here.geopandas_adapter.geopandas_adapter module#
HERE Platform Python SDK, GeoPandas adapter access package
- class here.geopandas_adapter.geopandas_adapter.GeoPandasAdapter(partition_column: str = 'partition_id', timestamp_column: str = 'partition_timestamp', including_default_value_fields: bool = True, preserving_proto_field_name: bool = True)[source]#
Bases:
AdapterThis adapter transform data from and to
pd.DataFrameandgpd.DataFrame, when geometry information such as longitude and latitude is involved.An adapter controls the encoding and decoding process of platform data. It transforms data from and to adapter-specific data structure and supports reading, writing, encoding and decoding a variety of MIME content types.
For the list of MIME content types supported when reading and writing a layer with
read_*andwrite_*functions of theLayerand its subclasses, please see documentation ofGeoPandasDecoderandGeoPandasEncoder.All the operations involving content passes through an adapter when the parameters
encodeordecodeareTrue, their default value. These are parameters of theread_*andwrite_*functions. If a content type is not supported, or if reading or writing raw content is preferred, passFalseto skip encoding or decoding and deal with raw bytes instead.- property content_adapter: ContentAdapter#
The adapter specialized for content.
- from_feature_ids(feature_ids: Iterator[str], **kwargs) Series[source]#
Adapt a sequence of feature identifiers to a Series.
- Parameters:
feature_ids – sequence of feature identifiers
kwargs – additional parameters are passed unchanged to
pd.Series(). For additional information please see: https://pandas.pydata.org/docs/reference/api/pandas.Series.html
- Returns:
a Series with the feature identifiers
- from_geo_features(features: Iterator[Feature], **kwargs) GeoDataFrame[source]#
Adapt a sequence of geographic features to a GeoDataFrame.
- Parameters:
features – sequence of geographic features
kwargs – additional parameters are passed unchanged to
gpd.GeoDataFrame.from_features(). For additional information please see: https://geopandas.org/docs/reference/api/geopandas.GeoDataFrame.from_features.html
- Returns:
a new
gpd.GeoDataFramecontaining the features
- from_index_data(partitions_data: Iterator[Tuple[IndexPartition, bytes]], content_type: str, schema: Schema | None, **kwargs) DataFrame[source]#
Adapt index partition metadata and data to a :class:
pd.DataFrame.- Parameters:
partitions_data – sequence of partition metadata and data from a stream layer
content_type – the MIME content type of the layer
schema – optional
Schemaof the layerkwargs – additional, content-type-specific parameters, see
GeoPandasDecoder
- Returns:
partition data as
pd.DataFrameorgpd.GeoDataFrame
- from_index_metadata(partitions: Iterator[IndexPartition], **kwargs) DataFrame[source]#
Adapt index partition metadata to a :class:
pd.DataFrame.- Parameters:
partitions – sequence of partition metadata from an index layer
kwargs – unused
- Returns:
partition metadata as :class:
pd.DataFrame
- from_stream_data(partitions_data: Iterator[Tuple[StreamPartition, bytes]], content_type: str, schema: Schema | None, **kwargs) DataFrame[source]#
Adapt stream partition metadata and data to a :class:
pd.DataFrame.- Parameters:
partitions_data – sequence of partition metadata and data from a stream layer
content_type – the MIME content type of the layer
schema – optional
Schemaof the layerkwargs – additional, content-type-specific parameters, see
GeoPandasDecoder
- Returns:
stream message data as
pd.DataFrameorgpd.GeoDataFrame
- from_stream_metadata(partitions: Iterator[StreamPartition], **kwargs) DataFrame[source]#
Adapt stream partition metadata to a :class:
pd.DataFrame.- Parameters:
partitions – sequence of partition metadata from a versioned layer
kwargs – unused
- Returns:
partition metadata as :class:
pd.DataFrame
- from_versioned_data(partitions_data: Iterator[Tuple[VersionedPartition, bytes]], content_type: str, schema: Schema | None, **kwargs) DataFrame[source]#
Adapt versioned partition metadata and data to a :class:
pd.DataFrame.- Parameters:
partitions_data – sequence of partition metadata and data from a versioned layer
content_type – the MIME content type of the layer
schema – optional
Schemaof the layerkwargs – additional, content-type-specific parameters, see
GeoPandasDecoder
- Returns:
partition data as
pd.DataFrameorgpd.GeoDataFrame
- from_versioned_metadata(partitions: Iterator[VersionedPartition], **kwargs) DataFrame[source]#
Adapt versioned partition metadata to a :class:
pd.DataFrame.- Parameters:
partitions – sequence of partition metadata from a versioned layer
kwargs – unused
- Returns:
partition metadata as :class:
pd.DataFrame
- from_volatile_data(partitions_data: Iterator[Tuple[VersionedPartition, bytes]], content_type: str, schema: Schema | None, **kwargs) DataFrame[source]#
Adapt versioned partition metadata and data to a :class:
pd.DataFrame.- Parameters:
partitions_data – sequence of partition metadata and data from a volatile layer
content_type – the MIME content type of the layer
schema – optional
Schemaof the layerkwargs – additional, content-type-specific parameters, see
GeoPandasDecoder
- Returns:
partition data as
pd.DataFrameorgpd.GeoDataFrame
- from_volatile_metadata(partitions: Iterator[VolatilePartition], **kwargs) DataFrame[source]#
Adapt volatile partition metadata to the target format.
- Parameters:
partitions – sequence of partition metadata from a volatile layer
kwargs – unused
- Returns:
partition metadata as :class:
pd.DataFrame
- to_feature_ids(data: Series, **kwargs) Iterator[str][source]#
Adapt data from a Series to a sequence of feature identifiers.
Values are converted to str. NA values discarded.
- Parameters:
data – a Series containing feature identifiers
kwargs – unused
- Returns:
sequence of feature identifiers
- to_geo_features(data: GeoDataFrame, **kwargs) Iterator[Feature][source]#
Adapt data in a GeoDataFrame to a sequence of geographic features.
- Parameters:
data – the
gpd.GeoDataFrameto adaptkwargs – additional parameters are passed unchanged to
gpd.GeoDataFrame.iterfeatures(). For additional information please see: https://geopandas.org/docs/reference/api/geopandas.GeoDataFrame.iterfeatures.html
- Returns:
sequence of geographic features from the GeoDataFrame
- to_index_single_data(data: DataFrame, content_type: str, schema: Schema | None, **kwargs) bytes[source]#
Adapt a DataFrame to be stored in an index layer.
- Parameters:
data – data in the form of DataFrame
content_type – the MIME content type of the layer
schema – optional
Schemaof the layerkwargs – additional, content-type-specific parameters, see
GeoPandasEncoder
- Returns:
data encoded for an index layer
- Raises:
ValueError – in case the content type is not supported by the adapter # noqa
- to_stream_data(layer: StreamLayer, data, content_type: str, schema: Schema | None, timestamp: int | None, **kwargs) Iterator[Tuple[str | int, bytes, int | None]][source]#
Adapt data from the target format to stream partition metadata and data.
- Parameters:
layer – the layer all the metadata and data belong to
data – adapter-specific, the data to adapt
content_type – the MIME content type of the layer
schema – optional
Schemaof the layertimestamp – optional timestamp for all the messages, if none is specified in data: in milliseconds since Unix epoch (1970-01-01T00:00:00 UTC)
kwargs – adapter-specific, please consult the documentation of the specific adapter to for the parameters and types it supports
- Yield:
partition id, data and timestamp for the stream layer
- Raises:
ValueError – in case required columns are missing
- to_stream_metadata(layer: StreamLayer, partitions: DataFrame, **kwargs) Iterator[StreamPartition][source]#
Adapt what to publish from the target format to stream partition metadata.
- Parameters:
layer – the layer all the metadata and data belong to
partitions – the
pd.DataFrameof partition metadata to appendkwargs – unused
- Yield:
the
StreamPartitionthat are adapted
- to_versioned_data(layer: VersionedLayer, data: pd.DataFrame, content_type: str, schema: Schema | None, **kwargs) Iterator[Tuple[str | int, bytes]][source]#
Adapt data from sequence of partition ids and data to versioned partition id and data.
- Parameters:
layer – the layer all the metadata and data belong to
data – data as
pd.DataFrameorgpd.GeoDataFramecontent_type – the MIME content type of the layer
schema – optional
Schemaof the layerkwargs – additional, content-type-specific parameters, see
GeoPandasEncoder
- Returns:
sequence of partition id and data for the volatile layer
- to_versioned_metadata(layer: VersionedLayer, partitions_update: DataFrame | None, partitions_delete: Series | None, **kwargs) Tuple[Iterator[VersionedPartition], Iterator[str | int]][source]#
Adapt
pd.DataFrameof metadata andpd.Seriesof keys to versioned partition metadata and partition ids to update and delete.- Parameters:
layer – the layer all the metadata and data belong to
partitions_update – the
pd.DataFrameof partition metadata to update, if anypartitions_delete – the
pd.Seriesof partitions ids to delete, if anykwargs – unused
- Returns:
tuple of Iterator, the first with the
VersionedPartitionthat have to be updated, the second with the partition ids to delete
- to_volatile_data(layer: VolatileLayer, data: pd.DataFrame, content_type: str, schema: Schema | None, **kwargs) Iterator[Tuple[str | int, bytes]][source]#
Adapt data from sequence of partition ids and data to volatile partition id and data.
- Parameters:
layer – the layer all the metadata and data belong to
data – data as
pd.DataFrameorgpd.GeoDataFramecontent_type – the MIME content type of the layer
schema – optional
Schemaof the layerkwargs – additional, content-type-specific parameters, see
GeoPandasEncoder
- Returns:
sequence of partition id and data for the volatile layer
- to_volatile_metadata(layer: VolatileLayer, partitions_update: DataFrame | None, partitions_delete: Series | None, **kwargs) Tuple[Iterator[VolatilePartition], Iterator[str | int]][source]#
Adapt
pd.DataFrameof metadata andpd.Seriesof keys to volatile partition metadata and partition ids to update and delete.- Parameters:
layer – the layer all the metadata and data belong to
partitions_update – the
pd.DataFrameof partition metadata to update, if anypartitions_delete – the
pd.Seriesof partitions ids to delete, if anykwargs – unused
- Returns:
tuple of Iterator, the first with the
VolatilePartitionthat have to be updated, the second with the partition ids to delete
- class here.geopandas_adapter.geopandas_adapter.GeoPandasContentAdapter(partition_column: str)[source]#
Bases:
ContentAdapterSpecialization of the
GeoPandasAdapterto map tabular-like content from content bindings toGeoDataFrameorDataFrame.- from_objects(fields: type, data: Iterator[object], single_element: bool = False, index_partition: None | str | Callable[[object], Partition] = None, index_id: None | str | Callable[[object], Identifier] = None, index_ref: None | str | Callable[[object], Ref | Iterable[Ref]] = None) DataFrame | GeoDataFrame[source]#
Adapt content form a structured representation to pandas
DataFrameor geopandasGeoDataFrame.It can optionally perform indexing of objects, based on their partition, identifier and set of references to other objects. Indexing is specified by naming the field of the object that contains the value to index, or by passing a function that calculates that value from the object.
- Parameters:
fields – the fields to extract, as specified by a dataclass. Field names are looked up among the attributes of each object via
getattr`. When missing, ``Noneor equivalent is used. Each field has a type that describes its semantic: it is used to adapt the value to the most appropriate representation for the output format.TypeErroris raised in case this is not possible.data – the objects to adapt to the target format. Fields not mentioned in
fieldsare discarded. Expected but missing fields and identifiers are consideredNone. Field values may be of any type compatible with the type declared for the field. Partition ids don’t have to be unique, but they have to be contiguous: all the objects with a given partition identifier must be returned in sequence. Object identifiers, when present, must be unique across the whole content.single_element – the data contains exactly one element, the content adapter case use this information to optimize or return a specialized representation
index_partition – index the content by partition, using the field specified
index_id – index the content by object identifier, using the field specified
index_ref – index the content by references, using the field specified. Each object can contain zero, one or more references, and references can be shared among multiple objects.
- Returns:
objects in a dataframe, indexed as requested
- Raises:
ValueError: if the fields are not described by a dataclass KeyError: in case partition id, object id or reference is needed but not present TypeError: in case partition or object id is not of type int or string. Also raised in case field values are not of the type declared for the field, or if they can’t be converted to it.
- class here.geopandas_adapter.geopandas_adapter.GeoPandasDecoder(including_default_value_fields: bool = True, preserving_proto_field_name: bool = True)[source]#
Bases:
DecoderImplementation of a
Decoderto work withpd.DataFrameandgpd.GeoDataFrame.- decode_blob(data: bytes, content_type: str, schema: Schema | None = None, **kwargs)[source]#
Decode one single blob of data.
- Parameters:
data – the encoded data
content_type – the MIME content type to be decoded
schema – the schema, if the content type requires one
kwargs –
additional, content-type-specific parameters for the decoder:
For Protobuf (application/protobuf or application/x-protobuf): -
record_path: the name of a schema field that is decoded and transformed toDataFrame. It can reference nested fields by concatenating the field names with.. When referencing a single Protobuf sub-message, that message is decoded into one single dataframe row. When referencing repeated Protobuf messages, each repeated message is decoded in its own row, resulting in multiple rows per partition. Fields that are not Protobuf messages or repeated fields containing single values (ints, strings, …) are not supported because it is not possible to transform them to a dataframe. If not specified, the whole blob is decoded as single message. Messages are decoded, normalized (seemax_level) and passed topd.DataFrame.from_record()together with the rest ofkwargs: this turns each field of the normalized messages into a column of the resulting dataframe. -record_prefix: if True, prefix the column names with therecord_path. If a non-empty string, that string is used as prefix..is used as separator. -max_level: normalize each record of the decoded Protobuf message up to the specified maximum level in depth.0disables normalization. -geometry_col: name of a column that contains geometries that is converted to a geopandasGeoSeries, resulting in aGeoDataFramereturned in place of a pandasDataFrame. For the supported formats, please see documentation ofhere.geopandas_adapter.geo_utils.to_geometry. Geometry field and sub-fields are excluded from normalization. If not specified, pandasDataFrameis returned and geometry is not interpreted. -geometry_crs: the CRS to set in theGeoDataFrame, when applicable. - The rest of the parameters are passed unchanged topd.DataFrame.from_record()for further customizations. For additional information please see: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.from_records.htmlFor Parquet (application/x-parquet): -
engine: an optional param for type of engine used to parse the parquet data, values allowed are [auto, fastparquet, pyarrow]. If ‘auto’, then the behavior is to try ‘pyarrow’, falling back to ‘fastparquet’ ifArrowNotImplementedErroris raised. - The rest of the parameters are passed unchanged topd.read_parquet()for further customizations. For additional information please see: https://pandas.pydata.org/docs/reference/api/pandas.read_parquet.htmlFor CSV (text/csv):
sep: delimiter or column separator to use.header: row number(s) to use as the column names, and the start of the data. Default behavior is to infer the column names: if no names are passed the behavior is identical toheader=0and column names are inferred from the first line of the file, if column names are passed explicitly then the behavior is identical toheader=None. Explicitly passheader=0to replace existing names. The header can be a list of integers that specify row locations for a multi-index on the columns e.g. [0,1,3]. Intervening rows that are not specified are skipped (e.g. 2 in this example is skipped). Note that this parameter ignores commented lines and empty lines ifskip_blank_lines=True, soheader=0denotes the first line of data rather than the first line of the file.names: list of column names to use. If the file contains a header row, then you should explicitly passheader=0to override the column names. Duplicates in this list are not allowed.index_col: column(s) to use as the row labels of the DataFrame, either given as string name or column index. If a sequence of int/str is given, a MultiIndex is used. Note:index_col=Falsecan be used to force pandas to not use the first column as the index, e.g. when you have a malformed file with delimiters at the end of each line.For JSON (application/json):
orient: indication of expected JSON string format. The set of possible orients is: - ‘split’: dict like {index -> [index], columns -> [columns], data -> [values]} - ‘records’: list like [{column -> value}, … , {column -> value}] - ‘index’: dict like {index -> {column -> value}} - ‘columns’: dict like {column -> {index -> value}} - ‘values’: just the values array - ‘table’: dict like {‘schema’: {schema}, ‘data’: {data}}lines: set toTrueto read the file as a json object per linenrows: the number of lines from the line-delimited json file to read. This can only be passed iflines=True. IfNone, all the rows are returned.For GeoJSON (application/geo+json or application/vnd.geo+json): No additional parameters available.
- Returns:
the decoded blob, its type correspond to the type declared in the property
supported_content_typesfor the content type- Raises:
ValueError – in case the specified content type is not decodable or the schema is mandatory for the content type but missing
UnsupportedContentTypeDecodeException – in case the content type is not decodable
ValueError – if the schema is mandatory for the content type but missing
DecodeException – in case the blob can’t be properly decoded # noqa
SchemaException – in case the schema can’t be used to decode the content # noqa
- property supported_content_types: Dict[str, type | Tuple[type, ...]]#
- Returns:
the dictionary of MIME content types supported when decoding single blobs
with the
decode_blobfunction of this decoder, each with the type of the decoded data.
- class here.geopandas_adapter.geopandas_adapter.GeoPandasEncoder[source]#
Bases:
EncoderImplementation of an
Encoderto work withpd.DataFrameandgpd.GeoDataFrame.- encode_blob(data, content_type: str, schema: Schema | None = None, **kwargs) bytes[source]#
Encode one single blob of data.
- Parameters:
data – the data to be encoded, its type corresponds to the type declared in the property
supported_content_typesfor the content typecontent_type – the MIME content type to be encoded
schema – the schema, if the content type requires one
kwargs –
additional, content-type-specific parameters for the encoder:
For Parquet (application/x-parquet): -
engine: an optional param for type of engine used to parse the parquet data, values allowed are [auto, fastparquet, pyarrow]. If ‘auto’, then the behavior is to try ‘pyarrow’, falling back to ‘fastparquet’ ifArrowNotImplementedErroris raised. - The rest of the parameters are passed unchanged topd.DataFrame.to_parquet()for further customizations. For additional information please see: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_parquet.htmlFor CSV (text/csv): For parameters and general info, please see: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html
For JSON (application/json):
orient: indication of JSON string format to produce. The set of possible orients is: - ‘split’: dict like {index -> [index], columns -> [columns], data -> [values]} - ‘records’: list like [{column -> value}, … , {column -> value}] - ‘index’: dict like {index -> {column -> value}} - ‘columns’: dict like {column -> {index -> value}} - ‘values’: just the values array - ‘table’: dict like {‘schema’: {schema}, ‘data’: {data}}lines: if orient is records write out line-delimited json format. For additional parameters and general info, please see: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_json.htmlFor GeoJSON (application/geo+json or application/vnd.geo+json): For parameters and general info, please see: https://geopandas.org/docs/reference/api/geopandas.GeoDataFrame.to_json.html#geopandas.GeoDataFrame.to_json
For Protobuf (application/protobuf or application/x-protobuf): For parameters and general info, please see: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_records.html
- Returns:
the encoded data
- Raises:
UnsupportedContentTypeEncodeException – in case the content type is not encodable
ValueError – if the schema is mandatory for the content type but missing
EncodeException – in case the blob can’t be properly encoded # noqa
SchemaException – in case the schema can’t be used to encode the content # noqa
- property supported_content_types: Dict[str, type | Tuple[type, ...]]#
- Returns:
the dictionary of MIME content types supported when encoding single blobs
with the
encode_blobfunction of this encoder, each with the type of the encoded data.