pandas.api.extensions.ExtensionDtype

class pandas.api.extensions.ExtensionDtype[source]

A custom data type, to be paired with an ExtensionArray.

See also

extensions.register_extension_dtype

Register an ExtensionType with pandas as class decorator.

extensions.ExtensionArray

Abstract base class for custom 1-D array types.

Notes

The interface includes the following abstract methods that must be implemented by subclasses:

  • type

  • name

  • construct_array_type

The following attributes and methods influence the behavior of the dtype in pandas operations

  • _is_numeric

  • _is_boolean

  • _get_common_dtype

The na_value class attribute can be used to set the default NA value for this type. numpy.nan is used by default.

ExtensionDtypes are required to be hashable. The base class provides a default implementation, which relies on the _metadata class attribute. _metadata should be a tuple containing the strings that define your data type. For example, with PeriodDtype that’s the freq attribute.

If you have a parametrized dtype you should set the ``_metadata`` class property.

Ideally, the attributes in _metadata will match the parameters to your ExtensionDtype.__init__ (if any). If any of the attributes in _metadata don’t implement the standard __eq__ or __hash__, the default implementations here will not work.

For interaction with Apache Arrow (pyarrow), a __from_arrow__ method can be implemented: this method receives a pyarrow Array or ChunkedArray as only argument and is expected to return the appropriate pandas ExtensionArray for this dtype and the passed values:

class ExtensionDtype:

    def __from_arrow__(
        self, array: Union[pyarrow.Array, pyarrow.ChunkedArray]
    ) -> ExtensionArray:
        ...

This class does not inherit from ‘abc.ABCMeta’ for performance reasons. Methods and properties required by the interface raise pandas.errors.AbstractMethodError and no register method is provided for registering virtual subclasses.

Attributes

kind

A character code (one of 'biufcmMOSUV'), default 'O'

na_value

Default NA value to use for this type.

name

A string identifying the data type.

names

Ordered list of field names, or None if there are no fields.

type

The scalar type for the array, e.g.

Methods

construct_array_type()

Return the array type associated with this dtype.

construct_from_string(string)

Construct this type from a string.

is_dtype(dtype)

Check if we match 'dtype'.

© 2008–2021, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/1.3.4/reference/api/pandas.api.extensions.ExtensionDtype.html