pandas.util.hash_array

pandas.util.hash_array(vals, encoding='utf8', hash_key='0123456789123456', categorize=True)[source]

Given a 1d array, return an array of deterministic integers.

Parameters
vals:ndarray or ExtensionArray
encoding:str, default ‘utf8’

Encoding for data & key when strings.

hash_key:str, default _default_hash_key

Hash_key for string key to encode.

categorize:bool, default True

Whether to first categorize object arrays before hashing. This is more efficient when the array contains duplicate values.

Returns
ndarray[np.uint64, ndim=1]

Hashed values, same length as the vals.

© 2008–2021, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/1.3.4/reference/api/pandas.util.hash_array.html