pandas.DataFrame.to_parquet
-
DataFrame.to_parquet(fname, engine='auto', compression='snappy', index=None, partition_cols=None, **kwargs)[source] -
Write a DataFrame to the binary parquet format.
New in version 0.21.0.
This function writes the dataframe as a parquet file. You can choose different parquet backends, and have the option of compression. See the user guide for more details.
Parameters: -
fname : str -
File path or Root Directory path. Will be used as Root Directory path while writing a partitioned dataset.
Changed in version 0.24.0.
-
engine : {‘auto’, ‘pyarrow’, ‘fastparquet’}, default ‘auto’ -
Parquet library to use. If ‘auto’, then the option
io.parquet.engineis used. The defaultio.parquet.enginebehavior is to try ‘pyarrow’, falling back to ‘fastparquet’ if ‘pyarrow’ is unavailable. -
compression : {‘snappy’, ‘gzip’, ‘brotli’, None}, default ‘snappy’ -
Name of the compression to use. Use
Nonefor no compression. -
index : bool, default None -
If
True, include the dataframe’s index(es) in the file output. IfFalse, they will not be written to the file. IfNone, the behavior depends on the chosen engine.New in version 0.24.0.
-
partition_cols : list, optional, default None -
Column names by which to partition the dataset Columns are partitioned in the order they are given
New in version 0.24.0.
- **kwargs
-
Additional arguments passed to the parquet library. See pandas io for more details.
See also
-
read_parquet - Read a parquet file.
-
DataFrame.to_csv - Write a csv file.
-
DataFrame.to_sql - Write to a sql table.
-
DataFrame.to_hdf - Write to hdf.
Notes
This function requires either the fastparquet or pyarrow library.
Examples
>>> df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]}) >>> df.to_parquet('df.parquet.gzip', ... compression='gzip') # doctest: +SKIP >>> pd.read_parquet('df.parquet.gzip') # doctest: +SKIP col1 col2 0 1 3 1 2 4 -
© 2008–2012, AQR Capital Management, LLC, Lambda Foundry, Inc. and PyData Development Team
Licensed under the 3-clause BSD License.
https://pandas.pydata.org/pandas-docs/version/0.24.2/reference/api/pandas.DataFrame.to_parquet.html