tf.data.experimental.save

Saves the content of the given dataset.

Example usage:

import tempfile
path = os.path.join(tempfile.gettempdir(), "saved_data")
# Save a dataset
dataset = tf.data.Dataset.range(2)
tf.data.experimental.save(dataset, path)
new_dataset = tf.data.experimental.load(path,
    tf.TensorSpec(shape=(), dtype=tf.int64))
for elem in new_dataset:
  print(elem)
tf.Tensor(0, shape=(), dtype=int64)
tf.Tensor(1, shape=(), dtype=int64)

The saved dataset is saved in multiple file "shards". By default, the dataset output is divided to shards in a round-robin fashion but custom sharding can be specified via the shard_func function. For example, you can save the dataset to using a single shard as follows:

dataset = make_dataset()
def custom_shard_func(element):
  return 0
dataset = tf.data.experimental.save(
    path="/path/to/data", ..., shard_func=custom_shard_func)
Note: The directory layout and file format used for saving the dataset is considered an implementation detail and may change. For this reason, datasets saved through tf.data.experimental.save should only be consumed through tf.data.experimental.load, which is guaranteed to be backwards compatible.
Args
dataset The dataset to save.
path Required. A directory to use for saving the dataset.
compression Optional. The algorithm to use to compress data when writing it. Supported options are GZIP and NONE. Defaults to NONE.
shard_func Optional. A function to control the mapping of dataset elements to file shards. The function is expected to map elements of the input dataset to int64 shard IDs. If present, the function will be traced and executed as graph computation.

© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r2.3/api_docs/python/tf/data/experimental/save