tf.contrib.layers.sparse_column_with_vocabulary_file

Creates a _SparseColumn with vocabulary file configuration.

Use this when your sparse features are in string or integer format, and you have a vocab file that maps each value to an integer ID. output_id = LookupIdFromVocab(input_feature_string)

Args
column_name A string defining sparse column name.
vocabulary_file The vocabulary filename.
num_oov_buckets The number of out-of-vocabulary buckets. If zero all out of vocabulary features will be ignored.
vocab_size Number of the elements in the vocabulary.
default_value The value to use for out-of-vocabulary feature values. Defaults to -1.
combiner A string specifying how to reduce if the sparse column is multivalent. Currently "mean", "sqrtn" and "sum" are supported, with "sum" the default. "sqrtn" often achieves good accuracy, in particular with bag-of-words columns.
  • "sum": do not normalize features in the column
  • "mean": do l1 normalization on features in the column
  • "sqrtn": do l2 normalization on features in the column For more information: tf.embedding_lookup_sparse.
dtype The type of features. Only string and integer types are supported.
Returns
A _SparseColumn with vocabulary file configuration.
Raises
ValueError vocab_size is not defined.
ValueError dtype is neither string nor integer.

© 2020 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/versions/r1.15/api_docs/python/tf/contrib/layers/sparse_column_with_vocabulary_file