Is there a way to stack two tensorflow datasets?
Solution 1:
The tf.data.Dataset.concatenate()
method is the closest analog of tf.stack()
when working with datasets. If you have two datasets with the same structure (i.e. same types for each component, but possibly different shapes):
dataset_1 = tf.data.Dataset.range(10, 20)
dataset_2 = tf.data.Dataset.range(60, 70)
then you can concatenate them as follows:
combined_dataset = dataset_1.concatenate(dataset_2)
Solution 2:
If by stacking you mean what tf.stack()
and np.stack()
do:
Stacks a list of rank-
R
tensors into one rank-(R+1)
tensor.
https://www.tensorflow.org/api_docs/python/tf/stack
Join a sequence of arrays along a new axis.
https://docs.scipy.org/doc/numpy/reference/generated/numpy.stack.html
then I believe the closest you can come with a tf.data.Dataset
is Dataset.zip()
:
@staticmethod
zip(datasets)
Creates a
Dataset
by zipping together the given datasets.
https://www.tensorflow.org/api_docs/python/tf/data/Dataset?version=stable#zip
This allows you to iterate through multiple datasets at the same time by iterating over the shared dimension of the original datasets, similarly to a stack()
ed tensor or matrix.
You can then also use .map(tf.stack)
or .map(lambda *t: tf.stack(t, axis=-1))
to stack the tensors along new dimensions at the front or back, respectively,
If indeed you want to achieve what tf.concat()
and np.concatenate()
do, then you use Dataset.concatenate()
.