What does tf.gfile do in TensorFlow?

Solution 1:

For anyone landing here, the following answer was provided (by a googler) on: Why use tensorflow gfile? (for file I/O)

The main roles of the tf.gfile module are:

  1. To provide an API that is close to Python's file objects, and

  2. To provide an implementation based on TensorFlow's C++ FileSystem API.

The C++ FileSystem API supports multiple file system implementations, including local files, Google Cloud Storage (using a gs:// prefix), and HDFS (using an hdfs:// prefix). TensorFlow exports these as tf.gfile, so that you can use these implementations for saving and loading checkpoints, writing TensorBoard logs, and accessing training data (among other uses). However, if all of your files are local, you can use the regular Python file API without any problem.

Solution 2:

As you correctly point out tf.gfile is an abstraction for accessing the filesystem and is documented here. It is recommended over using plain python API since it provides some level of portability.