As @Feng has already noted, reading files from drive is very slow. This tutorial suggests using some sort of a memory mapped file like hdf5 or lmdb in order to overcome this issue. This way the I\O Operations are much faster (for a complete explanation on the speed gain of hdf5 format see this).


It's very slow to read file from google drives.

For example, I have one big file(39GB).

It cost more than 10min when I exec '!cp drive/big.file /content/'.

After I shared my file, and got the url from google drive. It cost 5 min when I exec '! wget -c -O big.file http://share.url.from.drive'. Download speed can up to 130MB/s.


Reading files from google drive slow down your training process. The solution is to upload zip file to colab and unzip there. Hope it is clear for you.


I had the same issue, and here's how I solved it.

First, make sure GPU is enabled(because it is not by default) by going to Runtime -> Change runtime type, and choosing GPU as your Hardware accelerator.

Then, as shown here you can use cache() and prefetch() functions to optimize the performance. Example:

# Load dataset
train_ds = keras.preprocessing.image_dataset_from_directory('Data/train',labels="inferred")
val_ds = keras.preprocessing.image_dataset_from_directory('Data/test',labels="inferred")

# Standardize data (optional)
from tensorflow.keras import layers
normalization_layer = keras.layers.experimental.preprocessing.Rescaling(1./255)
train_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
val_ds = val_ds.map(lambda x, y: (normalization_layer(x), y))

# Cache to RAM (optional)
from tensorflow import data
AUTOTUNE = data.experimental.AUTOTUNE
train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)

# Train
model.fit(train_ds, validation_data=val_ds, epochs=3)

I was facing the same issue. Here's how I solved it:-

  1. Uploaded the zip file of the dataset to google drive.
  2. Mount the drive in colab and then unzip the dataset file 'in' a separate folder(other than ../drive) in colab itself.
  3. Do your business.

It worked for me. I don't know the exact reason but since colab access its local directory faster than it accesses the mounted drive directory, that may happen to be the gist of the problem.