Google Colab is very slow compared to my PC
As @Feng has already noted, reading files from drive is very slow. This tutorial suggests using some sort of a memory mapped file like hdf5 or lmdb in order to overcome this issue. This way the I\O Operations are much faster (for a complete explanation on the speed gain of hdf5 format see this).
It's very slow to read file from google drives.
For example, I have one big file(39GB).
It cost more than 10min when I exec '!cp drive/big.file /content/'.
After I shared my file, and got the url from google drive. It cost 5 min when I exec '! wget -c -O big.file http://share.url.from.drive'. Download speed can up to 130MB/s.
Reading files from google drive slow down your training process. The solution is to upload zip file to colab and unzip there. Hope it is clear for you.
I had the same issue, and here's how I solved it.
First, make sure GPU is enabled(because it is not by default) by going to Runtime -> Change runtime type, and choosing GPU as your Hardware accelerator.
Then, as shown here you can use cache() and prefetch() functions to optimize the performance. Example:
# Load dataset
train_ds = keras.preprocessing.image_dataset_from_directory('Data/train',labels="inferred")
val_ds = keras.preprocessing.image_dataset_from_directory('Data/test',labels="inferred")
# Standardize data (optional)
from tensorflow.keras import layers
normalization_layer = keras.layers.experimental.preprocessing.Rescaling(1./255)
train_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
val_ds = val_ds.map(lambda x, y: (normalization_layer(x), y))
# Cache to RAM (optional)
from tensorflow import data
AUTOTUNE = data.experimental.AUTOTUNE
train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
# Train
model.fit(train_ds, validation_data=val_ds, epochs=3)
I was facing the same issue. Here's how I solved it:-
- Uploaded the zip file of the dataset to google drive.
- Mount the drive in colab and then unzip the dataset file 'in' a separate folder(other than ../drive) in colab itself.
- Do your business.
It worked for me. I don't know the exact reason but since colab access its local directory faster than it accesses the mounted drive directory, that may happen to be the gist of the problem.