ImportError: libcublas.so.10.0: cannot open shared object file: No such file or directory

Solution 1:

I downloaded cuda 10.0 from the following link CUDA 10.0

Then I installed it using the following commands:

sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
sudo apt-get update
sudo apt-get install cuda-10-0

I then installed cudnn v7.5.0 for CUDA 10.0 by going to link CUDNN download and you need to logon using an account.

and after choosing the correct version I downloaded via link CUDNN power link after that I added the include and lib files for cudnn as follows:

sudo cp -P cuda/targets/ppc64le-linux/include/cudnn.h /usr/local/cuda-10.0/include/
sudo cp -P cuda/targets/ppc64le-linux/lib/libcudnn* /usr/local/cuda-10.0/lib64/
sudo chmod a+r /usr/local/cuda-10.0/lib64/libcudnn*

After modified the .bashrc for lib and path of cuda 10.0, if you do not have it you need to add them into .bashrc

export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

And after all these steps, I managed to import tensorflow in python3 successfully.

Solution 2:

If using Cuda 10.1 (as directed in https://www.tensorflow.org/install/gpu), the problem is that libcublas.so.10 was moved out of the cuda-10.1 directory and into cuda-10.2(!)

Copying from this answer: https://github.com/tensorflow/tensorflow/issues/26182#issuecomment-684993950

... libcublas.so.10 sits in /usr/local/cuda-10.2/lib64 (surprise from nvidia - installation of 10.1 installs some 10.2 stuff) but only /usr/local/cuda is in include path which points to /usr/local/cuda-10.1.

The fix is to add it to your include path:

export LD_LIBRARY_PATH=/usr/local/cuda-10.2/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Note: This fix is known to work in Cuda 10.1, V10.1.243 (print your version with nvcc -V).

Solution 3:

CUDA 10.1 (installed as per tensorflow docs) throws can't find libcublas.so.10.0 errors. The libs exist in /usr/local/cuda-10.1/targets/x86_64-linux/lib/ but are misnamed.

There was another (lost) stackoverflow post saying this was a pinned dependency issue with the package and could be fixed with an extra cli flag to apt. This didn't seem to fix the issue for me.

Tested workaround is to modify instructions to downgrade to CUDA 10.0

# Uninstall packages from tensorflow installation instructions 
sudo apt-get remove cuda-10-1 \
    libcudnn7 \
    libcudnn7-dev \
    libnvinfer6 \
    libnvinfer-dev \
    libnvinfer-plugin6

# WORKS: Downgrade to CUDA-10.0
sudo apt-get install -y --no-install-recommends \
    cuda-10-0 \
    libcudnn7=7.6.4.38-1+cuda10.0  \
    libcudnn7-dev=7.6.4.38-1+cuda10.0;
sudo apt-get install -y --no-install-recommends \
    libnvinfer6=6.0.1-1+cuda10.0 \
    libnvinfer-dev=6.0.1-1+cuda10.0 \
    libnvinfer-plugin6=6.0.1-1+cuda10.0;

Upgrading to CUDA-10.2 also seems to suffer from the same problem

# BROKEN: Upgrade to CUDA-10.2 
# use `apt show -a libcudnn7 libnvinfer7` to find 10.2 compatable version numbers
sudo apt-get install -y --no-install-recommends \
    cuda-10-2 \
    libcudnn7=7.6.5.32-1+cuda10.2  \
    libcudnn7-dev=7.6.5.32-1+cuda10.2;
sudo apt-get install -y --no-install-recommends \
    libnvinfer7=7.0.0-1+cuda10.2 \
    libnvinfer-dev=7.0.0-1+cuda10.2 \
    libnvinfer-plugin7=7.0.0-1+cuda10.2;

Test GPU Visibility in Python

python3
>>> import tensorflow as tf
>>> tf.test.is_gpu_available()

FutureWarnings on tensorflow import

https://github.com/tensorflow/tensorflow/issues/30427

two solutions:

  • pip3 install tf-nightly-gpu
  • pip3 install "numpy<1.17"

Update:

You also need the correct tensorflow version to match with your CUDA version

Tensorflow / CUDA version combinations:

  • Tensorflow v2.x does not support CUDA 9 (Ubuntu 18.4 default)
  • Tensorflow v2.1.0 works with CUDA 10.1
  • Tensorflow v2.0.0 works with CUDA 10.0

See for the full list: https://www.tensorflow.org/install/source#tested_build_configurations

You may potentually need to reinstall tensorflow with a named version matching your CUDA

pip uninstall tensorflow tensorflow-gpu
pip install tensorflow==2.1.0 tensorflow-gpu==2.1.0

Then add CUDA to $PATH and $LD_LIBRARY_PATH in ~/.bashrc

~/.bashrc

# CUDA Environment Setup: https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#environment-setup
for CUDA_BIN_DIR in `find /usr/local/cuda-*/bin   -maxdepth 0`; do export PATH="$PATH:$CUDA_BIN_DIR"; done;
for CUDA_LIB_DIR in `find /usr/local/cuda-*/lib64 -maxdepth 0`; do export LD_LIBRARY_PATH="${LD_LIBRARY_PATH:+${LD_LIBRARY_PATH}:}$CUDA_LIB_DIR"; done;

export            PATH=`echo $PATH            | tr ':' '\n' | awk '!x[$0]++' | tr '\n' ':' | sed 's/:$//g'` # Deduplicate $PATH
export LD_LIBRARY_PATH=`echo $LD_LIBRARY_PATH | tr ':' '\n' | awk '!x[$0]++' | tr '\n' ':' | sed 's/:$//g'` # Deduplicate $LD_LIBRARY_PATH

Solution 4:

This error occurs when the version of cuda and tensorflow installed are not compatible. I encountered a similar ImportError while running tensorflow version 1.13.0 with cuda 9. Since I had installed tensorflow on a virtual environment with pip, I just uninstalled tensorflow 1.13.0 and installed tensorflow 1.12.0 as follow;

    pip uninstall tensorflow-gpu tensorflow-estimator tensorboard
    pip install tensorflow-gpu==1.12.0

Everything now works.

Solution 5:

As CalderBot mentioned you can do this as well

sudo cp -r /usr/local/cuda-10.2/lib64/libcu* /usr/local/cuda-10.1/lib64/