Machine Learning on external GPU with CUDA and late MBP 2016?
Solution 1:
I could finally install Nvidia Titan XP + MacBook Pro + Akitio Node + Tensorflow + Keras
I wrote a gist with the procedure, hope it helps
https://gist.github.com/jganzabal/8e59e3b0f59642dd0b5f2e4de03c7687
Here is what I did:
This configuration worked for me, hope it helps
It is based on: https://becominghuman.ai/deep-learning-gaming-build-with-nvidia-titan-xp-and-macbook-pro-with-thunderbolt2-5ceee7167f8b
and on: https://stackoverflow.com/questions/44744737/tensorflow-mac-os-gpu-support
Hardware
- Nvidia Video Card: Titan Xp
- EGPU: Akitio Node
- MacBook Pro (Retina, 13-inch, Early 2015)
- Apple Thunderbolt3 to Thunderbolt2 Adapter
- Apple Thunderbolt2 Cable
Software versions
- macOS Sierra Version 10.12.6
- GPU Driver Version: 10.18.5 (378.05.05.25f01)
- CUDA Driver Version: 8.0.61
- cuDNN v5.1 (Jan 20, 2017), for CUDA 8.0: Need to register and download
- tensorflow-gpu 1.0.0
- Keras 2.0.8
Procedure:
Install GPU driver
- ShutDown your system, power it up again with pressing (⌘ and R) keys until you see , this will let you in Recovery Mode.
- From the Menu Bar click Utilities > Terminal and write ‘csrutil disable; reboot’ press enter to execute this command.
-
When your mac restarted, run this command in Terminal:
cd ~/Desktop; git clone https://github.com/goalque/automate-eGPU.git chmod +x ~/Desktop/automate-eGPU/automate-eGPU.sh sudo ~/Desktop/automate-eGPU/./automate-eGPU.sh
Unplug your eGPU from your Mac, and restart. This is important if you did not unplug your eGPU you may end up with black screen after restarting.
-
When your Mac restarted, Open up Terminal and execute this command:
sudo ~/Desktop/automate-eGPU/./automate-eGPU.sh -a
- Plug your eGPU to your mac via TH2.
- Restart your Mac.
Install CUDA, cuDNN, Tensorflow and Keras
At this moment, Keras 2.08 needs tensorflow 1.0.0. Tensorflow-gpu 1.0.0 needs CUDA 8.0 and cuDNN v5.1 is the one that worked for me. I tried other combinations but doesn't seem to work
- Download and installing CUDA 8.0 CUDA Toolkit 8.0 GA2 (Feb 2017)
- Install it and follow the instructions
-
Set env variables
vim ~/.bash_profile export CUDA_HOME=/usr/local/cuda export DYLD_LIBRARY_PATH="$CUDA_HOME/lib:$CUDA_HOME:$CUDA_HOME/extras/CUPTI/lib" export LD_LIBRARY_PATH=$DYLD_LIBRARY_PATH
(If your bash_profile does not exist, create it. This is executed everytime you open a terminal window)
- Downloading and installing cuDNN (cudnn-8.0-osx-x64-v5.1) Need to register before downloading it
-
Copy cuDNN files to CUDA
cd ~/Downloads/cuda sudo cp include/* /usr/local/cuda/include/ sudo cp lib/* /usr/local/cuda/lib/
-
Create envirenment and install tensorflow
conda create -n egpu python=3 source activate egpu pip install tensorflow-gpu==1.0.0
Verify it works
Run the following script:
import tensorflow as tf
with tf.device('/gpu:0'):
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
with tf.Session() as sess:
print (sess.run(c))
-
Install Keras in the envirenment and set tensorflow as backend:
pip install --upgrade --no-deps keras # Need no-deps flag to prevent from installing tensorflow dependency KERAS_BACKEND=tensorflow python -c "from keras import backend"
Output:
Using TensorFlow backend. I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcublas.8.0.dylib locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcudnn.5.dylib locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcufft.8.0.dylib locally I tensorflow/stream_executor/dso_loader.cc:126] Couldn't open CUDA library libcuda.1.dylib. LD_LIBRARY_PATH: /usr/local/cuda/lib:/usr/local/cuda:/usr/local/cuda/extras/CUPTI/lib I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcuda.dylib locally I tensorflow/stream_executor/dso_loader.cc:135] successfully opened CUDA library libcurand.8.0.dylib locally
Solution 2:
I was able to get a NVIDIA GTX 1080 Ti working on the Akitio Node on my iMac (late 2013). I'm using a Thunderbolt 2 > 3 adapter, though on newer Macs you can use the faster TB3 directly.
There are various eGPU set-ups described at eGPU.io, and you might find one that describes your computer/enclosure/card precisely. These tutorials are mostly for accelerating a display with an eGPU, though for training NNs you don't obviously need to follow all the steps.
Here's roughly what I did:
- Install CUDA according to official documentation.
- Disable SIP (Google for a tutorial). It's needed by the eGPU.sh script and later also by TensorFlow.
- Run the automate-eGPU.sh script (with sudo) that everybody at eGPU.io seems to rely on.
- Install cuDNN. The files from NVIDIA's website should go under
/usr/local/cuda
with the rest of your CUDA libraries and includes. - Uninstall CPU-only TensorFlow and install one with GPU support. When installing with
pip install tensorflow-gpu
, I had no installation errors, but got a segfault when requiring TensorFlow in Python. Turns out there are some environment variables that have to be set (a bit differently than the CUDA installer suggests), which were described in a GitHub issue comment. - I also tried compiling TensorFlow from source, which didn't work before I set the env vars as described in the previous step.
From iStat Menus I can verify that my external GPU is indeed used during training. This TensorFlow installation didn't work with Jupyter, though, but hopefully there's a workaround for that.
I haven't used this set-up much so not sure about the performance increase (or bandwidth limitations), but eGPU + TensorFlow/CUDA certainly is possible now, since NVIDIA started releasing proper drivers for macOS.