TensorFlow with GPU on your Mac
Installing CUDA, cuDNN and TensorFlow on a Mac

As part of the Udacity’s Self-Driving Car Nanodegree, I had the opportunity to try a GPU-powered server for Traffic Sign Classifier and the Behavioral Cloning projects in Term 1. It was not a painful experience(as I was expecting) to use this hardware because Udacity provided an AIM with the necessary software already installed, and I didn’t need to install anything else. The only problem I encounter was to update the NVIDIA driver, and it was done easy. During that process, I read a bit about GPUs, CUDA and cuDNN. It was awesome to see this development and the application of these platforms to Deep Learning. My Mac had a NVIDIA video card; so, I was up for local adventures too!

CUDA Preference

To use GPU-powered TensorFlow on your Mac, there are multiple system requirements and libraries to install. Here are a summary of those system requirements and steps:

When all of that is installed and checked, TensoFlow with GPU support could be installed. I don’t know about you, but this is a long list to me. Nevertheless, I could see great improvements on performance by using GPUs in my experiments. It worth trying to have it done locally if you have the hardware already. This article will describe the process of setting up CUDA and TensorFlow with GPU support on a Conda environment. It doesn’t mean this is the only way to do it, but I just want to let it rest somewhere I could find it if I needed in the future, and also share it to help anybody else with the same objective. And the journey begins!

Check you have a CUDA GPU card with CUDA Compute Capability 3.0 or higher.

First, you need to know your video card. Go to “About This Mac,” and get from there:

About This Mac

In my case, it is NVIDIA GeForce GT 750M. Then you need to see if the card is supported by CUDA by finding you card here:

Card supported

Now you have hardware support confirmed, let us move forward and install the driver.

Install the CUDA Driver.

There are options to install the driver when you install the CUDA Toolkit 8.0, but I preferred to install the driver first, to make sure I have the latest version. Go to this URL and download the latest version. At this time, it is 8.0.83:

CUDA driver

Install CUDA Toolkit 8.0

You can find the installation steps for Mac OS X here. There are some system requirements:

The first two requirements are met at this point; lets get to the last two.

Install Xcode and native command line tools

I didn’t have to install Xcode because I have it installed already, but here is a tutorial on how to do it. The tutorial also cover the installation of the command-line tools. In my case, I just need to install them with xcode-select --install. It is always good to verify that you installed it by using /usr/bin/cc --version. You should see something similar to this:

Download CUDA Toolkit install

Go to this URL to download the toolkit for the appropriate OS, architecture, and version:

CUDA Toolkit download

Optionally, verify the download was correct with md5 checksum: openssl md5 <THE_FILE_YOU_DOWNLOAD>.

Double-click the file, and follow the installation wizard. On the package selection, un-check the CUDA Drivers because they were installed before. When the installation finished, add the following to your .bash_profile:

It is always good to verify the driver is running:

You should see something similar to this:

Compile samples

Now everything CUDA related should be installed correctly, but we can have some fun compiling and running CUDA samples to verify even more everything is indeed installed properly:

And the following error happens!!!

After google-ing it, this is an issue described here. Following the steps suggested by mlloreda, downgrading to CLT 8.2 should work:

Done all that, The ‘80100’ error is gone, but a new error arrived:

It turns out where the samples, there is no write permission to them. You need to make a writable copy of the samples:

And this time, it works!!!

Now that we are at it, why not to compile a few more:

Lets run deviceQuery and see what happens:

All good so far. Let us go and compile all the samples with make. This takes a while to finish. There are a lot of samples there. Very interesting stuff. I just ran one more: bandwidthTest

It is good to play with this, but we need to keep going to get to the TensorFlow part. CUDA is done, next cuDNN.

Install cuDNN v5.1

To download cuDNN, you need to create a developer account here, and then proceed to the download part:

cuDNN download

I created a directory ~/cudnn and untar the download files there. After that is done, add the following to your .bash_profile:

This was an easy step!

Creating a Conda environment and installing TensorFlow

Even when the Anaconda is not officially supported, the installation worked quite well:

Everything is set. Lest verify the installation running the TensorFlow code suggested on the Validate your installation:

Great! Everything looks like is working. It was a long journey, but it was fun! There are a lot of things to learn and a lot of different weird messages on this scripts. It is just the beginning. There is an Udacity free course looking good: Intro to Parallel Programming. It could be interesting to see the difference between this “lower” level compared to other platforms based on CPUs. I certainly like more chickens than oxes(reference to the course trailer video)!

Enjoy!

*****
Written by Darien Martinez Torres on 07 June 2017