跳到主要內容

[簡易教學]快速建立深度學習的環境 CUDA + Tensorflow + nvidia docker

前言


若想把深度學習的程式碼跑在獨立的環境裡面, 最好的方法就是用docker container來隔離每隻程式, docker的好處這便就不再贅述, 以下示範完整安裝過程

Step 1. 安裝NVidia驅動程式


一開始可以先用ubuntu-drivers devices列出你的顯卡需要裝什麼驅動, 如果沒有特別的偏好, 直接用 autoinstall來裝就可以了
ubuntu-drivers devices
sudo ubuntu-drivers autoinstall

安裝完後建議重開機


Step 2. 檢查是否安裝成功


nvidia-smi 可以用來監測GPU的使用率
nvidia-smi

如果驅動有安裝成功, 基本上可以到目前GPU的使用狀況



Step 3. 安裝 CUDA & CUDNN


# Add NVIDIA package repository
sudo apt-key adv --fetch-keys http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/7fa2af80.pub
wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_9.1.85-1_amd64.deb
sudo apt install ./cuda-repo-ubuntu1604_9.1.85-1_amd64.deb
wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1604/x86_64/nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
sudo apt install ./nvidia-machine-learning-repo-ubuntu1604_1.0.0-1_amd64.deb
sudo apt update
# Install CUDA and tools. Include optional NCCL 2.x
sudo apt install cuda9.0 cuda-cublas-9-0 cuda-cufft-9-0 cuda-curand-9-0 \
    cuda-cusolver-9-0 cuda-cusparse-9-0 libcudnn7=7.2.1.38-1+cuda9.0 \
    libnccl2=2.2.13-1+cuda9.0 cuda-command-line-tools-9-0
# Optional: Install the TensorRT runtime (must be after CUDA install)
sudo apt update
sudo apt install libnvinfer4=4.1.2-1+cuda9.0


Step 4. 安裝docker


sudo apt-get update
sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg2 \
    software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo apt-key fingerprint 0EBFCD88
sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
sudo apt-get update
sudo apt-get install docker-ce
sudo groupadd docker
sudo usermod -aG docker $USER


Step 5. 安裝nvidia docker


# If you have nvidia-docker 1.0 installed: we need to remove it and all existing GPU containers
docker volume ls -q -f driver=nvidia-docker | xargs -r -I{} -n1 docker ps -q -a -f volume={} | xargs -r docker rm -f
sudo apt-get purge -y nvidia-docker
# Add the package repositories
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | \
sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | \
sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update
# Install nvidia-docker2 and reload the Docker daemon configuration
sudo apt-get install -y nvidia-docker2
sudo pkill -SIGHUP dockerd
Step 6. 測試nvidia docker 有沒有裝好

用以下的指令來試看看是否能從容器內取得宿主機上的GPU資訊
docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi


以上~




留言