Difference between revisions of "Tensorflow Guide"
From Storrs HPC Wiki
(→Anaconda route) |
|||
(94 intermediate revisions by 4 users not shown) | |||
Line 8: | Line 8: | ||
| help = [https://www.tensorflow.org/get_started/get_started documentation] | | help = [https://www.tensorflow.org/get_started/get_started documentation] | ||
}} | }} | ||
+ | |||
+ | =Current Supported Versions in HPC= | ||
+ | |||
+ | {| class="wikitable" | ||
+ | ! CPU Support Version || Corresponding Python Module || Dependence || Released Date | ||
+ | |- | ||
+ | | v0.10.0rc0 || python/2.7.6 || - ||Jul 29, 2016 | ||
+ | |- | ||
+ | | - || - || - || - | ||
+ | |} | ||
+ | |||
+ | {| class="wikitable" | ||
+ | ! GPU & CPU Support Version || Corresponding Python Module || Dependence || Released Date | ||
+ | |- | ||
+ | | v0.12.1 || python/2.7.6-gcc-unicode || cuda, cudnn ||Dec 25, 2016 | ||
+ | |- | ||
+ | | v1.2.0 || python/3.6.1 || cuda, cudnn || Jun 16, 2017 | ||
+ | |} | ||
=Import Modules= | =Import Modules= | ||
− | + | ==TensorFlow v0.10.0rc0 (CPU Support Only)== | |
+ | module load python/2.7.6 | ||
+ | |||
+ | ==TensorFlow v1.2.0 (Both GPU and CPU Support)== | ||
+ | module load cuda/8.0.61 cudnn/6.0 sqlite/3.18.0 tcl/8.6.6.8606 python/3.6.1 | ||
+ | |||
+ | Or: | ||
+ | |||
+ | module load gcc/5.4.0-alt cuda/8.0.61 cudnn/6.0 sqlite/3.18.0 tcl/8.6.6.8606 python/3.6.3-gcc540a | ||
+ | |||
+ | ==TensorFlow v0.12.1 (Both GPU and CPU Support)== | ||
+ | module load cuda/8.0.61 cudnn/6.0 python/2.7.6-gcc-unicode | ||
+ | Notice: | ||
+ | *1: This version supports both CPU and GPU. | ||
+ | *2: You should see the following auto-output, which means the gpu dependence libraries have been loaded successfully. | ||
+ | I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally | ||
+ | I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally | ||
+ | I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally | ||
+ | I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.6 locally | ||
+ | I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally | ||
+ | |||
+ | *3: You will get the error below if you run python in normal computing node. '''Please do not worry about it!''' This just mean the gpu feature is unavailable due to current node isn't GPU node. You can still use cpu for calculation without any problem. | ||
+ | E tensorflow/stream_executor/cuda/cuda_driver.cc:509] failed call to cuInit: CUresult(-1) | ||
+ | I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:145] kernel driver does not appear to be running on this host (cn02): /proc/driver/nvidia/version does not exist | ||
+ | |||
+ | =Application Usage= | ||
+ | |||
+ | ==Check Version Number== | ||
+ | $ python | ||
+ | Python 2.7.6 (default, Apr 20 2016, 16:39:52) | ||
+ | [GCC 4.8.2] on linux2 | ||
+ | Type "help", "copyright", "credits" or "license" for more information. | ||
+ | >>> import tensorflow as tf | ||
+ | >>> print tf.__version__ | ||
+ | 0.12.1 | ||
+ | >>> | ||
+ | |||
+ | ==Example: How to Build Computational Graph== | ||
+ | |||
+ | $ python | ||
+ | Python 2.7.6 (default, Apr 20 2016, 16:39:52) | ||
+ | [GCC 4.8.2] on linux2 | ||
+ | Type "help", "copyright", "credits" or "license" for more information. | ||
+ | >>> import tensorflow as tf | ||
+ | >>> node1 = tf.constant(3.0, tf.float32) | ||
+ | >>> node2 = tf.constant(4.0) | ||
+ | >>> node3 = tf.add(node1, node2) | ||
+ | >>> sess = tf.Session() | ||
+ | >>> sess.run(node3) | ||
+ | 7.0 | ||
+ | >>> a = tf.placeholder(tf.float32) | ||
+ | >>> b = tf.placeholder(tf.float32) | ||
+ | >>> adder_node = a + b | ||
+ | >>> print(sess.run(adder_node, {a: 3, b:4.5})) | ||
+ | 7.5 | ||
+ | >>> print(sess.run(adder_node, {a: [1,3], b: [2, 4]})) | ||
+ | [ 3. 7.] | ||
+ | >>> add_and_triple = adder_node * 3 | ||
+ | >>> print(sess.run(add_and_triple, {a: 3, b:4.5})) | ||
+ | 22.5 | ||
+ | >>> | ||
+ | |||
+ | ==Example: Logging Device placement (GPU Version Guide)== | ||
+ | '''hpc-xin@cn02:~$ ssh gpu01''' | ||
+ | '''hpc-xin@gpu01:~$ module purge''' | ||
+ | '''hpc-xin@gpu01:~$ module load cuda/8.0.61 cudnn/6.0 python/2.7.6-gcc-unicode''' | ||
+ | '''hpc-xin@gpu01:~$ python''' | ||
+ | Python 2.7.6 (default, Apr 20 2016, 16:39:52) | ||
+ | [GCC 4.8.2] on linux2 | ||
+ | Type "help", "copyright", "credits" or "license" for more information. | ||
+ | '''>>> import tensorflow as tf''' | ||
+ | I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally | ||
+ | I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally | ||
+ | I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally | ||
+ | I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.6 locally | ||
+ | I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally | ||
+ | '''>>> a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')''' | ||
+ | '''>>> b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')''' | ||
+ | '''>>> c = tf.matmul(a, b)''' | ||
+ | '''>>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))''' | ||
+ | I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40m, pci bus id: 0000:03:00.0) | ||
+ | I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla K40m, pci bus id: 0000:82:00.0) | ||
+ | Device mapping: | ||
+ | /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40m, pci bus id: 0000:03:00.0 | ||
+ | /job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: Tesla K40m, pci bus id: 0000:82:00.0 | ||
+ | I tensorflow/core/common_runtime/direct_session.cc:255] Device mapping: | ||
+ | /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40m, pci bus id: 0000:03:00.0 | ||
+ | /job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: Tesla K40m, pci bus id: 0000:82:00.0 | ||
+ | '''>>> print(sess.run(c))''' | ||
+ | MatMul: (MatMul): /job:localhost/replica:0/task:0/gpu:0 | ||
+ | I tensorflow/core/common_runtime/simple_placer.cc:827] MatMul: (MatMul)/job:localhost/replica:0/task:0/gpu:0 | ||
+ | b: (Const): /job:localhost/replica:0/task:0/gpu:0 | ||
+ | I tensorflow/core/common_runtime/simple_placer.cc:827] b: (Const)/job:localhost/replica:0/task:0/gpu:0 | ||
+ | a: (Const): /job:localhost/replica:0/task:0/gpu:0 | ||
+ | I tensorflow/core/common_runtime/simple_placer.cc:827] a: (Const)/job:localhost/replica:0/task:0/gpu:0 | ||
+ | Const: (Const): /job:localhost/replica:0/task:0/cpu:0 | ||
+ | I tensorflow/core/common_runtime/simple_placer.cc:827] Const: (Const)/job:localhost/replica:0/task:0/cpu:0 | ||
+ | [[ 22. 28.] | ||
+ | [ 49. 64.]] | ||
+ | You can run this code by tensorflow 1.2.0 in python 3.6.1 as well. You will get the same result but a little bit different intermediate automatic output. | ||
+ | |||
+ | ==More Examples== | ||
+ | https://www.tensorflow.org/tutorials/ | ||
+ | |||
+ | [[Category:Software]] | ||
+ | |||
+ | =Python Packages depend on Tensorflow= | ||
+ | ==Keras== | ||
+ | As an example, here we will: | ||
+ | * interactively run the MNIST example code from Keras: https://keras.io/examples/mnist_cnn/ - save the code into a file called <code>mnist_cnn.py</code> | ||
+ | * Run with CPU only, then with a K40 GPU and finally with a V100 model GPU. The epoch processing speeds are: | ||
+ | :* CPU = ~5 minutes | ||
+ | :* GPU K40 = 10 seconds | ||
+ | :* GPU v100 = 3 seconds | ||
+ | |||
+ | <syntaxhighlight lang="bash"> | ||
+ | srun -p gpu --gres gpu:1 --pty bash | ||
+ | # srun: job 2886234 queued and waiting for resources | ||
+ | # srun: job 2886234 has been allocated resources | ||
+ | module purge | ||
+ | module load cuda/8.0.61 cudnn/6.0 tcl/8.6.6.8606 sqlite/3.18.0 python/3.6.1 | ||
+ | which python | ||
+ | # Setting the empty CUDA_VISIBLE_DEVICES environmental variable below hides the GPU from TensorFlow so that we can run in CPU only mode. | ||
+ | CUDA_VISIBLE_DEVICES="" python mnist_cnn.py | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | For the GPU v100: | ||
+ | |||
+ | <syntaxhighlight lang="bash"> | ||
+ | srun -p gpu_v100 --gres gpu:1 --pty bash | ||
+ | module purge | ||
+ | module load cuda/8.0.61 cudnn/6.0 tcl/8.6.6.8606 sqlite/3.18.0 python/3.6.1 | ||
+ | which python | ||
+ | python mnist_cnn.py | ||
+ | </syntaxhighlight> | ||
+ | |||
+ | =Anaconda route= | ||
+ | |||
+ | module purge | ||
+ | module load bazel/3.1.0 cuda/10.1 cudnn/7.6.5 gcc/8.4.0-rhel7 anaconda/5.1.0 | ||
+ | |||
+ | Build an environment and then activate it! (Tensorflow GPU Version) | ||
+ | |||
+ | conda create -n tf_env -c anaconda -c conda-forge/label/gcc7 -c conda-forge/label/cf201901 python tensorflow-gpu=2.4.1 numpy pandas matplotlib seaborn scikit-learn scipy keras IPython jupyterlab tqdm pillow librosa pysoundfile pydub | ||
+ | |||
+ | source activate tf_env | ||
+ | |||
+ | |||
+ | Alternative (We can speed up solving if we specify all the versions.) (Tensorflow GPU Version) | ||
+ | |||
+ | conda create -n tf_env -c anaconda -c conda-forge/label/gcc7 -c conda-forge/label/cf201901 python=3.7.9 tensorflow-gpu=2.4.1 numpy=1.19.1 pandas=1.1.3 matplotlib=3.3.1 seaborn=0.11.0 scikit-learn=0.23.2 scipy=1.6.2 keras=2.4.3 ipython=7.18.1 jupyterlab=2.2.6 tqdm=4.50.2 pillow=8.0.0 librosa=0.6.2 pysoundfile=0.10.2 pydub=0.23.0 | ||
+ | |||
+ | source activate tf_env | ||
+ | |||
+ | If verifying takes a long time and it migt help, it is possible to force skip this by running this before the conda create command. | ||
− | + | conda config --set safety_checks disabled |
Latest revision as of 08:06, 3 May 2021
Tensorflow | |
---|---|
Author | Google Brain Team |
Website | https://www.tensorflow.org/ |
Source | HistoryVersions |
Category | Machine Learning, Deep Neural Networks |
Help | documentation |
Contents
Current Supported Versions in HPC
CPU Support Version | Corresponding Python Module | Dependence | Released Date |
---|---|---|---|
v0.10.0rc0 | python/2.7.6 | - | Jul 29, 2016 |
- | - | - | - |
GPU & CPU Support Version | Corresponding Python Module | Dependence | Released Date |
---|---|---|---|
v0.12.1 | python/2.7.6-gcc-unicode | cuda, cudnn | Dec 25, 2016 |
v1.2.0 | python/3.6.1 | cuda, cudnn | Jun 16, 2017 |
Import Modules
TensorFlow v0.10.0rc0 (CPU Support Only)
module load python/2.7.6
TensorFlow v1.2.0 (Both GPU and CPU Support)
module load cuda/8.0.61 cudnn/6.0 sqlite/3.18.0 tcl/8.6.6.8606 python/3.6.1
Or:
module load gcc/5.4.0-alt cuda/8.0.61 cudnn/6.0 sqlite/3.18.0 tcl/8.6.6.8606 python/3.6.3-gcc540a
TensorFlow v0.12.1 (Both GPU and CPU Support)
module load cuda/8.0.61 cudnn/6.0 python/2.7.6-gcc-unicode
Notice:
- 1: This version supports both CPU and GPU.
- 2: You should see the following auto-output, which means the gpu dependence libraries have been loaded successfully.
I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.6 locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally
- 3: You will get the error below if you run python in normal computing node. Please do not worry about it! This just mean the gpu feature is unavailable due to current node isn't GPU node. You can still use cpu for calculation without any problem.
E tensorflow/stream_executor/cuda/cuda_driver.cc:509] failed call to cuInit: CUresult(-1) I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:145] kernel driver does not appear to be running on this host (cn02): /proc/driver/nvidia/version does not exist
Application Usage
Check Version Number
$ python Python 2.7.6 (default, Apr 20 2016, 16:39:52) [GCC 4.8.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf >>> print tf.__version__ 0.12.1 >>>
Example: How to Build Computational Graph
$ python Python 2.7.6 (default, Apr 20 2016, 16:39:52) [GCC 4.8.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf >>> node1 = tf.constant(3.0, tf.float32) >>> node2 = tf.constant(4.0) >>> node3 = tf.add(node1, node2) >>> sess = tf.Session() >>> sess.run(node3) 7.0 >>> a = tf.placeholder(tf.float32) >>> b = tf.placeholder(tf.float32) >>> adder_node = a + b >>> print(sess.run(adder_node, {a: 3, b:4.5})) 7.5 >>> print(sess.run(adder_node, {a: [1,3], b: [2, 4]})) [ 3. 7.] >>> add_and_triple = adder_node * 3 >>> print(sess.run(add_and_triple, {a: 3, b:4.5})) 22.5 >>>
Example: Logging Device placement (GPU Version Guide)
hpc-xin@cn02:~$ ssh gpu01 hpc-xin@gpu01:~$ module purge hpc-xin@gpu01:~$ module load cuda/8.0.61 cudnn/6.0 python/2.7.6-gcc-unicode hpc-xin@gpu01:~$ python Python 2.7.6 (default, Apr 20 2016, 16:39:52) [GCC 4.8.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import tensorflow as tf I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcurand.so locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcuda.so.1 locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcufft.so locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcudnn.so.6 locally I tensorflow/stream_executor/dso_loader.cc:128] successfully opened CUDA library libcublas.so locally >>> a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') >>> b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') >>> c = tf.matmul(a, b) >>> sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40m, pci bus id: 0000:03:00.0) I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla K40m, pci bus id: 0000:82:00.0) Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40m, pci bus id: 0000:03:00.0 /job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: Tesla K40m, pci bus id: 0000:82:00.0 I tensorflow/core/common_runtime/direct_session.cc:255] Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40m, pci bus id: 0000:03:00.0 /job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: Tesla K40m, pci bus id: 0000:82:00.0 >>> print(sess.run(c)) MatMul: (MatMul): /job:localhost/replica:0/task:0/gpu:0 I tensorflow/core/common_runtime/simple_placer.cc:827] MatMul: (MatMul)/job:localhost/replica:0/task:0/gpu:0 b: (Const): /job:localhost/replica:0/task:0/gpu:0 I tensorflow/core/common_runtime/simple_placer.cc:827] b: (Const)/job:localhost/replica:0/task:0/gpu:0 a: (Const): /job:localhost/replica:0/task:0/gpu:0 I tensorflow/core/common_runtime/simple_placer.cc:827] a: (Const)/job:localhost/replica:0/task:0/gpu:0 Const: (Const): /job:localhost/replica:0/task:0/cpu:0 I tensorflow/core/common_runtime/simple_placer.cc:827] Const: (Const)/job:localhost/replica:0/task:0/cpu:0 [[ 22. 28.] [ 49. 64.]]
You can run this code by tensorflow 1.2.0 in python 3.6.1 as well. You will get the same result but a little bit different intermediate automatic output.
More Examples
https://www.tensorflow.org/tutorials/
Python Packages depend on Tensorflow
Keras
As an example, here we will:
- interactively run the MNIST example code from Keras: https://keras.io/examples/mnist_cnn/ - save the code into a file called
mnist_cnn.py
- Run with CPU only, then with a K40 GPU and finally with a V100 model GPU. The epoch processing speeds are:
- CPU = ~5 minutes
- GPU K40 = 10 seconds
- GPU v100 = 3 seconds
srun -p gpu --gres gpu:1 --pty bash
# srun: job 2886234 queued and waiting for resources
# srun: job 2886234 has been allocated resources
module purge
module load cuda/8.0.61 cudnn/6.0 tcl/8.6.6.8606 sqlite/3.18.0 python/3.6.1
which python
# Setting the empty CUDA_VISIBLE_DEVICES environmental variable below hides the GPU from TensorFlow so that we can run in CPU only mode.
CUDA_VISIBLE_DEVICES="" python mnist_cnn.py
For the GPU v100:
srun -p gpu_v100 --gres gpu:1 --pty bash
module purge
module load cuda/8.0.61 cudnn/6.0 tcl/8.6.6.8606 sqlite/3.18.0 python/3.6.1
which python
python mnist_cnn.py
Anaconda route
module purge module load bazel/3.1.0 cuda/10.1 cudnn/7.6.5 gcc/8.4.0-rhel7 anaconda/5.1.0
Build an environment and then activate it! (Tensorflow GPU Version)
conda create -n tf_env -c anaconda -c conda-forge/label/gcc7 -c conda-forge/label/cf201901 python tensorflow-gpu=2.4.1 numpy pandas matplotlib seaborn scikit-learn scipy keras IPython jupyterlab tqdm pillow librosa pysoundfile pydub
source activate tf_env
Alternative (We can speed up solving if we specify all the versions.) (Tensorflow GPU Version)
conda create -n tf_env -c anaconda -c conda-forge/label/gcc7 -c conda-forge/label/cf201901 python=3.7.9 tensorflow-gpu=2.4.1 numpy=1.19.1 pandas=1.1.3 matplotlib=3.3.1 seaborn=0.11.0 scikit-learn=0.23.2 scipy=1.6.2 keras=2.4.3 ipython=7.18.1 jupyterlab=2.2.6 tqdm=4.50.2 pillow=8.0.0 librosa=0.6.2 pysoundfile=0.10.2 pydub=0.23.0
source activate tf_env
If verifying takes a long time and it migt help, it is possible to force skip this by running this before the conda create command.
conda config --set safety_checks disabled