Caffe Guide

From Storrs HPC Wiki
Jump to: navigation, search
Caffe
Author Yangqing Jia + contributors
Website http://caffe.berkeleyvision.org/
Source GitHub
Category Deep Learning
Help documentation


Import Modules

module load gcc/4.9.3 \
	intelics/2016.1-full \
	protobuf/2.6.1 \
	gflags/2.1.2 \
	opencv/2.4.13 \
	cuda/7.5 \
	cudnn/5.0 \
	python/2.7.6 \
	boost/1.61.0 \
	glog/0.3.3 \
	hdf5/1.8.17 \
	leveldb/fa6dc01 \
	lmdb/1 \
 	snappy/1.1.3 \
	gstreamer/1.8.1

module load caffe/rc3

Application Usage

Python

$ python
Python 2.7.6 (default, Apr 20 2016, 16:39:52)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import caffe

Native

The main application name for Caffe is caffe.bin

GPU

First, you must target your job at the gpu partition and specify the number of GPUs that you need (maximum of 2).

#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1

# your caffe commands here

To verify that Caffe is able to access the GPUs:

$ caffe.bin device_query -gpu 0,1
I0727 23:38:25.242207 157390 caffe.cpp:112] Querying GPUs 0,1
I0727 23:38:25.252252 157390 common.cpp:168] Device id:                     0
I0727 23:38:25.252285 157390 common.cpp:169] Major revision number:         3
I0727 23:38:25.252288 157390 common.cpp:170] Minor revision number:         5
I0727 23:38:25.252290 157390 common.cpp:171] Name:                          Tesla K40m
I0727 23:38:25.252292 157390 common.cpp:172] Total global memory:           12079136768
I0727 23:38:25.252300 157390 common.cpp:173] Total shared memory per block: 49152
I0727 23:38:25.252301 157390 common.cpp:174] Total registers per block:     65536
I0727 23:38:25.252303 157390 common.cpp:175] Warp size:                     32
I0727 23:38:25.252306 157390 common.cpp:176] Maximum memory pitch:          2147483647
I0727 23:38:25.252308 157390 common.cpp:177] Maximum threads per block:     1024
I0727 23:38:25.252310 157390 common.cpp:178] Maximum dimension of block:    1024, 1024, 64
I0727 23:38:25.252313 157390 common.cpp:181] Maximum dimension of grid:     2147483647, 65535, 65535
I0727 23:38:25.252316 157390 common.cpp:184] Clock rate:                    745000
I0727 23:38:25.252318 157390 common.cpp:185] Total constant memory:         65536
I0727 23:38:25.252321 157390 common.cpp:186] Texture alignment:             512
I0727 23:38:25.252323 157390 common.cpp:187] Concurrent copy and execution: Yes
I0727 23:38:25.252326 157390 common.cpp:189] Number of multiprocessors:     15
I0727 23:38:25.252328 157390 common.cpp:190] Kernel execution timeout:      No
I0727 23:38:26.136673 157390 common.cpp:168] Device id:                     1
I0727 23:38:26.136693 157390 common.cpp:169] Major revision number:         3
I0727 23:38:26.136695 157390 common.cpp:170] Minor revision number:         5
I0727 23:38:26.136698 157390 common.cpp:171] Name:                          Tesla K40m
I0727 23:38:26.136701 157390 common.cpp:172] Total global memory:           12079136768
I0727 23:38:26.136703 157390 common.cpp:173] Total shared memory per block: 49152
I0727 23:38:26.136706 157390 common.cpp:174] Total registers per block:     65536
I0727 23:38:26.136708 157390 common.cpp:175] Warp size:                     32
I0727 23:38:26.136711 157390 common.cpp:176] Maximum memory pitch:          2147483647
I0727 23:38:26.136713 157390 common.cpp:177] Maximum threads per block:     1024
I0727 23:38:26.136716 157390 common.cpp:178] Maximum dimension of block:    1024, 1024, 64
I0727 23:38:26.136719 157390 common.cpp:181] Maximum dimension of grid:     2147483647, 65535, 65535
I0727 23:38:26.136721 157390 common.cpp:184] Clock rate:                    745000
I0727 23:38:26.136724 157390 common.cpp:185] Total constant memory:         65536
I0727 23:38:26.136726 157390 common.cpp:186] Texture alignment:             512
I0727 23:38:26.136729 157390 common.cpp:187] Concurrent copy and execution: Yes
I0727 23:38:26.136731 157390 common.cpp:189] Number of multiprocessors:     15
I0727 23:38:26.136734 157390 common.cpp:190] Kernel execution timeout:      No

Interactive

 fisbatch -p gpu --gres=gpu:1