ΠΠΏΠ΅ΡΠ°ΡΠΈΠΎΠ½Π½Π°Ρ ΡΠΈΡΡΠ΅ΠΌΠ°: Ubuntu 14.04
Π£ΡΡΠ°Π½ΠΎΠ²Π»Π΅Π½Π½Π°Ρ Π²Π΅ΡΡΠΈΡ CUDA ΠΈ cuDNN: 7.5 ΠΈ 4.0.7
(ΠΏΡΠΈΠ»ΠΎΠΆΠΈΡΠ΅ Π²ΡΠ²ΠΎΠ΄ ls -l /path/to/cuda/lib/libcud*
):
ΠΡΠΈ ΡΡΡΠ°Π½ΠΎΠ²ΠΊΠ΅ ΠΈΠ· ΠΈΡΡ ΠΎΠ΄Π½ΠΈΠΊΠΎΠ² ΡΠΊΠ°ΠΆΠΈΡΠ΅ Ρ Π΅Ρ ΡΠΈΠΊΡΠ°ΡΠΈΠΈ: 4a4f2461533847dde239851ecebe5056088a828c
ΠΠ°ΠΏΡΡΡΠΈΡΠ΅ ΡΠ»Π΅Π΄ΡΡΡΠΈΠΉ ΠΊΠΎΠ΄
import tensorflow as tf
def main():
a = tf.Variable(1)
init_a = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init_a)
with tf.device("/gpu:0"):
b = tf.constant(2)
init_b = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init_b)
with tf.device("/cpu:0"):
c = tf.Variable(2)
init_c = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init_c)
with tf.device("/gpu:0"):
d = tf.Variable(2)
init_d = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init_d)
if __name__ == '__main__':
main()
(ΠΡΠ»ΠΈ ΠΆΡΡΠ½Π°Π»Ρ Π±ΠΎΠ»ΡΡΠΈΠ΅, Π·Π°Π³ΡΡΠ·ΠΈΡΠ΅ ΠΈΡ ΠΊΠ°ΠΊ Π²Π»ΠΎΠΆΠ΅Π½ΠΈΠ΅).
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.266
pciBusID 0000:05:00.0
Total memory: 12.00GiB
Free memory: 11.02GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 1 with properties:
name: GeForce GTX 980
major: 5 minor: 2 memoryClockRate (GHz) 1.2785
pciBusID 0000:09:00.0
Total memory: 4.00GiB
Free memory: 3.91GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:59] cannot enable peer access from device ordinal 0 to device ordinal 1
I tensorflow/core/common_runtime/gpu/gpu_init.cc:59] cannot enable peer access from device ordinal 1 to device ordinal 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 1
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y N
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 1: N Y
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 980, pci bus id: 0000:09:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 980, pci bus id: 0000:09:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 980, pci bus id: 0000:09:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 980, pci bus id: 0000:09:00.0)
Traceback (most recent call last):
File "test_multi_gpu.py", line 30, in <module>
main()
File "test_multi_gpu.py", line 26, in main
sess.run(init_d)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 332, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 572, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 652, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 672, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: Cannot assign a device to node 'Variable_2': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available
[[Node: Variable_2 = Variable[container="", dtype=DT_INT32, shape=[], shared_name="", _device="/device:GPU:0"]()]]
Caused by op u'Variable_2', defined at:
File "test_multi_gpu.py", line 30, in <module>
main()
File "test_multi_gpu.py", line 23, in main
d = tf.Variable(2)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 211, in __init__
dtype=dtype)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 292, in _init_from_args
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py", line 139, in variable_op
container=container, shared_name=shared_name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 351, in _variable
name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 693, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2177, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1161, in __init__
self._traceback = _extract_stack()
Π― ΡΠ°ΠΊΠΆΠ΅ Π·Π°ΠΌΠ΅ΡΠΈΠ», ΡΡΠΎ Π² Π΄ΠΎΠΊΡΠΌΠ΅Π½ΡΠ°ΡΠΈΠΈ ΠΏΠΎ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΡ Π³ΡΠ°ΡΠΈΡΠ΅ΡΠΊΠΈΡ ΠΏΡΠΎΡΠ΅ΡΡΠΎΡΠΎΠ² Π½Π΅ ΡΠΏΠΎΠΌΠΈΠ½Π°Π΅ΡΡΡ ΠΎ tf.Variable, ΠΎΠ½Π° Π²ΠΊΠ»ΡΡΠ°Π΅Ρ ΡΠΎΠ»ΡΠΊΠΎ tf.constant ΠΈ tf.matmul.
Π₯ΠΎΡΠΎΡΠΎ, Ρ Π½Π°ΡΠ΅Π» Π΄ΠΎΠΊΡΠΌΠ΅Π½ΡΠ°ΡΠΈΡ ΠΈΠ· [Convolutional Neural Networks] (https://www.tensorflow.org/versions/r0.8/tutorials/deep_cnn/index.html),
ΡΠΈΡΠ°ΡΡ:
All variables are pinned to the CPU and accessed via tf.get_variable() in order to share them in a multi-GPU version. See how-to on Sharing Variables.
Π― Ρ
ΠΎΡΡ ΡΠΏΡΠΎΡΠΈΡΡ, ΠΏΠΎΡΠΊΠΎΠ»ΡΠΊΡ tf.Variables ΠΏΡΠΈΠΊΡΠ΅ΠΏΠ»Π΅Π½ ΠΊ ΠΏΡΠΎΡΠ΅ΡΡΠΎΡΡ Ρ ΠΏΠΎΠΌΠΎΡΡΡ ΡΠ΅Π½Π·ΠΎΡΠ½ΠΎΠ³ΠΎ ΠΏΠΎΡΠΎΠΊΠ°, ΠΌΠΎΠΆΠ΅ΠΌ Π»ΠΈ ΠΌΡ ΠΈΡΠΏΡΠ°Π²ΠΈΡΡ ΡΡΡ ΠΎΡΠΈΠ±ΠΊΡ? ΠΡΠΆΠ½ΠΎ Π»ΠΈ Π½Π°ΠΌ ΠΎΡΠ΅Π½Ρ Π²Π½ΠΈΠΌΠ°ΡΠ΅Π»ΡΠ½ΠΎ ΡΠ»Π΅Π΄ΠΈΡΡ Π·Π° ΡΠ΅ΠΌ, ΡΡΠΎΠ±Ρ ΠΈΡΠΊΠ»ΡΡΠΈΡΡ ΠΎΠ±ΡΡΠ²Π»Π΅Π½ΠΈΠ΅ tf.Variable Π·Π° ΠΏΡΠ΅Π΄Π΅Π»Π°ΠΌΠΈ ΠΎΠ±Π»Π°ΡΡΠΈ with tf.device('/gpu:xx')
, ΠΈΠ»ΠΈ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡ netsted with tf.device(None)
Π΄Π»Ρ Π΅Π³ΠΎ ΠΎΠ±ΡΠ°Π±ΠΎΡΠΊΠΈ?
ΠΡΠ°ΠΊ, Π΅ΡΡΡ Π½Π΅ΠΊΠΎΡΠΎΡΡΠ΅ ΠΎΠΏΠ΅ΡΠ°ΡΠΈΠΈ, ΠΊΠΎΡΠΎΡΡΠ΅ Π½Π΅Π΄ΠΎΠΏΡΡΡΠΈΠΌΡ Π΄Π»Ρ tf.device (), Π½Π°ΠΏΡΠΈΠΌΠ΅Ρ tf.nn.local_response_normalization (),
Π‘ΠΌ. ΠΠΎΠ΄ Π½ΠΈΠΆΠ΅:
with tf.device("/gpu:0"):
d = tf.placeholder("float", shape=[100, 100, 100, 10])
with tf.device(None):
lrn1 = tf.nn.local_response_normalization(d, depth_radius=5, bias=1.0, alpha=1e-4, beta=0.75)
lrn2 = tf.nn.local_response_normalization(d, depth_radius=5, bias=1.0, alpha=1e-4, beta=0.75)
init_d = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init_d)
r = np.random.randn(100, 100, 100, 10)
sess.run(lrn1, feed_dict={d: r}) #Run ok
sess.run(lrn2, feed_dict={d: r}) # Error
Π Π΅Π·ΡΠ»ΡΡΠ°Ρ Π½ΠΈΠΆΠ΅:
Traceback (most recent call last):
File "test_multi_gpu.py", line 44, in <module>
main()
File "test_multi_gpu.py", line 40, in main
sess.run(lrn2, feed_dict={d: r})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 332, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 572, in _run
feed_dict_string, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 652, in _do_run
target_list, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 672, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: Cannot assign a device to node 'LRN_1': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available
[[Node: LRN_1 = LRN[alpha=0.0001, beta=0.75, bias=1, depth_radius=5, _device="/device:GPU:0"](Placeholder)]]
Caused by op u'LRN_1', defined at:
File "test_multi_gpu.py", line 44, in <module>
main()
File "test_multi_gpu.py", line 34, in main
lrn2 = tf.nn.local_response_normalization(d, depth_radius=5, bias=1.0, alpha=1e-4, beta=0.75)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 737, in lrn
bias=bias, alpha=alpha, beta=beta, name=name)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 693, in apply_op
op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2177, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1161, in __init__
self._traceback = _extract_stack()
ΠΡΠΌΠ°Ρ, ΠΏΡΠΈΡΠΈΠ½Π° ΡΡΠΎΠΉ ΠΎΡΠΈΠ±ΠΊΠΈ ΠΌΠΎΠΆΠ΅Ρ Π±ΡΡΡ Π΄ΠΎΡΡΠ°ΡΠΎΡΠ½ΠΎ ΡΡΠ½Π°. Π tf.nn.local_response_normalization
Π΅ΡΡΡ Π²Π½ΡΡΡΠ΅Π½Π½ΡΡ tf.Variable, ΠΊΠΎΡΠΎΡΡΡ ΠΌΡ Π½Π΅ ΠΌΠΎΠ³Π»ΠΈ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡ Π²ΠΎ Π²Π½Π΅ΡΠ½Π΅ΠΌ ΠΊΠΎΠ΄Π΅, ΡΡΠΎΠ±Ρ ΠΎΡΡΠ°Π²Π°ΡΡΡΡ Π²ΡΡΠΈΡΠ»ΠΈΡΠ΅Π»ΡΠ½ΡΠΌ ΡΠ·Π»ΠΎΠΌ Π΄Π»Ρ ΡΠΊΠ°Π·Π°Π½Π½ΠΎΠ³ΠΎ Π³ΡΠ°ΡΠΈΡΠ΅ΡΠΊΠΎΠ³ΠΎ ΠΏΡΠΎΡΠ΅ΡΡΠΎΡΠ°, ΠΈΡΠΊΠ»ΡΡΠ°Ρ Π²ΡΠ΅ Π²Π½ΡΡΡΠ΅Π½Π½ΠΈΠ΅ ΠΏΠ΅ΡΠ΅ΠΌΠ΅Π½Π½ΡΠ΅.
ΠΠ° Π΄Π°Π½Π½ΡΠΉ ΠΌΠΎΠΌΠ΅Π½Ρ Ρ Π΄ΡΠΌΠ°Ρ, ΡΡΠΎ ΡΠ΅Π½Π·ΠΎΡΠ½ΡΠΉ ΠΏΠΎΡΠΎΠΊ Π΄ΠΎΠ»ΠΆΠ΅Π½ Π²ΡΠΏΠΎΠ»Π½ΡΡΡ ΠΎΠ΄Π½ΠΎ ΠΈΠ· Π΄Π²ΡΡ ΡΠ»Π΅Π΄ΡΡΡΠΈΡ Π΄Π΅ΠΉΡΡΠ²ΠΈΠΉ:
tf.device(None)
ΡΡΠΎΠ±Ρ ΠΏΠΎΠΌΠΎΡΡ ΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΠ΅Π»Ρ Π·Π°Π²Π΅ΡΡΠΈΡΡ ΡΠ²ΠΎΠΉ ΠΊΠΎΠ΄, Π²Π΅ΡΠ½ΠΎ?ΠΡΠΎΠ±Π»Π΅ΠΌΠ° Π²ΡΡΠΎΠΊΠΎΠ³ΠΎ ΡΡΠΎΠ²Π½Ρ Π΄ΠΎΠ»ΠΆΠ½Π° Π±ΡΡΡ ΡΡΡΡΠ°Π½Π΅Π½Π° Ρ ΠΏΠΎΠΌΠΎΡΡΡ ΠΏΠΎΡΡΠΎΡΠ½Π½ΠΎΠΉ ΡΠ°Π±ΠΎΡΡ tf.Variable
ignore tf.device()
Π½Π΅ ΡΡΠ°Π±ΠΎΡΠ°ΡΡ, ΠΏΠΎΡΠΎΠΌΡ ΡΡΠΎ ΠΌΠ½ΠΎΠ³ΠΈΠ΅ ΠΈΠ· Π½Π°ΡΠΈΡ
ΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΠ΅Π»Π΅ΠΉ, ΠΎΡΠΎΠ±Π΅Π½Π½ΠΎ Π² ΡΠ°ΡΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Π½ΡΡ
Π½Π°ΡΡΡΠΎΠΉΠΊΠ°Ρ
, ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΡΡ ΡΡΠΎ Π΄Π»Ρ Π½Π°ΡΡΡΠΎΠΉΠΊΠΈ ΡΠ΅ΡΠ²Π΅ΡΠΎΠ² ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΡΠΎΠ².) Π ΠΊΡΠ°ΡΠΊΠΎΡΡΠΎΡΠ½ΠΎΠΉ ΠΏΠ΅ΡΡΠΏΠ΅ΠΊΡΠΈΠ²Π΅ ΠΏΠΎΠΏΡΠΎΠ±ΡΠΉΡΠ΅ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡ ΠΌΡΠ³ΠΊΠΎΠ΅ ΡΠ°Π·ΠΌΠ΅ΡΠ΅Π½ΠΈΠ΅ Π² ΡΠ²ΠΎΠ΅ΠΌ ΡΠ΅Π°Π½ΡΠ΅. ΠΊΠΎΠ½ΡΡΡΡΠΊΡΠΎΡ:
config = tf.ConfigProto(allow_soft_placement=True)
with tf.Session(config=config) as sess:
# ...
Π‘ΠΏΠ°ΡΠΈΠ±ΠΎ Π·Π° Π²Π°ΡΠ΅ ΠΏΡΠ΅Π΄Π»ΠΎΠΆΠ΅Π½ΠΈΠ΅, ΠΏΠΎΡ
ΠΎΠΆΠ΅, ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°Π½ΠΈΠ΅ allow_soft_placement=True
ΡΠ΅ΡΠΈΡ ΠΏΡΠΎΠ±Π»Π΅ΠΌΡ. ΠΠ°ΠΊ ΡΠΊΠ°Π·Π°Π½ΠΎ Π² β 2292, Π»ΡΡΡΠ΅ ΡΠ»ΡΡΡΠΈΡΡ ΡΠΎΠΎΡΠ²Π΅ΡΡΡΠ²ΡΡΡΠΈΠΉ Π΄ΠΎΠΊΡΠΌΠ΅Π½Ρ, ΡΡΠΎΠ±Ρ ΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΠ΅Π»Ρ Π·Π½Π°Π» ΠΎΠ± ΡΡΠΎΠΌ.
Π‘Π°ΠΌΡΠΉ ΠΏΠΎΠ»Π΅Π·Π½ΡΠΉ ΠΊΠΎΠΌΠΌΠ΅Π½ΡΠ°ΡΠΈΠΉ
ΠΡΠΎΠ±Π»Π΅ΠΌΠ° Π²ΡΡΠΎΠΊΠΎΠ³ΠΎ ΡΡΠΎΠ²Π½Ρ Π΄ΠΎΠ»ΠΆΠ½Π° Π±ΡΡΡ ΡΡΡΡΠ°Π½Π΅Π½Π° Ρ ΠΏΠΎΠΌΠΎΡΡΡ ΠΏΠΎΡΡΠΎΡΠ½Π½ΠΎΠΉ ΡΠ°Π±ΠΎΡΡ
tf.Variable
ignoretf.device()
Π½Π΅ ΡΡΠ°Π±ΠΎΡΠ°ΡΡ, ΠΏΠΎΡΠΎΠΌΡ ΡΡΠΎ ΠΌΠ½ΠΎΠ³ΠΈΠ΅ ΠΈΠ· Π½Π°ΡΠΈΡ ΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΠ΅Π»Π΅ΠΉ, ΠΎΡΠΎΠ±Π΅Π½Π½ΠΎ Π² ΡΠ°ΡΠΏΡΠ΅Π΄Π΅Π»Π΅Π½Π½ΡΡ Π½Π°ΡΡΡΠΎΠΉΠΊΠ°Ρ , ΠΈΡΠΏΠΎΠ»ΡΠ·ΡΡΡ ΡΡΠΎ Π΄Π»Ρ Π½Π°ΡΡΡΠΎΠΉΠΊΠΈ ΡΠ΅ΡΠ²Π΅ΡΠΎΠ² ΠΏΠ°ΡΠ°ΠΌΠ΅ΡΡΠΎΠ².) Π ΠΊΡΠ°ΡΠΊΠΎΡΡΠΎΡΠ½ΠΎΠΉ ΠΏΠ΅ΡΡΠΏΠ΅ΠΊΡΠΈΠ²Π΅ ΠΏΠΎΠΏΡΠΎΠ±ΡΠΉΡΠ΅ ΠΈΡΠΏΠΎΠ»ΡΠ·ΠΎΠ²Π°ΡΡ ΠΌΡΠ³ΠΊΠΎΠ΅ ΡΠ°Π·ΠΌΠ΅ΡΠ΅Π½ΠΈΠ΅ Π² ΡΠ²ΠΎΠ΅ΠΌ ΡΠ΅Π°Π½ΡΠ΅. ΠΊΠΎΠ½ΡΡΡΡΠΊΡΠΎΡ: