Tensorflow: CPU์— ๊ณ ์ •๋œ tf.Variable๊ณผ ๊ด€๋ จ๋œ ์—ฌ๋Ÿฌ GPU ์‚ฌ์šฉ ๋ฒ„๊ทธ

์— ๋งŒ๋“  2016๋…„ 05์›” 09์ผ  ยท  3์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: tensorflow/tensorflow

ํ™˜๊ฒฝ ์ •๋ณด

์šด์˜ ์ฒด์ œ: ์šฐ๋ถ„ํˆฌ 14.04

์„ค์น˜๋œ CUDA ๋ฐ cuDNN ๋ฒ„์ „: 7.5 ๋ฐ 4.0.7
( ls -l /path/to/cuda/lib/libcud* ์˜ ์ถœ๋ ฅ์„ ์ฒจ๋ถ€ํ•˜์‹ญ์‹œ์˜ค):

์†Œ์Šค์—์„œ ์„ค์น˜๋œ ๊ฒฝ์šฐ ์ปค๋ฐ‹ ํ•ด์‹œ ์ œ๊ณต: 4a4f2461533847dde239851ecebe5056088a828c

์žฌํ˜„ ๋‹จ๊ณ„

๋‹ค์Œ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰

import tensorflow as tf

def main():
    a = tf.Variable(1)
    init_a = tf.initialize_all_variables()
    with tf.Session() as sess:
        sess.run(init_a)

    with tf.device("/gpu:0"):
        b = tf.constant(2)
        init_b = tf.initialize_all_variables()
    with tf.Session() as sess:
        sess.run(init_b)

    with tf.device("/cpu:0"):
        c = tf.Variable(2)
        init_c = tf.initialize_all_variables()
    with tf.Session() as sess:
        sess.run(init_c)

    with tf.device("/gpu:0"):
        d = tf.Variable(2)
        init_d = tf.initialize_all_variables()
    with tf.Session() as sess:
        sess.run(init_d)

if __name__ == '__main__':
    main()

๋„์›€์ด ๋  ๋กœ๊ทธ ๋˜๋Š” ๊ธฐํƒ€ ์ถœ๋ ฅ

(๋กœ๊ทธ ์šฉ๋Ÿ‰์ด ํด ๊ฒฝ์šฐ ์ฒจ๋ถ€ํŒŒ์ผ๋กœ ์—…๋กœ๋“œ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.)

I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcublas.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcudnn.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcufft.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:108] successfully opened CUDA library libcurand.so locally
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties: 
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.266
pciBusID 0000:05:00.0
Total memory: 12.00GiB
Free memory: 11.02GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 1 with properties: 
name: GeForce GTX 980
major: 5 minor: 2 memoryClockRate (GHz) 1.2785
pciBusID 0000:09:00.0
Total memory: 4.00GiB
Free memory: 3.91GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:59] cannot enable peer access from device ordinal 0 to device ordinal 1
I tensorflow/core/common_runtime/gpu/gpu_init.cc:59] cannot enable peer access from device ordinal 1 to device ordinal 0
I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 1 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0:   Y N 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 1:   N Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 980, pci bus id: 0000:09:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 980, pci bus id: 0000:09:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 980, pci bus id: 0000:09:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:756] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 980, pci bus id: 0000:09:00.0)
Traceback (most recent call last):
  File "test_multi_gpu.py", line 30, in <module>
    main()
  File "test_multi_gpu.py", line 26, in main
    sess.run(init_d)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 332, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 572, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 652, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 672, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: Cannot assign a device to node 'Variable_2': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available
     [[Node: Variable_2 = Variable[container="", dtype=DT_INT32, shape=[], shared_name="", _device="/device:GPU:0"]()]]
Caused by op u'Variable_2', defined at:
  File "test_multi_gpu.py", line 30, in <module>
    main()
  File "test_multi_gpu.py", line 23, in main
    d = tf.Variable(2)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 211, in __init__
    dtype=dtype)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/variables.py", line 292, in _init_from_args
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/state_ops.py", line 139, in variable_op
    container=container, shared_name=shared_name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_state_ops.py", line 351, in _variable
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 693, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2177, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1161, in __init__
    self._traceback = _extract_stack()

๋˜ํ•œ GPU ์‚ฌ์šฉ์— ๋Œ€ํ•œ ๋ฌธ์„œ์—์„œ๋Š” tf.Variable์— ๋Œ€ํ•ด ์–ธ๊ธ‰ํ•˜์ง€ ์•Š๊ณ  tf.constant ๋ฐ tf.matmul๋งŒ ๊ด€๋ จ๋˜์–ด ์žˆ์Œ์„ ์•Œ์•˜์Šต๋‹ˆ๋‹ค.

์•Œ๊ฒ ์Šต๋‹ˆ๋‹ค. [Convolutional Neural Networks] (https://www.tensorflow.org/versions/r0.8/tutorials/deep_cnn/index.html)
์ธ์šฉ ๋ถ€ํ˜ธ:

All variables are pinned to the CPU and accessed via tf.get_variable() in order to share them in a multi-GPU version. See how-to on Sharing Variables.

tf.Variables๊ฐ€ tensorflow์— ์˜ํ•ด CPU์— ๊ณ ์ •๋˜์–ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด ์˜ค๋ฅ˜๋ฅผ ์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ๋ฌป๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. with tf.device('/gpu:xx') ๋ฒ”์œ„ ๋ฐ–์˜ tf.Variable ์„ ์–ธ์„ ์ œ์™ธํ•˜๊ธฐ ์œ„ํ•ด ๋งค์šฐ ์ฃผ์˜ ๊นŠ๊ฒŒ ์‚ดํŽด๋ด์•ผ ํ•ฉ๋‹ˆ๊นŒ, ์•„๋‹ˆ๋ฉด netsted with tf.device(None) ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ?

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

๋†’์€ ์ˆ˜์ค€์˜ ๋ฌธ์ œ๋Š” ์žฅ์น˜ ๋ฐฐ์น˜๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•œ @vrv ์˜ ์ง€์†์ ์ธ ์ž‘์—…์œผ๋กœ ํ•ด๊ฒฐ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ( tf.Variable tf.device() tf.Variable ๋ฌด์‹œํ•˜๋„๋ก ๋งŒ๋“œ๋Š” ๊ฒƒ์€ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ๋ถ„์‚ฐ ์„ค์ •์—์„œ ๋งŽ์€ ์‚ฌ์šฉ์ž๊ฐ€ ์ด๊ฒƒ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋งค๊ฐœ๋ณ€์ˆ˜ ์„œ๋ฒ„๋ฅผ ๊ตฌ์„ฑํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.) ๋‹จ๊ธฐ์ ์œผ๋กœ๋Š” ์„ธ์…˜์—์„œ ์†Œํ”„ํŠธ ๋ฐฐ์น˜๋ฅผ ์‚ฌ์šฉํ•ด ๋ณด์‹ญ์‹œ์˜ค. ๊ฑด์„ค์ž:

config = tf.ConfigProto(allow_soft_placement=True)
with tf.Session(config=config) as sess:
    # ...

๋ชจ๋“  3 ๋Œ“๊ธ€

๋”ฐ๋ผ์„œ tf.nn.local_response_normalization()๊ณผ ๊ฐ™์ด tf.device()์— ์œ ํšจํ•˜์ง€ ์•Š์€ ์ž‘์—…์ด ์žˆ์Šต๋‹ˆ๋‹ค.
์•„๋ž˜ ์ฝ”๋“œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

    with tf.device("/gpu:0"):
        d = tf.placeholder("float", shape=[100, 100, 100, 10])
        with tf.device(None):
            lrn1 = tf.nn.local_response_normalization(d, depth_radius=5, bias=1.0, alpha=1e-4, beta=0.75)
        lrn2 = tf.nn.local_response_normalization(d, depth_radius=5, bias=1.0, alpha=1e-4, beta=0.75)
        init_d = tf.initialize_all_variables()
    with tf.Session() as sess:
        sess.run(init_d)
        r = np.random.randn(100, 100, 100, 10)
        sess.run(lrn1, feed_dict={d: r}) #Run ok
        sess.run(lrn2, feed_dict={d: r}) # Error

์ถœ๋ ฅ์€ ์•„๋ž˜์™€ ๊ฐ™์Šต๋‹ˆ๋‹ค.

Traceback (most recent call last):
  File "test_multi_gpu.py", line 44, in <module>
    main()
  File "test_multi_gpu.py", line 40, in main
    sess.run(lrn2, feed_dict={d: r})
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 332, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 572, in _run
    feed_dict_string, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 652, in _do_run
    target_list, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 672, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors.InvalidArgumentError: Cannot assign a device to node 'LRN_1': Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available
     [[Node: LRN_1 = LRN[alpha=0.0001, beta=0.75, bias=1, depth_radius=5, _device="/device:GPU:0"](Placeholder)]]
Caused by op u'LRN_1', defined at:
  File "test_multi_gpu.py", line 44, in <module>
    main()
  File "test_multi_gpu.py", line 34, in main
    lrn2 = tf.nn.local_response_normalization(d, depth_radius=5, bias=1.0, alpha=1e-4, beta=0.75)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py", line 737, in lrn
    bias=bias, alpha=alpha, beta=beta, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 693, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2177, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1161, in __init__
    self._traceback = _extract_stack()

์ด ์˜ค๋ฅ˜ ์˜ ์›์ธ ์€ ์ถฉ๋ถ„ํžˆ ๋ถ„๋ช…ํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. tf.nn.local_response_normalization ๋‚ด๋ถ€ tf.Variable์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์™ธ๋ถ€ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋“  ๋‚ด๋ถ€ ๋ณ€์ˆ˜๋ฅผ ์ œ์™ธํ•˜๋Š” ๋™์•ˆ ์ง€์ •๋œ GPU์— ๋Œ€ํ•œ ๊ณ„์‚ฐ ๋…ธ๋“œ๋ฅผ ์œ ์ง€ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

ํ˜„์žฌ๋กœ์„œ๋Š” tensorflow๊ฐ€ ์•„๋ž˜ ๋‘ ๊ฐ€์ง€ ์ค‘ ํ•˜๋‚˜๋ฅผ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

  1. tf.device()์˜ ์˜ํ–ฅ์„ ๋ฐ›์ง€ ์•Š๋„๋ก tf.Variable์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค. (์ด๊ฒƒ์ด ์„ ํ˜ธ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.)
  2. ์‚ฌ์šฉ์ž๊ฐ€ ์ฝ”๋“œ๋ฅผ ์™„์„ฑํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋˜๋„๋ก tf.device(None) ๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋Š” ์ž‘์—…์„ ๋‚˜์—ดํ•˜์‹ญ์‹œ์˜ค. ๋งž์Šต๋‹ˆ๊นŒ?

๋†’์€ ์ˆ˜์ค€์˜ ๋ฌธ์ œ๋Š” ์žฅ์น˜ ๋ฐฐ์น˜๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•œ @vrv ์˜ ์ง€์†์ ์ธ ์ž‘์—…์œผ๋กœ ํ•ด๊ฒฐ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ( tf.Variable tf.device() tf.Variable ๋ฌด์‹œํ•˜๋„๋ก ๋งŒ๋“œ๋Š” ๊ฒƒ์€ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ๋ถ„์‚ฐ ์„ค์ •์—์„œ ๋งŽ์€ ์‚ฌ์šฉ์ž๊ฐ€ ์ด๊ฒƒ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋งค๊ฐœ๋ณ€์ˆ˜ ์„œ๋ฒ„๋ฅผ ๊ตฌ์„ฑํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.) ๋‹จ๊ธฐ์ ์œผ๋กœ๋Š” ์„ธ์…˜์—์„œ ์†Œํ”„ํŠธ ๋ฐฐ์น˜๋ฅผ ์‚ฌ์šฉํ•ด ๋ณด์‹ญ์‹œ์˜ค. ๊ฑด์„ค์ž:

config = tf.ConfigProto(allow_soft_placement=True)
with tf.Session(config=config) as sess:
    # ...

์ œ์•ˆํ•ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. allow_soft_placement=True ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. #2292์—์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด ์‚ฌ์šฉ์ž๊ฐ€ ์ด๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ๋„๋ก ํ•ด๋‹น ๋ฌธ์„œ๋ฅผ ๊ฐœ์„ ํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰