Tensorflow: cudnn ํ•ธ๋“ค์„ ๋งŒ๋“ค ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค : CUDNN_STATUS_INTERNAL_ERROR

์— ๋งŒ๋“  2018๋…„ 12์›” 21์ผ  ยท  181์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: tensorflow/tensorflow

์ด๊ฒƒ์ด ๋ฒ„๊ทธ์ธ์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค. tag : bug_template

์‹œ์Šคํ…œ ์ •๋ณด

  • ์‚ฌ์šฉ์ž ์ง€์ • ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑ ํ–ˆ์Šต๋‹ˆ๊นŒ (TensorFlow์—์„œ ์ œ๊ณตํ•˜๋Š” ์ฃผ์‹ ์˜ˆ์ œ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ๊ณผ ๋ฐ˜๋Œ€) : ์˜ˆ ๋ฐ ์•„๋‹ˆ์š” (์•„๋ž˜ ์„ค๋ช…)
  • OS ํ”Œ๋žซํผ ๋ฐ ๋ฐฐํฌ (์˜ˆ : Linux Ubuntu 16.04) : Manjaro
  • ํœด๋Œ€ ๊ธฐ๊ธฐ์—์„œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ๊ฒฝ์šฐ ํœด๋Œ€ ๊ธฐ๊ธฐ (์˜ˆ : iPhone 8, Pixel 2, Samsung Galaxy) :
  • (์†Œ์Šค ๋˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ)์—์„œ ์„ค์น˜๋œ TensorFlow : tf-nightly-gpu (Dec 19, r1.13)
  • TensorFlow ๋ฒ„์ „ (์•„๋ž˜ ๋ช…๋ น ์‚ฌ์šฉ) : 1.13.0-dev20181219
  • Python ๋ฒ„์ „ : 3.7.1
  • Bazel ๋ฒ„์ „ (์†Œ์Šค์—์„œ ์ปดํŒŒ์ผํ•˜๋Š” ๊ฒฝ์šฐ) :
  • GCC / ์ปดํŒŒ์ผ๋Ÿฌ ๋ฒ„์ „ (์†Œ์Šค์—์„œ ์ปดํŒŒ์ผํ•˜๋Š” ๊ฒฝ์šฐ) :
  • CUDA / cuDNN ๋ฒ„์ „ : cuDNN 7.4.1์ด์žˆ๋Š” CUDA 10
  • GPU ๋ชจ๋ธ ๋ฐ ๋ฉ”๋ชจ๋ฆฌ : RTX 2070 8GB

ํ˜„์žฌ ํ–‰๋™ ์„ค๋ช…
MNIST์—์„œ CNN ๋ชจ๋ธ์„ ์‹คํ–‰ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. GPU๋กœ ์‹คํ–‰ ์ค‘์ผ ๋•Œ
2018-12-20 20:09:13.644176: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

๋‚˜๋Š” ์•ฝ๊ฐ„์˜ ํŒŒ๊ณ ๋ฅผํ–ˆ๊ณ  ๊ทธ๊ฒƒ์ด ๋ฉ”๋ชจ๋ฆฌ ๋ฌธ์ œ๋ผ๋Š” ๊ฒƒ์„ ๊นจ๋‹ฌ์•˜๋‹ค. (๋‚ด๊ฐ€ 32GB์˜ RAM๊ณผ 64GB์˜ ์Šค์™‘์„ ๊ฐ€์ง€๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ๋Ÿด ํ•„์š”๊ฐ€ ์—†๋‹ค. ๋ชจ๋ธ์„ ์‹คํ–‰ํ•  ๋•Œ htop์„ ์‹คํ–‰ํ–ˆ๊ณ  20GB ์ด์ƒ์˜ ์—ฌ์œ  ๊ณต๊ฐ„์ด ์žˆ์Šต๋‹ˆ๋‹ค. 8GB vRAM ๋งคํ•‘์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.

gpu_options.allow_growth = True ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋ชจ๋ธ์ด ์ œ๋Œ€๋กœ ์ž‘๋™ํ•˜๊ณ  os.environ['CUDA_VISIBLE_DEVICES'] = '-1' ๋„ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๋‚ด๊ฐ€ ๊ธฐ์–ต ๋ฌธ์ œ์— ์ง๋ฉดํ•˜๊ณ  ์žˆ์Œ์„ ์˜๋ฏธํ•˜์ง€๋งŒ ๋ฐฉ๋ฒ•์„ ์•Œ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ gpu_options.allow_growth = True ์‚ฌ์šฉํ•˜๋ฉด tensorflow / models / official / mnist / ๋ชจ๋ธ์„ ์‹คํ–‰ํ•˜๋ ค๊ณ  ํ•  ๋•Œ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.์ด ๋ฌธ์ œ๋Š” ๋‚ด ์ฝ”๋“œ์™€ ๋น„์Šทํ•œ ๋™์ž‘์„ ๊ฐ€์ ธ์•ผํ•ฉ๋‹ˆ๋‹ค.

๋ฌธ์ œ๋ฅผ ์žฌํ˜„ํ•˜๋Š” ์ฝ”๋“œ

import os
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import math
import time
# Killing optional CPU driver warnings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
tf.logging.set_verbosity(tf.logging.ERROR)


class Model:

    def __init__(self, image, label):
        """
        A Model class contains a computational graph that classifies images
        to predictions. Each of its methods builds part of the graph
        on Model initialization. Do not modify the constructor, as doing so
        would break the autograder. You may, however, add class variables
        to use in your graph-building. e.g. learning rate, 

        image: the input image to the computational graph as a tensor
        label: the correct label of an image as a tensor
        prediction: the output prediction of the computational graph,
                    produced by self.forward_pass()
        optimize: the model's optimizing tensor produced by self.optimizer()
        loss: the model's loss produced by computing self.loss_function()
        accuracy: the model's prediction accuracy
        """
        self.image = image
        self.label = label

        # TO-DO: Add any class variables you want to use.

        self.prediction = self.forward_pass()
        self.loss = self.loss_function()
        self.optimize = self.optimizer()
        self.accuracy = self.accuracy_function()

    def forward_pass(self):
        """
        Predicts a label given an image using convolution layers

        :return: the prediction as a tensor
        """
        filter_1 = tf.Variable(tf.truncated_normal([3, 3, 1, 8], stddev=0.1))
        conv_1 = tf.nn.conv2d(self.image, filter_1, [1, 1, 1, 1], "SAME")

        reshaped = tf.reshape(conv_1, shape=[50, -1])

        L1 = reshaped.shape[1].value
        L2 = 500
        W1 = tf.Variable(tf.random_normal([L1, L2], mean=0, stddev=0.01))
        b1 = tf.Variable(tf.random_normal([L2], mean=0, stddev=0.01))
        relu_1 = tf.nn.relu(tf.matmul(reshaped, W1) + b1)

        W2 = tf.Variable(tf.random_normal([L2, 10], mean=0, stddev=0.01))
        b2 = tf.Variable(tf.random_normal([10], mean=0, stddev=0.01))
        logits = tf.nn.relu(tf.matmul(relu_1, W2) + b2)
        return logits

    def loss_function(self):
        """
        Calculates the model cross-entropy loss

        :return: the loss of the model as a tensor
        """
        loss = tf.losses.softmax_cross_entropy(onehot_labels=self.label, logits=self.prediction)
        return loss

    def optimizer(self):
        """
        Optimizes the model loss using an Adam Optimizer

        :return: the optimizer as a tensor
        """
        learning_rate = 0.1
        sgd = tf.train.GradientDescentOptimizer(learning_rate)
        train = sgd.minimize(self.loss)
        return train

    def accuracy_function(self):
        """
        Calculates the model's prediction accuracy by comparing
        predictions to correct labels โ€“ no need to modify this

        :return: the accuracy of the model as a tensor
        """
        correct_prediction = tf.equal(tf.argmax(self.prediction, 1),
                                      tf.argmax(self.label, 1))
        return tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


def main():
    t_start = time.time()

    mnist = input_data.read_data_sets("data/mnist/", one_hot=True)
    batch_sz = 50
    batch = 2000

    inputs = tf.placeholder(shape=[batch_sz, 28, 28, 1], dtype=tf.float32)
    labels = tf.placeholder(shape=[batch_sz, 10], dtype=tf.float32)

    model = Model(inputs, labels)

    session_config = tf.ConfigProto(gpu_options=tf.GPUOptions(allow_growth=True))
    sess = tf.Session(config=session_config)

    # sess = tf.Session()

    sess.run(tf.global_variables_initializer())
    for i in range(batch):
        next_image, next_label = mnist.train.next_batch(batch_sz)
        next_image = next_image.reshape((batch_sz, 28, 28, 1))
        sess.run(model.optimize, feed_dict={inputs: next_image, labels: next_label})

    acc, test_images, test_labels = 0, mnist.test.images, mnist.test.labels
    test_batch = math.ceil(len(test_images) / batch_sz)
    for i in range(test_batch):
        batch_images = test_images[i * batch_sz: (i + 1) * batch_sz]
        batch_images = batch_images.reshape((batch_sz, 28, 28, 1))
        batch_labes = test_labels[i * batch_sz: (i + 1) * batch_sz]
        acc += sess.run(model.accuracy, feed_dict={inputs: batch_images, labels: batch_labes})
    acc /= test_batch
    print(acc)

    print(time.time() - t_start, 'seconds')

    return


if __name__ == '__main__':
    main()
TF 2.0 gpu bug

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

์†Œ์Šค์—์„œ ์ปดํŒŒ์ผ์„ ์‹œ๋„ํ–ˆ์ง€๋งŒ ๊ฐ™์€ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๋งˆ์นจ๋‚ด config.gpu_options.allow_growth = True ์„ค์ •ํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

๋ชจ๋“  181 ๋Œ“๊ธ€

๋™์ผํ•œ GPU "CUDNN_STATUS_INTERNAL_ERROR"์—์„œ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.

RTX 2070 GPU
CUDA 10
cuDNN 7.4.2
Ubuntu 18.04
tf-nightly-gpu (r1.13, 1 ์›” 13 ์ผ)
ํŒŒ์ด์ฌ 3.6.7

2019-01-15 05:01:03.503415: I tensorflow/stream_executor/platform/default/dso_loader.cc:154] successfully opened CUDA li
brary libcublas.so.10.0 locally
2019-01-15 05:01:03.752563: I tensorflow/stream_executor/platform/default/dso_loader.cc:154] successfully opened CUDA li
brary libcudnn.so.7 locally
2019-01-15 05:01:04.905618: E tensorflow/stream_executor/cuda/cuda_dnn.cc:493] Could not create cudnn handle: CUDNN_STAT
US_INTERNAL_ERROR
2019-01-15 05:01:04.908147: E tensorflow/stream_executor/cuda/cuda_dnn.cc:493] Could not create cudnn handle: CUDNN_STAT
US_INTERNAL_ERROR
2019-01-15 05:01:04.908191: W tensorflow/core/framework/op_kernel.cc:1412] OP_REQUIRES failed at conv_ops_fused.cc:801 :
 Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to se
e if a warning log message was printed above.

๋‚˜๋Š” ๊ฐ™์€ ๋ฌธ์ œ๊ฐ€

RTX2080 GPU
CUDA 10
cudnn 7.4.2

๋‹ค์Œ tf ๋ฒ„์ „ tf-nightly-gpu์™€ ๋งˆ์Šคํ„ฐ (060b6e32ad)์—์„œ ์ž์ฒด ์ปดํŒŒ์ผ ๋œ ๋ฒ„์ „์„ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค.
์ถ”๊ฐ€ ๋””๋ฒ„๊ทธ ์ •๋ณด๋ฅผ ์–ป๊ธฐ ์œ„ํ•ด ๋‹ค์Œ ENVIRONMENT ๋ณ€์ˆ˜๋ฅผ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Œ์„ ์•Œ์•˜์Šต๋‹ˆ๋‹ค.

CUDNN_LOGINFO_DBG = 1;
CUDNN_LOGDEST_DBG = stdout

๊ทธ๋Ÿฐ ๋‹ค์Œ ๋‹ค์Œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

I0117 14 : 11 : 24.441819 140433563125568 basic_session_run_hooks.py:594] 0์— ๋Œ€ํ•œ ์ฒดํฌ ํฌ์ธํŠธ๋ฅผ /tmp/mnist/model.ckpt์— ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.
2019-01-17 14 : 11 : 25.916269 : I tensorflow / stream_executor / platform / default / dso_loader.cc : 154] CUDA ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ libcublas.so.10.0์„ ๋กœ์ปฌ์—์„œ ์„ฑ๊ณต์ ์œผ๋กœ ์—ด์—ˆ์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š”! CuDNN (v7402) ํ•จ์ˆ˜ cudnnCreate () ํ˜ธ์ถœ :
๋‚˜๋Š”! ์‹œ๊ฐ„ : 2019-01-17T14 : 11 : 26.079184 (์‹œ์ž‘ ์ดํ›„ 0d + 0h + 0m + 0s)
๋‚˜๋Š”! ํ”„๋กœ์„ธ์Šค = 29255; ์Šค๋ ˆ๋“œ = 29356; GPU = NULL; ํ•ธ๋“ค = NULL; StreamId = NULL์ž…๋‹ˆ๋‹ค.

2019-01-17 14 : 11 : 26.079151 : I tensorflow / stream_executor / platform / default / dso_loader.cc : 154] CUDA ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ libcudnn.so.7์„ ๋กœ์ปฌ์—์„œ ์„ฑ๊ณต์ ์œผ๋กœ ์—ด์—ˆ์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š”! CuDNN (v7402) ํ•จ์ˆ˜ cudnnCreate () ํ˜ธ์ถœ :
๋‚˜๋Š”! ์‹œ๊ฐ„ : 2019-01-17T14 : 11 : 26.571897 (์‹œ์ž‘ ์ดํ›„ 0d + 0h + 0m + 0s)
๋‚˜๋Š”! ํ”„๋กœ์„ธ์Šค = 29255; ์Šค๋ ˆ๋“œ = 29356; GPU = NULL; ํ•ธ๋“ค = NULL; StreamId = NULL์ž…๋‹ˆ๋‹ค.

2019-01-17 14 : 11 : 26.571858 : E tensorflow / stream_executor / cuda / cuda_dnn.cc : 493] cudnn ํ•ธ๋“ค์„ ๋งŒ๋“ค ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค : CUDNN_STATUS_INTERNAL_ERROR
2019-01-17 14 : 11 : 26.579375 : E tensorflow / stream_executor / cuda / cuda_dnn.cc : 493] cudnn ํ•ธ๋“ค์„ ๋งŒ๋“ค ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค : CUDNN_STATUS_INTERNAL_ERROR

๋‚˜๋Š”! CuDNN (v7402) ํ•จ์ˆ˜ cudnnCreate () ํ˜ธ์ถœ :
๋‚˜๋Š”! ์‹œ๊ฐ„ : 2019-01-17T14 : 11 : 26.579803 (์‹œ์ž‘ ์ดํ›„ 0d + 0h + 0m + 0s)
๋‚˜๋Š”! ํ”„๋กœ์„ธ์Šค = 29255; ์Šค๋ ˆ๋“œ = 29356; GPU = NULL; ํ•ธ๋“ค = NULL; StreamId = NULL์ž…๋‹ˆ๋‹ค.

2019-01-17 14 : 11 : 26.585818 : E tensorflow / stream_executor / cuda / cuda_dnn.cc : 493] cudnn ํ•ธ๋“ค์„ ๋งŒ๋“ค ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค : CUDNN_STATUS_INTERNAL_ERROR
2019-01-17 14 : 11 : 26.585850 : W ./tensorflow/stream_executor/stream.h:2109] DNN ์ง€์›์—†์ด StreamExecutor๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ DNN ์ž‘์—… ์ˆ˜ํ–‰ ์‹œ๋„
์—ญ ์ถ”์  (๊ฐ€์žฅ ์ตœ๊ทผ ํ˜ธ์ถœ ๋งˆ์ง€๋ง‰) :
_do_call์˜ ํŒŒ์ผ "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", 1335 ํ–‰
return fn (* args)
ํŒŒ์ผ "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", ์ค„ 1320, _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
_call_tf_sessionrun์˜ ํŒŒ์ผ "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", 1408 ํ–‰
run_metadata)
tensorflow.python.framework.errors_impl.UnknownError : ์ปจ๋ณผ ๋ฃจ์…˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ€์ ธ ์˜ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” cuDNN์ด ์ดˆ๊ธฐํ™”์— ์‹คํŒจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์œ„์— ๊ฒฝ๊ณ  ๋กœ๊ทธ ๋ฉ”์‹œ์ง€๊ฐ€ ์ธ์‡„๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.
[[{{node Discriminator_1 / Conv / Conv2D}}]]
[[๊ธฐ์ฐจ / ์ฐจ๋ณ„ _ ๊ธฐ์ฐจ / ๊ธฐ์ฐจ _ ์šด์šฉ / ์ œ์–ด _ ์ข…์†์„ฑ / _569]]

์œ„์˜ ์˜ˆ์™ธ๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ๋™์•ˆ ๋‹ค๋ฅธ ์˜ˆ์™ธ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.

์—ญ ์ถ”์  (๊ฐ€์žฅ ์ตœ๊ทผ ํ˜ธ์ถœ ๋งˆ์ง€๋ง‰) :
ํŒŒ์ผ "/home/dj/projects/gan/tf_models/research/gan/mnist/train.py", ์ค„ 151, in
tf.app.run ()
์‹คํ–‰์ค‘์ธ "/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py"ํŒŒ์ผ, 125 ํ–‰
_sys.exit (main (argv))
ํŒŒ์ผ "/home/dj/projects/gan/tf_models/research/gan/mnist/train.py", ์ค„ 147, ๊ธฐ๋ณธ
get_hooks_fn = tfgan.get_joint_train_hooks ())
ํŒŒ์ผ "/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/gan/python/train.py", ๋ผ์ธ 1200, gan_train
config = config)
"/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/training/python/training/training.py"ํŒŒ์ผ, 546 ํ–‰, ๊ธฐ์ฐจ
์†์‹ค = session.run (train_op, run_metadata = run_metadata)
ํŒŒ์ผ "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", 693 ํ–‰, ์‹คํ–‰ ์ค‘
run_metadata = run_metadata)
"/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py"ํŒŒ์ผ, 1188 ํ–‰ ์‹คํ–‰ ์ค‘
run_metadata = run_metadata)
"/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py"ํŒŒ์ผ, 1287 ํ–‰ ์‹คํ–‰ ์ค‘
six.reraise ( original_exc_info)๋ฅผ ์˜ฌ๋ฆฌ์‹ญ์‹œ์˜ค.ํŒŒ์ผ "/usr/local/lib/python3.6/dist-packages/six.py", 693 ํ–‰, reraise๊ฐ€์น˜๋ฅผ ๋†’์ด๋‹ค"/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py"ํŒŒ์ผ, 1272 ํ–‰ ์‹คํ–‰ ์ค‘return self._sess.run ( args, ** kwargs)
ํŒŒ์ผ "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", 1336 ํ–‰, ์‹คํ–‰ ์ค‘
feed_dict, ์˜ต์…˜)
_call_hook_before_run์˜ ํŒŒ์ผ "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/monitored_session.py", 1362 ํ–‰
์š”์ฒญ = hook.before_run (run_context)
before_run์˜ "/usr/local/lib/python3.6/dist-packages/tensorflow/contrib/gan/python/train.py", 1061 ํ–‰ ํŒŒ์ผ
run_context.session.run (self._train_ops)
์‹คํ–‰์ค‘์ธ "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py"ํŒŒ์ผ, 930 ํ–‰
run_metadata_ptr)
_run์˜ ํŒŒ์ผ "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", 1153 ํ–‰
feed_dict_tensor, ์˜ต์…˜, run_metadata)
_do_run์˜ ํŒŒ์ผ "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", 1329 ํ–‰
run_metadata)
_do_call์˜ ํŒŒ์ผ "/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py", 1349 ํ–‰
์œ ํ˜• ๋ฐœ์ƒ (e) (node_def, op, message)
tensorflow.python.framework.errors_impl.UnknownError : ์ปจ๋ณผ ๋ฃจ์…˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ€์ ธ ์˜ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” cuDNN์ด ์ดˆ๊ธฐํ™”์— ์‹คํŒจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์œ„์— ๊ฒฝ๊ณ  ๋กœ๊ทธ ๋ฉ”์‹œ์ง€๊ฐ€ ์ธ์‡„๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.
[[node Discriminator_1 / Conv / Conv2D (home / dj / projects / gan / tf_models / research / gan / mnist / networks.py : 152์—์„œ ์ •์˜ ๋จ)]]
[[๊ธฐ์ฐจ / ์ฐจ๋ณ„ _ ๊ธฐ์ฐจ / ๊ธฐ์ฐจ _ ์šด์šฉ / ์ œ์–ด _ ์ข…์†์„ฑ / _569]]

์ž…๋ ฅ ์ž‘์—…์—์„œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋…ธ๋“œ Discriminator_1 / Conv / Conv2D์— ์—ฐ๊ฒฐ๋œ ์ž…๋ ฅ ์†Œ์Šค ์ž‘์—… :
inputs / batch / n (home / dj / projects / gan / tf_models / research / gan / mnist / data_provider.py : 67์— ์ •์˜ ๋จ)

'Discriminator_1 / Conv / Conv2D'์— ๋Œ€ํ•œ ์›๋ณธ ์Šคํƒ ์ถ”์  :
ํŒŒ์ผ "home / dj / projects / gan / tf_models / research / gan / mnist / train.py", ์ค„ 151, in
tf.app.run ()
"usr / local / lib / python3.6 / dist-packages / tensorflow / python / platform / app.py"ํŒŒ์ผ, 125 ํ–‰, ์‹คํ–‰ ์ค‘
_sys.exit (main (argv))
ํŒŒ์ผ "home / dj / projects / gan / tf_models / research / gan / mnist / train.py", 87 ํ–‰, ๊ธฐ๋ณธ
[FLAGS.batch_size, FLAGS.noise_dims]))
gan_model์˜ "usr / local / lib / python3.6 / dist-packages / tensorflow / contrib / gan / python / train.py", 118 ํ–‰ ํŒŒ์ผ
ํŒ๋ณ„ ์ž _ ์‹ค์ œ ์ถœ๋ ฅ = ํŒ๋ณ„ ์ž _fn (์‹ค์ œ _ ๋ฐ์ดํ„ฐ, ์ƒ์„ฑ๊ธฐ _ ์ž…๋ ฅ)
ํŒŒ์ผ "home / dj / projects / gan / tf_models / research / gan / mnist / networks.py", 176 ํ–‰, ๋ฌด์กฐ๊ฑด์  _ ์ฐจ๋ณ„ ์ž
net = _discriminator_helper (img, False, None, weight_decay)
ํŒŒ์ผ "home / dj / projects / gan / tf_models / research / gan / mnist / networks.py", ์ค„ 152, _discriminator_helper
net = layers.conv2d (img, 64, [4, 4], stride = 2)
func_with_args์—์žˆ๋Š” "usr / local / lib / python3.6 / dist-packages / tensorflow / contrib / framework / python / ops / arg_scope.py", 182 ํ–‰ ํŒŒ์ผ
return func ( args, * current_args)
convolution2d์˜ ํŒŒ์ผ "usr / local / lib / python3.6 / dist-packages / tensorflow / contrib / layers / python / layers / layers.py", 1155 ํ–‰
conv_dims = 2)
func_with_args์—์žˆ๋Š” "usr / local / lib / python3.6 / dist-packages / tensorflow / contrib / framework / python / ops / arg_scope.py", 182 ํ–‰ ํŒŒ์ผ
return func ( args, * current_args)
ํŒŒ์ผ "usr / local / lib / python3.6 / dist-packages / tensorflow / contrib / layers / python / layers / layers.py", ๋ผ์ธ 1058, ํšŒ์„ 
์ถœ๋ ฅ = layer.apply (์ž…๋ ฅ)
ํŒŒ์ผ "usr / local / lib / python3.6 / dist-packages / tensorflow / python / keras / engine / base_layer.py", ๋ผ์ธ 1228, ์ ์šฉ
return self .__ call __ (inputs, args, * kwargs)
ํŒŒ์ผ "usr / local / lib / python3.6 / dist-packages / tensorflow / python / layers / base.py", ๋ผ์ธ 531, __call__
outputs = super (Layer, self) .__ call __ (inputs, args, * kwargs)
"usr / local / lib / python3.6 / dist-packages / tensorflow / python / keras / engine / base_layer.py", 564 ํ–‰, __call__์— ํŒŒ์ผ
outputs = self.call (inputs, args, * kwargs)
ํŒŒ์ผ "usr / local / lib / python3.6 / dist-packages / tensorflow / python / keras / layers / convolutional.py", ๋ผ์ธ 196, ํ˜ธ์ถœ ์ค‘
์ถœ๋ ฅ = self._convolution_op (inputs, self.kernel)
__call__์—์„œ "usr / local / lib / python3.6 / dist-packages / tensorflow / python / ops / nn_ops.py", 966 ํ–‰ ํŒŒ์ผ
return self.conv_op (inp, filter)
__call__์—์žˆ๋Š” "usr / local / lib / python3.6 / dist-packages / tensorflow / python / ops / nn_ops.py", 591 ํ–‰ ํŒŒ์ผ
return self.call (inp, filter)
__call__์—์„œ "usr / local / lib / python3.6 / dist-packages / tensorflow / python / ops / nn_ops.py", 208 ํ–‰ ํŒŒ์ผ
name = self.name)
conv2d์˜ ํŒŒ์ผ "usr / local / lib / python3.6 / dist-packages / tensorflow / python / ops / nn_ops.py", 1578 ํ–‰
์ด๋ฆ„ = ์ด๋ฆ„)
conv2d์˜ ํŒŒ์ผ "usr / local / lib / python3.6 / dist-packages / tensorflow / python / ops / gen_nn_ops.py", 1040 ํ–‰
data_format = data_format, dilations = ํ™•์žฅ, ์ด๋ฆ„ = ์ด๋ฆ„)
ํŒŒ์ผ "usr / local / lib / python3.6 / dist-packages / tensorflow / python / framework / op_def_library.py", 788 ํ–‰, _apply_op_helper์—
op_def = op_def)
new_func์—์žˆ๋Š” "usr / local / lib / python3.6 / dist-packages / tensorflow / python / util / deprecation.py", 501 ํ–‰ ํŒŒ์ผ
return func ( args, * kwargs)
create_op์—์„œ "usr / local / lib / python3.6 / dist-packages / tensorflow / python / framework / ops.py", 3300 ํ–‰ ํŒŒ์ผ
op_def = op_def)
__init__์—์žˆ๋Š” ํŒŒ์ผ "usr / local / lib / python3.6 / dist-packages / tensorflow / python / framework / ops.py", 1801 ํ–‰
self._traceback = tf_stack.extract_stack ()

๋ˆ„๊ตฐ๊ฐ€ ์–ด๋–ค ์•„์ด๋””์–ด? ๋‚ด ์ „์ฒด ํ™˜๊ฒฝ์„ ๋‹ค์‹œ ์„ค์น˜ํ•˜๊ธฐ ์ง์ „์— :-(

์†Œ์Šค์—์„œ r1.13์„ ์ปดํŒŒ์ผ ํ•ด๋ณด์‹ญ์‹œ์˜ค. ์‹œ๊ฐ„์ด ์˜ค๋ž˜ ๊ฑธ๋ฆฌ์ง€ ๋งŒ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ ์–ด๋„ ๊ทธ๊ฒƒ์€ ๋‚ด ๊ฒƒ์„ ๊ณ ์ณค์Šต๋‹ˆ๋‹ค.

์†Œ์Šค์—์„œ ์ปดํŒŒ์ผ์„ ์‹œ๋„ํ–ˆ์ง€๋งŒ ๊ฐ™์€ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๋งˆ์นจ๋‚ด config.gpu_options.allow_growth = True ์„ค์ •ํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค (RTX 2060, Ubuntu 18.04, Python 3.6.7, CUDA 10.0.130, cuDNN 7.4.2, Tensorflow 1.13.0-rc0 ์†Œ์Šค์—์„œ). @ va-andrew์˜ ์ œ์•ˆ ๋•๋ถ„์— allow_growth ์˜ต์…˜ ์„ธํŠธ์™€ ํ•จ๊ป˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

FWIW, ์ด์— ๋Œ€ํ•œ ํ•ด๊ฒฐ์ฑ…์„ ์ฐพ๋Š” ๊ณผ์ •์—์„œ์ด ๋ฌธ์ œ๋Š” RTX ์‹œ๋ฆฌ์ฆˆ์˜ ์ผ๋ฐ˜์ ์ธ ๋ฌธ์ œ์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค (์ƒˆ ์นด๋“œ๊ฐ€ ์ด์ „ ๋ฒ„์ „์„ ์ง€์›ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— CUDA 10.0์—์„œ๋Š” ์ผ๋ฐ˜์ ์ธ ๋ฌธ์ œ ์ผ ์ˆ˜ ์žˆ์Œ). ์ด ์นด๋“œ์— ๋Œ€ํ•ด ํŠน๋ณ„ํ•œ ์˜ต์…˜์„ ์„ค์ •ํ•  ํ•„์š”๊ฐ€ ์—†๋„๋ก 1.13 ๋ฆด๋ฆฌ์Šค์—์„œ ๊ธฐ๋ณธ๊ฐ’์„ ์—…๋ฐ์ดํŠธ ํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋‹ค์Œ ๊ตฌ์„ฑ์—์„œ๋„ ์ด๊ฒƒ์„ ๊ฒฝํ—˜ํ–ˆ์Šต๋‹ˆ๋‹ค.

  • https://github.com/tensorflow/benchmarks ์—์„œ tf ๋ฒค์น˜ ๋งˆํฌ ์‹คํ–‰
  • RTX 2080
  • Ubuntu 18.04
  • CUDA 10.0
  • Nvidia ๋“œ๋ผ์ด๋ฒ„ 415.27
  • Tensorflow 1.13.0-dev20190125
  • CuDNN 7.4.2
  • ํŒŒ์ด์ฌ 3

์•ˆ์ •๋œ ๋ชจ๋“  ๋ฆด๋ฆฌ์Šค๊ฐ€ ํฌํ•จ ๋œ Tensorflow Docker GPU ์ปจํ…Œ์ด๋„ˆ๋„ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค (CUDNN_STATUS_INTERNAL_ERROR๋ฅผ๋ณด๊ณ ํ•˜๋Š” ๋Œ€์‹  segfault๋ฅผ ๋ฐ”๋กœ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค).

ํฅ๋ฏธ๋กญ๊ฒŒ๋„ Tensorflow v1.12๊ฐ€ ์„ค์น˜๋œ Windows 10์—์„œ๋Š” ๋ชจ๋“  ๊ฒƒ์ด ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค!

๊ทธ๋ฆฌ๊ณ  ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์ด๋ณด๊ณ  ํ•œ ๋ฐ”์— ๋”ฐ๋ฅด๋ฉด allow_growth๋ฅผ ์„ค์ •ํ•˜๋ฉด ์ œ๋Œ€๋กœ ์‹คํ–‰๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์—๋„ ๊ฐ™์€ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

  • RTX 2070
  • Ubuntu 18.04
  • CudNN 7.4.2 (๊ทธ๋Ÿฌ๋‚˜ ์šด์ด ์ข‹์ง€ ์•Š์€ ๋‹ค๋ฅธ ์ด์ „ ๋ฒ„์ „์œผ๋กœ ์ปดํŒŒ์ผ์„ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค)
  • Tensorflow 1.13.0-dev20190125 (๋˜ํ•œ Cuda 10์œผ๋กœ ์ปดํŒŒ์ผ ๋œ Tensorflow 1.12 ์‹œ๋„)

๊ทธ๋ฆฌ๊ณ  ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์ด๋ณด๊ณ ํ–ˆ๋“ฏ์ด allow_growth = TRUE๋ฅผ ์„ค์ •ํ•˜๋ฉด ์ž‘์—…์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ•ด๊ฒฐ ๋œ ์ดํ›„์ด ๋ฌธ์ œ๋ฅผ ์ข…๊ฒฐํ•ฉ๋‹ˆ๋‹ค. ๊ฐ์‚ฌ!

@ymodak ์ด ๋ฒ„๊ทธ๋ฅผ ์ˆ˜์ • ํ•œ PR์„ ์ฐธ์กฐ ํ•ด ์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

RTX 2080์—์„œ tf-nightly-gpu-2.0-preview ์™€ ์œ ์‚ฌํ•œ ๋ฌธ์ œ ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

RTX2080๊ณผ ๋™์ผํ•œ ๋ฌธ์ œ๋กœ์ด ์ˆ˜์ • ์‚ฌํ•ญ์„ ์ฐพ์„ ๋•Œ๊นŒ์ง€ 2 ์ผ ๋™์•ˆ ์žฌ ์ปดํŒŒ์ผํ•˜๊ณ  ๋ฒ„๊ทธ๋ฅผ ์‚ฌ๋ƒฅํ–ˆ์Šต๋‹ˆ๋‹ค.
(allow_growth = true๊ฐ€ ์ˆ˜์ •ํ–ˆ์Šต๋‹ˆ๋‹ค)

๋‹น์‹ ์€ ๋‚ด ํ•˜๋ฃจ๋ฅผ

์‹ค์ œ๋กœ allow_growth = true๋ฅผ ์–ด๋–ป๊ฒŒ ์„ค์ •ํ•ฉ๋‹ˆ๊นŒ? tf-nightly-gpu-2.0-preview๊ฐ€ ์žˆ๊ณ  ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค.

tensorflow๋ฅผ tf๋กœ ๊ฐ€์ ธ ์˜ค๊ธฐ
๊ตฌ์„ฑ = tf.ConfigProto ()
config.gpu_options.allow_growth = True
์„ธ์…˜ = tf.Session (config = config, ...)

ํ•˜์ง€๋งŒ์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

AttributeError Traceback (๊ฐ€์žฅ ์ตœ๊ทผ ํ˜ธ์ถœ ๋งˆ์ง€๋ง‰)
์—()
1 tf๋กœ tensorflow ๊ฐ€์ ธ ์˜ค๊ธฐ
----> 2 ๊ตฌ์„ฑ = tf.ConfigProto ()

AttributeError : 'tensorflow'๋ชจ๋“ˆ์— 'ConfigProto'์†์„ฑ์ด ์—†์Šต๋‹ˆ๋‹ค.

tensorflow 2.0์—์„œ allow_growth๋ฅผ ์–ด๋–ป๊ฒŒ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

์ข‹์•„, tf-nightly-gpu-2.0-preview ๋ฐ ipython ๋…ธํŠธ๋ถ์—์„œ ์ž‘๋™ํ•˜๋„๋ก ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค.

tensorflow.compat.v1์—์„œ ConfigProto ๊ฐ€์ ธ ์˜ค๊ธฐ
tensorflow.compat.v1์—์„œ InteractiveSession ๊ฐ€์ ธ ์˜ค๊ธฐ

๊ตฌ์„ฑ = ConfigProto ()
config.gpu_options.allow_growth = True
์„ธ์…˜ = InteractiveSession (config = config)

๋™์ผํ•œ ๋ฌธ์ œ, gpu_options.allow_growth = True๋กœ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

@newhouseb ๋ชจ๋“  ๋ฒค์น˜ ๋งˆํฌ์— ๋Œ€ํ•ด ์–ด๋–ป๊ฒŒ / ์–ด๋””์„œ ์‚ฌ์‹ค์„ ์„ค์ • ํ–ˆ์Šต๋‹ˆ๊นŒ? ์‰ฌ์šด ๋ณ€ํ™” ์˜€๋‚˜์š”?

๋‹ด์š”๋Š” ์„ฑ์žฅ์„ ํ—ˆ์šฉํ•ฉ๋‹ˆ๊นŒ?

๊ธฐ๋ณธ์ ์œผ๋กœ ๊บผ์ ธ ์žˆ์Šต๋‹ˆ๋‹ค.
https://www.tensorflow.org/guide/using_gpu#allowing_gpu_memory_growth

๋‚ด ํ”„๋กœ๊ทธ๋žจ์—์„œ ๋ฉ”๋ชจ๋ฆฌ ๊ด€๋ฆฌ๊ฐ€ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค

๋‚ด ๊ทธ๋ž˜ํ”ฝ ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์—์„œ GPU ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋‹ค๋ฅธ ์šฉ๋„๋กœ ์‚ฌ์šฉ๋˜๋ฉฐ ์ œํ•œ๋œ ๊ณต๊ฐ„์— ๋ฐฐ์น˜ํ•˜๋Š” ๊ฒƒ์ด ๋ฉ”๋ชจ๋ฆฌ ๋ถ€์กฑ ์˜ค๋ฅ˜๋ฅผ ๋ฐฉ์ง€ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๊ธฐ ๋•Œ๋ฌธ์— TF์—์„œ ์‚ฌ์šฉํ•˜๋Š” GPU์˜ ์–‘์„ ์ œํ•œํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

Windows์—์„œ C ++๋กœ ์ž‘์—… ์ค‘์ž…๋‹ˆ๋‹ค.

์„ฑ์žฅ ํ—ˆ์šฉ ์˜ต์…˜์„ ์ถ”๊ฐ€ํ•˜๋ฉด OOM ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

์ด ์ฝ”๋“œ ์ค„์ด ์—†์œผ๋ฉด ๋ชจ๋ธ์€ ๋™์ผํ•œ ์นด๋“œ๋กœ ๋™์ผํ•œ ์‹œ์Šคํ…œ์—์„œ ์ž˜ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค.

OOM ์˜ค๋ฅ˜

options.config.mutable_gpu_options()->set_allow_growth(true);
options.config.mutable_gpu_options()->set_per_process_gpu_memory_fraction(fraction);

OOM ์˜ค๋ฅ˜์—†์ด

//options.config.mutable_gpu_options()->set_allow_growth(true);
options.config.mutable_gpu_options()->set_per_process_gpu_memory_fraction(fraction);

๋”ฐ๋ผ์„œ set allow growth๋กœ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋ ค๋ฉด segfault๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

@ymodak ์ด ๋ฒ„๊ทธ๋Š” ์ˆ˜์ •๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ํ‹€๋ฆผ์—†์ด ๋ชจ๋“  ์ข…๋ฅ˜์˜ convnet์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์€ ๊ธฐ๋ณธ ๊ตฌ์„ฑ์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. allow_growth๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ true ์—ฌ์•ผํ•˜๊ณ , ์ด๊ฒƒ์ด ์ž‘๋™ํ•˜๋„๋ก ์ˆ˜์ •ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด CUDNN_STATUS_INTERNAL_ERROR ๋ณด๋‹ค ๋” ๋‚˜์€ ์˜ค๋ฅ˜๊ฐ€ ์žˆ์–ด์•ผํ•ฉ๋‹ˆ๋‹ค.

@ymodak ์ด ๋ฌธ์ œ๊ฐ€ ์กฐ๊ธฐ์— ์ข…๋ฃŒ ๋œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ด ๋ฌธ์ œ์— ๋Œ€ํ•œ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์ด ์žˆ์ง€๋งŒ ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ ์ฝ”๋“œ ๋ณ€๊ฒฝ์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ์ ์œผ๋กœ ์˜ˆ์ œ ์ฝ”๋“œ๋Š” RTX ์นด๋“œ์—์„œ _ ์ฆ‰์‹œ ์ž‘๋™ํ•˜์ง€ ์•Š์œผ๋ฉฐ, ์˜จ๋ผ์ธ์—์„œ ๋Œ€๋ถ€๋ถ„์˜ ๋ ˆ์‹œํ”ผ๋„ ์ˆ˜์ •ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

@samhodge ๋Š” ์ง์ ‘ ๊ฒŒ์‹œ ํ•œ tensorflow ๋ฌธ์„œ ํŽ˜์ด์ง€ ์— ์ œ์•ˆ ๋œ๋Œ€๋กœ config.gpu_options.per_process_gpu_memory_fraction = 0.4 ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ OOM์„ ๋ฐฉ์ง€ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๊นŒ?

๋‚ด RTX 2080์—์„œ tensorflow-gpu๋ฅผ ํ™œ์„ฑํ™”ํ•˜๋Š”์ด ๋ถ€์šธ ํ•ดํ‚น์œผ๋กœ ํ˜ผ๋ž€์Šค๋Ÿฌ์›Œํ•ฉ๋‹ˆ๋‹ค : ํ•œ ๋ฒˆ์— ํ•˜๋‚˜์˜ tensorflow ์Šคํฌ๋ฆฝํŠธ / jupyter ๋…ธํŠธ๋ถ์—๋งŒ GPU๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ์ด allow_growth = True ์ด ๋ฌธ์ œ๊ฐ€๋ฉ๋‹ˆ๊นŒ? (ํ™”๋ฉด ๋“ฑ์˜ ํ‘œ์ค€ GPU ์‚ฌ์šฉ๋Ÿ‰์— ์ถ”๊ฐ€)

์ปดํ“จํ„ฐ์— ์ •์  ML ์Šคํƒ์„ ์„ค์ •ํ•˜๋ ค๊ณ ํ•˜๋Š”๋ฐ ์ด๊ฒƒ์ด ์–ด๋Š ์‹œ์ ์—์„œ ์—‰๋ง์ด ๋ ์ง€ ์•Œ๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค (ํฐ ๊ทธ๋ฆฌ๋“œ ๊ฒ€์ƒ‰, ๋งค๊ฐœ ๋ณ€์ˆ˜๊ฐ€ ๋งŽ์€ ๋ชจ๋ธ ๋“ฑ). ์ด ๋‚ด๋ถ€ ์˜ค๋ฅ˜๋ฅผ ํ”ผํ•˜๊ธฐ ์œ„ํ•ด ์†Œ์Šค์—์„œ ๋นŒ๋“œํ•ด์•ผํ•˜๋Š”์ง€ ์•„๋‹ˆ๋ฉด์ด ๋ถ€์šธ์„ ๋ณ€๊ฒฝํ•ด์•ผํ•˜๋Š”์ง€ ์•„์ง ํŒŒ์•…ํ•˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.

์ข‹์Šต๋‹ˆ๋‹ค. ์„ธ์…˜์„ ๋งŒ๋“ค๊ธฐ ์ „์— ๋ฌธ์ œ์˜ ์›์ธ์„ ์ฐพ์€ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. GPU RAM์„ ๋ฌด๋ฃŒ๋กœ ์ธก์ •ํ•˜๋ฏ€๋กœ 8Gb ์นด๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  6Gb๊ฐ€ ๋ฌด๋ฃŒ ์ธ ๊ฒฝ์šฐ 0.75์˜ ์ผ๋ถ€๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ๋•Œ๋กœ๋Š” OOM์œผ๋กœ ๋๋‚˜์ง€๋งŒ ์ตœ๊ทผ์—๋Š” 0.95 * 0.75๋กœ ์‹คํ—˜ํ–ˆ์ง€๋งŒ ์•„์ง OOM์ด ์—†์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ Tensorflow ํ• ๋‹น ๊ณต๊ฐ„์„ ํ•œ๊ณ„๊นŒ์ง€ ๋ฐ€์–ด ๋„ฃ์œผ๋ฉด ๋•Œ๋•Œ๋กœ ์ถฉ๋Œํ•ฉ๋‹ˆ๋‹ค. ๋ถ„๋ช…ํžˆ ๊ฐœ๋ณ„ Op์— ๋Œ€ํ•œ ์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ์ด ์ ํ•ฉํ•˜์ง€ ์•Š์œผ๋ฉด OOM์ด ๋ฐœ์ƒํ•˜์ง€๋งŒ ์ด์— ๋Œ€ํ•ด ์ธก์ •ํ•˜๋ฉด ์ ํ•ฉ ์—ฌ๋ถ€์— ๋”ฐ๋ผ GPU ๋˜๋Š” CPU๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

@samhodge ํ›Œ๋ฅญํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๊ฒฐ๊ตญ allow_growth ๋ถ€์šธ ํ•ดํ‚น์€ ์ฃผ์š” GPU ์ž‘์—…์ด ๋ณ‘๋ ฌ๋กœ ์‹œ์ž‘๋˜์ง€ ์•Š๊ณ  _ ํ•œ ๋ฒˆ์— _ ์ฒ˜๋ฆฌ๋˜๋Š” ๊ฒƒ์ด ํ…์„œ ํ”Œ๋กœ์šฐ (๋ฐฐ์น˜ ํฌ๊ธฐ๊ฐ€ ์ค‘์š” ํ•  ์ˆ˜ ์žˆ์Œ)๊ฐ€ ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ ์†”๋ฃจ์…˜์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. GPU์—์„œ ์ œ๊ณตํ•˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ์˜ค๋ฒ„ํ”Œ๋กœ ...?

๋ธŒ๋ผ์šฐ์ €์—์„œ๋„ ๋ชจ๋“  ๊ฒƒ์ด GPU๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

CUDA 10.0 / cuDNN 7.4.2.24/Nvidia ๋“œ๋ผ์ด๋ฒ„ 410 / Ubuntu 16.04์™€ ํ•จ๊ป˜ pip์—์„œ tensorflow-gpu 1.13.1์„ ์‚ฌ์šฉํ•˜์—ฌ GTX 1050์—์„œ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

์—ฌ์ „ํžˆ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ์žˆ์ง€๋งŒ "config.gpu_options.allow_growth = True"๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. TF-gpu 1.14.1 ๋ฐ TF-gpu 2.0 ๋ชจ๋‘์—์„œ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. RTX1070, CUDA 10.0, Ubuntu 18.04, Nvidia ๋“œ๋ผ์ด๋ฒ„ 430.09.

๋‹น์‹ ์ด๋ณด๊ณ ์žˆ๋Š” ๋ฌธ์ œ์— ๋Œ€ํ•œ ์„ค๋ช…์€ (ํŠน์ • ๋ฒ„์ „์˜) cuDNN์ด ํ•ธ๋“ค์„ ๋งŒ๋“ค ๋•Œ GPU ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ• ๋‹นํ•˜๋ ค๊ณ ํ•œ๋‹ค๊ณ  ๋ฏฟ๊ฒŒ ๋งŒ๋“ญ๋‹ˆ๋‹ค. TensorFlow๊ฐ€ ์ด๋ฏธ ๋ชจ๋“  ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค๋ฉด (config.gpu_options.allow_growth = false ๋˜๋Š” per_process_gpu_memory_fraction์ด 1.0์— ๊ฐ€๊น๊ธฐ ๋•Œ๋ฌธ์—) cuDNN์— ํ• ๋‹น ํ•  ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋‚จ์•„ ์žˆ์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

nvprof๋ฅผ ํ†ตํ•ด TensorFlow๋ฅผ ์‹คํ–‰ํ•˜์—ฌ์ด๋ฅผ ํ™•์ธํ•˜๊ณ  ์‹คํŒจํ•œ cuMemAlloc ํ˜ธ์ถœ์„ ๊ฒ€์‚ฌํ•˜๊ธฐ์œ„ํ•œ API ์ถ”์ ์„ ์ƒ์„ฑ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฌธ์ œ # 6698์€ ๋™์ผํ•œ ๋ฌธ์ œ๋ฅผ ๋…ผ์˜ํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ผ๋ถ€ ์‚ฌ๋žŒ๋“ค์€ ์ž์‹ ์˜ CUDA ๋ฒ„์ „๊ณผ ์ผ์น˜ํ•˜์ง€ ์•Š๋Š” cuDNN ๋ฆด๋ฆฌ์Šค๋ฅผ ์‹ค์ˆ˜๋กœ ์‚ฌ์šฉํ–ˆ๋‹ค๋Š” ์‚ฌ์‹ค์„ ์•Œ๊ฒŒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. CUDA 10์œผ๋กœ ์‹คํ–‰ํ•  ๋•Œ CUDA 10์— cuDNN์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š”์ง€ ํ™•์ธํ•ด ์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

๋‚ด๊ฐ€ ๋ฐ”๋ณด์ด๊ธฐ ๋•Œ๋ฌธ์— cuDNN์ด ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์„ค์น˜๋˜์ง€ ์•Š์€ ๊ฒƒ์œผ๋กœ ๋ฐํ˜€์กŒ์Šต๋‹ˆ๋‹ค. ๊ฐ€์ ธ ์™€์„œ TF2๋ฅผ ์•ผ๊ฐ„์— ๋‹ค์‹œ ์„ค์น˜ํ•˜๊ณ  ์„ฑ์žฅ์„ ํ—ˆ์šฉํ•˜๊ธฐ ์œ„ํ•ด ๋ผ์ธ์„ ์ถ”๊ฐ€ํ–ˆ์œผ๋ฉฐ ๋ชจ๋“  ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

Conda์—์„œ cudatoolkit ๋ฐ cudnn์„ ์‚ญ์ œํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

Anaconda์— ํฌํ•จ ๋œ (๋˜๋Š” ๋‚ด์žฅ ๋œ) cudnn์—๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์˜ค๋ฅ˜๊ฐ€ ์žˆ์œผ๋ฏ€๋กœ conda๊ฐ€ ์„ค์น˜๋œ cudatoolkit ๋ฐ cudnn์„ ์ œ๊ฑฐํ•˜๊ณ  Nvidia ์›น ์‚ฌ์ดํŠธ์—์„œ ๋…๋ฆฝ์  ์ธ CUDA ๋ฐ cudnn์„ ์„ค์น˜ํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

์˜ค๋ฅ˜ : ์ปจ๋ณผ ๋ฃจ์…˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ€์ ธ ์˜ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” cuDNN์ด ์ดˆ๊ธฐํ™”์— ์‹คํŒจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์œ„์— ๊ฒฝ๊ณ  ๋กœ๊ทธ ๋ฉ”์‹œ์ง€๊ฐ€ ์ธ์‡„๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.

๋‹จ, ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ช…๋ น์–ด๋ฅผ ์‚ฌ์šฉํ•˜์ง€๋งŒ ์ œ๊ฑฐ ํ•  ์ˆ˜์—†๋Š” ๋™์•ˆ์—๋Š” ์ œ๊ฑฐ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
conda ์ œ๊ฑฐ --name cuda --all
conda ์ œ๊ฑฐ --name cudnn --all

๊ฒฝ๋กœ์— cudatoolkit-10.0.130-0 ๋ฐ cudnn-7.3.1-cuda10.0.0_0์„ ํฌํ•จํ•œ ๋‘ ๊ฐœ์˜ ๋ฌธ์„œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

/home/anaconda3/pkgs/cudatoolkit-10.0.130-0
/home/anaconda3/pkgs/cudnn-7.3.1-cuda10.0.0_0

Anaconda์— ํฌํ•จ ๋œ (๋˜๋Š” ํฌํ•จ ๋œ) cuda ๋ฐ cudnn์„ ์–ด๋–ป๊ฒŒ ์‚ญ์ œ (๋˜๋Š” ์ œ๊ฑฐ) ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

๋ฏธ๋ฆฌ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

๋งˆ์ดํฌ

@ mikechen66 conda ์˜ ์ถœ๋ ฅ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ๋‹ค๋ฅธ ํŒจํ‚ค์ง€๊ฐ€ cuda ๋ฐ cudnn์— ์˜์กดํ•˜๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฒ˜์Œ์— ์‚ญ์ œํ•˜๋ ค๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ์‚ฌ์šฉ์ž ์ •์˜ ํ™˜๊ฒฝ์„ ์–ป์œผ๋ ค๋ฉด ์•„๋‚˜์ฝ˜๋‹ค ๋Œ€์‹  ๋ฏธ๋‹ˆ ์ฝ˜ ๋‹ค๋ฅผ ์‚ฌ์šฉํ•˜์‹ญ์‹œ์˜ค. Miniconda์—๋Š” conda ๋งŒ ์ œ๊ณต๋˜๋ฉฐ ํ•„์š”ํ•œ ๋ชจ๋“  ํŒจํ‚ค์ง€๋ฅผ ์ˆ˜๋™์œผ๋กœ ์„ค์น˜ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š” tydlwav :

์˜๊ฒฌ์„ ๋ณด๋‚ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ํ•ต์‹ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์˜ ๋ฒ„์ „ ํ˜ธํ™˜์„ฑ ๋ฐ ์ถœ์‹œ์ผ์„ ํ™•์ธํ•œ ํ›„ ๊ด€๋ จ ๊ฐœ๋ฐœ ํ™˜๊ฒฝ์„ ์„ค์น˜ํ•˜๊ณ  ๊ฐ„๋‹จํ•œ MNIST ํ…Œ์ŠคํŠธ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜์—ฌ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ถœ๋ ฅ์„ ์–ป์—ˆ์Šต๋‹ˆ๋‹ค.

Anaconda3๋Š” cudnn๊ณผ TensorFlow์˜ ํ•ต์‹ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์กฐ์ฐจ ์ง€์›ํ•  ์ˆ˜ ์—†๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ Anaconda3์˜ ํฐ ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ Anaconda์—์„œ ๊ฒฝ๋Ÿ‰ cudnn ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ญ์ œํ•˜๊ณ  ๋…๋ฆฝ์ ์ด๊ณ  ๊ฐ•๋ ฅํ•œ Nvidia cuda ๋ฐ cudnn ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ๋ช‡ ๊ฐ€์ง€ ์ œ์•ˆ์„ ํ•ด์ฃผ์„ธ์š”.

  1. ์„ค์น˜ ํ™˜๊ฒฝ

Nvidia GeForce RTX 2060
๊ทธ๋ž˜ํ”ฝ ๋“œ๋ผ์ด๋ฒ„ : NVIDIA-Linux-x86_64-415.27 (2019 ๋…„ 1 ์›” 15 ์ผ)
RTX 2060์„ ์ง€์›ํ•˜๋Š” ์ฒซ ๋ฒˆ์งธ ๋ฒ„์ „
Anaconda3 : Anaconda3-2019.03-Linux-x86_64.sh (2019.04-04)
-cudatoolkit-10.0.130-0
-cudnn-7.3.1-cuda10.0.0_0
-TensorFlow 13.1
-Juputer ๋…ธํŠธ๋ถ ๋ฐ ipykernel
--Ananconda3์— ์˜ํ•ด ๊ธฐ๋ณธ๊ฐ’

  1. MNIST ํ…Œ์ŠคํŠธ ์ฝ”๋“œ :

์ˆ˜์ž… ์ผ€ ๋ผ์Šค
keras.datasets์—์„œ mnist ๊ฐ€์ ธ ์˜ค๊ธฐ
keras.models์—์„œ ๊ฐ€์ ธ ์˜ค๊ธฐ ์ˆœ์ฐจ
keras.layers import Dense, Dropout์—์„œ
keras.layers์—์„œ ๊ฐ€์ ธ ์˜ค๊ธฐ Flatten, MaxPooling2D, Conv2D
keras.callbacks์—์„œ ๊ฐ€์ ธ ์˜ค๊ธฐ TensorBoard

(X_train, y_train), (X_test, y_test) = mnist.load_data ()

X_train = X_train.reshape (60000,28,28,1) .astype ( 'float32')
X_test = X_test.reshape (10000,28,28,1) .astype ( 'float32')

X_train / = 255
X_test / = 255

n_classes = 10
y_train = keras.utils.to_categorical (y_train, n_classes)
y_test = keras.utils.to_categorical (y_test, n_classes)

๋ชจ๋ธ = Sequential ()
model.add (Conv2D (32, kernel_size = (3,3), activation = 'relu', input_shape = (28,28,1)))
model.add (Conv2D (64, kernel_size = (3,3), activation = 'relu'))
model.add (MaxPooling2D (pool_size = (2,2)))
model.add (๋“œ๋กญ ์•„์›ƒ (0.25))
model.add (Flatten ())
model.add (Dense (128, activation = 'relu'))
model.add (๋“œ๋กญ ์•„์›ƒ (0.5))
model.add (๋ฐ€๋„ (n_classes, activation = 'softmax'))

model.compile (loss = 'categorical_crossentropy', optimizer = 'adam', metrics = [ 'accuracy'])

tensor_board = TensorBoard ( './ logs / LeNet-MNIST-1')

model.fit (X_train, y_train, batch_size = 128, epochs = 15, verbose = 1,
validation_data = (X_test, y_test), ์ฝœ๋ฐฑ = [tensor_board])

  1. ์ถœ๋ ฅ :

TensorFlow ๋ฐฑ์—”๋“œ ์‚ฌ์šฉ.

๊ฒฝ๊ณ  : tensorflow : From
์—…๋ฐ์ดํŠธ ์ง€์นจ :
๋ฐฐ์น˜์ž๊ฐ€ ์ž๋™์œผ๋กœ ์ฒ˜๋ฆฌํ•˜๋Š” ์ฝ” ๋กœ์ผ€์ด์…˜.
๊ฒฝ๊ณ  : tensorflow : From
์—…๋ฐ์ดํŠธ ์ง€์นจ :
์‚ฌ์šฉํ•˜์‹ญ์‹œ์˜ค rate ๋Œ€์‹  keep_prob . ์š”์œจ์€ rate = 1 - keep_prob ๋กœ ์„ค์ •๋˜์–ด์•ผํ•ฉ๋‹ˆ๋‹ค.
๊ฒฝ๊ณ  : tensorflow : From /home/mike/anaconda3/envs/tf-gpu/lib/python3.7/site-packages/tensorflow/python/ops/math_ops.py:3066 : to_int32 (from tensorflow.python.ops.math_ops )๋Š” ๋” ์ด์ƒ ์‚ฌ์šฉ๋˜์ง€ ์•Š์œผ๋ฉฐ ํ–ฅํ›„ ๋ฒ„์ „์—์„œ ์ œ๊ฑฐ ๋  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.
์—…๋ฐ์ดํŠธ ์ง€์นจ :
๋Œ€์‹  tf.cast๋ฅผ ์‚ฌ์šฉํ•˜์‹ญ์‹œ์˜ค.
60000 ๊ฐœ ์ƒ˜ํ”Œ์—์„œ ํ•™์Šตํ•˜๊ณ  10000 ๊ฐœ ์ƒ˜ํ”Œ์—์„œ ๊ฒ€์ฆ
์‹ ๊ธฐ์› 1/15

UnknownError Traceback (๊ฐ€์žฅ ์ตœ๊ทผ ํ˜ธ์ถœ ๋งˆ์ง€๋ง‰)
์—
34
35 model.fit (X_train, y_train, batch_size = 128, epochs = 15, verbose = 1,
---> 36 validation_data = (X_test, y_test), callbacks = [tensor_board])

~ / anaconda3 / envs / tf-gpu / lib / python3.7 / site-packages / keras / engine / training.py in fit (self, x, y, batch_size, epochs, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, initial_epoch, steps_per_epoch, validation_steps, ** kwargs)
1037 ์ดˆ๊ธฐ
1038 steps_per_epoch = steps_per_epoch,
-> 1039 validation_steps = validation_steps)
1040
1041 def ํ‰๊ฐ€ (self, x = None, y = None,

~ / anaconda3 / envs / tf-gpu / lib / python3.7 / site-packages / keras / engine / training_arrays.py in fit_loop (model, f, ins, out_labels, batch_size, epochs, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics, initial_epoch, steps_per_epoch, validation_steps)
197 ins_batch [i] = ins_batch [i] .toarray ()
198
-> 199 ์•„์›ƒ = f (ins_batch)
200 ์•„์›ƒ = to_list (์•„์›ƒ)
zip (out_labels, outs)์˜ l, o์— ๋Œ€ํ•ด 201 :

~ / anaconda3 / envs / tf-gpu / lib / python3.7 / site-packages / keras / backend / tensorflow_backend.py in __call __ (self, inputs)
2713 ์ž๊ธฐ๋ฅผ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.
2714
-> 2715 return self._call (inputs)
2716 ๊ทธ ์™ธ :
2717 if py_any (is_tensor (x) for x in inputs) :

~ / anaconda3 / envs / tf-gpu / lib / python3.7 / site-packages / keras / backend / tensorflow_backend.py in _call (self, inputs)
2673 fetched = self._callable_fn ( array_vals, run_metadata = self.run_metadata)2674 ๊ทธ ์™ธ :-> 2675 ๊ฐ€์ ธ์˜ด = self._callable_fn ( array_vals)
2676 ๋ฐ˜ํ™˜ ๊ฐ€์ ธ ์˜ค๊ธฐ [: len (self.outputs)]
2677

~ / anaconda3 / envs / tf-gpu / lib / python3.7 / site-packages / tensorflow / python / client / session.py in __call __ (self, args, * kwargs)
1437 ํ™”
1438 ์ž๊ธฐ ._ ์„ธ์…˜ ._ ์„ธ์…˜, ์ž๊ธฐ ._ ํ•ธ๋“ค, ์ธ์ˆ˜, ์ƒํƒœ,
-> 1439 run_metadata_ptr)
run_metadata ์ธ ๊ฒฝ์šฐ 1440 :
1441

~ / anaconda3 / envs / tf-gpu / lib / python3.7 / site-packages / tensorflow / python / framework / errors_impl.py in __exit __ (self, type_arg, value_arg, traceback_arg)
526 ์—†์Œ, ์—†์Œ,
527 ํ™”
-> 528 c_api.TF_GetCode (self.status.status))
529 # ๋ฉ”๋ชจ๋ฆฌ์—์„œ ๊ธฐ๋ณธ ์ƒํƒœ ๊ฐ์ฒด๋ฅผ ์‚ญ์ œํ•˜์ง€ ์•Š์œผ๋ฉด ์‚ด์•„๋‚จ์Šต๋‹ˆ๋‹ค.
530 # ์—ญ ์ถ”์ ์—์„œ ์ƒํƒœ์— ๋Œ€ํ•œ ์ฐธ์กฐ๊ฐ€ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—

UnknownError : ์ปจ๋ณผ ๋ฃจ์…˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ€์ ธ ์˜ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” cuDNN์ด ์ดˆ๊ธฐํ™”์— ์‹คํŒจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์œ„์— ๊ฒฝ๊ณ  ๋กœ๊ทธ ๋ฉ”์‹œ์ง€๊ฐ€ ์ธ์‡„๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.
[[{{๋…ธ๋“œ conv2d_1 / convolution}}]]
[[{{node metrics / acc / Mean}}]]

์•ˆ๋…•ํ•˜์„ธ์š” tydlwav :

๋‹ค์Œ ๋ช…๋ น์„ ์‚ฌ์šฉํ•˜์—ฌ cuda์™€ cudnn์„ ๋ชจ๋‘ ์ œ๊ฑฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋‘ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” ํ˜„์žฌ ์ž‘๋™ํ•˜์ง€ ์•Š๋”๋ผ๋„ ์—ฌ์ „ํžˆ Anaconda3์— ์žˆ์Šต๋‹ˆ๋‹ค. Anaconda3๋Š” ์ œ๊ฑฐ๋˜์ง€ ์•Š๋Š” ํ•ต์‹ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๋ณดํ˜ธํ•˜๋ ค๊ณ ํ•ฉ๋‹ˆ๋‹ค. Continuum์˜ ํ•ต์‹ฌ ๊ธฐ๋Šฅ์€ ๋ฒ„๊ทธ๊ฐ€ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ–ˆ์„ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. Independent Nvdia cuda (nvcc ์ œ์™ธ)์™€ cudnn์„ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ conda๊ฐ€ ์„ค์น˜๋œ ์ƒˆ cuda ๋˜๋Š” cudnn์„ ์„ค์น˜ํ•˜๋ ค๊ณ ํ•ฉ๋‹ˆ๋‹ค.

์ œ๊ฑฐ ๋ช…๋ น :

conda ์ œ๊ฑฐ cudatoolkit

ํŒจํ‚ค์ง€ ๋ฉ”ํƒ€ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ : ์™„๋ฃŒ
ํ•ด๊ฒฐ ํ™˜๊ฒฝ : ์™„๋ฃŒ

ํŒจํ‚ค์ง€ ๊ณ„ํš

ํ™˜๊ฒฝ ์œ„์น˜ : / home / mike / anaconda3 / envs / tf-gpu

์ œ๊ฑฐ ๋œ ์‚ฌ์–‘ :
-cudatoolkit

๋‹ค์Œ ํŒจํ‚ค์ง€๊ฐ€ ์ œ๊ฑฐ๋ฉ๋‹ˆ๋‹ค.

cudatoolkit-10.0.130-0
cudnn-7.3.1-cuda10.0_0
cupti-10.0.130-0
์ผ€ ๋ผ์Šค -2.2.4-0
tensorflow-1.13.1-gpu_py37hc158e3b_0
tensorflow-base-1.13.1-gpu_py37h8d69cac_0
tensorflow-gpu-1.13.1-h0d30ee6_0

๊ณ„์† ํ•˜์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ ([y] / n)? ์™€์ด

๊ฑฐ๋ž˜ ์ค€๋น„ ์ค‘ : ์™„๋ฃŒ
๊ฑฐ๋ž˜ ํ™•์ธ : ์™„๋ฃŒ
ํŠธ๋žœ์žญ์…˜ ์‹คํ–‰ : ์™„๋ฃŒ

๋ฉ”๋ชจ:

๋‘˜ ๋‹ค ์ œ๊ฑฐํ•œ ํ›„ Jupyter Notebook์— "No mudule named"tensorflow "๊ฐ€ ํ‘œ์‹œ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ฆ‰, uninsallation์ด ์„ฑ๊ณตํ–ˆ์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ Anaconda3์—์„œ cudatoolkit๊ณผ cudnn ๋ชจ๋‘ ์—ฌ์ „ํžˆ ๋ฐœ๊ฒฌ๋ฉ๋‹ˆ๋‹ค. ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

/home/anaconda3/pkgs/cudatoolkit-10.0.130-0
/home/anaconda3/pkgs/cudnn-7.3.1-cuda10.0.0_0

์ด๋ฏธ ์ œ๊ฑฐํ–ˆ์Šต๋‹ˆ๋‹ค. pkgs ์˜ ํŒŒ์ผ์€ ์„ค์น˜์šฉ์ž…๋‹ˆ๋‹ค. ์ด๋“ค์€ ์„ค์น˜๋ฅผ ์œ„ํ•ด ๋‹ค์šด๋กœ๋“œ ๋œ ์บ์‹œ์ž…๋‹ˆ๋‹ค. ๋˜ํ•œ ์ด๊ณณ์€ ์ฝ˜๋‹ค ํ™˜๊ฒฝ ๋ฌธ์ œ๋ฅผ ๋…ผ์˜ํ•˜๋Š” ๊ณณ์ด ์•„๋‹™๋‹ˆ๋‹ค. ์ด ๋ฌธ์ œ์™€ ๊ด€๋ จ์ด ์—†์Šต๋‹ˆ๋‹ค. ์Šคํƒ ์˜ค๋ฒ„ํ”Œ๋กœ๋ฅผ ์‹œ๋„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๋ฌธ์ œ์˜ ์ƒํƒœ์— ๋Œ€ํ•ด ์•ฝ๊ฐ„ ํ˜ผ๋ž€ ์Šค๋Ÿฝ์Šต๋‹ˆ๋‹ค. RTX 2080, cuda 10.1, cudnn v7.5.1.10 ๋ฐ tensorflow 1.14๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์„ฑ์žฅ ํ—ˆ์šฉ์„ ์‚ฌ์šฉํ•˜๋ฉด ์ž‘๋™ํ•˜์ง€๋งŒ ๋‹ค๋ฅธ ๋ฒ„์ „์ด ์ผ์น˜ํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

tensorflow 1.14์— ์ด๊ฒƒ์— ๋Œ€ํ•œ ์ˆ˜์ •์ด ์žˆ์Šต๋‹ˆ๊นŒ?

๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค

๊ฐ์‚ฌ. RTX 20XX Turing ์‹œ๋ฆฌ์ฆˆ, TensorFlow ๋ฐ Anaconda ๊ฐ„์˜ ํ˜ธํ™˜์„ฑ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. RTX 20XX ์‹œ๋ฆฌ์ฆˆ๋Š” cudnn 7.5.0์„ ์ง€์›ํ•˜๊ณ  TensorFlow๋Š” cudnn 7.4 ๋งŒ ์ง€์›ํ•˜์ง€๋งŒ Anaconda์—๋Š” ๊ฐ„์†Œํ™” ๋œ 7.3.1์ด ํฌํ•จ๋˜์–ด์žˆ์–ด ์„ธ ๊ณต๊ธ‰ ์—…์ฒด๊ฐ„์— ์ด ๋ถˆ์ผ์น˜์ž…๋‹ˆ๋‹ค. ๋˜ํ•œ RTX 20XX ์‹œ๋ฆฌ์ฆˆ๋Š” Ubuntu 16.04 LTS์™€ ํฐ ํ˜ธํ™˜์„ฑ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๋•Œ๋•Œ๋กœ Ubuntu 16.04๊ฐ€ ์ถฉ๋Œํ–ˆ์Šต๋‹ˆ๋‹ค. OS๋ฅผ ์žฌ์„ค์น˜ํ•˜๋ ค๋ฉด ๋ถ€ํŒ… ๊ฐ€๋Šฅํ•œ USB ์Šคํ‹ฑ 2 ๊ฐœ๋ฅผ ๊ฐ€์ ธ์™€์•ผํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ 2 ๋Œ€์˜ PC๋ฅผ Ubuntu 18.04 LTS๋กœ ์—…๊ทธ๋ ˆ์ด๋“œํ•˜๊ณ  Miniconda๋ฅผ ์„ค์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ ๋” ๋†’์€ ๋ฒ„์ „์˜ Tensorflow๋ฅผ ์‚ฌ์šฉํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋ฉ”๋ชจ:

Nvidia๋Š” Jetson TX1 / TX2 ๋ฐ Jetson Nano ๋ชจ๋ฐ”์ผ GPU ํ”Œ๋žซํผ์„์œ„ํ•œ ์ž์ฒด ๋งž์ถคํ˜• Ubuntu 18.04 LTS๋ฅผ ๋ณด์œ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. Nvidia๋Š” ํ•˜์œ„ ๋ฒ„์ „ Ubuntu 16.04๊ฐ€ ์•„๋‹Œ Ubuntu 18.04 LTS์™€ ํ˜ธํ™˜๋˜๋Š” RTX 20XX ์‹œ๋ฆฌ์ฆˆ์™€ ๊ฐ™์€ ์‹ ์ œํ’ˆ์„ ๊ฒฐ์ •ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ Continuum์ด Nvidia RTX 20XX Turing ์‹œ๋ฆฌ์ฆˆ์— ๋Œ€ํ•œ ์—…๊ทธ๋ ˆ์ด๋“œ ๊ณ„ํš์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š”์ง€ ์—ฌ๋ถ€๋Š” ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.

RTX ์‹œ๋ฆฌ์ฆˆ๋Š” ํ˜„์žฌ ์ž˜ ์ง€์›๋ฉ๋‹ˆ๋‹ค. ๋น„ ์šฐ๋ถ„ํˆฌ ๋ฐฐํฌํŒ์—์„œ conda ํ™˜๊ฒฝ์„ ํ†ตํ•ด RTX 2070๊ณผ ํ•จ๊ป˜ tf๋ฅผ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์ตœ์•…์˜ ์‹œ๋‚˜๋ฆฌ์˜ค ์—ฌ์•ผํ•˜๋ฉฐ ์—ฌ์ „ํžˆ ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. Cuda ๋ฐ cudnn์€ ์ด์ „ ๋ฒ„์ „๊ณผ ํ˜ธํ™˜๋˜๋ฉฐ ์ตœ์‹  ๋ฒ„์ „์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ๋ฌธ์ œ๊ฐ€๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋‹น์‹ ์€ ๋‹จ์ˆœํžˆ์œผ๋กœ ์ƒˆ๋กœ์šด ํŒŒ์ด์ฌ 3.6 ํ™˜๊ฒฝ์„ ๋งŒ๋“ค์–ด์•ผํ•ฉ๋‹ˆ๋‹ค conda create -n tf python==3.6.8 ์‹คํ–‰ conda install tensorflow-gpu .

๊ทธ๊ฒƒ์€ ๋‚ด๊ฐ€ ์†Œ์Šค์—์„œ ์ปดํŒŒ์ผํ•˜๊ณ  ํด๋ผ์ด์–ธํŠธ๊ฐ€ ๋Œ€๋ถ€๋ถ„์˜ ํ•˜๋“œ์›จ์–ด์—์„œ Tensorflow 1.12.0 CUDA 10.0 ๋ฐ CUDNN 7.4.2.24๋กœ ์ž‘์—…ํ•˜๋„๋กํ–ˆ์ง€๋งŒ GPU์—์„œ cudnn์ด์žˆ๋Š” CNN์ด์žˆ๋Š” RTX ์นด๋“œ๊ฐ€์žˆ๋Š” ์†Œ์ˆ˜์˜ ํด๋ผ์ด์–ธํŠธ์— ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์‹ค์ˆ˜๋กœ CUDA 9.0์— ๋Œ€ํ•ด ์ž˜๋ชป๋œ CUDNN์„ ํŒจํ‚ค์ง•ํ–ˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํŒŒ์ผ ์ด๋ฆ„์ด ๋™์ผํ•ฉ๋‹ˆ๋‹ค.

๋ˆ„๊ตฌ๋“ ์ง€ ์ด๋Ÿฌํ•œ ๋ฒ„์ „์ด RTX2080 ๋ฐ ๊ธฐํƒ€ Turing ๊ธฐ๋ฐ˜ ์นด๋“œ์—์„œ ์ž‘๋™ํ•˜๋Š”์ง€ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

์•ˆ๋…•ํ•˜์„ธ์š” tydlwav :

๊ท€ํ•˜์˜ ์ œ์•ˆ์— ๋”ฐ๋ผ Miniconda ๋ฐ ๊ด€๋ จ python ๋ฐ tensorflow ํ™˜๊ฒฝ์„ ์„ค์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค. ์—ฌ์ „ํžˆ ์˜ค๋ฅ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ปจ๋ณผ ๋ฃจ์…˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ€์ ธ ์˜ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์•„๋งˆ๋„ cuDNN์ด ์ดˆ๊ธฐํ™”์— ์‹คํŒจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ผ ๊ฒƒ์ž…๋‹ˆ๋‹ค .......
ํ•ด๊ฒฐ์ฑ…์„ ์ฐพ๋„๋ก ๋„์™€์ฃผ์„ธ์š”.

๋‚ด๊ฐ€ ์šด์˜ ํ•œ ๋‹จ๊ณ„๋ฅผ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค.

  1. ์ง€์นจ์— ๋”ฐ๋ผ Python 3.6.8์„ ์„ค์น˜ํ•˜์‹ญ์‹œ์˜ค.
    conda create -n tf python == 3.6.8

  2. tf ํ™œ์„ฑํ™”
    conda tf ํ™œ์„ฑํ™”

  3. ์ง€์นจ์— ๋”ฐ๋ผ tf ํ™˜๊ฒฝ์— tensorflow-gpu๋ฅผ ์„ค์น˜ํ•˜์‹ญ์‹œ์˜ค.
    conda tensorflow-gpu ์„ค์น˜

์„ค์น˜๋œ ํŒจํ‚ค์ง€์—๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด cudatoolkit ๋ฐ cudnn์ด ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
.................................................. ..................................................
cudatoolkit pkgs / main / linux-64 :: cudatoolkit-10.0.130-0
cudnn pkgs / main / linux-64 :: cudnn-7.3.1-cuda10.0_0
.................................................. ..................................................

  1. ์›น ํŽ˜์ด์ง€์— jupyter ๋…ธํŠธ๋ถ, ipykernel ๋ฐ ๊ด€๋ จ ํ™˜๊ฒฝ์„ ์„ค์น˜ํ•˜์‹ญ์‹œ์˜ค.

1). jupyter ๋…ธํŠธ๋ถ ์„ค์น˜
conda๋Š” jupyter ๋…ธํŠธ๋ถ์„ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค.

2). jupyter ๋…ธํŠธ๋ถ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ipykernel ์„ค์น˜
conda ipykernel jupyter ์„ค์น˜

์‚ผ). jupyter ๋…ธํŠธ๋ถ์˜ ์›น ํŽ˜์ด์ง€์—์„œ TensorFlow-GPU๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
python -m ipykernel install --user --name tf-gpu --display-name "TensorFlow-GPU"

  1. jupyter ๋…ธํŠธ๋ถ ์—ด๊ธฐ
    1). jupyter ๋…ธํŠธ๋ถ ์›น ํŽ˜์ด์ง€์— ๋ช…๋ น
    ์ฃผํ”ผํ„ฐ ๋…ธํŠธ๋ถ

2). TensorFlow-GPU๋ฅผ ํด๋ฆญํ•ฉ๋‹ˆ๋‹ค.
์›น ํŽ˜์ด์ง€์—์„œ "new"๋ฉ”๋‰ด๋ฅผ ํด๋ฆญํ•˜๊ณ  "TensorFlow-GPU"๋ฅผ ์„ ํƒํ•˜๋Š” ๋™์•ˆ jupyter ๋…ธํŠธ๋ถ์˜ ์›น ํŽ˜์ด์ง€์— ์…€์ด ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค. ์›น ํŽ˜์ด์ง€๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋‚˜์—ด๋ฉ๋‹ˆ๋‹ค.
http : // localhost : 8888 / notebooks / Untitled3.ipynb? kernel_name = tf-gpu

  1. ๋ถ™์—ฌ ๋„ฃ๊ธฐ ๊ฐ„๋‹จํ•œ MNIST ํ…Œ์ŠคํŠธ ์ฝ”๋“œ ์‹คํ–‰

์ˆ˜์ž… ์ผ€ ๋ผ์Šค
keras.datasets์—์„œ mnist ๊ฐ€์ ธ ์˜ค๊ธฐ
keras.models์—์„œ ๊ฐ€์ ธ ์˜ค๊ธฐ ์ˆœ์ฐจ
keras.layers import Dense, Dropout์—์„œ
keras.layers์—์„œ ๊ฐ€์ ธ ์˜ค๊ธฐ Flatten, MaxPooling2D, Conv2D
keras.callbacks์—์„œ ๊ฐ€์ ธ ์˜ค๊ธฐ TensorBoard

(X_train, y_train), (X_test, y_test) = mnist.load_data ()

X_train = X_train.reshape (60000,28,28,1) .astype ( 'float32')
X_test = X_test.reshape (10000,28,28,1) .astype ( 'float32')

X_train / = 255
X_test / = 255

n_classes = 10
y_train = keras.utils.to_categorical (y_train, n_classes)
y_test = keras.utils.to_categorical (y_test, n_classes)

๋ชจ๋ธ = Sequential ()
model.add (Conv2D (32, kernel_size = (3,3), activation = 'relu', input_shape = (28,28,1)))
model.add (Conv2D (64, kernel_size = (3,3), activation = 'relu'))
model.add (MaxPooling2D (pool_size = (2,2)))
model.add (๋“œ๋กญ ์•„์›ƒ (0.25))
model.add (Flatten ())
model.add (Dense (128, activation = 'relu'))
model.add (๋“œ๋กญ ์•„์›ƒ (0.5))
model.add (๋ฐ€๋„ (n_classes, activation = 'softmax'))

model.compile (loss = 'categorical_crossentropy', optimizer = 'adam', metrics = [ 'accuracy'])

tensor_board = TensorBoard ( './ logs / LeNet-MNIST-1')

model.fit (X_train, y_train, batch_size = 128, epochs = 15, verbose = 1,
validation_data = (X_test, y_test), ์ฝœ๋ฐฑ = [tensor_board])

  1. ๋งˆ์ง€๋ง‰์œผ๋กœ ์–ธ๊ธ‰ ํ•œ ๋ฉ”์‹œ์ง€์™€ ๋™์ผํ•œ ์˜ค๋ฅ˜ :

UnknownError : ์ปจ๋ณผ ๋ฃจ์…˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ€์ ธ ์˜ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” cuDNN์ด ์ดˆ๊ธฐํ™”์— ์‹คํŒจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์œ„์— ๊ฒฝ๊ณ  ๋กœ๊ทธ ๋ฉ”์‹œ์ง€๊ฐ€ ์ธ์‡„๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.
[[{{๋…ธ๋“œ conv2d_1 / convolution}}]]
[[{{node metrics / acc / Mean}}]]

๊ฐ์‚ฌ,

๋งˆ์ดํฌ

์•ˆ๋…• tydlwav :

๋ง๋ถ™์—ฌ์„œ ๋‹ค์Œ ๋ช…๋ น์œผ๋กœ keras๋„ ์„ค์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค.
conda keras-gpu ์„ค์น˜

์„ค์น˜๊ฐ€ ์ •ํ™•ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ Miniconda์™€ RTX20XX Turing ์‹œ๋ฆฌ์ฆˆ ๊ฐ„์˜ ๋ฒ„์ „ ํ˜ธํ™˜์„ฑ ๋ฌธ์ œ๋ผ๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค. ์˜ค๋ฅ˜๋Š” Anaconda์™€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค. Miniconda์™€ Anaconda์˜ cudnn ๋ฐ cuda ๋ฒ„์ „์ด ๋™์ผํ•˜๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ฒŒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

๊ฝค ํฅ๋ฏธ ๋กญ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์•ฝ ํ•œ ๋‹ฌ ๋ฐ˜ ์ „์— conda๋กœ cuda 10๊ณผ cudnn7.3์„ ์–ป์—ˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ ์ดํ›„๋กœ tensorflow๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ ์†Œ์Šค์—์„œ ๋นŒ๋“œ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์€ ํ•ญ์ƒ ๋‚˜๋ฅผ ์œ„ํ•ด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๋ง‰ ์‹œ์ž‘ํ–ˆ๋‹ค๋ฉด pytorch๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ์„ค์น˜ํ•˜๊ณ  ์ž‘์—…ํ•˜๋Š” ๋ฐ ํ›จ์”ฌ ๋” ์‰ฝ๊ฒŒ ์‹œ๊ฐ„์„ ํ• ์•  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š” tydlwav :

๋‚˜๋Š” pytorch์™€ ๊ฐ™์€ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์„ ์‹œ๋„ ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด์ œ Google์—์„œ tensorflow-gpu 1.14๋ฅผ ์ถœ์‹œ ํ–ˆ์œผ๋ฏ€๋กœ Miniconda๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Google Tensorflow ์›น ์‚ฌ์ดํŠธ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋…๋ฆฝ์  ์ธ tensorflow-gpu 1.14๋ฅผ ์„ค์น˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Google tensorflow : https://www.tensorflow.org/install/source

๋ฉ”๋ชจ:

Conda์—๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด 1.0.1์—์„œ 1.13.1๊นŒ์ง€์˜ tensorflow-gpu ๋นŒ๋“œ ๋งŒ ์žˆ์Šต๋‹ˆ๋‹ค. ๋นŒ๋“œ๊ฐ€ ๋„ˆ๋ฌด ์˜ค๋ž˜๋˜์–ด ๊ณต์‹ Google TensorFlow ๋ฐ ๊ณต์‹ Nvidia GeForce RTX 20XX (2060 ~ 2080) Truing ์‹œ๋ฆฌ์ฆˆ๋ฅผ ๋”ฐ๋ผ ์žก์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.

๋ช…๋ น:
conda ๊ฒ€์ƒ‰ tensorflow-gpu

์ฑ„๋„๋กœ๋“œ : ์™„๋ฃŒ

์ด๋ฆ„ ๋ฒ„์ „ ๋นŒ๋“œ ์ฑ„๋„
tensorflow-gpu 1.0.1 py27_4 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.0.1 py35_4 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.0.1 py36_4 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.1.0 np111py27_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.1.0 np111py35_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.1.0 np111py36_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.1.0 np112py27_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.1.0 np112py35_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.1.0 np112py36_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.2.1 py27cuda7.5cudnn5.1_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.2.1 py27cuda7.5cudnn6.0_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.2.1 py27cuda8.0cudnn5.1_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.2.1 py27cuda8.0cudnn6.0_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.2.1 py35cuda7.5cudnn5.1_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.2.1 py35cuda7.5cudnn6.0_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.2.1 py35cuda8.0cudnn5.1_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.2.1 py35cuda8.0cudnn6.0_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.2.1 py36cuda7.5cudnn5.1_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.2.1 py36cuda7.5cudnn6.0_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.2.1 py36cuda8.0cudnn5.1_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.2.1 py36cuda8.0cudnn6.0_0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.3.0 0 ํŒจํ‚ค์ง€ / ๋ฌด๋ฃŒ
tensorflow-gpu 1.4.1 0 ํŒจํ‚ค์ง€ / ๋ฉ”์ธ
tensorflow-gpu 1.5.0 0 ํŒจํ‚ค์ง€ / ๋ฉ”์ธ
tensorflow-gpu 1.6.0 0 ํŒจํ‚ค์ง€ / ๋ฉ”์ธ
tensorflow-gpu 1.7.0 0 ํŒจํ‚ค์ง€ / ๋ฉ”์ธ
tensorflow-gpu 1.8.0 h7b35bdc_0 pkgs / main
tensorflow-gpu 1.9.0 hf154084_0 ํŒจํ‚ค์ง€ / ๋ฉ”์ธ
tensorflow-gpu 1.10.0 hf154084_0 ํŒจํ‚ค์ง€ / ๋ฉ”์ธ
tensorflow-gpu 1.11.0 h0d30ee6_0 ํŒจํ‚ค์ง€ / ๋ฉ”์ธ
tensorflow-gpu 1.12.0 h0d30ee6_0 ํŒจํ‚ค์ง€ / ๋ฉ”์ธ
tensorflow-gpu 1.13.1 h0d30ee6_0 ํŒจํ‚ค์ง€ / ๋ฉ”์ธ

RTX 2070๊ณผ ํ•จ๊ป˜ conda์˜ tf 1.12 ๋ฆด๋ฆฌ์Šค๋ฅผ ์‚ฌ์šฉํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์˜ค๋ž˜๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์ƒˆ ํ•˜๋“œ์›จ์–ด๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ์ด์ „ ๋ฒ„์ „๊ณผ ํ˜ธํ™˜๋˜๋ฉฐ RTX๋„ ๋‹ค๋ฅด์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํ”Œ๋ ˆ์ด ์ค‘์— ์ด์ƒํ•œ ํ™˜๊ฒฝ ๋ฌธ์ œ๊ฐ€์žˆ์„ ๊ฐ€๋Šฅ์„ฑ์ด ํฝ๋‹ˆ๋‹ค. 7 ์›”๊นŒ์ง€ RTX ๋จธ์‹ ์— ์•ก์„ธ์Šค ํ•  ์ˆ˜ ์—†์œผ๋ฏ€๋กœ ์ง€๊ธˆ์€ ํ…Œ์ŠคํŠธ๋ฅผ ๋„์™€ ๋“œ๋ฆด ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์†Œ์Šค์—์„œ ๋นŒ๋“œํ•˜๋ฉด ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋ฉ๋‹ˆ๋‹ค. ์†Œ์Šค์—์„œ ๋นŒ๋“œ ๋œ tf์—์„œ convnet์„ ์‹คํ–‰ํ•˜๋Š” ๋ฐ ์‹คํŒจํ•œ ์ ์ด ์—†์Šต๋‹ˆ๋‹ค (๋นŒ๋“œ ์ค‘์— ์˜ฌ๋ฐ”๋ฅธ ๊ตฌ์„ฑ์ด ์žˆ๋‹ค๊ณ  ๊ฐ€์ •).

๋‹ค์‹œ ํ•œ ๋ฒˆ, ์ด๊ฒƒ์€ tensorflow์˜ ๋ฐฐํฌ ๋ฌธ์ œ๋ฅผ ๋…ผ์˜ํ•˜๊ธฐ์— ์ ํ•ฉํ•œ ์žฅ์†Œ๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค. ์Šคํƒ ์˜ค๋ฒ„ํ”Œ๋กœ ๋˜๋Š” reddit์— ๊ฒŒ์‹œ๋ฌผ์„ ์ž‘์„ฑํ•˜๊ณ  ์—ฌ๊ธฐ์— ์—ฐ๊ฒฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋” ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์ด ๊ทธ๊ฒƒ์„๋ณด๊ณ  ์ด๋Ÿฐ ์‹์œผ๋กœ ๋‹น์‹ ์„ ๋„์šธ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๊ท€ํ•˜์˜ ๋ฌธ์ œ๋Š” ๋ฒ„๊ทธ๊ฐ€ ์•„๋‹ˆ๋ฉฐ์ด ๋ฌธ์ œ๊ฐ€ ๋…ผ์˜ํ•˜๋Š” ๋‚ด์šฉ์ด ์•„๋‹™๋‹ˆ๋‹ค.

@chsigg ๋‹น์‹ ์€ ์ด๊ฒƒ์ด ํ…์„œ ํ”Œ๋กœ์šฐ๊ฐ€ ์ด๋ฏธ ํ• ๋‹น ํ•œ GPU ๋ฉ”๋ชจ๋ฆฌ ๋ฆฌ์†Œ์Šค๋ฅผ ํ• ๋‹นํ•˜๋ ค๋Š” CUDNN์ด ๋ฌธ์ œ๋ผ๋Š” ๊ฒƒ์„ ์ง„๋‹จํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐ„๋‹จํ•˜๊ฒŒ ์„ค์ • per_process_gpu_memory_fraction=0.9 ๋Œ€์‹  0.95 ๋‚ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ์— ์ถฉ๋ถ„ํ–ˆ๋‹ค.

๋‚˜๋„์ด ๋ฌธ์ œ์— ์ง๋ฉดํ–ˆ๋‹ค. cuDNN์„ 7.6 ๋ฒ„์ „์œผ๋กœ ์—…๋ฐ์ดํŠธํ•˜์—ฌ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ–ˆ์Šต๋‹ˆ๋‹ค.

tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above

Tensorflow-gpu : 1.13.1
Cuda : 10.0
CuDNN : 7.3.1

๋˜ํ•œ Conda๋Š” tensorflow์™€ CuDNN์„ ์„ค์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค.
conda list cudnn

cudnn                     7.3.1                cuda10.0_0    anaconda

๋‚ด๊ฐ€ ํ•œ ์ผ :

  1. conda tensorflow๋ฅผ ์ œ๊ฑฐํ–ˆ์Šต๋‹ˆ๋‹ค.
    conda remove tensorflow
  2. conda cuDNN ์ œ๊ฑฐ
    conda remove cudnn
  3. pip๋กœ tensorflow ์„ค์น˜
    pip install tensorflow
  4. https://developer.nvidia.com/cudnn ์—์„œ ํ•ด๋‹น cuDNN 7.6 ๋Ÿฐํƒ€์ž„ deb ํŒŒ์ผ์„ ๋‹ค์šด๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.
  5. sudo dpkg -i libcudnn_xxxxx_amd64.deb

@nluehr ์˜๊ฒฌ์ด ์žˆ์Šต๋‹ˆ๊นŒ? MinSystemMemory () cuda / cudnn ๋ฒ„์ „์„ ์ธ์‹ํ•˜๋„๋ก ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

tf.keras๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ํŒŒ์ผ ์ƒ๋‹จ์—์„œ ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•˜๋ฉด ํ•ฉ๋ฒ•์  ์ธ ๋ฉ”๋ชจ๋ฆฌ ์˜ค๋ฅ˜์ž…๋‹ˆ๋‹ค.
๊ตฌ์„ฑ = tf.ConfigProto ()
config.gpu_options.allow_growth = True
tf.keras.backend.set_session (tf.Session (config = config))

์ด ๋ฌธ์ œ๋„ ๋งŒ๋‚ฌ๊ณ  @ va-andrew์˜ ์†”๋ฃจ์…˜์„ ์‚ฌ์šฉํ•˜์—ฌ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์—ˆ๊ณ  ํŠนํžˆ ์ฝ”๋“œ์—์„œ tensorflow.keras ํ•จ์ˆ˜ ์ค‘ ์ผ๋ถ€๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— @colinsteidtmann ์˜ ๊ตฌํ˜„์„ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ฌธ์ œ๋ฅผ ๋””๋ฒ„๊น…ํ•˜๋Š” ๋ฐ ์˜ค๋žœ ์‹œ๊ฐ„์„ ๋ณด๋ƒˆ์œผ๋ฏ€๋กœ ๊ธฐ์—ฌํ•ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

ํŽธ์ง‘ : ๋‚˜๋Š” ๋‹จ์ง€ tensorflow ๋ฌธ์„œ (https://www.tensorflow.org/guide/using_gpu)๋ฅผ๋ณด๊ณ  ์žˆ์—ˆ๊ณ  ํ™˜๊ฒฝ ๋ณ€์ˆ˜ TF_FORCE_GPU_ALLOW_GROWTH๋ฅผ true๋กœ ์„ค์ •ํ•˜์—ฌ ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๋ฅผ ํ—ˆ์šฉํ•˜๋„๋ก ๋งํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ์ด ๊ตฌ์„ฑ์€ ํ”Œ๋žซํผ๋ณ„๋กœ ๋‹ค๋ฅด๋ฏ€๋กœ YMMV (Ubuntu 18.04์—์„œ ์ €์—๊ฒŒ ์ ํ•ฉ ํ•จ)๋ผ๊ณ ํ•ฉ๋‹ˆ๋‹ค.

์ฐธ๊ณ ๋กœ ๋‹ค์Œ์„ ์‹คํ–‰ ์ค‘์ž…๋‹ˆ๋‹ค.
Ubuntu 18.04.2 LTS, Gigabyte GeForce RTX 2080 Turbo, NVIDIA ๋“œ๋ผ์ด๋ฒ„ 430.26, CUDA 10.0.130, cuDNN 7.4.2.24, tensorflow-gpu 1.13.1, python 3.6. spyder 3.3.4๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ€์ƒ ํ™˜๊ฒฝ ๋‚ด์—์„œ tensorflow๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

๋‚˜๋Š” ์ •ํ™•ํžˆ ๋™์ผํ•œ ํ•˜๋“œ์›จ์–ด๋ฅผ ๊ฐ€์ง„ ๋‘ ๋ฒˆ์งธ ์ปดํ“จํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉฐ ๋™์ผํ•œ ์ง€์นจ์— ๋”ฐ๋ผ ์„ค์ •ํ•˜๊ณ  ๋™์ผํ•œ ํŒŒ์ผ์„ ์‚ฌ์šฉํ•˜์—ฌ ์„ค์น˜ํ–ˆ์œผ๋ฉฐ ํ•ด๋‹น ์ปดํ“จํ„ฐ ์—์„œ๋„์ด ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๋†€๋ž„ ์ผ์ด ์•„๋‹™๋‹ˆ๋‹ค.

๋™์ผํ•œ ํ•˜๋“œ์›จ์–ด๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์„ธ ๋ฒˆ์งธ ์ปดํ“จํ„ฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹จ, 2080์ด ์•„๋‹Œ 2080Ti๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ๋™์ผํ•œ ์ง€์นจ์— ๋”ฐ๋ผ ์„ค์ •ํ•˜๊ณ  ๋™์ผํ•œ ํŒŒ์ผ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์‹œ ์„ค์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด๋ฒˆ์—๋Š” ๋ฌธ์ œ๊ฐ€ ์—†์—ˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ ๋‚˜๋Š” ๊ทธ๊ฒƒ์ด CUDA, cuDNN ๋ฐ ๋“œ๋ผ์ด๋ฒ„ ๋ฒ„์ „์˜ ์ผ๋ถ€ ์ถฉ๋Œ๊ณผ ๊ด€๋ จ์ด ์—†๋‹ค๊ณ  ๋ฏฟ๊ฒŒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ž˜๋ชป ์„ค์น˜ ํ•œ ๊ฒƒ์ด ์•„๋‹™๋‹ˆ๋‹ค. ์˜คํžˆ๋ ค ๋น„๋””์˜ค ์นด๋“œ ๋ชจ๋ธ๊ณผ ๊ด€๋ จ์ด ์žˆ์Šต๋‹ˆ๋‹ค. RTX 2060, 2070 ๋ฐ 2080์—์„œ๋งŒ์ด ๋ฌธ์ œ์— ๋Œ€ํ•œ ์–ธ๊ธ‰์„ ๋ณด์•˜์Šต๋‹ˆ๋‹ค.

๋‹คํ–‰ํžˆ๋„ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ํฐ ๋ถˆํŽธ์€ ์•„๋‹™๋‹ˆ๋‹ค.

๋‚˜๋„์ด ๋ฌธ์ œ์— ์ง๋ฉดํ–ˆ๋‹ค. cuDNN์„ 7.6 ๋ฒ„์ „์œผ๋กœ ์—…๋ฐ์ดํŠธํ•˜์—ฌ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ–ˆ์Šต๋‹ˆ๋‹ค.

tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above

Tensorflow : 1.13.1
Cuda : 10.0
CuDNN : 7.3.1

๋˜ํ•œ Conda๋Š” tensorflow์™€ CuDNN์„ ์„ค์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค.
conda list cudnn

cudnn                     7.3.1                cuda10.0_0    anaconda

๋‚ด๊ฐ€ ํ•œ ์ผ :

1. Uninstalled conda tensorflow.
   `conda remove tensorflow`

2. Uninstall conda cuDNN
   `conda remove cudnn`

3. Install tensorflow with pip
   `pip install tensorflow`

4. Download corresponding cuDNN 7.6 runtime deb file from https://developer.nvidia.com/cudnn

5. Install it with `sudo dpkg -i libcudnn7_-1+cuda9.0_amd64.deb`

@ alexforever86 ์—…๋ฐ์ดํŠธ๋ฅผ ๋งˆ์นœ ํ›„ CPU๊ฐ€ ์•„๋‹Œ GPU์—์„œ ์‹คํ–‰๋˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ? (cuDNN์„ ์ฐธ์กฐํ•˜๋Š” ์˜ค๋ฅ˜ ๋ฉ”์‹œ์ง€๋กœ ์ธํ•ด) ์—…๋ฐ์ดํŠธํ•˜๊ธฐ ์ „์— GPU๋ฅผ ์‚ฌ์šฉํ•˜๊ณ ์žˆ๋Š” ๊ฒƒ ๊ฐ™์ง€๋งŒ ๋‚˜์ค‘์— ๊ถ๊ธˆํ•ฉ๋‹ˆ๋‹ค. "pip install tensorflow"๋ฅผ ์‚ฌ์šฉํ•˜์ง€๋งŒ "pip install tensorflow-gpu"์—ฌ์•ผํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ CUDA 10์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ๋งํ–ˆ์ง€๋งŒ ๋‚˜์—ดํ•œ cuDNN deb ํŒŒ์ผ์€ cuda9.0 ์šฉ์ด๋ฏ€๋กœ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ์‹ค์ œ๋กœ GPU๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ ์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ cuDNN 7.6์œผ๋กœ ์—…๋ฐ์ดํŠธํ•˜๋ฉด ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋œ๋‹ค๋Š” ์ฆ๊ฑฐ๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค.

@ synapse8 tensorflow-gpu ๋ฐ cuDNN ๋ฒ„์ „์— ๋Œ€ํ•ด ์ ˆ๋Œ€์ ์œผ๋กœ ๋งž์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๋˜ํ•œ ์ง€๊ธˆ ๋‚ด ์˜๊ฒฌ์— ๋งค์šฐ ํ˜ผ๋ž€์Šค๋Ÿฝ๊ณ  ๋” ์ด์ƒ ์„ธ๋ถ€ ์‚ฌํ•ญ์„ ๊ธฐ์–ตํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. ์–ด์จŒ๋“  ์•„๋ž˜๋Š” ๋‚ด ์‹œ์Šคํ…œ์˜ ํ˜„์žฌ ๋ฒ„์ „์ž…๋‹ˆ๋‹ค.

pip show tensorflow-gpu
์ด๋ฆ„ : tensorflow-gpu
๋ฒ„์ „ : 1.13.1

nvidia-smi
NVIDIA-SMI 430.26 ๋“œ๋ผ์ด๋ฒ„ ๋ฒ„์ „ : 430.26 CUDA ๋ฒ„์ „ : 10.2

sudo apt search cudnn | grep installed
libcudnn7 / now 7.6.0.64-1 + cuda10.0 amd64

์ง€๊ธˆ ์–ธ๊ธ‰ ํ•œ ๊ตฌ์„ฑ์œผ๋กœ @ alexforever86 ์ด ์—ฌ์ „ํžˆ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ? (๋‚˜๋Š” ๊ทธ๊ฒƒ์ด ๋‹น์‹ ์„ ์œ„ํ•ด ์ž‘๋™ํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค). ์ตœ๊ทผ์— cuda10, 410 ๋“œ๋ผ์ด๋ฒ„, 7.6 cudnn ๋ฐ TF-gpu 1.14 (pip ์„ค์น˜)๋กœ ์‹œ์Šคํ…œ์„ ์„ค์น˜ํ–ˆ์ง€๋งŒ ๋ฌธ์ œ๋ฅผ ๋ณด์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.

@ robzor92 ๋‚˜๋Š” tensorflow-gpu 1.13์„ ์‚ฌ์šฉํ•ด ์™”๊ณ  ํ˜ธ๊ธฐ์‹ฌ์— ๋ฐฉ๊ธˆ 1.14๋ฅผ ์„ค์น˜ํ•˜์—ฌ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜์—ˆ๋Š”์ง€ ํ…Œ์ŠคํŠธํ–ˆ์Šต๋‹ˆ๋‹ค. ์—ฌ์ „ํžˆ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜๊ณ  '์„ฑ์žฅ ํ—ˆ์šฉ'ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์„ ์ˆ˜ํ–‰ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค (๋‹ค์‹œ ๋งํ•˜์ง€๋งŒ ๊ทธ๋ ‡๊ฒŒ ํฐ ๋ฌธ์ œ๋Š” ์•„๋‹™๋‹ˆ๋‹ค).

์–ด๋–ค ๋น„๋””์˜ค ์นด๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ?

@ synapse8 GTX 1070์œผ๋กœ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค.

@ synapse8 ๋ฐฉ๊ธˆ์ด ์Šค๋ ˆ๋“œ ์ƒ์„ฑ์ž๊ฐ€ ์ œ๊ณต ํ•œ ์ƒ˜ํ”Œ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•ด ๋ณด์•˜์ง€๋งŒ ๋ฌธ์ œ์—†์ด ์ž‘๋™ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ TF 1.13.1์„ ์‚ฌ์šฉํ•˜๋Š” GTX 1050Ti์—์„œ ๋™์ผํ•œ ๋ฌธ์ œ๋ฅผ ๋ณด์•˜ ๊ธฐ ๋•Œ๋ฌธ์— RTX ๋ผ์ธ์˜ ๋ฌธ์ œ๋ผ๊ณ  ์ฃผ์žฅํ•˜์ง€๋Š” ์•Š์Šต๋‹ˆ๋‹ค. ์ด์ „์— ๊ฒŒ์‹œ ํ•œ ๊ฒƒ๊ณผ ๋™์ผํ•œ ๋“œ๋ผ์ด๋ฒ„ / cuda / cudnn ์กฐํ•ฉ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

@ robzor92 ๋‚˜๋Š” 1050Ti์˜ ๋ฌธ์ œ๊ฐ€ ์ž‘์€ VRAM ํฌ๊ธฐ์— ์žˆ๋‹ค๊ณ  ์˜์‹ฌํ•ฉ๋‹ˆ๋‹ค. RTX ์นด๋“œ๋Š” ๊ธฐ๋ณธ CNN MNIST ๋ชจ๋ธ์—์„œ์ด ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. NVIDIA๊ฐ€ RTX ์นด๋“œ์—์„œ VRAM ํ• ๋‹น์„ ์กฐ์ •ํ•˜์—ฌ ์–ด๋–ป๊ฒŒ ๋“  ์ผ์„ ์—‰๋ง์œผ๋กœ ๋งŒ๋“  ๊ฒƒ์ธ์ง€ ์˜์‹ฌํ•ฉ๋‹ˆ๋‹ค.

tensorflow 1.14.0 ๋ฐ RTX2080์—์„œ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ œ ๊ฒฝ์šฐ์—๋Š” ์ปจ๋ณผ ๋ฃจ์…˜ ๋ ˆ์ด์–ด๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ๋งŒ์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

2019-07-14 21:48:13.041683: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-14 21:48:13.064262: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3600000000 Hz
2019-07-14 21:48:13.064955: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55abe99bcd30 executing computations on platform Host. Devices:
2019-07-14 21:48:13.064967: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2019-07-14 21:48:13.066219: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2019-07-14 21:48:13.153748: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-14 21:48:13.154195: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55abebb44f00 executing computations on platform CUDA. Devices:
2019-07-14 21:48:13.154207: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): GeForce RTX 2080, Compute Capability 7.5
2019-07-14 21:48:13.154317: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-14 21:48:13.154707: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: 
name: GeForce RTX 2080 major: 7 minor: 5 memoryClockRate(GHz): 1.71
pciBusID: 0000:01:00.0
2019-07-14 21:48:13.154845: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-07-14 21:48:13.155504: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0
2019-07-14 21:48:13.156112: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0
2019-07-14 21:48:13.156265: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0
2019-07-14 21:48:13.157040: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0
2019-07-14 21:48:13.157646: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0
2019-07-14 21:48:13.159661: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-07-14 21:48:13.159730: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-14 21:48:13.160165: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-14 21:48:13.160542: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0
2019-07-14 21:48:13.160559: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0
2019-07-14 21:48:13.161120: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-14 21:48:13.161129: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187]      0 
2019-07-14 21:48:13.161133: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0:   N 
2019-07-14 21:48:13.161331: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-14 21:48:13.161730: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1005] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-14 21:48:13.162120: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6794 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080, pci bus id: 0000:01:00.0, compute capability: 7.5)
2019-07-14 21:48:13.497639: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7
2019-07-14 21:48:14.077729: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-07-14 21:48:14.080055: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
Traceback (most recent call last):
  File "test.py", line 16, in <module>
    print(model.predict(test_inputs))
  File "/home/yudai/.local/share/virtualenvs/pipenv_practice-DKmRVcs4/lib/python3.7/site-packages/tensorflow/python/keras/engine/training.py", line 1078, in predict
    callbacks=callbacks)
  File "/home/yudai/.local/share/virtualenvs/pipenv_practice-DKmRVcs4/lib/python3.7/site-packages/tensorflow/python/keras/engine/training_arrays.py", line 363, in model_iteration
    batch_outs = f(ins_batch)
  File "/home/yudai/.local/share/virtualenvs/pipenv_practice-DKmRVcs4/lib/python3.7/site-packages/tensorflow/python/keras/backend.py", line 3292, in __call__
    run_metadata=self.run_metadata)
  File "/home/yudai/.local/share/virtualenvs/pipenv_practice-DKmRVcs4/lib/python3.7/site-packages/tensorflow/python/client/session.py", line 1458, in __call__
    run_metadata_ptr)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
  (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node conv2d/Conv2D}}]]
  (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node conv2d/Conv2D}}]]
     [[flatten/Reshape/_7]]
0 successful operations.
0 derived errors ignored.

config.gpu_options.allow_growth = True ์‹œ๋„ํ–ˆ์ง€๋งŒ์ด ์˜ค๋ฅ˜๊ฐ€ ํ•ด๊ฒฐ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋ˆ„๊ตฐ๊ฐ€ ๋‚˜๋ฅผ ๋„์™€ ์ฃผ์—ˆ์œผ๋ฉดํ•ฉ๋‹ˆ๋‹ค.

๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

RTX 2070๊ณผ ๋™์ผํ•œ ๋ฌธ์ œ

์ด ์˜ค๋ฅ˜๋ฅผ ์ถ”์ ํ•˜๊ฑฐ๋‚˜ ์‹คํ–‰ ๊ฐ€๋Šฅํ•œ ํ•ด๊ฒฐ์ฑ…์„ ์ฐพ๋Š” ๋ฐ ๋„์›€์ด ๋  ์ˆ˜์žˆ๋Š” ํฅ๋ฏธ๋กœ์šด ๊ด€์ฐฐ์„ํ–ˆ์Šต๋‹ˆ๋‹ค.
Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR ์ฐธ์กฐํ•˜์—ฌ Failed to get convolution algorithm ์˜ค๋ฅ˜๋„ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.
์‹œ์Šคํ…œ : Nvidia Quadro P2000, Ubuntu 18.04, tf 1.13.1, cuda10, cudnn 7.4.2๊ฐ€ ์„ค์น˜๋œ ๋…ธํŠธ๋ถ ์ปดํ“จํ„ฐ
์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด allow_growth ์‚ฌ์šฉํ•˜์—ฌ ํ”„๋กœ๊ทธ๋žจ์„ ์›ํ™œํ•˜๊ฒŒ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

ํฅ๋ฏธ๋กญ๊ฒŒ ๋„์ด ์˜ค๋ฅ˜๋Š” tf.layers.conv... ์‚ฌ์šฉํ•  ๋•Œ๋งŒ ๋ฐœ์ƒํ•˜์ง€๋งŒ tf.keras.layers.... ํ•˜๋ฉด ํ”„๋กœ๊ทธ๋žจ์ด allow_growth ์—†์ด ์‹คํ–‰๋  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ keras ์ฝ”๋“œ์˜ ๋‚ด์šฉ์ด tf ์ฝ”๋“œ. ๋ˆ„๊ตฐ๊ฐ€๊ฐ€์ด ์ •๋ณด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ keras์—์„œ ์†”๋ฃจ์…˜์„ ์ถ”์  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋‚˜๋Š” ์Šฌํ”„๊ฒŒ๋„ keras๊ฐ€ ์ง€์›ํ•˜์ง€ ์•Š๋Š” ๊ฐ€๋ณ€ ๋ฒ”์œ„๋ฅผ ํ†ตํ•ด ์‰ฌ์šด ๊ฐ€์ค‘์น˜ ๊ณต์œ ๋ฅผ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ์— tf.layers๋ฅผ ๊ณ ์ˆ˜ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

@ DavidS3141 ํฅ๋ฏธ

pytorch๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ torch.cuda.is_available ์€ True์ด๊ณ  ์•„๋ฌด ๋ฌธ์ œ์—†์ด convolution layer๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์›์ธ์€ tensorflow๋ผ๊ณ  ์ƒ๊ฐํ•˜์ง€๋งŒ ๋ฌด์—‡์ด ์ž˜๋ชป๋˜์—ˆ๋Š”์ง€ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.

@ Hayashi-Yudai์— ๋™์˜ํ•ฉ๋‹ˆ๋‹ค. MXNet๋„ ๋งˆ์ฐฌ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค. Tensorflow๊ฐ€ ์‹คํŒจํ•˜๋ฉด ๋™์ผํ•œ ๊ตฌ์„ฑ์ด ์ œ๋Œ€๋กœ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

ํ™˜๊ฒฝ:
RTX2080
Ubuntu 18.10
๋“œ๋ผ์ด๋ฒ„ 430.26
CUDA 10.0 (๋˜ํ•œ 10.1, ์•„์ง TF์—์„œ ์ง€์›๋˜์ง€ ์•Š์Œ)
cuDNN 7.6.1
mxnet-cu100 1.4.1
tensorflow-gpu 1.14.0

์•ˆ๋…•ํ•˜์„ธ์š”, ์ €๋Š” COCO ๋ฐ์ดํ„ฐ ์„ธํŠธ์—์„œ ResNet50 ๋ฐฑ๋ณธ๊ณผ ํ•จ๊ป˜ ์‚ฌ์ „ ํ›ˆ๋ จ ๋œ ๋ชจ๋ธ์˜ ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ CSV ๋ฐ์ดํ„ฐ ์„ธํŠธ์—์„œ ํ›ˆ๋ จํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. Unknown : Failed to get convolution algorithm.
-๋ฐฐ์น˜ ํฌ๊ธฐ 7-๋‹จ๊ณ„ 9-์—ํฌํฌ 4
--snapshot-path ์Šค๋ƒ… ์ƒท --tensorboard-dir tensorboard
csv dataset / train.csv dataset / classes.csv ๊ฐ€์ƒ ํ™˜๊ฒฝ์˜ ๋ช…๋ น ์ค„์—์„œ ๋‹ค์Œ ์Šคํฌ๋ฆฝํŠธ๋กœ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋ ค๊ณ ํ–ˆ์Šต๋‹ˆ๋‹ค.
ํŒŒ์ด์ฌ

tensorflow ๊ฐ€์ ธ ์˜ค๊ธฐ

tensorflow.compat.v1์—์„œ ConfigProto ๊ฐ€์ ธ ์˜ค๊ธฐ
tensorflow.compat.v1์—์„œ InteractiveSession ๊ฐ€์ ธ ์˜ค๊ธฐ
๊ตฌ์„ฑ = ConfigProto ()
config.gpu_options.allow_growth = True
์„ธ์…˜ = InteractiveSession (config = config)

๋งŒํผ ์ž˜
tensorflow๋ฅผ tf๋กœ ๊ฐ€์ ธ ์˜ค๊ธฐ
๊ตฌ์„ฑ = tf.ConfigProto ()
config.gpu_options.allow_growth = True
session = tf.Session (config = config)ํ•˜์ง€๋งŒ ๋‚ด ์˜ค๋ฅ˜๊ฐ€ ํ•ด๊ฒฐ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. :

๋‚˜๋Š” ์‚ฌ์šฉํ•˜๊ณ ์žˆ๋‹ค :-
Ubuntu 16.0
Cuda : 10.0
Tensorflow 1.14.0

์˜ค๋ฅ˜:
tensorflow.python.framework.errors_impl.UnknownError : 2 ๊ฐœ์˜ ๋ฃจํŠธ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ๊ฒฌ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. โ”‚ | ์‹คํ–‰์ค‘์ธ ํ”„๋กœ์„ธ์Šค๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. |
(0) ์•Œ ์ˆ˜ ์—†์Œ : ํšŒ์„  ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ€์ ธ ์˜ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” cuDNN์ด ์ดˆ๊ธฐํ™”์— ์‹คํŒจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ๊ฒฝ๊ณ  โ”‚ + -------------------------------๊ฐ€ ์žˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค. ---------------------------------------------- +
๋กœ๊ทธ ๋ฉ”์‹œ์ง€๊ฐ€ ์œ„์— ์ธ์‡„๋˜์—ˆ์Šต๋‹ˆ๋‹ค. โ”‚
[[{{๋…ธ๋“œ conv1 / convolution}}]] โ”‚
[[์†์‹ค / ์ถ”๊ฐ€ / _2377]] โ”‚
(1) ์•Œ ์ˆ˜ ์—†์Œ : ์ปจ๋ณผ ๋ฃจ์…˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ€์ ธ ์˜ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์•„๋งˆ๋„ cuDNN์ด ์ดˆ๊ธฐํ™”์— ์‹คํŒจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ๊ฒฝ๊ณ  โ”‚
๋กœ๊ทธ ๋ฉ”์‹œ์ง€๊ฐ€ ์œ„์— ์ธ์‡„๋˜์—ˆ์Šต๋‹ˆ๋‹ค. โ”‚
[[{{๋…ธ๋“œ conv1 / convolution}}]] โ”‚
์„ฑ๊ณตํ•œ ์ž‘์—…์ด ์—†์Šต๋‹ˆ๋‹ค. โ”‚
0 ๊ฐœ์˜ ํŒŒ์ƒ ์˜ค๋ฅ˜๊ฐ€ ๋ฌด์‹œ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. โ”‚
ํ™œ์„ฑ ์˜ˆ์™ธ์—†์ด ํ˜ธ์ถœ ์ข…๋ฃŒ โ”‚
์ค‘๋‹จ๋จ (์ฝ”์–ด ๋คํ”„ ๋จ)
์–ด๋–ค ๋„์›€์„ ์ฃผ์‹œ๋ฉด ๊ฐ์‚ฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์—๋„ ๊ฐ™์€ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. Allow_growth ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์ด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ๊ฐ€์žฅ ๊ธฐ๋ณธ์ ์ธ MNIST tensorflow ๋ฐ์ดํ„ฐ ์„ธํŠธ์—์„œ์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

์—ฌ๊ธฐ RTX2060 ๋ชจ๋ฐ”์ผ.

conda (tensorflow-gpu)์™€ ํ•จ๊ป˜ conda๋ฅผ ํ†ตํ•ด ์„ค์น˜๋œ TF 1.4๋ฟ๋งŒ ์•„๋‹ˆ๋ผ r2.0 ๋ธŒ๋žœ์น˜์—์„œ ์ปดํŒŒ์ผ ๋œ tensorflow์—์„œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

@ Hayashi-Yudai

config.gpu_options.allow_growth = True๋ฅผ ์‹œ๋„ํ–ˆ์ง€๋งŒ์ด ์˜ค๋ฅ˜๊ฐ€ ํ•ด๊ฒฐ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์ฝ”๋“œ์— ์ถ”๊ฐ€ ํ•œ ์ •ํ™•ํ•œ ๋ช…๋ น์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ๋‹ค๋ฅธ ๊ฒฝ์šฐ ๋Œ€์‹  ๋‹ค์Œ์„ ์‹œ๋„ํ•˜์‹ญ์‹œ์˜ค ...

๊ตฌ์„ฑ = tf.ConfigProto ()
config.gpu_options.allow_growth = True
tf.keras.backend.set_session (tf.Session (config = config))

@ synapse8 ๊ท€ํ•˜์˜ ์˜๊ฒฌ์— ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค. ์‹œ๋„ํ–ˆ์ง€๋งŒ ๊ฒฐ๊ณผ๋Š” ๊ฐ™์•˜์Šต๋‹ˆ๋‹ค.

๊ทธ๊ฑด ๊ทธ๋ ‡๊ณ , ๋‚˜๋Š” nvidia-docker๋ฅผ ์‹œ๋„ํ–ˆ๊ณ  ํŒŒ์ด์ฌ ๋ฒ„์ „์ด 3.5๋ผ๋Š” ์ ์„ ์ œ์™ธํ•˜๊ณ ๋Š” ์ž˜๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
https://docs.nvidia.com/deeplearning/frameworks/tensorflow-release-notes/running.html#running

์ถ”๊ฐ€ ์ •๋ณด, python 3.6.8 ๋ฐ tensorflow-gpu 1.12.0์„ ์‚ฌ์šฉํ•ด๋„ ๊ดœ์ฐฎ๋‹ค๋ฉด anaconda๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

conda create -n <virtual env name> python=3.6.8
conda install tensorflow-gpu==1.12.0
conda install cudnn==7.3.1    # By default, cudnn7.6 is installed but it causes the error

CUDA-10.1 ๋ฐ CUDNN-7.6.2.4๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์†Œ์Šค์—์„œ tf-2.0.0-beta1 ๋นŒ๋“œ๋ฅผ ํ…Œ์ŠคํŠธํ–ˆ๋Š”๋ฐ ์˜ค๋ฅ˜๊ฐ€ ๋‚˜ํƒ€๋‚˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์—์„œ tf-gpu ํŒจํ‚ค์ง€ ๋ฐ tf-base ํŒจํ‚ค์ง€๋ฅผ ๋นŒ๋“œ ํ•˜๊ธฐ์œ„ํ•œ ๋„์ปค ์ด๋ฏธ์ง€๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
https://github.com/edowson/docker-tensorflow

์•„๋‚˜์ฝ˜๋‹ค ์ฑ„๋„์€์ด ๋Œ“๊ธ€์„ ์ž‘์„ฑํ•˜๋Š” ์‹œ์ ์— cudnn==7.6.2 ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

Windows 7์€ ์ƒˆ ์‹œ์Šคํ…œ์„ ์„ค์น˜ํ•˜๋ ค๊ณ  ํ•œ๋™์•ˆ Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR ์ด์ƒ์œผ๋กœ ๋‚ด ๋จธ๋ฆฌ๋ฅผ ๋ฒฝ์— ๋ถ€๋”ช ํ˜”์Šต๋‹ˆ๋‹ค.

์žฌ์„ค์น˜, ์ด๊ฒƒ๊ณผ ๋‹ค๋ฅธ ์Šค๋ ˆ๋“œ์˜ ๋งŽ์€ ๋‹ค๋ฅธ ๊ฒƒ๋“ค์ด ๊ทธ๊ฒƒ์„ ๊ณ ์น˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.

๊ทธ ์žˆ์ง€ ์•Š๋Š” ํ…Œ์ŠคํŠธํ•˜๋Š” ๋™์•ˆ cudnn64_7.dll ๋Œ„ ๋‹ค๋ฅธ ์˜ค๋ฅ˜ ์›์ธ์ด CUDNN_STATUS_INTERNAL_ERROR ๋‚˜๋Š” DLL์„ ์ด๋ฆ„. ์˜ค๋ฅ˜๊ฐ€ ๋Œ€์‹  CUDNN NOT INSTALLED ์œ ํ˜• ์˜ค๋ฅ˜์ธ์ง€ ํ™•์ธํ•˜๊ณ  ํŒŒ์ผ ์ด๋ฆ„ ๋ณ€๊ฒฝ์„ ์ทจ์†Œํ–ˆ์Šต๋‹ˆ๋‹ค.

๋งˆ์ˆ ์ฒ˜๋Ÿผ ๋ชจ๋“  ๊ฒƒ์ด ์ž‘๋™ํ•˜๊ธฐ ์‹œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค.

์ด์œ ์™€ ๋ฐฉ๋ฒ•์€ ์•Œ ์ˆ˜ ์—†์ง€๋งŒ ๊ทธ๋ ‡์Šต๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ์‚ฌ๋žŒ์—๊ฒŒ ๋„์›€์ด๋˜๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค. ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ ์‹œ๋„ํ•˜๋Š” ๋ฐ ๋ช‡ ์ดˆ ๋ฐ–์— ๊ฑธ๋ฆฌ์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์ด ๋ฌธ์ œ๋Š” tf.Session์„ ๋‘ ๋ฒˆ ์ž˜๋ชป ํ˜ธ์ถœํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.

sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

# several lines of code later...

sess = tf.Session(config=config)

์•„๋งˆ๋„ ๋Œ€๋ถ€๋ถ„์˜ ์‚ฌ๋žŒ๋“ค์—๊ฒŒ ๊ทผ๋ณธ ์›์ธ์€ ์•„๋‹ˆ์ง€๋งŒ ์‚ดํŽด๋ณผ ๊ฐ€์น˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

"allow_growth = True"๋ฅผ ๊ณต์œ ํ•˜๋ฉด ์•„๋ž˜ ๋‚ด ์‹œ์Šคํ…œ์˜ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋ฉ๋‹ˆ๋‹ค.
RTX 2080ti, ubuntu18.04, cuda9.0, cudnn7, tf1.9

๊ตฌ์„ฑ = tf.ConfigProto ()
config.gpu_options.allow_growth = True
์„ธ์…˜ = tf.Session (config = config)

per_process_gpu_memory_fraction ๋ผ๊ณ ๋„ํ•˜๋Š” cudnn ํ•ธ๋“ค์„ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด GPU ๋ฆฌ์†Œ์Šค๋ฅผ๋กœ๋“œํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜์žˆ๋Š” ๋ฉ”๋ชจ๋ฆฌ ๋น„์œจ๊ณผ ๊ด€๋ จ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
์ด ๋ฉ”๋ชจ๋ฆฌ ๋น„์œจ์„ ์ง์ ‘ ์ค„์ด๋ฉด ์˜ค๋ฅ˜๊ฐ€ ํ•ด๊ฒฐ๋ฉ๋‹ˆ๋‹ค.

> sess_config = tf.ConfigProto(gpu_options =
> tf.GPUOptions(per_process_gpu_memory_fraction=0.7),
> allow_soft_placement = True)
> 
> with tf.Session(config=sess_config) as sess:
>      sess.run([whatever])

๋‹น์‹ ์˜ ๊ธฐ์–ต์— ๋งž๋Š” ์ž‘์€ ๋ถ„์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์‹ญ์‹œ์˜ค. (์ฝ”๋“œ์—์„œ๋Š” 0.7์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. 0.3 ์ดํ•˜๋กœ ์‹œ์ž‘ํ•œ ๋‹ค์Œ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•  ๋•Œ๊นŒ์ง€ ๋Š˜๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ํ•œ๊ณ„์ž…๋‹ˆ๋‹ค.)
tf.Session() ๋˜๋Š” tf.train.MonitoredTrainingSession() ๋˜๋Š” ๊ฐ๋…์ž์˜ sv.managed_session() ์— ๊ตฌ์„ฑ์œผ๋กœ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.

์ด๋ ‡๊ฒŒํ•˜๋ฉด GPU๊ฐ€ TensorFlow ์ฝ”๋“œ์— ๋Œ€ํ•œ cudnn ํ•ธ๋“ค์„ ์ƒ์„ฑ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์—ฌ๊ธฐ ์— ์„ค๋ช… ๋œ๋Œ€๋กœ config.gpu_options.allow_growth = True ์„ค์ •์„์œ„ํ•œ TF 2.0์˜ ์ƒˆ๋กœ์šด ์ ‘๊ทผ ๋ฐฉ์‹์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  # Currently, memory growth needs to be the same across GPUs
  try:
    for gpu in gpus:
      tf.config.experimental.set_memory_growth(gpu, True)
  except RuntimeError as e:
    print(e)

์ด ์ฝ”๋“œ ์กฐ๊ฐ๊ณผ TF 2.0 RC1์„ ์‚ฌ์šฉํ•˜๋ฉด ์˜ค๋ฅ˜๊ฐ€ ๋” ์ด์ƒ ๋‚˜ํƒ€๋‚˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
๊ทธ๋Ÿฌ๋‚˜ 20XX Nvidia GPU๋ฅผ ๋ณด์œ ํ•œ ์‚ฌ๋žŒ์ด ๋งŽ๊ธฐ ๋•Œ๋ฌธ์— TF 2.0 ์ตœ์ข… ๋ฒ„์ „์ด ์ถœ์‹œ๋˜๊ธฐ ์ „์— ๊ธฐ๋ณธ์ ์œผ๋กœ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

TF1.4์˜ 1080Ti & TitanX์—์„œ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์œผ๋ฉฐ @ va-andrew ๋ฐ @oscarlinux ์˜ ์ œ์•ˆ์ด ๊ทธ๋‚ ์„ ๊ตฌํ–ˆ์Šต๋‹ˆ๋‹ค! ์ฒ˜์Œ์— ๋‚ด๊ฐ€ pytorch๋กœ ์ „ํ™˜ํ•˜๊ณ  ๋‹ค์‹œ๋Š” ๋Œ์•„ ์˜ค์ง€ ์•Š๋Š” ์ด์œ ๋ฅผ ์ƒ๊ธฐ์‹œ์ผœ์ค๋‹ˆ๋‹ค. ๋ถˆํ–‰ํžˆ๋„ ์—ฌ์ „ํžˆ TF๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ppl์ด ์žˆ์Šต๋‹ˆ๋‹ค .... ๊ทธ๋ž˜์„œ ๋‚˜๋Š” ๊ทธ๋“ค์˜ ์ฝ”๋“œ๋ฒ ์ด์Šค๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ๋งˆ๋‹ค ์—ฌ์ „ํžˆ์ด ๊ณ ํ†ต์„ ๊ฒช์–ด์•ผํ•ฉ๋‹ˆ๋‹ค. ์•„๋งˆ๋„ ONNX๋กœ ์กฐ๊ธˆ ํ”Œ๋ ˆ์ด ํ•  ๋•Œ์ž…๋‹ˆ๋‹ค.

tensorflow 2.0์œผ๋กœ ์—…๊ทธ๋ ˆ์ด๋“œ ํ•œ ํ›„ ์ด๊ฒƒ์„ ๋ฐœ๊ฒฌ ํ•œ ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์—๊ฒŒ๋Š” API์™€ ์ฝ”๋“œ๊ฐ€ ์•ฝ๊ฐ„ ๋‹ค๋ฆ…๋‹ˆ๋‹ค.

Ubuntu 18
Tensorflow 2.0
Tensorflow-gpu 2.0
์ง€ํฌ์Šค RTX 2070

์ด ์‹œ์Šคํ…œ์˜ ์ฝ”๋“œ๊ฐ€ ์—…๋ฐ์ดํŠธ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

import tensorflow as tf
config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.compat.v1.Session(config=config)

์ด ์†”๋ฃจ์…˜ ์€ ์ €์—๊ฒŒ ํšจ๊ณผ์ ์ด์—ˆ์Šต๋‹ˆ๋‹ค. (TF-GPU 2.0, Windows 10, GeForce RTX 2070)

physical_devices = tf.config.experimental.list_physical_devices('GPU')
assert len(physical_devices) > 0, "Not enough GPU hardware devices available"
tf.config.experimental.set_memory_growth(physical_devices[0], True)

์ถ”๊ฐ€ ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ๋ฅผ ์ถ”๊ฐ€ํ•˜์‹ญ์‹œ์˜ค. :
RTX 2080ti, ubuntu18.04, cuda10.0, cudnn7
์ œ ๊ฒฝ์šฐ์—๋Š” tf1.14 ๋ฐ 1.15rc3์—์„œ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

@ w4nderlust , 1.14 ๋ฐ 1.15์˜ ๊ฒฝ์šฐ ์„ธ์…˜ ๊ตฌ์„ฑ ์˜ต์…˜ config.gpu_options.allow_growth = True ์„ ๊ณ„์† ์„ค์ •ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ์‹ ๊ณ ํ•˜์‹  ๋‚ด์šฉ์ด ์ž‘๋™ํ•˜์ง€ ์•Š๊ฑฐ๋‚˜ tf.config.experimental ๋ฉ”์ปค๋‹ˆ์ฆ˜

@ w4nderlust , 1.14 ๋ฐ 1.15์˜ ๊ฒฝ์šฐ ์„ธ์…˜ ๊ตฌ์„ฑ ์˜ต์…˜ config.gpu_options.allow_growth = True ์„ ๊ณ„์† ์„ค์ •ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ์‹ ๊ณ ํ•˜์‹  ๋‚ด์šฉ์ด ์ž‘๋™ํ•˜์ง€ ์•Š๊ฑฐ๋‚˜ tf.config.experimental ๋ฉ”์ปค๋‹ˆ์ฆ˜

์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. ๋” ์ •ํ™•ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. config.gpu_options.allow_growth = True ์—†์ด๋Š” 1.14์™€ 1.15rc3 ๋ชจ๋‘ ๋‚ด ๊ตฌ์„ฑ์—์„œ ์—ฌ์ „ํžˆ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š”๋‹ค๊ณ ๋ณด๊ณ ํ•ฉ๋‹ˆ๋‹ค.

config.gpu_options.allow_growth = True ๋ณด๋‹ค ๋” ๋‚˜์€ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์„ ์ฐพ์€ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋‚ด ์„ค์ • (_RTX 2070_, docker image _tensorflow : 1.15.0-gpu-py3_)์˜ ๊ฒฝ์šฐ ์•„๋ž˜์™€ ๊ฐ™์ด ๊ตฌ์„ฑ์„ ์„ค์ • ํ•˜๋ฉด ์ „์ฒด GPU ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ• ๋‹น ํ•˜๋ฉด์„œ _CUDNN_STATUS_INTERNAL_ERROR_๋ฅผ ํ”ผํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ด๊ฒƒ์€ allow_growth ๋ชจ๋“œ์—์„œ ๋ฉ”๋ชจ๋ฆฌ์— ๋งž์ง€ ์•Š์ง€๋งŒ ์ „์ฒด ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ํ• ๋‹น ๋  ๋•Œ ๋”ฑ ๋งž๋Š” ๋Œ€ํ˜• ๋ชจ๋ธ์— ๋งค์šฐ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.

RTX์— ์ „์ฒด ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ• ๋‹นํ•˜๋ ค๋ฉด :
config.gpu_options.per_process_gpu_memory_fraction = 1.0

RTX์— ์ „์ฒด ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ• ๋‹นํ•˜๋ ค๋ฉด :
config.gpu_options.per_process_gpu_memory_fraction = 1.0

๋ฟก๋ฟก
TF 2.0์œผ๋กœ ์‹œ๋„ํ–ˆ์ง€๋งŒ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.
Ubuntu18.04, RTX 2080, CUDA10, cudnn 7.6.

TF 2.0์˜ ๊ฒฝ์šฐ GPU ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ์ œํ•œํ•˜๋Š” API๊ฐ€ ๋ณ€๊ฒฝ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

gpus = tf.config.experimental.list_physical_devices('GPU')

tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=1024)]

@nluehr ์™œ์ด ๋ฌธ์ œ๊ฐ€ RTX์—๋งŒ

RTX GPU์— ์•ก์„ธ์Šค ํ•  ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์ง์ ‘ ๋””๋ฒ„๊ทธํ•˜๊ธฐ๊ฐ€ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.

@sanjoy ํ†ตํ•ฉ GPU์—์„œ ๋””์Šคํ”Œ๋ ˆ์ด๋ฅผ ์‹คํ–‰ ์ค‘์ž…๋‹ˆ๋‹ค. TensorFlow๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๋™์•ˆ ๋‚ด ๋‹จ์ผ RTX GPU์— ๋‹ค๋ฅธ ์•ฑ์ด ์—†์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” ๊ทธ๊ฒƒ์„ tensorflow 2.0์— ์‚ฌ์šฉํ•ด ๋ณด์•˜์Šต๋‹ˆ๋‹ค.

    config = tf.compat.v1.ConfigProto()
    config.gpu_options.allow_growth = True
    session = tf.compat.v1.Session(config=config)

๊ทธ๊ฒƒ์€ ๋‚ด rtx2080์—์„œ cudnn ์˜ค๋ฅ˜๋ฅผ ์ˆ˜์ •ํ•˜์ง€๋งŒ ํ›ˆ๋ จ์€ ๋‚ด ๋…ธํŠธ๋ถ์˜ 1050Ti๋งŒํผ ๋น ๋ฆ…๋‹ˆ๋‹ค!
CNN์„ ํ›ˆ๋ จํ•˜๋Š” ๋™์•ˆ :

Tue Nov 12 19:22:35 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.26       Driver Version: 440.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce RTX 2080    Off  | 00000000:2D:00.0 Off |                  N/A |
|  0%   37C    P2    75W / 265W |   2904MiB /  7979MiB |     27%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1026      G   /usr/lib/Xorg                                200MiB |
|    0      6420      G   cinnamon                                      43MiB |
|    0     21073      C   /home/clementpoiret/anaconda3/bin/python    2647MiB |
+-----------------------------------------------------------------------------+

์ฒจ๊ฐ€

gpus = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=7000)]

๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. allow_growth์—†์ด cudnn ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜๊ณ  ์–ด์จŒ๋“  ๋‚ด RTX๋Š” 3Gb ๋˜๋Š” ๋ฉ”๋ชจ๋ฆฌ์™€ ๊ฐ™์€ ๊ฒƒ์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์–ด๋–ค ์ƒ๊ฐ?

๋‚˜๋Š” ์‹œ๋„ํ–ˆ๋‹ค

    gpus = tf.config.experimental.list_physical_devices('GPU')
    tf.config.experimental.set_memory_growth(gpus[0], True)
    tf.config.experimental.set_virtual_device_configuration(
        gpus[0],
        [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=7900)])

ํ•˜์ง€๋งŒ cudnn์€ ์—ฌ์ „ํžˆ โ€‹โ€‹์˜ค๋ฅ˜๋ฅผ ๋˜์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ RTX๊ฐ€ ์•„๋‹Œ 2 ๊ฐœ์˜ Titan V GPU (@sanjoy)๊ฐ€์žˆ๋Š” tensorflow 1.15.0-py3-gpu Docker ์ด๋ฏธ์ง€ (Ubuntu 18.04)์—์„œ์ด ์˜ค๋ฅ˜๊ฐ€ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜์ด ์˜ค๋ฅ˜๋Š” GPU0 ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” Xorg ๋ฐ gnome-shell์ด์žˆ๋Š” ๋‚ด GPU0์—์„œ๋งŒ ๋ฐœ์ƒํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ด์ง€๋งŒ GPU1์—๋Š” GPU mem์„ ์‚ฌ์šฉํ•˜๋Š” Python ๋งŒ ์žˆ์œผ๋ฉฐ์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์˜ค๋ฅ˜๋Š” ์•ˆํƒ€๊น๊ฒŒ๋„ ๊ฐ„ํ—์ ์œผ๋กœ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ๋•Œ๋กœ๋Š” ๋„์ปค ์ปจํ…Œ์ด๋„ˆ๋ฅผ ์ œ๊ฑฐํ•˜๊ณ  ๋™์ผํ•œ ์„ค์ •๊ณผ ๋™์ผํ•œ ์ฝ”๋“œ๋กœ ๋‹ค์‹œ ์ƒ์„ฑํ•˜๋ฉด ์˜ค๋ฅ˜๊ฐ€ ์‚ฌ๋ผ์ง‘๋‹ˆ๋‹ค. ์•„๋‹˜.

Keras ๋ฐฑ์—”๋“œ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

import tensorflow as tf

config = tf.ConfigProto()
config.gpu_options.allow_growth = True
allow_growth_session = tf.Session(config=config)
tf.keras.backend.set_session(allow_growth_session)

๋‹ค์Œ์€ ๋‘ GPU์˜ nvidia-smi์ž…๋‹ˆ๋‹ค.

| NVIDIA-SMI 440.26       Driver Version: 440.26       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  TITAN V             Off  | 00000000:01:00.0  On |                  N/A |
| 46%   63C    P2    51W / 250W |   7936MiB / 12065MiB |     31%      Default |
+-------------------------------+----------------------+----------------------+
|   1  TITAN V             Off  | 00000000:02:00.0 Off |                  N/A |
| 52%   70C    P2   131W / 250W |  12014MiB / 12066MiB |     60%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1564      G   /usr/lib/xorg/Xorg                            56MiB |
|    0      1607      G   /usr/bin/gnome-shell                          58MiB |
|    0      2428      G   /usr/lib/xorg/Xorg                           442MiB |
|    0      2574      G   /usr/bin/gnome-shell                         289MiB |
|    0      3292      G   ...p/pycharm-professional/167/jbr/bin/java    12MiB |
|    0      6794      G   anki                                          60MiB |
|    0     10336      G   /usr/lib/firefox/firefox                       6MiB |
|    0     16986      C   python                                      6981MiB |
|    1      4057      C   python                                     12001MiB |
+-----------------------------------------------------------------------------+

conda๋ฅผ ํ†ตํ•ด TF 2.0์ด ์„ค์น˜๋œ allow_growth ํ”Œ๋ž˜๊ทธ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋ฌธ์ œ๊ฐ€ ์‚ฌ๋ผ์ง€์ง€๋งŒ TF 1.x์—์„œ ๊ฐ€์กŒ๋˜ ๊ฒƒ๋ณด๋‹ค ํ›ˆ๋ จ์ด ๋งค์šฐ ๋Š๋ฆฌ๊ณ  ๋Š๋ ค์ง‘๋‹ˆ๋‹ค.

@clementpoiret ๋ฐ @EKami , config.gpu_options.allow_growth = True ๋ฅผ config.gpu_options.per_process_gpu_memory_fraction = 0.8 ๋ฐ”๊พธ๋ฉด ๊ต์œก ์†๋„๊ฐ€

@ synapse8 ๋‚˜๋Š” tensorflow 2.0์˜ ๋ฌธ์„œ์—์„œ ๋™๋“ฑํ•œ ๊ฒƒ์„ ๋ณด์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. tf.config.experimental๋กœ ๊ทธ๋ ‡๊ฒŒ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

ํŽธ์ง‘ :์ด ๋ฐฉ๋ฒ•์œผ๋กœ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์„ค์ •ํ•˜์—ฌ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜๋Š”์ง€ ํ™•์ธํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

import subprocess
import tensorflow as tf


def get_gpus_memory():
    """Get the max gpu memory.

    Returns
    -------
    usage: list
        Returns a list of total memory for each gpus.
    """
    result = subprocess.check_output([
        "nvidia-smi", "--query-gpu=memory.total",
        "--format=csv,nounits,noheader"
    ]).decode("utf-8")

    gpus_memory = [int(x) for x in result.strip().split("\n")]
    return gpus_memory


def setup_gpus(allow_growth=True, memory_fraction=.9):
    """Setup GPUs.

    Parameters:
    allow_growth (Boolean)
    memory_fraction (Float): Set maximum memory usage, with 1 using
        maximum memory
    """
    gpus = tf.config.experimental.list_physical_devices("GPU")
    if gpus:
        try:
            # Currently, memory growth needs to be the same across GPUs
            for i, gpu in enumerate(gpus):
                memory = get_gpus_memory()[i]

                tf.config.experimental.set_memory_growth(gpu, allow_growth)

                # Setting memory limit to max*fraction
                tf.config.experimental.set_virtual_device_configuration(
                    gpu, [
                        tf.config.experimental.VirtualDeviceConfiguration(
                            memory_limit=memory * memory_fraction)
                    ])

                logical_gpus = tf.config.experimental.list_logical_devices(
                    "GPU")
                print(len(gpus), "Physical GPUs,", len(logical_gpus),
                      "Logical GPUs")
        except RuntimeError as e:
            # Memory growth must be set before GPUs have been initialized
            print(e)

์ด๋ ‡๊ฒŒํ•˜๋ฉด ํŽธ๋ฆฌํ•˜๊ฒŒ setup_gpus(True, .9) ์ „ํ™”๋ฅผ ๊ฑธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@clementpoiret : tf.config.experimental.set_virtual_device_configuration ๋Š” GPU ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋ถ„ํ• ํ•˜๊ณ  ํ• ๋‹น ๋œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋ฏธ๋ฆฌ ํ• ๋‹นํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ•ด๋‹น ํ”Œ๋ž˜๊ทธ๋ฅผ ์žฌ์ •์˜ํ•˜๋ฏ€๋กœ tf.config.experimental.set_memory_growth ํ˜ธ์ถœ์ด ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์ด ๋ฌธ์ œ๋Š” RTX์—๋งŒ ๊ตญํ•œ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋˜๋Š” TF 2.0.

์ฒจ๊ฐ€:
_from tensorflow.compat.v1 import ConfigProto
tensorflow.compat.v1์—์„œ InteractiveSession ๊ฐ€์ ธ ์˜ค๊ธฐ
๊ตฌ์„ฑ = ConfigProto ()
config.gpu_options.allow_growth = True
์„ธ์…˜ = InteractiveSession (config = config) _

๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ™˜๊ฒฝ์—์„œ "Couldnn handle : CUDNN_STATUS_INTERNAL_ERROR"๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•ฉ๋‹ˆ๋‹ค.

nvidia-smi | NVIDIA-SMI 430.50 Driver Version: 430.50 CUDA Version: 10.1 | | 0 GeForce GT 1030 | 49% 67C P0 N/A / 30W | 1957MiB / 2000MiB | 94%

python -c 'import tensorflow as tf; print(tf.__version__)' 1.14.0
์ด๊ฒƒ์ด NVIDIA ๋“œ๋ผ์ด๋ฒ„์˜ ์ตœ๋Œ€ ์—ฐ์† ๋ธ”๋ก ํ• ๋‹น ๋ฌธ์ œ ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ๋™์ผํ•œ ์ด ๋ฉ”๋ชจ๋ฆฌ ์–‘์„ ๋” ์ž‘์€ ๋ธ”๋ก์œผ๋กœ ํ• ๋‹นํ•ด๋„ ๊ดœ์ฐฎ์Šต๋‹ˆ๊นŒ?

์•ˆ๋…•ํ•˜์„ธ์š”,

๋‚ด ์ปดํ“จํ„ฐ์—์„œ์ด ๋ฌธ์ œ๋ฅผ ์žฌํ˜„ ํ•  ์ˆ˜ ์—†์œผ๋ฏ€๋กœ ๊ทผ๋ณธ ์›์ธ์ด๋˜๋Š” ๋„์›€์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๋ฌธ์ œ๋ฅผ ์žฌํ˜„ ํ•  ์ˆ˜ ์žˆ๊ณ  ์ง์ ‘ ๋””๋ฒ„๊น…์„ ์ˆ˜ํ–‰ ํ•  ์‚ฌ๋žŒ์ด ์žˆ์Šต๋‹ˆ๊นŒ?

์‹œ์ž‘์ ์œผ๋กœ MinSystemMemory ๊ฐ€ cuDNN์„์œ„ํ•œ ์ถฉ๋ถ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋ณด์กดํ•˜์ง€ ๋ชปํ•˜๋Š” ์ด์œ ๋ฅผ ์ดํ•ดํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ์ด ๋ฌธ์ œ๋ฅผ ์žฌํ˜„ํ•˜๋Š” ์„ค์ •์„ ๊ฐ€์ง„ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๋กœ๊น… (๋กœ์ปฌ ํŒจ์น˜)์„ ์ถ”๊ฐ€ํ•˜์—ฌ MinSystemMemory ๋ฐ˜ํ™˜ ํ•œ ๋ฉ”๋ชจ๋ฆฌ ์–‘์„ ์•Œ์•„๋‚ผ ์ˆ˜ ์žˆ๋‹ค๋ฉด ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  MinSystemMemory ์—์„œ ๋งˆ๋ฒ•์˜ 0.05 ์ˆซ์ž๋ฅผ ๋Š˜๋ฆฌ๋ฉด ์ƒํ™ฉ์— ๋„์›€์ด ๋ ๊นŒ์š”?

@sanjoy ์ด ๋ฌธ์ œ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๋ฒ„์ „์ด ์žˆ์Šต๋‹ˆ๋‹ค. MinSystemMemory์— ์•ก์„ธ์Šคํ•˜๊ฑฐ๋‚˜ "๋งค์ง 0.05 ์ˆซ์ž๋ฅผ ์„ค์ •"ํ•˜๋ ค๋ฉด ์–ด๋–ป๊ฒŒํ•ด์•ผํ•ฉ๋‹ˆ๊นŒ? ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ cuda 9.1์„ ์‚ฌ์šฉํ•˜๋„๋ก ๋˜๋Œ ๋ ธ์ง€๋งŒ ๋ช‡ ๊ฐ€์ง€ ์‹œ๋„ํ•ด๋„ ๊ดœ์ฐฎ์Šต๋‹ˆ๋‹ค.

@odinsbane ์•„๋ž˜์—์„œ ์ œ์•ˆํ•œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด ์†Œ์Šค์—์„œ TensorFlow๋ฅผ ๋นŒ๋“œํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

์ฒซ ๋ฒˆ์งธ ๋‹จ๊ณ„๋Š” LOG(INFO) ๋˜๋Š” std::cerr ์ค„์„ MinSystemMemory ์— ์ถ”๊ฐ€ํ•˜์—ฌ available_memory ๋ฐ MinSystemMemory ์˜ ๋ฐ˜ํ™˜ ๊ฐ’์„ ์ธ์‡„ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. available_memory nvidia-smi ์ธ์‡„ ๋‚ด์šฉ๊ณผ ์ผ์น˜ํ•ฉ๋‹ˆ๊นŒ? ์‹œ์Šคํ…œ์„ ์œ„ํ•ด ์–ผ๋งˆ๋‚˜ ๋งŽ์€ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋‚จ์•„ ์žˆ์Šต๋‹ˆ๊นŒ?

๋‘˜์งธ, 0.05 ๋งค์ง ๋„˜๋ฒ„ ๋ฅผ ๋Š˜๋ฆฌ๋ฉด 0.07 ๋„์›€์ด ๋ ๊นŒ์š”?

์ด๊ฒƒ์€ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค! ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!

from keras.backend.tensorflow_backend import set_session
$ import tensorflow as tf
$ config = tf.ConfigProto()
$ config.gpu_options.allow_growth = True
$ config.log_device_placement = True
$ sess = tf.Session(config=config)
$ set_session(sess)

RTX 2070 (Ubuntu 18.04, TF2)์—์„œ ๋น„์Šทํ•œ ๋ฌธ์ œ์— ์ง๋ฉดํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. CUDA 10.0 ๋ฐ libcudnn7.xxx ๋ฒ„์ „์˜ ๋‹ค๋ฅธ ์กฐํ•ฉ์„ ์‹œ๋„ํ–ˆ์ง€๋งŒ ์˜ค๋ฅ˜๊ฐ€ ๊ณ„์† ๋‹ค์‹œ ๋‚˜ํƒ€๋‚ฉ๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์ปดํ“จํ„ฐ์—๋Š” GTX 1080ti๊ฐ€ ์žˆ์œผ๋ฉฐ ์ด๊ฒƒ์€ ๋ฌธ์ œ์—†์ด ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค.
nvidia-driver๋Š” ๋‘ ๊ฒฝ์šฐ ๋ชจ๋‘ 430.50์ž…๋‹ˆ๋‹ค.

tf.keras.utils.plot_model ์ธํ•œ ๊ฒƒ์ด ์•„๋‹™๋‹ˆ๋‹ค. ์ œ๊ฑฐํ•ด๋„์ด ์˜ค๋ฅ˜๊ฐ€ ๊ณ„์† ํ‘œ์‹œ๋˜์ง€๋งŒ ๋นˆ๋„๋Š” ๋‚ฎ์Šต๋‹ˆ๋‹ค.
์—…๋ฐ์ดํŠธ : tf.keras.utils.plot_model ์‚ฌ์šฉํ•  ๋•Œ๋งŒ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ๊ณ„์† ๋…ธ๋ ฅํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

============

Ubuntu 18.04.3 LTS, tf 1.15, cuda 10.0์—์„œ RTX 2080 Ti์™€ ์œ ์‚ฌํ•œ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

์ œ ๊ฒฝ์šฐ์— ์ด์ƒํ•œ ์ ์€ ์ด๋Ÿฐ ์ผ์ด ์•„์ฃผ ๊ฐ€๋” ๋ฐœ์ƒํ•˜๊ณ  ์ผ๋‹จ ๋ฐœ์ƒํ•˜๋ฉด ๋ช‡ ๋ถ„์—์„œ ๋ช‡ ์‹œ๊ฐ„ ๋™์•ˆ ์ง€์† ๋œ ๋‹ค์Œ ์ €์ ˆ๋กœ ์‚ฌ๋ผ์ง„๋‹ค๋Š” ๊ฒƒ ์ž…๋‹ˆ๋‹ค.

์œ„์˜ ๋ชจ๋“  ์†”๋ฃจ์…˜์„ ์‹œ๋„ํ–ˆ์ง€๋งŒ ์•„๋ฌด๋„ ์ฆ‰์‹œ ํ•ด๊ฒฐํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์•„๋ฌด๊ฒƒ๋„ํ•˜์ง€ ์•Š์œผ๋ ค ๊ณ  ๊ทธ๋ƒฅ ๊ธฐ๋‹ค๋ฆฌ๋ฉด ๋งˆ์นจ๋‚ด ์‚ฌ๋ผ์งˆ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋‚ด๊ฐ€ ์‹œ๋„ํ–ˆ์ง€๋งŒ ์œ„์— ์–ธ๊ธ‰๋˜์ง€ ์•Š์€ ๊ฒƒ :

  1. ~/.nv ๋””๋ ‰ํ„ฐ๋ฆฌ ์ œ๊ฑฐ
  2. ๊ฐ„๋‹จํžˆ ์žฌ๋ถ€ํŒ…

์ฐธ๊ณ ๋กœ, ์˜ค๋ฅ˜ ๋กœ๊ทธ

2019-12-21 14:47:30.785233: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-12-21 14:47:30.959825: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-12-21 14:47:31.722238: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2019-12-21 14:47:31.749524: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
Traceback (most recent call last):
  File "train_cifar.py", line 204, in <module>
    main()
  File "train_cifar.py", line 133, in main
    validation_data=(x_test, output_test), callbacks=callbacks, verbose=0)
  File "/home/xxx/anaconda3/envs/tf-1-gpu/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 727, in fit
    use_multiprocessing=use_multiprocessing)
  File "/home/xxx/anaconda3/envs/tf-1-gpu/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_generator.py", line 603, in fit
    steps_name='steps_per_epoch')
  File "/home/xxx/anaconda3/envs/tf-1-gpu/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_generator.py", line 265, in model_iteration
    batch_outs = batch_function(*batch_data)
  File "/home/xxx/anaconda3/envs/tf-1-gpu/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 1017, in train_on_batch
    outputs = self.train_function(ins)  # pylint: disable=not-callable
  File "/home/xxx/anaconda3/envs/tf-1-gpu/lib/python3.7/site-packages/tensorflow_core/python/keras/backend.py", line 3476, in __call__
    run_metadata=self.run_metadata)
  File "/home/xxx/anaconda3/envs/tf-1-gpu/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1472, in __call__
    run_metadata_ptr)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
  (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node stem_layer/conv2d/Conv2D}}]]
     [[metrics/classifier_acc/Identity/_1749]]
  (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node stem_layer/conv2d/Conv2D}}]]
0 successful operations.
0 derived errors ignored.

์šฐ๋ฆฌ๋Š” ๊ด€๋ จ ๋ฌธ์ œ์— ์ง๋ฉดํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค

์‹œ์Šคํ…œ ์‚ฌ์–‘

  • Ubuntu 18.04.3 LTS
  • RTX 2070
  • ํŒŒ์ด์ฌ 3.7.1
  • tf-gpu 2.0.0
  • V10.0.130 CUDA
  • libcudnn7 7.6.2

LSTM, GRU, RNN ๋“ฑ์„ ์‚ฌ์šฉํ•˜๋ ค๊ณ ํ•˜๋ฉด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

์‹ค์ œ ์˜ค๋ฅ˜

2019-12-23 16:09:00.912238: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2019-12-23 16:09:01.408990: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR 2019-12-23 16:09:01.409043: W tensorflow/core/framework/op_kernel.cc:1622] OP_REQUIRES failed at cudnn_rnn_ops.cc:1491 : Unknown: Fail to find the dnn implementation.

File "/home/alex/anaconda3/envs/tf/lib/python3.7/site-packages/tensorflow_core/python/keras/layers/recurrent_v2.py", line 961, in call **cudnn_lstm_kwargs) File "/home/alex/anaconda3/envs/tf/lib/python3.7/site-packages/tensorflow_core/python/keras/layers/recurrent_v2.py", line 1174, in cudnn_lstm rnn_mode='lstm') File "/home/alex/anaconda3/envs/tf/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_cudnn_rnn_ops.py", line 109, in cudnn_rnn ctx=_ctx) File "/home/alex/anaconda3/envs/tf/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_cudnn_rnn_ops.py", line 198, in cudnn_rnn_eager_fallback attrs=_attrs, ctx=_ctx, name=name) File "/home/alex/anaconda3/envs/tf/lib/python3.7/site-packages/tensorflow_core/python/eager/execute.py", line 67, in quick_execute six.raise_from(core._status_to_exception(e.code, message), None) File "<string>", line 3, in raise_from tensorflow.python.framework.errors_impl.UnknownError: Fail to find the dnn implementation. [Op:CudnnRNN]

๋ช…๋ฐฑํ•œ ๋ฌธ์ œ

๋‚ด ๋ชจ๋“  ๊ธฐ์–ต์ด ๊ฝค ๋นจ๋ฆฌ ์†Œ๋ชจ๋˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ž…๋‹ˆ๋‹ค. ๋ฌธ์ œ๋Š” GPU ๋ชจ๋“œ์—์„œ๋งŒ ๋ฐœ์ƒํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋™์ผํ•œ ์ฝ”๋“œ๊ฐ€ CPU์—์„œ๋„ ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

์‹œํ—˜

  • ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€ ํ—ˆ์šฉ
  • ์ œํ•œ๋œ ๋ฉ”๋ชจ๋ฆฌ๋กœ ๊ฐ€์ƒ ์žฅ์น˜ ์ƒ์„ฑ

๋‘ ์‹œ๋„ ๋ชจ๋‘ ๋™์ผํ•œ ์˜ค๋ฅ˜๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

์–ด๋–ค ์•„์ด๋””์–ด?

์ด ๋ฌธ์ œ๋Š” ์žฌํ˜„ ํ•  ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์ง„ํ–‰ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ปดํ“จํ„ฐ์—์„œ ์ด๊ฒƒ์„ ์•ˆ์ •์ ์œผ๋กœ ์žฌํ˜„ ํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ๋„์›€์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฐฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. https://github.com/tensorflow/tensorflow/issues/24496#issuecomment -560963770, https://github.com/tensorflow/tensorflow/issues/24496#issuecomment -561366750

์ด ๋ฌธ์ œ๋Š” ์žฌํ˜„ ํ•  ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์ง„ํ–‰ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ปดํ“จํ„ฐ์—์„œ ์ด๊ฒƒ์„ ์•ˆ์ •์ ์œผ๋กœ ์žฌํ˜„ ํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ๋„์›€์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฐฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. # 24496 (์ฝ”๋ฉ˜ํŠธ) , # 24496 (์ฝ”๋ฉ˜ํŠธ)

์•ˆ๋…•ํ•˜์„ธ์š” @sanjoy , ์ €๋Š” ๊ธฐ๊บผ์ด ๋„์™€ ๋“œ๋ฆด ์ˆ˜ ์žˆ์ง€๋งŒ, ๋ถˆํ–‰ํžˆ๋„ ์ €๋Š” ์ œ ๋Œ€ํ•™์˜ ์ž์‚ฐ์„ ์‚ฌ์šฉํ•˜์—ฌ ์‹คํ—˜์„ํ•˜๊ณ  ์žˆ๊ณ  ์ œ ๊ฐœ์ธ ๋…ธํŠธ๋ถ์— GPU๊ฐ€ ์žฅ์ฐฉ๋˜์–ด ์žˆ์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ์†Œ์Šค์—์„œ tf๋ฅผ ๋นŒ๋“œ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ํ•„์š”ํ•œ ๋กœ๊ทธ๋ฅผ ์–ป์„ ์ˆ˜์žˆ๋Š” ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๊นŒ?

stack overflow ์—์„œ ๋‹ค์Œ ์ฝ”๋“œ๋ฅผ ์ฐพ์•˜์Šต๋‹ˆ๋‹ค. ๋„์›€์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

from tensorflow.contrib.memory_stats.python.ops.memory_stats_ops import BytesInUse
with tf.device('/device:GPU:0'):  # Replace with device you are interested in
  bytes_in_use = BytesInUse()
with tf.Session() as sess:
  print(sess.run(bytes_in_use))

ํ•„์š”ํ•œ ๋กœ๊ทธ๋ฅผ ์–ป์„ ์ˆ˜์žˆ๋Š” ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๊นŒ?

์ด ์ •๋ณด๋ฅผ ์–ป๊ธฐ ์œ„ํ•ด VLOG ๋ช…์„ธ์„œ๋ฅผ ํ™•์ธํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ์ด ์ž‘์—…์ด ์™„๋ฃŒ๋˜๋ฉด tf-nightly๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ์ด๋ฅผ ์„ค์น˜ํ•˜๊ณ  ์žฌํ˜„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ (๋ช‡ ๊ฐ€์ง€ ์ถ”๊ฐ€ ํ”Œ๋ž˜๊ทธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ •ํ™•ํžˆ ์–ด๋–ค ํ”Œ๋ž˜๊ทธ๋ฅผ ์•Œ๋ ค ๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค)?

๋ฌผ๋ก , pip ๋˜๋Š” conda ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜์žˆ๋Š” ํŒจํ‚ค์ง€๋ฅผ ํ•ด๋‹น ์ปดํ“จํ„ฐ์— ์„ค์น˜ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๊ฐ€์ƒ ํ™˜๊ฒฝ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์˜ค๋ฅ˜๋ฅผ ์žฌํ˜„ ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋ฌผ๋ก , pip ๋˜๋Š” conda ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜์žˆ๋Š” ํŒจํ‚ค์ง€๋ฅผ ํ•ด๋‹น ์ปดํ“จํ„ฐ์— ์„ค์น˜ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๊ฐ€์ƒ ํ™˜๊ฒฝ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์˜ค๋ฅ˜๋ฅผ ์žฌํ˜„ ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

tf-nightly (๋กœ๊น…์„ ์ถ”๊ฐ€ํ•˜๋Š” ์ปค๋ฐ‹ ์„ ์„ ํƒํ•˜๋„๋ก)๋ฅผ ์„ค์น˜ํ•˜๊ณ  ํ™˜๊ฒฝ ๋ณ€์ˆ˜ TF_CPP_VMODULE ๋ฅผ gpu_device=5 ์„ค์ •ํ•˜์—ฌ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋‘ ์ค„์„ ์ธ์‡„ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

2019-12-26 12:07:37.196206: I tensorflow/core/common_runtime/gpu/gpu_device.cc:837] available_memory = 12319588352                                             
2019-12-26 12:07:37.196221: I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] min_system_memory = 615979417                                              

์—ฌ๊ธฐ์—์ด ์ˆ˜์น˜๋ฅผ๋ณด๊ณ  ํ•ด ์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. ํ˜„์žฌ ์ฝ”๋“œ๊ฐ€ tf 2.0 (1.15 ์‚ฌ์šฉ)๊ณผ ํ˜ธํ™˜๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์—…๋ฐ์ดํŠธํ•˜๋ ค๊ณ ํ•ฉ๋‹ˆ๋‹ค. ์‹œ๊ฐ„์„ ์ข€์ฃผ์„ธ์š”.

์ด ๋ฌธ์ œ๋Š” ๋‚ด RTX2080๊ณผ ๊ด€๋ จ๋œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋ฐ์Šคํฌํƒ‘ GTX1080์ด ์žˆ๊ณ  ๋ชจ๋“  ๊ฒƒ์ด ์ •์ƒ์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ฝ˜ ๋‹ค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ RTX2080 ๋…ธํŠธ๋ถ์— conda ํ™˜๊ฒฝ์„ ๋ณต์ œํ•˜๊ณ  tensorflow2.0.0-gpu๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์ฝ”๋“œ๊ฐ€ Conv2d, LSTM, GRU๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด์ด ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.
์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์Œ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— :
gpus = tf.config.experimental.list_physical_devices ( 'GPU')
GPU ์ธ ๊ฒฝ์šฐ :
์‹œํ—˜:

ํ˜„์žฌ ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๋Š” GPU์—์„œ ๋™์ผํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:

GPU๋ฅผ ์ดˆ๊ธฐํ™”ํ•˜๊ธฐ ์ „์— ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๋ฅผ ์„ค์ •ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

    print(e)

ํ•˜์ง€๋งŒ ๋ฉฐ์น  ์ „๋ถ€ํ„ฐ ์œ„์˜ ๋ฐฉ๋ฒ•์€ ๋” ์ด์ƒ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

gtx 960m์—์„œ๋„ ๊ฐ™์€ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š” @sanjoy , ๋ฐฉ๊ธˆ์ด ์ถœ๋ ฅ์„ ์–ป์—ˆ์Šต๋‹ˆ๋‹ค.

2019-12-30 17:38:23.824323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:837] available_memory = 10840309760
2019-12-30 17:38:23.824328: I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] min_system_memory = 542015488

์•ˆ๋…•ํ•˜์„ธ์š” @sanjoy , ๋ฐฉ๊ธˆ์ด ์ถœ๋ ฅ์„ ์–ป์—ˆ์Šต๋‹ˆ๋‹ค.

2019-12-30 17:38:23.824323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:837] available_memory = 10840309760
2019-12-30 17:38:23.824328: I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] min_system_memory = 542015488

๊ฐ์‚ฌ!

๋ถˆํ–‰ํžˆ๋„ ์ด๊ฒƒ์€ ๋‚ด๊ฐ€ ์ƒ๊ฐํ•œ ๊ฒƒ๋งŒ ํผ ๋„์›€์ด๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋‚ด๊ฐ€ ํด๋žจํ”„ ๊ฒฝ์šฐ MinSystemMemory ์— ๋กœ์ปฌ ๋นŒ๋“œ 542015488 (์˜ˆ : min_system_memory = std::min(min_system_memory, 542015488ll) ) resnet (์˜ˆ๋ฅผ ๋“ค์–ด) ์ž˜ ์ž‘๋™ํ•˜๋Š” ๊ฒƒ ๊ฐ™๋‹ค, ๋‚˜๋Š” cuDNN์—์„œ ์˜ค๋ฅ˜๋ฅผ ์–ป์„ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค .

@sanjoy ๋‚˜๋Š” ๋‚ด ์ชฝ์—์„œ ๋ฌธ์ œ๋ฅผ (๋Œ€๋ถ€๋ถ„ ์ผ๊ด€๋˜๊ฒŒ) ์žฌํ˜„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ตœ์‹  ์•ผ๊ฐ„์˜ ๊ด€๋ จ ๋ฉ”์‹œ์ง€ :

๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๊ฐ€ ๋ช…์‹œ ์ ์œผ๋กœ ํ—ˆ์šฉ๋จ

2019-12-30 22:51:06.846774: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
WARNING:tensorflow:Falling back to tensorflow client, its recommended to install the cloud tpu client directly with pip install cloud-tpu-client .
2019-12-30 22:51:08.851660: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2019-12-30 22:51:08.877811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1558] Found device 0 with properties: 
pciBusID: 0000:08:00.0 name: GeForce GTX 1070 computeCapability: 6.1
coreClock: 1.7715GHz coreCount: 15 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 238.66GiB/s
2019-12-30 22:51:08.887672: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2019-12-30 22:51:08.895277: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2019-12-30 22:51:08.906016: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2019-12-30 22:51:08.913767: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2019-12-30 22:51:08.921329: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2019-12-30 22:51:08.930208: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2019-12-30 22:51:08.941818: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2019-12-30 22:51:08.945713: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1700] Adding visible gpu devices: 0
TF GPU device: PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')



CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1
Tensorflow Version: 2.1.0-dev20191230
Tensorflow_addons Version: 0.7.0-dev



Preparing data
Loading dataset
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 80/80 [00:03<00:00, 21.61it/s] 
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 68/68 [00:00<00:00, 447.32it/s] 
Performing NLP
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 80/80 [00:00<00:00, 13332.71it/s] 
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 68/68 [00:00<?, ?it/s] 
Transforming dataset
Generating primitives and constructing vocabulary
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 80/80 [00:00<00:00, 139.11it/s] 
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 68/68 [00:00<00:00, 4249.86it/s] 
Encoding primitives
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 16654/16654 [00:00<00:00, 33640.74it/s]
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 805/805 [00:00<00:00, 33538.43it/s] 
2019-12-30 22:51:22.970554: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-12-30 22:51:22.977228: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1558] Found device 0 with properties: 
pciBusID: 0000:08:00.0 name: GeForce GTX 1070 computeCapability: 6.1
coreClock: 1.7715GHz coreCount: 15 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 238.66GiB/s
2019-12-30 22:51:22.983571: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2019-12-30 22:51:22.986832: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2019-12-30 22:51:22.990667: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2019-12-30 22:51:22.993801: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2019-12-30 22:51:22.996967: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2019-12-30 22:51:23.002629: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2019-12-30 22:51:23.006072: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2019-12-30 22:51:23.010482: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1700] Adding visible gpu devices: 0
2019-12-30 22:51:23.557556: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1087] TensorFlow compiled with CUDA 10.1 and cuDNN 7.6.5
2019-12-30 22:51:23.560870: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1099] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-30 22:51:23.564144: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105]      0 
2019-12-30 22:51:23.569159: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1118] 0:   N
2019-12-30 22:51:23.571310: I tensorflow/core/common_runtime/gpu/gpu_device.cc:837] available_memory = 7038160076
2019-12-30 22:51:23.573861: I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] min_system_memory = 351908003
2019-12-30 22:51:23.576728: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1370] GPUDevice PlatformGpuId 0 TfGpuId 0 on bus 1 numa: 0 pci: 0000:08:00.0 DeviceLocality: bus_id: 1
links {
}

2019-12-30 22:51:23.583814: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1244] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6376 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:08:00.0, compute capability: 6.1)
2019-12-30 22:51:23.590034: I tensorflow/core/common_runtime/gpu/gpu_device.cc:249] Created stream[0] = 000002093BAB9860
2019-12-30 22:51:23.594885: I tensorflow/core/common_runtime/gpu/gpu_device.cc:268] Created host_to_device_stream[0] = 000002093BAB9360
2019-12-30 22:51:23.597951: I tensorflow/core/common_runtime/gpu/gpu_device.cc:273] Created device_to_host_stream[0] = 000002093BABA960
2019-12-30 22:51:23.600920: I tensorflow/core/common_runtime/gpu/gpu_device.cc:289] Created device_to_device_stream[0] = 000002093BAB8EE0

GPU ์žฅ์น˜์˜ ๊ตฌ์„ฑ์„ ๋ณ€๊ฒฝํ•˜์ง€ ์•Š๊ณ 

2019-12-30 22:54:47.762913: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
WARNING:tensorflow:Falling back to tensorflow client, its recommended to install the cloud tpu client directly with pip install cloud-tpu-client .
2019-12-30 22:54:50.073199: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2019-12-30 22:54:50.100339: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1558] Found device 0 with properties:
pciBusID: 0000:08:00.0 name: GeForce GTX 1070 computeCapability: 6.1
coreClock: 1.7715GHz coreCount: 15 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 238.66GiB/s
2019-12-30 22:54:50.105836: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2019-12-30 22:54:50.115940: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2019-12-30 22:54:50.127341: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2019-12-30 22:54:50.131871: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2019-12-30 22:54:50.139786: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2019-12-30 22:54:50.144940: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2019-12-30 22:54:50.159197: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2019-12-30 22:54:50.162685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1700] Adding visible gpu devices: 0
TF GPU device: PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')



CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1
Tensorflow Version: 2.1.0-dev20191230
Tensorflow_addons Version: 0.7.0-dev



Preparing data
Loading dataset
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 80/80 [00:03<00:00, 21.71it/s] 
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 68/68 [00:00<00:00, 433.07it/s] 
Performing NLP
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 80/80 [00:00<00:00, 13332.18it/s] 
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 68/68 [00:00<?, ?it/s] 
Transforming dataset
Generating primitives and constructing vocabulary
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 80/80 [00:00<00:00, 140.34it/s] 
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 68/68 [00:00<00:00, 4249.55it/s] 
Encoding primitives
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 16654/16654 [00:00<00:00, 33039.93it/s] 
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 805/805 [00:00<00:00, 33537.43it/s] 
2019-12-30 22:55:04.084880: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-12-30 22:55:04.088867: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1558] Found device 0 with properties:
pciBusID: 0000:08:00.0 name: GeForce GTX 1070 computeCapability: 6.1
coreClock: 1.7715GHz coreCount: 15 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: 238.66GiB/s
2019-12-30 22:55:04.094516: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_101.dll
2019-12-30 22:55:04.097049: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_10.dll
2019-12-30 22:55:04.099754: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_10.dll
2019-12-30 22:55:04.102329: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_10.dll
2019-12-30 22:55:04.105131: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_10.dll
2019-12-30 22:55:04.108029: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_10.dll
2019-12-30 22:55:04.110629: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll
2019-12-30 22:55:04.114339: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1700] Adding visible gpu devices: 0
2019-12-30 22:55:04.655119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1087] TensorFlow compiled with CUDA 10.1 and cuDNN 7.6.5
2019-12-30 22:55:04.658124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1099] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-30 22:55:04.660826: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105]      0
2019-12-30 22:55:04.662403: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1118] 0:   N
2019-12-30 22:55:04.664213: I tensorflow/core/common_runtime/gpu/gpu_device.cc:837] available_memory = 7038160076
2019-12-30 22:55:04.666185: I tensorflow/core/common_runtime/gpu/gpu_device.cc:838] min_system_memory = 351908003
2019-12-30 22:55:04.668490: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1370] GPUDevice PlatformGpuId 0 TfGpuId 0 on bus 1 numa: 0 pci: 0000:08:00.0 DeviceLocality: bus_id: 1
links {
}

2019-12-30 22:55:04.672820: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1244] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 6376 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1070, pci bus id: 0000:08:00.0, compute capability: 6.1)
2019-12-30 22:55:04.677690: I tensorflow/core/common_runtime/gpu/gpu_device.cc:249] Created stream[0] = 0000021EC0CF5840
2019-12-30 22:55:04.679747: I tensorflow/core/common_runtime/gpu/gpu_device.cc:268] Created host_to_device_stream[0] = 0000021EC0CF58C0
2019-12-30 22:55:04.682343: I tensorflow/core/common_runtime/gpu/gpu_device.cc:273] Created device_to_host_stream[0] = 0000021EC0CF5940
2019-12-30 22:55:04.685266: I tensorflow/core/common_runtime/gpu/gpu_device.cc:289] Created device_to_device_stream[0] = 0000021EC0CF59C0

ํŽธ์ง‘ : ๋„์›€์ด๋˜๋Š” ๊ฒฝ์šฐ ๋ชจ๋ธ ์ •๋ณด.

Layer (type)                    Output Shape         Param #     Connected to
==================================================================================================
Feature_1 (InputLayer)          [(None, 150)]        0
__________________________________________________________________________________________________
Feature_2 (InputLayer)          [(None, 150)]        0
__________________________________________________________________________________________________
embedding (Embedding)           (None, 150, 64)      5632        Feature_1[0][0]
__________________________________________________________________________________________________
embedding_1 (Embedding)         (None, 150, 64)      2944        Feature_2[0][0]
__________________________________________________________________________________________________
bidirectional (Bidirectional)   (None, 150, 128)     66048       embedding[0][0]
__________________________________________________________________________________________________
bidirectional_1 (Bidirectional) (None, 150, 128)     66048       embedding_1[0][0]
__________________________________________________________________________________________________
concatenate (Concatenate)       (None, 150, 256)     0           bidirectional[0][0]
                                                                 bidirectional_1[0][0]
__________________________________________________________________________________________________
bidirectional_2 (Bidirectional) (None, 64)           73984       concatenate[0][0]
__________________________________________________________________________________________________
dense (Dense)                   (None, 32)           2080        bidirectional_2[0][0]
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 1)            33          dense[0][0]
==================================================================================================
Total params: 216,769
Trainable params: 216,769
Non-trainable params: 0

TF 1.15๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์ตœ์†Œํ•œ์˜ ์˜ˆ์ด๋ฉฐ์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. RTX 2070 ๋ฐ NVIDIA 440.44 ๋ฐ CUDA ๋ฒ„์ „ 10.2.

import tensorflow as tf
import tensorflow.keras.applications as applications
import tensorflow.keras.utils as utils
import numpy as np

num_samples = 1000
height = 224
width = 224
num_classes = 1000

model = applications.ResNet50(weights=None, input_shape=(height, width, 3), classes=num_classes)

parallel_model = utils.multi_gpu_model(model, gpus=2, cpu_relocation=True)
parallel_model.compile(loss='categorical_crossentropy', optimizer='rmsprop')

x = np.random.random((num_samples, height, width, 3))
y = np.random.random((num_samples, num_classes))

parallel_model.fit(x, y, epochs=20, batch_size=256)

print('all done')
Train on 1000 samples
Epoch 1/20
2020-02-06 15:06:40.524918: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-06 15:06:41.291528: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-06 15:06:41.329183: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 822083584 exceeds 10% of system memory.
2020-02-06 15:06:42.082319: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 851705856 exceeds 10% of system memory.
2020-02-06 15:06:42.293092: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 822083584 exceeds 10% of system memory.
2020-02-06 15:06:43.173764: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 822083584 exceeds 10% of system memory.
2020-02-06 15:06:43.820074: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-02-06 15:06:44.390897: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 822083584 exceeds 10% of system memory.
2020-02-06 15:06:45.839525: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-02-06 15:06:45.856793: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-02-06 15:06:45.883423: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
Traceback (most recent call last):
  File "./test_tf.py", line 19, in <module>
    parallel_model.fit(x, y, epochs=20, batch_size=256)
  File "/nix/store/520352w3m8lyj2zgv647qfqrws5q798n-python3.7-tensorflow-gpu-1.15.0/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training.py", line 727, in fit
    use_multiprocessing=use_multiprocessing)
  File "/nix/store/520352w3m8lyj2zgv647qfqrws5q798n-python3.7-tensorflow-gpu-1.15.0/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_arrays.py", line 675, in fit
    steps_name='steps_per_epoch')
  File "/nix/store/520352w3m8lyj2zgv647qfqrws5q798n-python3.7-tensorflow-gpu-1.15.0/lib/python3.7/site-packages/tensorflow_core/python/keras/engine/training_arrays.py", line 394, in model_iteration
    batch_outs = f(ins_batch)
  File "/nix/store/520352w3m8lyj2zgv647qfqrws5q798n-python3.7-tensorflow-gpu-1.15.0/lib/python3.7/site-packages/tensorflow_core/python/keras/backend.py", line 3476, in __call__
    run_metadata=self.run_metadata)
  File "/nix/store/520352w3m8lyj2zgv647qfqrws5q798n-python3.7-tensorflow-gpu-1.15.0/lib/python3.7/site-packages/tensorflow_core/python/client/session.py", line 1472, in __call__
    run_metadata_ptr)
tensorflow.python.framework.errors_impl.UnknownError: 2 root error(s) found.
  (0) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
    [[{{node replica_1/resnet50/conv1_conv/Conv2D}}]]
  (1) Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
    [[{{node replica_1/resnet50/conv1_conv/Conv2D}}]]
    [[training/RMSprop/gradients/gradients/Switch_482/_3893]]
0 successful operations.
1 derived errors ignored.

๋ณ„๋„์˜ ๋ฌธ์ œ https://github.com/tensorflow/tensorflow/issues/36501 ์—์„œ ์ด๋Ÿฌํ•œ ์˜ต์…˜์„ ์‚ฌ์šฉํ•˜๋ฉด ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ GPU์˜ ์‹ค์ œ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ๊ด€์ฐฐํ•˜๋ฉด ์‹ค์ œ๋กœ๋Š” ๊ทธ๋ ‡์ง€ ์•Š๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฆ๋ถ„ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์œ„์˜ ์˜ต์…˜์€ ์˜ค๋ฅ˜๋ฅผ ์ˆ˜์ •ํ•˜์ง€๋งŒ ์‹ค์ œ๋กœ ์ˆ˜ํ–‰ํ•˜๋Š” ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜์ง€๋Š” ์•Š์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” 1.2์™€ ๊ฐ™์€ ์ด์ „ TF ๋ฒ„์ „์—์„œ ๋™์ผํ•œ ๋ชจ๋ธ์„ ๋‹ค์‹œ ์‚ฌ์šฉํ–ˆ์œผ๋ฉฐ ์‹ค์ œ ์ฆ๋ถ„ ๋ฉ”๋ชจ๋ฆฌ ํ• ๋‹น์„ ์ˆ˜ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.

์—ฌ๊ธฐ ๋ชจ๋‘์™€ ๊ฐ™์€ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค! tf 2.1์„ ์„ค์น˜ ํ•œ ํ›„ GPU์— ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๋ฅผ ์ถ”๊ฐ€ํ•˜์ง€ ์•Š๊ณ ๋Š” ๊ฐ„๋‹จํ•œ MNIST ์˜ˆ์ œ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” 2080 ti๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

๋‚ด๊ฐ€ ์ง๋ฉด ํ•œ ์ฃผ์š” ๋ฌธ์ œ๋Š” ์ฝ”๋“œ์— ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๊ฐ€ ์ถ”๊ฐ€ ๋˜์–ด๋„ ์ €์ฃผ๋ฐ›์€ CUDNN ๋‚ด๋ถ€ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š๊ณ  tf 2.1๊ณผ ํ•จ๊ป˜ tensorflow-probability๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์—†๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. tf 2.0, CUDA 10.0 ๋ฐ CUDA 10.1, ๋‹ค๋ฅธ CUDNN ๋ฒ„์ „ ์„ค์น˜๋ฅผ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์šฐ๋ถ„ํˆฌ๋ฅผ ์™„์ „ํžˆ ์žฌ์„ค์น˜ ํ•œ ํ›„ ์„ฑ์žฅํ•˜์ง€ ์•Š๊ณ  ์ž‘๋™ํ•˜๋„๋ก ๊ฐ„๋‹จํ•œ MNIST ์˜ˆ์ œ๋ฅผ ์ˆ˜์ •ํ–ˆ์ง€๋งŒ tensorflow ํ™•๋ฅ  ์˜ˆ์ œ๋Š” ์ˆ˜์ •ํ•˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๋งˆ์นจ๋‚ด tensorflow ๊ณต์‹ ์•ผ๊ฐ„ ๋„์ปค๋ฅผ ์‚ฌ์šฉํ•ด ๋ณด์•˜์ง€๋งŒ tensorflow ํ™•๋ฅ  (tf 2.2 ๋‚ด๋ถ€ ์ปจํ…Œ์ด๋„ˆ)์„ ์‚ฌ์šฉํ•  ๋•Œ ์—ฌ์ „ํžˆ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋“  ๊ฒƒ์ด CPU์—์„œ ์ž˜ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ๋˜ํ•œ 1080 ti๋ฅผ ๊ฐ€์ง„ ๊ธฐ๊ณ„์—์„œ ๋™์ผํ•œ ๋„์ปค๋ฅผ ์‹คํ–‰ ํ•ด ๋ณด์•˜๊ณ  ๊ทธ๊ฒƒ์€ ํšจ๊ณผ๊ฐ€ ์žˆ์—ˆ๋‹ค ... ๋‚ด๊ฐ€ ๋Š๋ผ๋Š” RTX ์‹œ๋ฆฌ์ฆˆ์— ๋ถ„๋ช…ํžˆ ๋ญ”๊ฐ€ ์ž˜๋ชป๋œ ๊ฒƒ์ด์žˆ๋‹ค.

tf docker ๋ฐ tensorflow-probability ์˜ˆ์ œ ๋ฐ ์ถ”๊ฐ€ cudnn ๋””๋ฒ„๊ทธ ์ •๋ณด ์˜ค๋ฅ˜ :

TF VERSION: 2.2.0-dev20200208
2020-02-11 08:51:05.891560: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-02-11 08:51:05.912465: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3696000000 Hz
2020-02-11 08:51:05.913040: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x57b1fd0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-02-11 08:51:05.913052: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-02-11 08:51:05.914414: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-02-11 08:51:05.975016: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.975364: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5679220 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-02-11 08:51:05.975376: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce RTX 2080 Ti, Compute Capability 7.5
2020-02-11 08:51:05.975477: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.975744: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1558] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.75GiB deviceMemoryBandwidth: 573.69GiB/s
2020-02-11 08:51:05.975865: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-11 08:51:05.976745: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-11 08:51:05.977582: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-02-11 08:51:05.977722: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-02-11 08:51:05.978636: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-02-11 08:51:05.979165: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-02-11 08:51:05.981150: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-11 08:51:05.981216: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.981528: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.981792: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1700] Adding visible gpu devices: 0
2020-02-11 08:51:05.981812: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-11 08:51:05.982323: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1099] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-11 08:51:05.982331: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105]      0 
2020-02-11 08:51:05.982335: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1118] 0:   N 
2020-02-11 08:51:05.982395: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.982687: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.982959: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1244] Created TensorFlow device (/device:GPU:0 with 9604 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
2020-02-11 08:51:05.983594: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.983864: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1558] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.75GiB deviceMemoryBandwidth: 573.69GiB/s
2020-02-11 08:51:05.983881: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-11 08:51:05.983889: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-11 08:51:05.983896: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-02-11 08:51:05.983904: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-02-11 08:51:05.983912: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-02-11 08:51:05.983920: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-02-11 08:51:05.983928: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-11 08:51:05.983961: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.984238: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.984497: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1700] Adding visible gpu devices: 0
2020-02-11 08:51:05.984508: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1099] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-11 08:51:05.984512: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105]      0 
2020-02-11 08:51:05.984516: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1118] 0:   N 
2020-02-11 08:51:05.984563: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.984842: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.985099: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1244] Created TensorFlow device (/device:GPU:0 with 9604 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
SUCCESS: Found GPU: /device:GPU:0
2020-02-11 08:51:05.989382: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.989649: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1558] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.75GiB deviceMemoryBandwidth: 573.69GiB/s
2020-02-11 08:51:05.989663: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-11 08:51:05.989671: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-11 08:51:05.989678: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-02-11 08:51:05.989684: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-02-11 08:51:05.989691: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-02-11 08:51:05.989700: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-02-11 08:51:05.989709: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-11 08:51:05.989744: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.990021: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.990347: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1700] Adding visible gpu devices: 0
2020-02-11 08:51:05.990544: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.990807: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1558] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce RTX 2080 Ti computeCapability: 7.5
coreClock: 1.545GHz coreCount: 68 deviceMemorySize: 10.75GiB deviceMemoryBandwidth: 573.69GiB/s
2020-02-11 08:51:05.990820: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-02-11 08:51:05.990828: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-11 08:51:05.990834: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-02-11 08:51:05.990841: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-02-11 08:51:05.990848: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-02-11 08:51:05.990854: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-02-11 08:51:05.990861: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-02-11 08:51:05.990892: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.991171: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.991426: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1700] Adding visible gpu devices: 0
2020-02-11 08:51:05.991437: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1099] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-02-11 08:51:05.991441: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1105]      0 
2020-02-11 08:51:05.991444: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1118] 0:   N 
2020-02-11 08:51:05.991486: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.991763: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-02-11 08:51:05.992022: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1244] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 9604 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 Ti, pci bus id: 0000:01:00.0, compute capability: 7.5)
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/linalg/linear_operator_lower_triangular.py:158: calling LinearOperator.__init__ (from tensorflow.python.ops.linalg.linear_operator) with graph_parents is deprecated and will be removed in a future version.
Instructions for updating:
Do not pass `graph_parents`.  They will  no longer be used.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow_core/python/ops/linalg/linear_operator_lower_triangular.py:158: calling LinearOperator.__init__ (from tensorflow.python.ops.linalg.linear_operator) with graph_parents is deprecated and will be removed in a future version.
Instructions for updating:
Do not pass `graph_parents`.  They will  no longer be used.
2020-02-11 08:51:06.822991: W tensorflow/python/util/util.cc:319] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
Epoch 1/15
2020-02-11 08:51:07.907445: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-02-11 08:51:09.832694: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7

I! CuDNN (v7604) function cudnnCreate() called:
i! Time: 2020-02-11T08:51:09.832722 (0d+0h+0m+4s since start)
i! Process=205; Thread=269; GPU=NULL; Handle=NULL; StreamId=NULL.

2020-02-11 08:51:10.409902: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

I! CuDNN (v7604) function cudnnCreate() called:
i! Time: 2020-02-11T08:51:10.410012 (0d+0h+0m+5s since start)
i! Process=205; Thread=269; GPU=NULL; Handle=NULL; StreamId=NULL.

2020-02-11 08:51:10.417952: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
      1/Unknown - 4s 4s/stepTraceback (most recent call last):
  File "VAE_MNIST_tfp.py", line 150, in <module>
    validation_data=eval_dataset)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training.py", line 718, in fit
    use_multiprocessing=use_multiprocessing)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_v2.py", line 341, in fit
    total_epochs=epochs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_v2.py", line 128, in run_one_epoch
    batch_outs = execution_function(iterator)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/keras/engine/training_v2_utils.py", line 98, in execution_function
    distributed_function(input_fn))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/def_function.py", line 576, in __call__
    result = self._call(*args, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/def_function.py", line 640, in _call
    return self._stateless_fn(*args, **kwds)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 2414, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1660, in _filtered_call
    self.captured_inputs)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 1741, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager))
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/function.py", line 598, in call
    ctx=ctx)
  File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/eager/execute.py", line 60, in quick_execute
    inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[node model/conv2d/Conv2D (defined at VAE_MNIST_tfp.py:150) ]] [Op:__inference_distributed_function_4291]

Errors may have originated from an input operation.
Input Source operations connected to node model/conv2d/Conv2D:
 model/lambda/sub (defined at VAE_MNIST_tfp.py:98)

Function call stack:
distributed_function

@sanjoy RTX 2080๊ณผ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ์žˆ์œผ๋ฉฐ ํ•„์š”ํ•œ ๊ฒฝ์šฐ ์†Œ์Šค์—์„œ ๋นŒ๋“œ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@odinsbane ์•„๋ž˜์—์„œ ์ œ์•ˆํ•œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด ์†Œ์Šค์—์„œ TensorFlow๋ฅผ ๋นŒ๋“œํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

์ฒซ ๋ฒˆ์งธ ๋‹จ๊ณ„๋Š” LOG(INFO) ๋˜๋Š” std::cerr ์ค„์„ MinSystemMemory ์— ์ถ”๊ฐ€ํ•˜์—ฌ available_memory ๋ฐ MinSystemMemory ์˜ ๋ฐ˜ํ™˜ ๊ฐ’์„ ์ธ์‡„ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. available_memory nvidia-smi ์ธ์‡„ ๋‚ด์šฉ๊ณผ ์ผ์น˜ํ•ฉ๋‹ˆ๊นŒ? ์‹œ์Šคํ…œ์„ ์œ„ํ•ด ์–ผ๋งˆ๋‚˜ ๋งŽ์€ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋‚จ์•„ ์žˆ์Šต๋‹ˆ๊นŒ?

๋‘˜์งธ, 0.05 ๋งค์ง ๋„˜๋ฒ„ ๋ฅผ ๋Š˜๋ฆฌ๋ฉด 0.07 ๋„์›€์ด ๋ ๊นŒ์š”?

๋งค์ง ๋„˜๋ฒ„ 0.05 ๋งค์ง ๋„˜๋ฒ„ ๋ฅผ 0.1 ๋กœ ๋ณ€๊ฒฝํ•˜์—ฌ ์†Œ์Šค์—์„œ ๋นŒ๋“œํ•˜๋ฉด ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค (์ ์–ด๋„ 1.15.2์˜ ๊ฒฝ์šฐ)!

์‹œ๋„๋Ÿฌ์šด ๊ฒŒ์‹œ๋ฌผ์˜ ๋ฐ”๋‹ค์—์„œ ์ตœ์†Œ ์‹œ์Šคํ…œ ๋ฉ”๋ชจ๋ฆฌ ๋งค์ง ๋„˜๋ฒ„๋Š” ์™„์ „ํžˆ ๋…ผ๋ฆฌ์ ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๊ณต์œ ํ•ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!

@chsigg ์ œ์•ˆ ์‚ฌํ•ญ์ด ์žˆ์Šต๋‹ˆ๊นŒ? ๋ชจ๋“  GPU ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์˜ˆ์•ฝํ•˜๊ธฐ ์ „์— cuDNN, cuBLAS ๋ฐ ๊ธฐํƒ€ NVIDIA ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์ดˆ๊ธฐํ™” ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

๊ธฐ๋ณธ์ ์œผ๋กœ allow_growth ๋ฅผ ํ™œ์„ฑํ™” ํ•  ์ˆ˜๋„ ์žˆ์ง€๋งŒ ์‹œ๊ฐ„์ด ๊ฑธ๋ฆฝ๋‹ˆ๋‹ค.

์ด ๋ฌธ์ œ๋Š” ๋‚ด RTX2080๊ณผ ๊ด€๋ จ๋œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋ฐ์Šคํฌํƒ‘ GTX1080์ด ์žˆ๊ณ  ๋ชจ๋“  ๊ฒƒ์ด ์ •์ƒ์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ฝ˜ ๋‹ค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ RTX2080 ๋…ธํŠธ๋ถ์— conda ํ™˜๊ฒฝ์„ ๋ณต์ œํ•˜๊ณ  tensorflow2.0.0-gpu๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์ฝ”๋“œ๊ฐ€ Conv2d, LSTM, GRU๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด์ด ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.
์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์Œ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— :
gpus = tf.config.experimental.list_physical_devices ( 'GPU')
GPU ์ธ ๊ฒฝ์šฐ :
์‹œํ—˜:

ํ˜„์žฌ ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๋Š” GPU์—์„œ ๋™์ผํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:

GPU๋ฅผ ์ดˆ๊ธฐํ™”ํ•˜๊ธฐ ์ „์— ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๋ฅผ ์„ค์ •ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

    print(e)

ํ•˜์ง€๋งŒ ๋ฉฐ์น  ์ „๋ถ€ํ„ฐ ์œ„์˜ ๋ฐฉ๋ฒ•์€ ๋” ์ด์ƒ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋žŒ๋‹ค Tensorflow2-tutorial ๊ธฐ๋ณธ ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜ ์ฝ”๋“œ๋ฅผ ๋ฉฐ์น  ๋™์•ˆ ์‹คํ–‰ํ•˜๊ณ  ์†”๋ฃจ์…˜์„ ์‹œ๋„ ํ•  ๋•Œ๊นŒ์ง€ ๋™์ผํ•œ cudnn ํ•ธ๋“ค ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๋งˆ์นจ๋‚ด RTX 2070 Max Q์—์„œ ์‹คํ–‰๋˜๊ณ  ์ตœ์†Œ GPU ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

๋‚˜๋„์ด ๋ฌธ์ œ๋ฅผ ๋งŒ๋‚˜
anacondacloud install tensorflow-gpu2.0

rtx2070s
tensorflow-gpu.2.0.0
cuda 10.0.13
cudnn 7.6.5
cudnn ํ•ธ๋“ค์„ ๋งŒ๋“ค ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค : CUDNN_STATUS_INTERNAL_ERROR
์ปจ๋ณผ ๋ฃจ์…˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ€์ ธ ์˜ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” cuDNN์ด ์ดˆ๊ธฐํ™”์— ์‹คํŒจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์œ„์— ๊ฒฝ๊ณ  ๋กœ๊ทธ ๋ฉ”์‹œ์ง€๊ฐ€ ์ธ์‡„๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.

๋‚˜๋„์ด ๋ฌธ์ œ๋ฅผ ๋งŒ๋‚˜
anacondacloud install tensorflow-gpu2.0

rtx2070s
tensorflow-gpu.2.0.0
cuda 10.0.13
cudnn 7.6.5
cudnn ํ•ธ๋“ค์„ ๋งŒ๋“ค ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค : CUDNN_STATUS_INTERNAL_ERROR
์ปจ๋ณผ ๋ฃจ์…˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ€์ ธ ์˜ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” cuDNN์ด ์ดˆ๊ธฐํ™”์— ์‹คํŒจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์œ„์— ๊ฒฝ๊ณ  ๋กœ๊ทธ ๋ฉ”์‹œ์ง€๊ฐ€ ์ธ์‡„๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.

์‚ฝ์ž… ํ–ˆ์Šต๋‹ˆ๊นŒ :

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
    print(e)

์ž…์žฅ ์ฝ”๋“œ ์ƒ๋‹จ์—?

tf.signal.stft๋กœ ์‹คํŒจํ•œ ๋ช…๋ฐฑํ•œ ๋‹ค๋ฅธ ๋ฌธ์ œ๋กœ ๊ฝค ์˜ค๋žœ ์‹œ๊ฐ„ ์‹คํ—˜ ํ•œ ํ›„
๋‚˜๋Š” ๋งˆ์นจ๋‚ด์ด ์Šค๋ ˆ๋“œ๋ฅผ ๋ฐœ๊ฒฌํ•˜๊ณ  ๋ฉ”๋ชจ๋ฆฌ ์„ฑ์žฅ์„ ํ—ˆ์šฉํ•˜๋Š” ์†”๋ฃจ์…˜์„ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‚ด ๋ฌธ์ œ๋„ ํ•ด๊ฒฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
anaconda์—์„œ cudatoolkit = 10.1๋กœ tensorflow-gpu = 2.1์„ ์„ค์น˜ํ–ˆ์ง€๋งŒ ์„ค์น˜๋ฅผ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค.
pip๋ฅผ ํ†ตํ•œ tensorflow-gpu๋Š” ์ •ํ™•ํžˆ ๋™์ผํ•œ ๊ฒฐ๊ณผ๋ฅผ ์–ป์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ด๊ฒƒ์„ linux-ubuntu 18.04 ๋ฐ debian 9.12์—์„œ ์นด๋“œ๋กœ ์žฌํ˜„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

   GeForce GTX 1050 Ti with Max-Q Design   
   GeForce GTX 1050 Ti
   GeForce RTX 2080 Ti

๋‚˜๋Š” ๋˜ํ•œ ์šฐ๋ฆฌ ์—ฐ๊ตฌ์‹ค์—์„œ ๋‘ ๊ฐœ์˜ ๋‹ค๋ฅธ ์นด๋“œ๋ฅผ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค

  GeForce GTX 1080 Ti
  TITAN Xp COLLECTORS EDITION

๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๋ฅผ ํ—ˆ์šฉํ•˜๊ฑฐ๋‚˜ ํ—ˆ์šฉํ•˜์ง€ ์•Š๊ณ  ์ฝ”๋“œ๊ฐ€ ์ž˜ ์‹คํ–‰๋˜๋Š” ๊ณณ

๋‚ด ์ตœ์†Œํ•œ์˜ ๋ฌธ์ œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ํฅ๋ฏธ๋กญ๊ฒŒ๋„ ๋ฌธ์ œ๋Š” conv2d๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค. ์ด ์„ธ ๋ช…๋ น์˜ ์ˆœ์„œ๋ฅผ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ํ•ญ์ƒ ํ•˜๋‚˜๊ฐ€ ์‹คํŒจํ•˜๋Š” ๊ฒƒ์ด ์„ธ ๋ฒˆ์งธ์ž…๋‹ˆ๋‹ค.

import sys
import tensorflow as tf

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus and len(sys.argv)> 1 and sys.argv[1].startswith("-a"):
    print("allowing growth")
    growth = True
else:
    print("nogrowth")
    growth = False

try:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, growth)
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
    print(e)

tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.matmul(tf.zeros((2,2,2)), tf.zeros((2,2,2)))
tf.nn.conv2d(tf.zeros((2,20,20,20), dtype=tf.float32),
                                         filters=tf.zeros((2,2,20,20), dtype=tf.float32),
            strides=(1,1,1,1), padding="VALID")
print("done")

๋‚˜๋Š” ๋˜ํ•œ์ด ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค
anacondacloud install tensorflow-gpu2.0
rtx2070s
tensorflow-gpu.2.0.0
cuda 10.0.13
cudnn 7.6.5
cudnn ํ•ธ๋“ค์„ ๋งŒ๋“ค ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค : CUDNN_STATUS_INTERNAL_ERROR
์ปจ๋ณผ ๋ฃจ์…˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ€์ ธ์˜ฌ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ด๋Š” cuDNN์ด ์ดˆ๊ธฐํ™”์— ์‹คํŒจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์œ„์— ์ธ์‡„ ๋œ ๊ฒฝ๊ณ  ๋กœ๊ทธ ๋ฉ”์‹œ์ง€๊ฐ€ ์žˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.

์‚ฝ์ž… ํ–ˆ์Šต๋‹ˆ๊นŒ :

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
    print(e)

์ž…๋ ฅ ํ•œ ์ฝ”๋“œ ์ƒ๋‹จ์—?

๋„ค,์ด ๋ฌธ์ œ๋ฅผ ์ด๋ ‡๊ฒŒ ํ•ด๊ฒฐํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค !!

๋‚˜๋Š” ๊ฐ™์€ ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ๊ณ  allow_growth = True ๊ฐ€ ํ•ด๊ฒฐ์ฑ…์ด์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ TensorFlow 2์˜ ๊ฒฝ์šฐ์ด๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ ค๋ฉด ๋‹ค์Œ ์ค„์„ ์ถ”๊ฐ€ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

gpu_devices = tf.config.experimental.list_physical_devices('GPU') for device in gpu_devices: tf.config.experimental.set_memory_growth(device, True)

์ด ๋ฌธ์ œ์˜ @opcecco ์‚ฌ์šฉ์ž์—๊ฒŒ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค : https://github.com/tensorflow/tensorflow/issues/25446

ํฅ๋ฏธ๋กญ๊ฒŒ๋„ ๋ฌธ์ œ๋Š” conv2d๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค. ์ด ์„ธ ๋ช…๋ น์˜ ์ˆœ์„œ๋ฅผ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ํ•ญ์ƒ ํ•˜๋‚˜๊ฐ€ ์‹คํŒจํ•˜๋Š” ๊ฒƒ์ด ์„ธ ๋ฒˆ์งธ์ž…๋‹ˆ๋‹ค.

@roebel ๋ช‡ ๊ฐ€์ง€ ๋‹ค๋ฅธ 6 ๊ฐœ์˜ ์ˆœ์—ด์— ๋Œ€ํ•œ ๋กœ๊ทธ๋ฅผ ์ฒจ๋ถ€ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

ํ”„๋กœ๊ทธ๋žจ์„ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ณ€๊ฒฝํ•˜๋ฉด ์–ด๋–ป๊ฒŒ ๋˜๋‚˜์š”?

tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.matmul(tf.zeros((2,2,2)), tf.zeros((2,2,2)))
tf.nn.conv2d(tf.zeros((2,20,20,20), dtype=tf.float32),
                                         filters=tf.zeros((2,2,20,20), dtype=tf.float32),
            strides=(1,1,1,1), padding="VALID")

์‹คํŒจ๊ฐ€ ์—ฌ์ „ํžˆ conv2d ์—์„œ ๋ฐœ์ƒํ•ฉ๋‹ˆ๊นŒ ์•„๋‹ˆ๋ฉด ์„ธ ๋ฒˆ์งธ stft ํ•ฉ๋‹ˆ๊นŒ?

@sanjoy ๋Š” ๋ช…๋ น ์ˆœ์„œ๋ฅผ ๋ณ€๊ฒฝํ•˜๋Š” ์œ„์˜ ์Šคํฌ๋ฆฝํŠธ์˜ ์„ธ ๊ฐ€์ง€ ๋ณ€ํ˜•๊ณผ 4 stft๋กœ ์‹œ์ž‘ํ•˜๊ณ  conv2d๋กœ ๋๋‚˜๋Š” ๋„ค ๋ฒˆ์งธ ๋ณ€ํ˜•์ž…๋‹ˆ๋‹ค.

๋„ค ๊ฐ€์ง€ ๋กœ๊ทธ๋Š” ๋‹ค์Œ์˜ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
https://github.com/tensorflow/tensorflow/issues/24496#issuecomment -593098386
๋งˆ์ง€๋ง‰ ๋„ค ์ค„์„ ๋Œ€์ฒดํ•ฉ๋‹ˆ๋‹ค.

์š”์ปจ๋Œ€ ์ฃผ๋ฌธ์— ๋”ฐ๋ฅธ ๊ฒฐ๊ณผ :

conv2d ์‹คํ–‰์‹œ stft-> blas-> conv2d ์‹คํŒจ
conv2d-> stft-> blas๋Š” stft๋ฅผ ์‹คํ–‰ํ•  ๋•Œ ์‹คํŒจํ•ฉ๋‹ˆ๋‹ค (๋”ฐ๋ผ์„œ ์„ธ ๋ฒˆ์งธ๋Š” ์•„๋‹ˆ์ง€๋งŒ blas๋Š” conv2d์— ๋Œ€ํ•ด ์ด๋ฏธ๋กœ๋“œ ๋œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.
matmul-> conv2d-> stft๊ฐ€ STFT ์‹คํ–‰์‹œ ์‹คํŒจ ํ•จ
conv2d๊ฐ€ ์‹คํ–‰๋˜๋ฉด stft->-stft->-stft-> stft-> matmul-> conv2d๊ฐ€ ์‹คํŒจํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜ ๋กœ๊ทธ๋ฅผ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค.

ํ•„์š”ํ•œ ๊ฒฝ์šฐ ๋‹ค๋ฅธ ๋ณ€ํ˜•์„ ์š”์ฒญํ•ด๋„ ๊ดœ์ฐฎ์Šต๋‹ˆ๋‹ค.

conv2d last :

tf.matmul(tf.zeros((2,2,2)), tf.zeros((2,2,2)))
tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.nn.conv2d(tf.zeros((2,20,20,20), dtype=tf.float32),
                                         filters=tf.zeros((2,2,20,20), dtype=tf.float32),
            strides=(1,1,1,1), padding="VALID")
print("done")

log.conv2d.last.txt

๋งˆ์ง€๋ง‰ ๋ง›๋ฌผ

tf.nn.conv2d(tf.zeros((2,20,20,20), dtype=tf.float32),
                                         filters=tf.zeros((2,2,20,20), dtype=tf.float32),
            strides=(1,1,1,1), padding="VALID")
tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.matmul(tf.zeros((2,2,2)), tf.zeros((2,2,2)))
print("done")

log.matmul.last.txt

stft ๋งˆ์ง€๋ง‰

tf.matmul(tf.zeros((2,2,2)), tf.zeros((2,2,2)))
tf.nn.conv2d(tf.zeros((2,20,20,20), dtype=tf.float32),
                                         filters=tf.zeros((2,2,20,20), dtype=tf.float32),
            strides=(1,1,1,1), padding="VALID")
tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
print("done")

log.stft.last.txt

4 stft ์ฒซ ๋ฒˆ์งธ conv2d ๋งˆ์ง€๋ง‰ :

tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.matmul(tf.zeros((2,2,2)), tf.zeros((2,2,2)))
tf.nn.conv2d(tf.zeros((2,20,20,20), dtype=tf.float32),
                                         filters=tf.zeros((2,2,20,20), dtype=tf.float32),
            strides=(1,1,1,1), padding="VALID")
print("done")

log.multi_stft.first.txt

๋งŽ์€ ๊ฐ์‚ฌ

๋‹ค์Œ ๊ตฌ์„ฑ์—์„œ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.
(์†Œ์Šค ๋˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ)์—์„œ ์„ค์น˜๋œ TensorFlow : r1.13.1, r.1.13.2, r1.14
Python ๋ฒ„์ „ : 3.6.1
Bazel ๋ฒ„์ „ (์†Œ์Šค์—์„œ ์ปดํŒŒ์ผํ•˜๋Š” ๊ฒฝ์šฐ) :
GCC / ์ปดํŒŒ์ผ๋Ÿฌ ๋ฒ„์ „ (์†Œ์Šค์—์„œ ์ปดํŒŒ์ผํ•˜๋Š” ๊ฒฝ์šฐ) :
CUDA / cuDNN ๋ฒ„์ „ : cuDNN 7.4.1์ด์žˆ๋Š” CUDA 10
GPU ๋ชจ๋ธ ๋ฐ ๋ฉ”๋ชจ๋ฆฌ : RTX 2070 8GB.

๋‚˜๋Š”์ด ๋ฌธ์ œ๋ฅผ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ•ด๊ฒฐํ–ˆ์Šต๋‹ˆ๋‹ค.
(์†Œ์Šค ๋˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ)์—์„œ ์„ค์น˜๋œ TensorFlow : r1.12.0
Python ๋ฒ„์ „ : 3.6.9
GCC / ์ปดํŒŒ์ผ๋Ÿฌ ๋ฒ„์ „ : 4.8
CUDA / cuDNN ๋ฒ„์ „ : cuDNN 7.1.4๊ฐ€์žˆ๋Š” CUDA 9.0
GPU ๋ชจ๋ธ ๋ฐ ๋ฉ”๋ชจ๋ฆฌ : RTX 2070 8GB.
๋‹น์‹ ์—๊ฒŒ ๋„์›€์ด๋˜๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค

๋‚˜๋Š” ๋˜ํ•œ ํ™˜๊ฒฝ ๋ณ€์ˆ˜ TF_FORCE_GPU_ALLOW_GROWTH = true๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ํ•ด๊ฒฐ ๋œ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ์— ์ง๋ฉดํ–ˆ์Šต๋‹ˆ๋‹ค.

๊ตฌ์„ฑ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
์œˆ๋„์šฐ 10
์†Œ์Šค r2.0์—์„œ ์ปดํŒŒ์ผ ๋œ Tensorflow
Bazel : 0.26.1
C ++ ์ปดํŒŒ์ผ๋Ÿฌ : MSVC 2017
CUDA : 10
cuDNN : 7.6.5

์ธํ…” 4930 CPU, ์—”๋น„๋””์•„ ํƒ€์ดํƒ„ XP ํŒŒ์Šค์นผ
Ubuntu 18.04.4, miniconda ์ตœ์‹ ,
`! conda ๋ชฉ๋ก | grep "cud"๋Š”

    cudatoolkit               10.1.243             h6bb024c_0  
    cudnn                     7.6.5                cuda10.1_0  

`! conda ๋ชฉ๋ก | grep "tensor"``๋Š”

tensorboard               2.1.0                     py3_0  
tensorflow                2.1.0           gpu_py37h7a4bb67_0  
tensorflow-base           2.1.0           gpu_py37h6c5654b_0  
tensorflow-estimator      2.1.0              pyhd54b08b_0  
tensorflow-gpu            2.1.0                h0d30ee6_0  

jupyter ๋…ธํŠธ๋ถ์˜ ์ฒซ ๋ฒˆ์งธ ์…€์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

import tensorflow as tf
gpu_devices = tf.config.experimental.list_physical_devices('GPU')
for device in gpu_devices: tf.config.experimental.set_memory_growth(device, True)

๋ชจ๋ธ์€ ์ด ๋งค๊ฐœ ๋ณ€์ˆ˜๊ฐ€ 112,269 ์ธ ๋ณ€ํ˜• ์˜คํ†  ์ธ์ฝ”๋”์ž…๋‹ˆ๋‹ค.
x_train.shape, y_train.shape, x_test.shape, y_test.shape๋Š”
((106496, 32, 32, 1), (106496,), (12288, 32, 32, 1), (12288,))

์ฝ”๋“œ์—๋Š” ๋‹ค์Œ์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค.

batch_size=64
var_auto_encoder.fit(x_train, x_train, verbose=1, 
                 batch_size=batch_size, epochs=100,
                 validation_data=(x_test, x_test))

๊ทธ๋ฆฌ๊ณ  ๊ทธ๊ฒƒ์€ ์‹คํŒจํ•ฉ๋‹ˆ๋‹ค. ์ฝ˜์†” ์‡ผ

2020-03-18 15:46:03.019451: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-03-18 15:46:03.179472: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-03-18 15:46:03.566267: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-03-18 15:46:03.569842: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-03-18 15:46:03.569907: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node conv2d/Conv2D}}]]
2020-03-18 15:46:03.573206: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

์œ„์—์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด ์ฒซ ๋ฒˆ์งธ ์…€ ๋Œ€์‹ ์— f๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.2
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

๊ทธ๋Ÿฐ ๋‹ค์Œ์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

2020-03-18 15:55:43.050094: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-03-18 15:55:43.050123: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-03-18 15:55:43.050150: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-03-18 15:55:43.050177: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-03-18 15:55:43.050209: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-03-18 15:55:43.050246: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-03-18 15:55:43.050273: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-03-18 15:55:43.050337: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-18 15:55:43.050720: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-18 15:55:43.051063: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-03-18 15:55:43.051097: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-03-18 15:55:43.051108: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-03-18 15:55:43.051116: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
2020-03-18 15:55:43.051201: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-18 15:55:43.051573: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-03-18 15:55:43.051915: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 16 MB memory) -> physical GPU (device: 0, name: TITAN X (Pascal), pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-03-18 15:56:07.877181: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-03-18 15:56:07.882424: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-03-18 15:56:07.886148: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-03-18 15:56:07.889830: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR


Why am I having the problem if I allow memory growth? Do I need to reboot to reinitialize the gpu?

ํฅ๋ฏธ๋กญ๊ฒŒ๋„, ์ œ๊ฐ€ ๊ณ ์ƒํ•˜๋Š” ๋™์•ˆ ๋ฉ”๋‰ด ๋ฐ”์—์žˆ๋Š” ๋นจ๊ฐ„์ƒ‰ '์ฐธ๊ฐ€ ๋ถˆ๊ฐ€'ํ‘œ์‹œ์—์„œ '์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜์—ฌ ์˜์กด์„ฑ์ด ์ถฉ์กฑ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค'๋ผ๋Š” ๋ฉ”์‹œ์ง€๋ฅผ ๋ฐ›์•˜์Šต๋‹ˆ๋‹ค.
์†Œํ”„ํŠธ์›จ์–ด ์—…๋ฐ์ดํŠธ๋ฅผ ์‹คํ–‰ํ–ˆ๋Š”๋ฐ libcudnn7-dev ๋ฐ libcudnn7-doc๋ฅผ ์ œ๊ฑฐํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.
Linux์™€ ๊ด€๋ จ๋œ 57 ๊ฐœ์˜ ๋‹ค๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์—…๊ทธ๋ ˆ์ด๋“œํ•ฉ๋‹ˆ๋‹ค.

ํŽธ์ง‘ : ์žฌ๋ถ€ํŒ… ํ›„ ๋ชจ๋ธ์ด ๋‹ค์Œ์„ ์‚ฌ์šฉํ•˜์—ฌ ์„ฑ๊ณต์ ์œผ๋กœ ํ•™์Šตํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.2
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

์•„๋‹ˆ๋ฉด ์ด๊ฑฐ:

import tensorflow as tf
gpu_devices = tf.config.experimental.list_physical_devices('GPU')
for device in gpu_devices: tf.config.experimental.set_memory_growth(device, True)

GPU์˜ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋ฅ ์€ ๋ฐฐ์น˜ ํฌ๊ธฐ๊ฐ€ 16 ์ธ ๊ฒฝ์šฐ 700MB ๋ฏธ๋งŒ์ด๋ฉฐ
๋ฐฐ์น˜ ํฌ๊ธฐ๊ฐ€ 256 ์ธ ๊ฒฝ์šฐ ~ 1GB (3 ๋ฐฐ ๋” ๋น ๋ฅด๊ฒŒ ํ•™์Šต)

์†Œ์Šค์—์„œ ์ปดํŒŒ์ผ์„ ์‹œ๋„ํ–ˆ์ง€๋งŒ ๊ฐ™์€ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๋งˆ์นจ๋‚ด config.gpu_options.allow_growth = True ์„ค์ •ํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ ๋ช…๋ น ์ค„์—์„œ์ด ๋ฌธ์ œ๋ฅผ ๋งŒ๋‚œ ๊ฒฝ์šฐ ์ด๋Ÿฌํ•œ ์ฝ”๋“œ๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

๋‚˜๋„์ด ๋ฌธ์ œ๋ฅผ ๋งŒ๋‚˜
anacondacloud install tensorflow-gpu2.0
rtx2070s
tensorflow-gpu.2.0.0
cuda 10.0.13
cudnn 7.6.5
cudnn ํ•ธ๋“ค์„ ๋งŒ๋“ค ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค : CUDNN_STATUS_INTERNAL_ERROR
์ปจ๋ณผ ๋ฃจ์…˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๊ฐ€์ ธ ์˜ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” cuDNN์ด ์ดˆ๊ธฐํ™”์— ์‹คํŒจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์œ„์— ๊ฒฝ๊ณ  ๋กœ๊ทธ ๋ฉ”์‹œ์ง€๊ฐ€ ์ธ์‡„๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.

์‚ฝ์ž… ํ–ˆ์Šต๋‹ˆ๊นŒ :

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
    for gpu in gpus:
        tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
except RuntimeError as e:
    print(e)

์ž…์žฅ ์ฝ”๋“œ ์ƒ๋‹จ์—?

์œ„์™€ ๋˜‘๊ฐ™์€ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

@robosmith ์˜ ์†”๋ฃจ์…˜์€ ๋‚ด ๋ฌธ์ œ๋ฅผ ์™„์ „ํžˆ ํ•ด๊ฒฐํ•ฉ๋‹ˆ๋‹ค!

๋‚ด ์‚ฌ์–‘ :
RTX 2070
Ubuntu 18.04 LTE
Tensorflow 2.1.0
์ผ€ ๋ผ์Šค 2.3.0
cudnn 7.6.5
cuda10.1.0
์ฝ˜๋‹ค 4.8.3
ํŒŒ์ด์ฌ 3.7.7

conda install tensorflow-gpu keras ๋ฅผ ํ†ตํ•ด ๊ตฌ์ถ•

์ •๋ง ๊ณ ๋ง™์Šต๋‹ˆ๋‹ค! TF-2๊ฐ€ ์ „ํ˜€ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ๊ฒƒ์€ ์ด๋ฒˆ์ด ์ฒ˜์Œ ์ž…๋‹ˆ๋‹ค! ๊ทธ๋ฆฌ๊ณ  TF-1์ด ๋ชจ๋‘ ์ž‘๋™์„ ๋ฉˆ ์ท„๊ธฐ ๋•Œ๋ฌธ์— ์—…๊ทธ๋ ˆ์ด๋“œํ•˜๊ณ  '๋ฌด์Šจ ์ผ์ด ์ผ์–ด๋‚˜๋Š”์ง€ ํ™•์ธ'ํ•˜๊ธฐ๋กœ ๊ฒฐ์ •ํ–ˆ์Šต๋‹ˆ๋‹ค!

๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!

config.gpu_options.allow_growth = True

tensorflow 2.0์„ ์‚ฌ์šฉํ•  ๋•Œ ๋‹ค์Œ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
tf.config.experimental.set_memory_growth(tf.config.list_physical_devices('GPU')[0], True)
์ด ์ฝ”๋“œ๋Š” import tensorflow as tf ๋’ค์— ์žˆ์ง€๋งŒ ์ฝ”๋“œ ์•ž์— ์žˆ์Šต๋‹ˆ๋‹ค.

์†Œ์Šค์—์„œ ์ปดํŒŒ์ผ์„ ์‹œ๋„ํ–ˆ์ง€๋งŒ ๊ฐ™์€ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๋งˆ์นจ๋‚ด config.gpu_options.allow_growth = True ์„ค์ •ํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

์ด ์ฝ”๋“œ๋Š” tensorflow ๋ฐ keras ์‚ฌ์šฉ์ž ๋ชจ๋‘๊ฐ€ ๋” ๋น ๋ฅด๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ณต์œ ๋ฉ๋‹ˆ๋‹ค.
์—ฌ๊ธฐ ์—์„œ ์†Œ์Šค

# Tensorflow
import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)


#And for Keras
from keras.callbacks import ModelCheckpoint
from keras.models import Model, load_model, save_model, Sequential
from keras.layers import Dense, Activation, Dropout, Input, Masking, TimeDistributed, LSTM, Conv1D
from keras.layers import GRU, Bidirectional, BatchNormalization, Reshape
from keras.optimizers import Adam
from keras.backend.tensorflow_backend import set_session
import tensorflow as tf
config = tf.ConfigProto()
config.gpu_options.allow_growth = True  # dynamically grow the memory used on the GPU
config.log_device_placement = True  # to log device placement (on which device the operation ran)
sess = tf.Session(config=config)
set_session(sess)  # set this TensorFlow session as the default session for Keras

๊ทธ๋ƒฅ ์ฐจ์ž„ํ•˜๊ณ  ๋ฌธ์ œ๊ฐ€ ์—ฌ์ „ํžˆ ์กด์žฌํ•œ๋‹ค๊ณ  ๋งํ•˜๊ณ  ์‹ถ์—ˆ์Šต๋‹ˆ๋‹ค.

๋‚ด ์‚ฌ์–‘ :
Ubuntu 20.04
NVIDIA RTX 2070
Nvidia_driver 440.64
Tensorflow-gpu 2.0.1 (conda๋ฅผ ํ†ตํ•ด ์„ค์น˜, ๋™์ผํ•œ ํ™˜๊ฒฝ์— Cudatoolkit ๋ฐ CuDNN์„ ์ž๋™์œผ๋กœ ์„ค์น˜)
cudatoolkit 10.1.243
cudnn 7.6.5

tf.config.experimental.set_memory_growth(tf.config.list_physical_devices('GPU')[0], True) ๋ฌธ์ œ ํ•ด๊ฒฐ

๊ทธ๋Ÿฌ๋‚˜ ์ด๊ฒƒ์€ ์‹ค์ œ ์ˆ˜์ •๋ณด๋‹ค ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์œผ๋กœ ๋ณด์ด๋ฉฐ ์š”์ฆ˜ ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์ด 20XX ์นด๋“œ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋งˆ๋„์ด ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ ๋œ ์—…๋ฐ์ดํŠธ๊ฐ€์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์—…๋ฐ์ดํŠธ : ๋“€์–ผ ๋ถ€ํŒ…์ด๊ธฐ ๋•Œ๋ฌธ์— ์œˆ๋„์šฐ๋„ ํ™•์ธํ•˜๋ ค๊ณ ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ฌธ์ œ๊ฐ€ ๊ณ„์†๋ฉ๋‹ˆ๋‹ค.
์œˆ๋„์šฐ 10
Nvidia ๋“œ๋ผ์ด๋ฒ„ 445.87
๊ทธ ์™ธ์—๋Š” ๋‹ค ๋น„์Šทํ•ด

RTX 2080 ์šฉ ์ตœ์‹  ๋“œ๋ผ์ด๋ฒ„ (445.87)๋ฅผ ์„ค์น˜ํ•˜๋ฉด์ด ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

@NBouman ํฅ๋ฏธ๋กญ์ง€ ๋งŒ GeForce GTX 1050 TI๋ฅผ ์‚ฌ์šฉํ•˜๋Š” Ubuntu 18.04์˜ ์ €์—๊ฒŒ๋Š” ๋ฐฉ๊ธˆ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋งˆ์ง€๋ง‰ ๋“œ๋ผ์ด๋ฒ„ 440.82๋กœ ์—…๋ฐ์ดํŠธํ–ˆ์Šต๋‹ˆ๋‹ค. ์ž‘๋™ํ•˜๋ ค๋ฉด ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๋ฅผ ํ—ˆ์šฉํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

RTX 2080 ์šฉ ์ตœ์‹  ๋“œ๋ผ์ด๋ฒ„ (445.87)๋ฅผ ์„ค์น˜ํ•˜๋ฉด์ด ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

@NBouman ์–ด๋–ค OS๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ? ์šฐ๋ถ„ํˆฌ 20.40์—์žˆ์–ด, ๋‚ด๊ฐ€ ์ฐพ์„ ์ˆ˜์žˆ๋Š” ์ตœ์‹  ๋“œ๋ผ์ด๋ฒ„ @roebel, ๋ฌธ์ œ์˜์ด ์ง€์†์ฒ˜๋Ÿผ, 440.82, ๊ทธ๋ฆฌ๊ณ .

@roebel @eduardoscsouza ๋‚˜๋Š” ์ด์ „ ์—์ด ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ๋˜ ์ปดํ“จํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Windows 10์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ƒฅ ์ฐจ์ž„ํ•˜๊ณ  ๋ฌธ์ œ๊ฐ€ ์—ฌ์ „ํžˆ ์กด์žฌํ•œ๋‹ค๊ณ  ๋งํ•˜๊ณ  ์‹ถ์—ˆ์Šต๋‹ˆ๋‹ค.

๋‚ด ์‚ฌ์–‘ :
Ubuntu 20.04
NVIDIA RTX 2070
Nvidia_driver 440.64
Tensorflow-gpu 2.0.1 (conda๋ฅผ ํ†ตํ•ด ์„ค์น˜, ๋™์ผํ•œ ํ™˜๊ฒฝ์— Cudatoolkit ๋ฐ CuDNN์„ ์ž๋™์œผ๋กœ ์„ค์น˜)
cudatoolkit 10.1.243
cudnn 7.6.5

tf.config.experimental.set_memory_growth(tf.config.list_physical_devices('GPU')[0], True) ๋ฌธ์ œ ํ•ด๊ฒฐ

๊ทธ๋Ÿฌ๋‚˜ ์ด๊ฒƒ์€ ์‹ค์ œ ์ˆ˜์ •๋ณด๋‹ค ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์œผ๋กœ ๋ณด์ด๋ฉฐ ์š”์ฆ˜ ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์ด 20XX ์นด๋“œ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋งˆ๋„์ด ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ ๋œ ์—…๋ฐ์ดํŠธ๊ฐ€์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์—…๋ฐ์ดํŠธ : ๋“€์–ผ ๋ถ€ํŒ…์ด๊ธฐ ๋•Œ๋ฌธ์— ์œˆ๋„์šฐ๋„ ํ™•์ธํ•˜๋ ค๊ณ ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ฌธ์ œ๊ฐ€ ๊ณ„์†๋ฉ๋‹ˆ๋‹ค.
์œˆ๋„์šฐ 10
Nvidia ๋“œ๋ผ์ด๋ฒ„ 445.87
๊ทธ ์™ธ์—๋Š” ๋‹ค ๋น„์Šทํ•ด

tensorflow 2.0.0์˜ ๊ฒฝ์šฐ ๋‹ค์Œ๊ณผ ํ•จ๊ป˜ ์ž‘๋™ํ–ˆ์Šต๋‹ˆ๋‹ค.
tf.config.experimental.set_memory_growth(tf.config.experimental.list_physical_devices('GPU')[0],True)

๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!!! ์ˆ˜์ฒœ๋ช…์˜ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค !!!!!!!!!!!!!!!

์šด์˜์ฒด์ œ : ubuntu 18.04 lts

๋“œ๋ผ์ด๋ฒ„ ๋ฒ„์ „ : 435.21

CUDA : cudatoolkit 10.1

CUDNN : cudnn-7.6.5-cuda10.1_0

์•„๋‚˜์ฝ˜๋‹ค ์„ค์น˜ tensorflow๋ฅผ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.

conda create -n tf-gpu tensorflow-gpu

cudatoolkit ๋ฐ cudnn์€ ์ด์ „ ๋ช…๋ น์„ ํ†ตํ•ด ์•„๋‚˜์ฝ˜๋‹ค์— ์˜ํ•ด ์ž๋™ ์„ค์น˜๋ฉ๋‹ˆ๋‹ค.

๊ฐ™์€ ์งˆ๋ฌธ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ค๋ฅ˜ :

coreClock: 1.5315GHz coreCount: 3 deviceMemorySize: 1.96GiB deviceMemoryBandwidth: 44.76GiB/s
2020-05-12 17:58:44.119679: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-05-12 17:58:44.119694: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-05-12 17:58:44.119707: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2020-05-12 17:58:44.119719: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2020-05-12 17:58:44.119732: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2020-05-12 17:58:44.119744: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2020-05-12 17:58:44.119756: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-12 17:58:44.119819: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-12 17:58:44.120069: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-12 17:58:44.120277: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1697] Adding visible gpu devices: 0
2020-05-12 17:58:44.120308: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2020-05-12 17:58:44.174976: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-05-12 17:58:44.175003: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      0 
2020-05-12 17:58:44.175012: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] 0:   N 
2020-05-12 17:58:44.175136: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-12 17:58:44.175392: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-12 17:58:44.175624: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-05-12 17:58:44.175844: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1241] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1439 MB memory) -> physical GPU (device: 0, name: GeForce MX150, pci bus id: 0000:01:00.0, compute capability: 6.1)
2020-05-12 17:58:44.177113: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55abc3d20b80 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-05-12 17:58:44.177129: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): GeForce MX150, Compute Capability 6.1
2020-05-12 17:58:44.177749: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 376320000 exceeds 10% of system memory.
2020-05-12 17:58:44.787493: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 376320000 exceeds 10% of system memory.
WARNING:tensorflow:Layer my_model is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because it's dtype defaults to floatx.

If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.

To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

2020-05-12 17:58:45.311821: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2020-05-12 17:58:45.467966: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-05-12 17:58:45.904025: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-05-12 17:58:45.913861: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-05-12 17:58:45.913978: W tensorflow/core/common_runtime/base_collective_executor.cc:217] BaseCollectiveExecutor::StartAbort Unknown: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[{{node my_model/conv2d/Conv2D}}]]

๋”ฐ๋ผ์„œ ์—ฌ๊ธฐ์— ํ•ด๊ฒฐ๋˜์ง€ ์•Š์€ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค (๋ณด๋‹ค ํšจ์œจ์ ์ธ ๋ฉ”๋ชจ๋ฆฌ ์ฒ˜๋ฆฌ๋ฅผ ์œ„ํ•ด ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ๋ง๋ผ๋Š” ๊ณต์‹ ๊ถŒ์žฅ ์‚ฌํ•ญ์— ์œ„๋ฐฐ๋˜๋Š” ํ•ด๊ฒฐ ๋ฐฉ๋ฒ• ์™ธ์—). ๊ฐœ๋ฐœํŒ€์˜ ํ”ผ๋“œ๋ฐฑ์ด ๋งŽ์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์™œ ๊ทธ๋Ÿฐ์ง€ ๊ถ๊ธˆ ํ•ด์š”?

์ด ๋ฒ„๊ทธ๋Š” ์ƒ๋‹นํžˆ ๋‹ค์–‘ํ•œ ํ…์„œ ํ”Œ๋กœ์šฐ ๋ฒ„์ „ (1.13, 2.0, 2.1)์— ์˜ํ–ฅ์„ ๋ฏธ์น˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๋ณธ ๊ฒฝ์šฐ ๋ชจ๋“  ๋ฌธ์ œ๊ฐ€ cuda 10์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๊ฒƒ์œผ๋กœ๋ณด๊ณ ๋ฉ๋‹ˆ๋‹ค. ์ฝ”๋“œ๋Š” ๋งŽ์€ ์นด๋“œ์—์„œ ์ž˜ ์‹คํ–‰๋˜์ง€๋งŒ ๋‹ค๋ฅธ ์นด๋“œ์—์„œ๋Š” ์‹คํ–‰๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
๊ฐœ๋ฐœ์ž ํŒ€์˜ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ์ด๊ฒƒ์ด tensorflow ๋ ˆ์ด์–ด๋ณด๋‹ค cuda ๋“œ๋ผ์ด๋ฒ„์˜ ๋ฌธ์ œ์— ๋Œ€ํ•œ ํžŒํŠธ๋ฅผ ๋” ๋งŽ์ด ์ œ๊ณตํ•˜๋Š”์ง€ ๋งํ•ด ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ์ด ๊ฒฝ์šฐ NVIDIA ์ง€์› ํŽ˜์ด์ง€์— ๋ฒ„๊ทธ ๋ณด๊ณ ์„œ๋ฅผ ์ „์†กํ•˜๋Š” ๊ฒƒ์ด ํ™•์‹คํžˆ ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ ‡์ง€ ์•Š์Šต๋‹ˆ๊นŒ?

tensorflow dev ํŒ€์˜ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€์ด ๋ฒ„๊ทธ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ณด๋Š”์ง€์— ๋Œ€ํ•ด ๋…ผํ‰ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ๋ˆ„๊ตฌ๋“ ์ง€ ์ด๊ฒƒ์„ ์กฐ์‚ฌํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ?

๊ฒฝ๋กœ ๋˜๋Š” LD ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๊ฒฝ๋กœ์— ๋‘ ๊ฐœ์˜ CuDNN 7 ๊ณต์œ  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์žˆ๋Š”์ง€ ์‚ฌ๋žŒ๋“ค์ด ํ™•์ธ ํ–ˆ์Šต๋‹ˆ๊นŒ? ์ด ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์—๋Š” ๋ถ€ ๋˜๋Š” ํŒจ์น˜ ๋ฒˆํ˜ธ๊ฐ€ ์—†์ง€๋งŒ ๋ฒ„์ „ ๋ถˆ์ผ์น˜๋กœ ์ธํ•ด์ด ์˜ค๋ฅ˜ ๋ฉ”์‹œ์ง€๊ฐ€ ํ‘œ์‹œ ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

NVIDIA์—์„œ ๋ฒ„๊ทธ ๋ณด๊ณ ์„œ๋ฅผ ์—ด์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ๋ฅผ ์•Œ๋ ค ๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค.

๋ฟก๋ฟก
์‹ค์ œ๋กœ ๋งŽ์€ ๋ฒ„์ „์˜ libcudnn์ด ์„ค์น˜๋˜์–ด ์žˆ์œผ๋ฉฐ ๊ฐ anaconda env์—๋Š” ์ž์ฒด ๋ฒ„์ „์ด ์žˆ์Šต๋‹ˆ๋‹ค.
์ผ๋ฐ˜์ ์œผ๋กœ anaconda๋Š” rpath๋ฅผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์„ค์ •ํ•˜์—ฌ ์„ค์น˜ํ•˜๋ฏ€๋กœ ์˜ฌ๋ฐ”๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์–ป์ง€ ๋ชปํ•˜๋Š” ๊ฒƒ์ด ๋‹ค์†Œ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” strace๋ฅผ ๋งŒ๋“ค๊ณ  ์‹คํŒจํ–ˆ์„ ๋•Œ ์—ด๋ฆฌ๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ greppedํ–ˆ์Šต๋‹ˆ๋‹ค.
์ด๋“ค์€ tensorflow ํŒจํ‚ค์ง€๋ฅผ ํ˜ธ์ŠคํŒ…ํ•˜๋Š” anaconda env dir์—์„œ ์ง€์†์ ์œผ๋กœ ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค (์•„๋ž˜ ์ฐธ์กฐ).
๋ฒ„์ „ 440.82์ด๊ณ  NVIDIA ์„ค์น˜ ํ”„๋กœ๊ทธ๋žจ์œผ๋กœ ์ปดํŒŒ์ผ ํ•œ libcuda ์™ธ์—.

LD_LIBRARY_PATH๋ฅผ ๋‹ค๋ฅธ cudatoolkit๊ณผ ๋‹ค๋ฅธ libcudnn์„ ์‚ฌ์šฉํ•˜๋Š” ๋‹ค๋ฅธ anaconda env lib dir ์ค‘ ํ•˜๋‚˜๋กœ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ์ถ”์ ์€ ๋™์ผํ•˜๊ฒŒ ์œ ์ง€๋ฉ๋‹ˆ๋‹ค.
๋˜ํ•œ ๋ฌธ์ œ๋ฅผ ์ผ์œผํ‚ค๋Š” ๊ฒƒ์€ lbcudnn์ด ์•„๋‹™๋‹ˆ๋‹ค. ํ•ญ์ƒ ์„ธ ๋ฒˆ์งธ libcuxyz ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ž…๋‹ˆ๋‹ค.
์ด๊ฒƒ์€ ํŠน์ • GPU์—์„œ๋งŒ ์‚ฌ์šฉ๋˜๋ฉฐ (๋‹ค๋ฅธ GPU๊ฐ€์žˆ๋Š” ๋‹ค๋ฅธ ์ปดํ“จํ„ฐ์—์„œ ๋™์ผํ•œ ์„ค์น˜ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‚ฌ์šฉํ–ˆ์œผ๋ฉฐ ์ผ๋ถ€๋Š” ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ๋„ ์žˆ์Œ) ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๊ฐ€ ํ™œ์„ฑํ™” ๋œ ๊ฒฝ์šฐ ๋ชจ๋‘ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

(tf2.1) m3088.roebel: (test_sd) 510> grep open trace.log  | grep libcu | grep -v -- -1
openat(AT_FDCWD, "/usr/lib/x86_64-linux-gnu/libcuda.so.1", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcudart.so.10.1", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcublas.so.10", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../.././libcublasLt.so.10", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcufft.so.10", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcurand.so.10", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcusolver.so.10", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcusparse.so.10", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcudnn.so.7", O_RDONLY|O_CLOEXEC) = 11

GeForce RTX 2060 SUPER๋ฅผ ์‚ฌ์šฉํ•˜๋Š” Ubuntu 20.04์—์„œ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ณ ๋ฐ€๋„ ๋ ˆ์ด์–ด๊ฐ€์žˆ๋Š” NN์ด ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ CNN ๋ ˆ์ด์–ด๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
tf.config.experimental.set_memory_growth(tf.config.list_physical_devices('GPU')[0], True) ์„ ์ถ”๊ฐ€ํ•ด๋„ ์˜ค๋ฅ˜์— ์ฐจ์ด๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.
https://www.tensorflow.org/install/gpu ๋ฐ nvidia-smi ์‡ผ์— ๋”ฐ๋ผ ์„ค์น˜๋ฅผ ๋”ฐ๋ž์Šต๋‹ˆ๋‹ค.
Driver Version: 440.64.00 CUDA Version: 10.2
๋‚ด conda ํ™˜๊ฒฝ์—๋Š” ๋‹ค์Œ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

cudatoolkit               10.1.243             h6bb024c_0  
cudnn                     7.6.5                cuda10.1_0  
tensorflow-gpu            2.1.0                h0d30ee6_0

tf 1.15๋ฅผ ์‚ฌ์šฉํ•˜๋Š” conda ํ™˜๊ฒฝ์—์„œ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ๊ณ ์ณ์งˆ ์ˆ˜ ์žˆ๋‹ค๋ฉด ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ตœ์‹  ์ •๋ณด

export TF_FORCE_GPU_ALLOW_GROWTH=true ํ•˜๋ฉด ๋ชจ๋“  ๊ฒƒ์ด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” tf.config.experimental.set_memory_growth(tf.config.list_physical_devices('GPU')[0], True) ๊ฐ€ ๊ฐ™์€ ์ผ์„ ํ•  ๊ฒƒ์ด๋ผ๋Š” ์ธ์ƒ์„ ๋ฐ›์•˜์ง€๋งŒ ๊ทธ๋ ‡์ง€ ์•Š์Šต๋‹ˆ๋‹ค. TensorFlow GPU ์ง€์› ์›น ํŽ˜์ด์ง€์—์ด ๋‚ด์šฉ์ด ๋ช…์‹œ๋˜์–ด ์žˆ์–ด์•ผํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

๋ฟก๋ฟก
์‹ค์ œ๋กœ ๋งŽ์€ ๋ฒ„์ „์˜ libcudnn์ด ์„ค์น˜๋˜์–ด ์žˆ์œผ๋ฉฐ ๊ฐ anaconda env์—๋Š” ์ž์ฒด ๋ฒ„์ „์ด ์žˆ์Šต๋‹ˆ๋‹ค.
์ผ๋ฐ˜์ ์œผ๋กœ anaconda๋Š” rpath๋ฅผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์„ค์ •ํ•˜์—ฌ ์„ค์น˜ํ•˜๋ฏ€๋กœ ์˜ฌ๋ฐ”๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์–ป์ง€ ๋ชปํ•˜๋Š” ๊ฒƒ์ด ๋‹ค์†Œ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” strace๋ฅผ ๋งŒ๋“ค๊ณ  ์‹คํŒจํ–ˆ์„ ๋•Œ ์—ด๋ฆฌ๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ greppedํ–ˆ์Šต๋‹ˆ๋‹ค.
์ด๋“ค์€ tensorflow ํŒจํ‚ค์ง€๋ฅผ ํ˜ธ์ŠคํŒ…ํ•˜๋Š” anaconda env dir์—์„œ ์ง€์†์ ์œผ๋กœ ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค (์•„๋ž˜ ์ฐธ์กฐ).
๋ฒ„์ „ 440.82์ด๊ณ  NVIDIA ์„ค์น˜ ํ”„๋กœ๊ทธ๋žจ์œผ๋กœ ์ปดํŒŒ์ผ ํ•œ libcuda ์™ธ์—.

LD_LIBRARY_PATH๋ฅผ ๋‹ค๋ฅธ cudatoolkit๊ณผ ๋‹ค๋ฅธ libcudnn์„ ์‚ฌ์šฉํ•˜๋Š” ๋‹ค๋ฅธ anaconda env lib dir ์ค‘ ํ•˜๋‚˜๋กœ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ์ถ”์ ์€ ๋™์ผํ•˜๊ฒŒ ์œ ์ง€๋ฉ๋‹ˆ๋‹ค.
๋˜ํ•œ ๋ฌธ์ œ๋ฅผ ์ผ์œผํ‚ค๋Š” ๊ฒƒ์€ lbcudnn์ด ์•„๋‹™๋‹ˆ๋‹ค. ํ•ญ์ƒ ์„ธ ๋ฒˆ์งธ libcuxyz ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์ž…๋‹ˆ๋‹ค.
์ด๊ฒƒ์€ ํŠน์ • GPU์—์„œ๋งŒ ์‚ฌ์šฉ๋˜๋ฉฐ (๋‹ค๋ฅธ GPU๊ฐ€์žˆ๋Š” ๋‹ค๋ฅธ ์ปดํ“จํ„ฐ์—์„œ ๋™์ผํ•œ ์„ค์น˜ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‚ฌ์šฉํ–ˆ์œผ๋ฉฐ ์ผ๋ถ€๋Š” ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ๋„ ์žˆ์Œ) ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๊ฐ€ ํ™œ์„ฑํ™” ๋œ ๊ฒฝ์šฐ ๋ชจ๋‘ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

(tf2.1) m3088.roebel: (test_sd) 510> grep open trace.log  | grep libcu | grep -v -- -1
openat(AT_FDCWD, "/usr/lib/x86_64-linux-gnu/libcuda.so.1", O_RDONLY|O_CLOEXEC) = 4
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcudart.so.10.1", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcublas.so.10", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../.././libcublasLt.so.10", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcufft.so.10", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcurand.so.10", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcusolver.so.10", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcusparse.so.10", O_RDONLY|O_CLOEXEC) = 11
openat(AT_FDCWD, "/data/anasynth/anaconda3/envs/tf2.1/lib/python3.7/site-packages/tensorflow_core/python/../../../../libcudnn.so.7", O_RDONLY|O_CLOEXEC) = 11

๊ทธ๋ž˜์„œ ๋‹น์‹ ์€ libcudnn.so.7 ๊ฐ€ 7.XXX.YYY ๋ผ๊ณ  ๋งํ•˜์ง€ ์•Š๊ณ  7.XXX.YYY ๊ฐ€ CUDA 10.2 10.1 ์— ๋” ์˜์กดํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์„ค๋ช…ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. 10.0 9.2 9.1 9.0 ๋“ฑ

๊ฒฝ๋กœ๋ฅผ ์ž˜ ๊ด€๋ฆฌํ•˜๊ณ  ์•Œ๋ ค์ง„ ํฌ๊ธฐ์˜ ๊ทธ๋ž˜ํ”„๋ฅผ ์ดˆ๊ธฐํ™”ํ•˜๊ธฐ ์ „์— ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ฉ”๋ชจ๋ฆฌ ์–‘์„ ๊ด€๋ฆฌํ•˜๊ณ  ๋Œ€์ƒ GPU๊ฐ€ ๊ทธ๋ž˜ํ”„์— ์ถฉ๋ถ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ ๋งŒ ์‚ฌ์šฉํ•˜๊ณ  CUDA์˜ ์–‘์„ ์ฟผ๋ฆฌํ•˜๋Š” ๋ฐ ์ถฉ๋ถ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ ๋งŒ ์‚ฌ์šฉํ–ˆ๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์˜ค๋ฅ˜๋ฅผ ๋ณด์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ž์› ๋ฌธ์ œ๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ํ”„๋กœ์„ธ์Šค๋ฅผ ์‹œ์ž‘ํ•  ๋•Œ ์‚ฌ์šฉํ•  ์ˆ˜์žˆ๋Š” ๋ฉ”๋ชจ๋ฆฌ ์–‘๊ณผ ๊ทธ๋ž˜ํ”„์—์„œ ์‚ฌ์šฉํ•˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ์–‘์€ ์–ผ๋งˆ์ž…๋‹ˆ๊นŒ?

@ kognat-docs

๊ทธ๋ž˜์„œ ๋‹น์‹ ์€ ๋‚ด ์š”์  libcudnn.so.7์ด 7.XXX.YYY ์œ„์— 7.XXX.YYY๋ฅผ ๋งํ•˜์ง€ ์•Š๋Š”๋‹ค๋Š” ๊ฒƒ์„ ์„ค๋ช…ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค .YYY๋Š” CUDA 10.2 10.1 10.0 9.2 9.1 9.0 ๋“ฑ์— ๋” ์˜์กดํ•ฉ๋‹ˆ๋‹ค.

๋‹น์‹ ์ด ์ œ๊ธฐ ํ•œ ์งˆ๋ฌธ์€ "๊ฒฝ๋กœ ๋˜๋Š” LD ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๊ฒฝ๋กœ์— ๋‘ ๊ฐœ์˜ CuDNN 7 ๊ณต์œ  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์žˆ๋Š”์ง€ ์‚ฌ๋žŒ๋“ค์ด ํ™•์ธํ•˜๊ณ  ์žˆ๋Š”์ง€"์˜€์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‚ด ๋Œ€๋‹ต์€ : ๋‚˜๋Š” ์ด๊ฒƒ์„ ํ™•์ธํ–ˆ์Šต๋‹ˆ๋‹ค.
๋‚ด๊ฐ€ ๋‹น์‹ ์—๊ฒŒ ํ”์ ์„ ๋ณด๋ƒˆ์Šต๋‹ˆ๋‹ค.

๊ฒฝ๋กœ ๊ด€๋ฆฌ๋ฅผ ์‹œ์ž‘ํ•œ ์ดํ›„๋กœ ์˜ค๋ฅ˜๋ฅผ ๋ณด์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.

๊ฒฝ๋กœ๋ฅผ ๊ด€๋ฆฌํ•œ๋‹ค๋Š” ๊ฒƒ์€ ๋ฌด์—‡์„ ์˜๋ฏธํ•ฉ๋‹ˆ๊นŒ?
๋‚˜๋Š” ํ•ญ์ƒ ๋‚ด ๊ธธ์„ ๊ด€๋ฆฌํ•ฉ๋‹ˆ๋‹ค! ์ผ๊ด€์„ฑ์žˆ๋Š” ๊ฒƒ์œผ๋กœ ํ™•์ธ ๋œ conda ํ™˜๊ฒฝ์„ ์„ค์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค! ๋ชจ๋“  ๊ฒƒ์ด ์•„๋‚˜์ฝ˜๋‹ค์— ์˜ํ•ด ํฌ์žฅ ๋œ ๊ทธ๋Œ€๋กœ์ž…๋‹ˆ๋‹ค. ์ €๋Š” ์ด๊ฒƒ์„ ํ™•์ธํ–ˆ์Šต๋‹ˆ๋‹ค.

์–ด์จŒ๋“  ๋‹น์‹ ์€ ๋‚ด๊ฐ€ ์•„๋‚˜์ฝ˜๋‹ค๋ฅผ ์„ค์ •ํ•˜๊ธฐ์—๋Š” ๋„ˆ๋ฌด ์–ด๋ฆฌ ์„๋‹ค๊ณ  ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž˜
์ด์ œ ๊ณต์‹ ๋„์ปค ์ด๋ฏธ์ง€๋ฅผ ๋‹ค์šด๋กœ๋“œํ–ˆ์Šต๋‹ˆ๋‹ค.

tensorflow / t ensorflow : 2.1.0-gpu-py3

๊ฑฐ๊ธฐ์—์„œ ๋‚ด ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‹คํ–‰ํ•˜์‹ญ์‹œ์˜ค. ๋‚ด๊ฐ€ ์—†์œผ๋ฉด ์ถฉ๋Œ

๋‚ด๋ณด๋‚ด๊ธฐ TF_FORCE_GPU_ALLOW_GROWTH = true

๊ฒฝ๋กœ๋ฅผ ๋” ์ž˜ ๊ด€๋ฆฌ ํ•  ์ˆ˜ โ€‹โ€‹์žˆ์Šต๋‹ˆ๊นŒ?

๋ฐ ์•Œ๋ ค์ง„ ํฌ๊ธฐ์˜ ๊ทธ๋ž˜ํ”„๋ฅผ ์ดˆ๊ธฐํ™”ํ•˜๊ธฐ ์ „์— ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ฉ”๋ชจ๋ฆฌ ์–‘์„ ๊ด€๋ฆฌํ•˜๊ณ  ๋Œ€์ƒ GPU๊ฐ€ ๊ทธ๋ž˜ํ”„์— ๋Œ€ํ•ด ์ถฉ๋ถ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ์™€ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ CUDA ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ฟผ๋ฆฌํ•˜๋Š” ๋ฐ ์ถฉ๋ถ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ ๋งŒ ์‚ฌ์šฉํ•˜๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๋‹จ๊ณ„๋ฅผ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.

์ž์› ๋ฌธ์ œ๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ํ”„๋กœ์„ธ์Šค๋ฅผ ์‹œ์ž‘ํ•  ๋•Œ ์‚ฌ์šฉํ•  ์ˆ˜์žˆ๋Š” ๋ฉ”๋ชจ๋ฆฌ ์–‘๊ณผ ๊ทธ๋ž˜ํ”„์—์„œ ์‚ฌ์šฉํ•˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ์–‘์€ ์–ผ๋งˆ์ž…๋‹ˆ๊นŒ?

์œ„์—์„œ ์“ด ๊ฒƒ์ฒ˜๋Ÿผ ๊ทธ๋ž˜ํ”„๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค (๋˜๋Š” ๊ทธ๋ž˜ํ”„๊ฐ€ ๊ฑฐ์˜ ์—†๋‹ค๊ณ  ๋งํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค)! ์ด ๋„ค ์ค„์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค

import tensorflow as tf
tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.matmul(tf.zeros((2,2,2)), tf.zeros((2,2,2)))
tf.nn.conv2d(tf.zeros((2,20,20,20), dtype=tf.float32), filters=tf.zeros((2,2,20,20), dtype=tf.float32), strides=(1,1,1,1), padding="VALID")

๊ทธ๋ฆฌ๊ณ  ๊ทธ๊ฒƒ์€ ์ถฉ๋Œํ•ฉ๋‹ˆ๋‹ค. ์„ธ ์ค„์˜ ์ˆœ์„œ๋ฅผ ๋ณ€๊ฒฝํ•˜๋ฉด์ด ์„ธ ๊ฐ€์ง€ ์ž‘์—… ํ›„์— ํ•ญ์ƒ ์ถฉ๋Œ์ด ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค (๋ฒ„๊ทธ ๋ณด๊ณ ์„œ์—์„œ ์„ค๋ช… ํ–ˆ์Œ).

์žฌ๋ฏธ๋ฅผ ์œ„ํ•ด ๋ฐ”์ดํŠธ ์ˆ˜๋ฅผ ์„ธ์—ˆ์Šต๋‹ˆ๋‹ค. ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ 83kB ๋ฏธ๋งŒ์ž…๋‹ˆ๋‹ค. GPU๊ฐ€ ๋น„์–ด ์žˆ๊ณ  ๊ทธ๋ž˜ํ”ฝ์— ์‚ฌ์šฉํ•˜์ง€ ์•Š์œผ๋ฉฐ ๋‹ค๋ฅธ ํ”„๋กœ์„ธ์Šค๊ฐ€ ์‹คํ–‰๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋‹ค์–‘ํ•œ ์‹œ์Šคํ…œ์—์„œ 4GB ๋˜๋Š” 11GB๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค! ๊ฒŒ๋‹ค๊ฐ€ nvidia-smi๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค! ๊ทธ๋ž˜์„œ ์นด๋“œ๋Š” ๋น„์–ด ์žˆ์ง€๋งŒ 84kB๊ฐ€ ํ•„์š”ํ•œ 4 ๊ฐœ์˜ ๋ผ์ธ์„ ์‹คํ–‰ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค!

์ฐธ๊ณ ๋กœ, ๋ฉ”๋ชจ๋ฆฌ ์†Œ๋ชจ๋กœ ์ธํ•œ ์˜ค๋ฅ˜๋Š” ์ƒ๋‹นํžˆ ๋‹ค๋ฅด๊ฒŒ ๋ณด์ž…๋‹ˆ๋‹ค. ์ด๊ฒƒ๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚ด ์‹ค์ œ ๊ทธ๋ž˜ํ”„์˜ ๊ฒฝ์šฐ์ด๋ฅผ ๊ฐ์ง€ํ•˜๊ณ  ๊ทธ์— ๋”ฐ๋ผ ๋ฐ˜์‘ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์–ด์จŒ๋“  ๋‹น์‹ ์˜ ๋…ธ๋ ฅ์— ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

@roebel cpp https://github.com/tensorflow/tensorflow/issues/24496#issuecomment -561366750์—์„œ ๋””๋ฒ„๊น…์— ๋Œ€ํ•œ @sanjoy ์˜ ์˜๊ฒฌ์„ ๋ณด์…จ์Šต๋‹ˆ๊นŒ?

๋‚˜๋Š” tensorflow๋ฅผ ๋‹ค์‹œ ์ปดํŒŒ์ผํ•˜๊ณ  ๊ทธ๊ฒƒ์„ ์‹œ๋„ํ•ด ๋ณด์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๊ทธ๋“ค์˜ ๋ฒ„์ „์€ ๋„ˆ๋ฌด ๋นจ๋ฆฌ ์›€์ง์—ฌ์„œ ๋ชจ๋“  ๊ฒƒ์„ ์„ค์ •ํ•˜๊ณ  ์ปดํŒŒ์ผํ•˜๋Š” ๋ฐ ์•ฝ๊ฐ„์˜ ์‹œ๊ฐ„์ด ๊ฑธ๋ฆฝ๋‹ˆ๋‹ค. ๋˜ํ•œ 1.15๋Š” ๋‚ด๊ฐ€ ์‚ฌ์šฉํ•˜๋Š” gcc ๋ฒ„์ „์— ๋Œ€ํ•œ ์ง€์›์„ ์ค‘๋‹จํ–ˆ์œผ๋ฉฐ 1.13์€ ์—…๋ฐ์ดํŠธ๋ฅผ๋ฐ›์ง€ ์•Š์œผ๋ฏ€๋กœ ์–ด์จŒ๋“  ์ด๊ฒƒ์„ ๋””๋ฒ„๊น…ํ•˜๋Š” ๊ฒƒ์€ ๋‹ค์†Œ ๋ฌด์˜๋ฏธํ–ˆ์Šต๋‹ˆ๋‹ค.

@roebel ๋‚˜๋Š” ๋‹น์‹ ์—๊ฒŒ ๋ฌธ์ œ๋ฅผ ์ผ์œผํ‚จ ์ด์œ ๋ฅผ ๊ธฐ์–ตํ•˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.

https://github.com/tensorflow/tensorflow/issues/24496#issuecomment -480549043์„ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค

์ด๊ฒƒ์ด ๋ฉ”๋ชจ๋ฆฌ์™€ ๊ด€๋ จ์ด ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•œ ์ด์œ ์ž…๋‹ˆ๋‹ค.์ด ๋ฌธ์ œ๋Š” ํ•œ๋™์•ˆ ์ €์—๊ฒŒ ์˜ํ–ฅ์„ ๋ฏธ์น˜์ง€ ์•Š์•˜์œผ๋ฉฐ ๋‹ค์–‘ํ•œ ํ”Œ๋žซํผ์—์„œ ๋‚ด ์†Œํ”„ํŠธ์›จ์–ด ์‚ฌ์šฉ์ž์—๊ฒŒ ์˜ํ–ฅ์„ ๋ฏธ์น˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.

๋ฟก๋ฟก

์˜ˆ, ๋ฒ„๊ทธ๊ฐ€ ์žˆ์œผ๋ฉด ํŠน์ • ์ƒํ™ฉ์— ์˜ํ•ด์„œ๋งŒ ์œ ๋ฐœ๋˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋ฟก ๋นต๋€จ

๊ณ ๋งˆ์›Œ์š”. ์ตœ์‹  ๋ฒ„์ „ tf2.2.0์„ ์ปดํŒŒ์ผ ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์‚ดํŽด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

์‚ฌ์‹ค ๋‚˜๋Š” tensorflow 2.2๋กœ ๋„์ปค๋ฅผ ์‹œ๋„ํ–ˆ๋Š”๋ฐ ๋™์ผํ•œ ๋ฒ„์ „์˜ cuda 10.1์„ ์‚ฌ์šฉํ•˜๊ณ  ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด๊ฒƒ์ด Windows ์ „์šฉ ๋ฌธ์ œ๋ผ๊ณ  ์ƒ๊ฐํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ฒ˜์Œ๋ถ€ํ„ฐ ์šฐ๋ถ„ํˆฌ ํ™˜๊ฒฝ์„ ์„ค์น˜ํ–ˆ์ง€๋งŒ ๋‚ด ๊ทธ๋ž˜ํ”ฝ ์นด๋“œ (RTX 2080)๊ฐ€ ๋ฌธ์ œ๋ผ๋Š” ๊ฒƒ์„ ์•Œ์•˜์Šต๋‹ˆ๋‹ค. ์•ˆํƒ€๊น๊ฒŒ๋„์ด ๋ฌธ์ œ๋กœ ์ธํ•ด 2018 ๋…„๋ถ€ํ„ฐ ๋ฌธ์ œ๊ฐ€ ๋œ ๊ฒƒ ๊ฐ™๊ธฐ ๋•Œ๋ฌธ์— ๋‹ค๋ฅธ ๋จธ์‹  ๋Ÿฌ๋‹ ํ”Œ๋žซํผ์„ ์„ ํƒํ•˜๋ ค๊ณ ํ•ฉ๋‹ˆ๋‹ค.

@ kognat-docs

๋ฐ ์•Œ๋ ค์ง„ ํฌ๊ธฐ์˜ ๊ทธ๋ž˜ํ”„๋ฅผ ์ดˆ๊ธฐํ™”ํ•˜๊ธฐ ์ „์— ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ฉ”๋ชจ๋ฆฌ ์–‘์„ ๊ด€๋ฆฌํ•˜๊ณ  ๋Œ€์ƒ GPU๊ฐ€ ๊ทธ๋ž˜ํ”„์— ๋Œ€ํ•ด ์ถฉ๋ถ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ์™€ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ CUDA ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ฟผ๋ฆฌํ•˜๋Š” ๋ฐ ์ถฉ๋ถ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ ๋งŒ ์‚ฌ์šฉํ•˜๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๋‹จ๊ณ„๋ฅผ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.

์ž์› ๋ฌธ์ œ๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ํ”„๋กœ์„ธ์Šค๋ฅผ ์‹œ์ž‘ํ•  ๋•Œ ์‚ฌ์šฉํ•  ์ˆ˜์žˆ๋Š” ๋ฉ”๋ชจ๋ฆฌ ์–‘๊ณผ ๊ทธ๋ž˜ํ”„์—์„œ ์‚ฌ์šฉํ•˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ์–‘์€ ์–ผ๋งˆ์ž…๋‹ˆ๊นŒ?

์œ„์—์„œ ์“ด ๊ฒƒ์ฒ˜๋Ÿผ ๊ทธ๋ž˜ํ”„๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค (๋˜๋Š” ๊ทธ๋ž˜ํ”„๊ฐ€ ๊ฑฐ์˜ ์—†๋‹ค๊ณ  ๋งํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค)! ์ด ๋„ค ์ค„์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค

import tensorflow as tf
tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.matmul(tf.zeros((2,2,2)), tf.zeros((2,2,2)))
tf.nn.conv2d(tf.zeros((2,20,20,20), dtype=tf.float32), filters=tf.zeros((2,2,20,20), dtype=tf.float32), strides=(1,1,1,1), padding="VALID")

๊ทธ๋ฆฌ๊ณ  ๊ทธ๊ฒƒ์€ ์ถฉ๋Œํ•ฉ๋‹ˆ๋‹ค. ์„ธ ์ค„์˜ ์ˆœ์„œ๋ฅผ ๋ณ€๊ฒฝํ•˜๋ฉด์ด ์„ธ ๊ฐ€์ง€ ์ž‘์—… ํ›„์— ํ•ญ์ƒ ์ถฉ๋Œ์ด ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค (๋ฒ„๊ทธ ๋ณด๊ณ ์„œ์—์„œ ์„ค๋ช… ํ–ˆ์Œ).

์žฌ๋ฏธ๋ฅผ ์œ„ํ•ด ๋ฐ”์ดํŠธ ์ˆ˜๋ฅผ ์„ธ์—ˆ์Šต๋‹ˆ๋‹ค. ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ 83kB ๋ฏธ๋งŒ์ž…๋‹ˆ๋‹ค. GPU๊ฐ€ ๋น„์–ด ์žˆ๊ณ  ๊ทธ๋ž˜ํ”ฝ์— ์‚ฌ์šฉํ•˜์ง€ ์•Š์œผ๋ฉฐ ๋‹ค๋ฅธ ํ”„๋กœ์„ธ์Šค๊ฐ€ ์‹คํ–‰๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋‹ค์–‘ํ•œ ์‹œ์Šคํ…œ์—์„œ 4GB ๋˜๋Š” 11GB๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค! ๊ฒŒ๋‹ค๊ฐ€ nvidia-smi๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค! ๊ทธ๋ž˜์„œ ์นด๋“œ๋Š” ๋น„์–ด ์žˆ์ง€๋งŒ 84kB๊ฐ€ ํ•„์š”ํ•œ 4 ๊ฐœ์˜ ๋ผ์ธ์„ ์‹คํ–‰ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค!

50ms ๊ฐ„๊ฒฉ์œผ๋กœ ํ”„๋กœ์„ธ์Šค๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๋™์•ˆ nvidia-smi์—์„œ watch๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ค‘์ธ ๋ฉ”๋ชจ๋ฆฌ ์–‘์„ ๊ด€์ฐฐ ํ–ˆ์Šต๋‹ˆ๊นŒ?

๋‹ค๋ฅธ ์‚ฌ๋žŒ์—๊ฒŒ ํšจ๊ณผ๊ฐ€ ์žˆ์—ˆ๋˜์ด ์ˆ˜์ • ์‚ฌํ•ญ๋ณด๊ธฐ

https://github.com/tensorflow/tensorflow/issues/24496#issuecomment -497202806

์—ฌ๊ธฐ์— 4 ๋…„ ์ „์˜ ๊ด€๋ จ ๊ฒŒ์‹œ๋ฌผ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

https://stackoverflow.com/questions/34199233/how-to-prevent-tensorflow-from-allocating-the-totality-of-a-gpu-memory

๋˜๋Š” ์นœ์ˆ™ํ•œ ๋งค๋‰ด์–ผ์„ ์ฝ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
https://www.tensorflow.org/guide/gpu#limiting_gpu_memory_growth

๋”ฐ๋ผ์„œ ๋Ÿฐํƒ€์ž„ ํ™˜๊ฒฝ์„ ๋ณ€๊ฒฝํ•˜๊ธฐ ๋งŒํ•˜๋ฉด ์ฝ”๋“œ๋ฅผ ๊ฑด๋“œ๋ฆฌ์ง€ ์•Š๊ณ  ํŒจ์น˜๋ฅผ ์ˆ˜ํ–‰ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Another way to enable this option is to set the environmental variable TF_FORCE_GPU_ALLOW_GROWTH to true. This configuration is platform specific.

ํŠธ์œ— ๋‹ด์•„ ๊ฐ€๊ธฐ

์ข‹์€ ์†Œ์‹!
์ˆ˜ํ–‰์›
https://github.com/tensorflow/tensorflow/issues/24496#issuecomment -561366750

์—ฌ๊ธฐ์—์„œ anaconda tensorflow ๋ ˆ์‹œํ”ผ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฒ„์ „ 2.1์„ ๋‹ค์‹œ ๋นŒ๋“œํ–ˆ์Šต๋‹ˆ๋‹ค.
https://github.com/AnacondaRecipes/tensorflow_recipes

MinSystemMemory์— available_memory ๋ฐ min_system_memory๋ฅผ ๋ณด์—ฌ์ฃผ๋Š” ๋‘ ๊ฐœ์˜ ์ธ์‡„๋ฌผ์„ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.
๋‚ด ์‹œ์Šคํ…œ์—์„œ GeForce GTX 1050 Ti TF ํ‘œ์ค€ ๋กœ๊ทธ ๋น„ํ™œ์„ฑํ™”
๋‚˜๋Š” ์ด๊ฒƒ์„ ์–ป์—ˆ๋‹ค

TF_CPP_MIN_LOG_LEVEL=2 python run_cuda.py 
=========================================================
MinSystemMemory: available_memory::4163764224
MinSystemMemory: min_system_memory::314572800
=========================================================
1 Physical GPUs, 1 Logical GPUs
2020-05-21 09:44:32.143642: E tensorflow/stream_executor/cuda/cuda_fft.cc:223] failed to make cuFFT batched plan:5
2020-05-21 09:44:32.143671: E tensorflow/stream_executor/cuda/cuda_fft.cc:426] Initialize Params: rank: 1 elem_count: 512 input_embed: 512 input_stride: 1 input_distance: 512 output_embed: 257 output_stride: 1 output_distance: 257 batch_count: 20
2020-05-21 09:44:32.143677: F tensorflow/stream_executor/cuda/cuda_fft.cc:435] failed to initialize batched cufft plan with customized allocator: Failed to make cuFFT batched plan.
Aborted

nvidia-smi๋Š” GPU์— 4040MiB๊ฐ€ ์žˆ๋‹ค๊ณ ๋ณด๊ณ ํ•˜๊ณ ,์ด ์‹œ์Šคํ…œ์—์„œ๋Š” 13MiB๋ฅผ ๊ฐ€์ง„ ์นด๋“œ์—์„œ ์‹คํ–‰๋˜๋Š” X๊ฐ€ ์žˆ์œผ๋ฏ€๋กœ ์ˆซ์ž๊ฐ€ ๊ดœ์ฐฎ์•„ ๋ณด์ž…๋‹ˆ๋‹ค.

min_system_memory๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์„ค์ •๋ฉ๋‹ˆ๋‹ค.

    min_system_memory =                                                                                                                        
        std::max(int64{314572800}, static_cast<int64>(available_memory * 0.05));                                                               

๊ทธ๋ž˜์„œ ์–ด์จŒ๋“  ์ตœ๋Œ€ ๋ฉ”๋ชจ๋ฆฌ ์–‘์ด ์„ ํƒ๋ฉ๋‹ˆ๋‹ค. ๋Œ€์‹  ํ™˜๊ฒฝ ๋ณ€์ˆ˜ TF_FORCE_MIN_SYSTEM_MEMORY_MB๋ฅผ ํ†ตํ•ด min_system_memory๋ฅผ ๊ฐ•์ œํ•˜๋Š” ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.
๊ทธ๋Ÿฐ ๋‹ค์Œ ์‹คํ–‰

TF_FORCE_MIN_SYSTEM_MEMORY_MB=310 TF_CPP_MIN_LOG_LEVEL=2 python run_cuda.py 
=========================================================
MinSystemMemory: available_memory::4163764224
MinSystemMemory: min_system_memory::314572800
MinSystemMemory: forced min_system_memory::325058560
=========================================================
1 Physical GPUs, 1 Logical GPUs
done

๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค!

์•ˆํƒ€๊น๊ฒŒ๋„ ํ˜„์žฌ ์ž‘๋™ํ•˜๋Š” RTX ์นด๋“œ๊ฐ€์žˆ๋Š” ์‹œ์Šคํ…œ์ด์—†๊ณ  ์–ธ์ œ ๋‹ค์‹œ ์ž‘๋™ํ• ์ง€ ํ™•์‹  ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๊ทธ๋Ÿฌํ•œ ์นด๋“œ์—์„œ ์ด๊ฒƒ์„ ํ…Œ์ŠคํŠธํ•˜๋ ค๋Š” ๊ฒฝ์šฐ pip ํŒจํ‚ค์ง€์™€ ๊ทธ๊ฒƒ์„ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ์„ค์น˜ํ•ด์•ผํ•˜๋Š” ์šฐ๋ถ„ํˆฌ ๋ฆฌ๋ˆ…์Šค ์šฉ conda ํ™˜๊ฒฝ์˜ ๋‚ด์šฉ์„ ์ œ๊ณต ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ข‹์€ ์‚ฌ๋žŒ @roebel !

ํ’€ ์š”์ฒญ์œผ๋กœ ์ œ์•ˆํ•˜๊ณ  ๋ฌธ์„œ์— ์ถ”๊ฐ€ ํ•  ๊ฐ€์น˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

ํŠธ์œ— ๋‹ด์•„ ๊ฐ€๊ธฐ

ํ’€ ์š”์ฒญ์œผ๋กœ ์ œ์•ˆํ•˜๊ณ  ๋ฌธ์„œ์— ์ถ”๊ฐ€ ํ•  ๊ฐ€์น˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฌผ๋ก ์ž…๋‹ˆ๋‹ค.ํ•˜์ง€๋งŒ ๋ฌธ์ œ๋Š” ์†”๋ฃจ์…˜์ด ๋‹ค๋ฅธ ์นด๋“œ์—์„œ๋Š” ์ž‘๋™ํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๋‚ด GTX 1050์˜ ๊ฒฝ์šฐ ์ด ๋ฉ”๋ชจ๋ฆฌ๋Š” 4GB์ด๊ณ  ๊ธฐ๋ณธ ์‹œ์Šคํ…œ ๋ฉ”๋ชจ๋ฆฌ๋Š” ๊ทธ๋Œ€๋กœ ์œ ์ง€๋ฉ๋‹ˆ๋‹ค.
tensorflow๋Š” ์ตœ๋Œ€ (300MB, 4GB * 0.05)์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ GTX1050์˜ ๊ฒฝ์šฐ ์ด๊ฒƒ์€ ๋ถ„๋ช…ํžˆ ๋„ˆ๋ฌด ์ž‘์€ 300MB์ž…๋‹ˆ๋‹ค. ์œ„์—์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด 310MB๋กœ ๋Š˜๋ ค์•ผํ•ฉ๋‹ˆ๋‹ค.

์ด์ œ RTX2080์˜ ์ด ๋ฉ”๋ชจ๋ฆฌ๋Š” 11GB์ด๋ฉฐ max (300MB, 11GB * 0.05)
1050์˜ ๊ฒฐ๊ณผ์— ๋”ฐ๋ผ ์‹œ์Šคํ…œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ 550MB๋กœ ์„ ํƒํ•ฉ๋‹ˆ๋‹ค.
์ผ๋ฐ˜์ ์œผ๋กœ ์ถฉ๋ถ„ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

์ด๋ฒˆ ์ฃผ ๋ง๊นŒ์ง€ RTX2080 GPU์— ๋‹ค์‹œ ์•ก์„ธ์Šค ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ
๋‚ด๊ฐ€ ๊ฑฐ๊ธฐ์— ์–ป๋Š” ๊ฒƒ.

ํŠธ์œ— ๋‹ด์•„ ๊ฐ€๊ธฐ

๋งˆ์ง€๋ง‰์œผ๋กœ rtx 2080 ์นด๋“œ์—์„œ ํŒจ์น˜ ๋œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.
์˜ˆ์ƒ๋Œ€๋กœ ํŒจ์น˜ ๋œ ๋ฒ„์ „์ด ํ†ต๊ณผํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ ๋‹ค์‹œ ์Šคํฌ๋ฆฝํŠธ

import tensorflow as tf
tf.signal.stft(tf.zeros(3000, dtype=tf.float32), 512, 128)
tf.matmul(tf.zeros((2,2,2)), tf.zeros((2,2,2)))
tf.nn.conv2d(tf.zeros((2,20,20,20), dtype=tf.float32), filters=tf.zeros((2,2,20,20), dtype=tf.float32), strides=(1,1,1,1), padding="VALID")

๊ทธ๋ฆฌ๊ณ  ์—ฌ๊ธฐ gpu_device.cc์—์„œ๋ณด๊ณ  ๋œ available memory ํ–‰๋ ฌ์€
gpu_device.cc์—์„œ ์„ ํƒํ•œ ๊ธฐ๋ณธ๊ฐ’ Min_system_memory ๋ฐ
min value of the min_system_memory ์Šคํฌ๋ฆฝํŠธ๊ฐ€ ์ค‘๋‹จ๋˜์ง€ ์•Š๋„๋ก ์„ ํƒํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

์นด๋“œ | AvailMem | Def MinSysMem | ํ•„์š”ํ•œ MinSysMem
: ------- | : ----------- | : ---------- | : --------------- --------
1050TI | 4163764224 | 314572800 | 325058560
1080TI | 11567431680 | 578371584 | 335544320
2080 TI | 11381964800 | 569098240 | 618659840

๋”ฐ๋ผ์„œ 1050๊ณผ 1080์€ ๊ฑฐ์˜ ๋™์ผํ•œ ๋ฉ”๋ชจ๋ฆฌ ํฌ๊ธฐ๋กœ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.
RTX2080์€ ๊ฑฐ์˜ ๋‘ ๋ฐฐ์˜ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ•„์š”๋กœํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฑด ์ข‹์ง€ ์•Š์•„์š”
๋‚˜์—๊ฒŒ.

์ด ๊ฐ’์„ ๋น„์Šทํ•œ ๊ฐ€์น˜๋กœ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด ์–ด๋–ค ์ œ์•ˆ์„ ํ•˜์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

๋ฟก๋ฟก

๋‚˜๋Š” ์—ฌ๋Ÿฌ ๋ฐ˜๋ณต์„ ์œ„ํ•ด ๋‚ด C ++ ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์—์„œ ์ด๊ฒƒ์„ ๊ณ ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.

๊ฒฐ๊ตญ ๋‚ด๋ ค์ง„ ๊ฒƒ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋ชจ๋ธ์„ ์‹คํ–‰ํ•˜๋Š” ๋ฐ ์ถฉ๋ถ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜์žˆ๋Š” ๊ฒฝ์šฐ์—๋งŒ GPU์—์„œ ๋ชจ๋ธ์„ ์‹คํ–‰ํ•˜์‹ญ์‹œ์˜ค.

๋”ฐ๋ผ์„œ ๋ชจ๋ธ์— ํ•„์š”ํ•œ ๋ฉ”๋ชจ๋ฆฌ ์–‘์€ ์ •๋Ÿ‰ํ™” ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ํ•ด๋‹น ๋ชจ๋ธ์— ๋งž๋Š” ๋ฐฑ๋ถ„์œจ๋กœ GPU ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ์žˆ์–ด์•ผํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋Ÿฐ ๋‹ค์Œ ์šด์˜ ์ฒด์ œ์—์„œ CUDA ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋™์‹œ์— ์‚ฌ์šฉํ•˜๋Š” ๋‹ค๋ฅธ ํ•ญ๋ชฉ์ด ๋ฌด์—‡์ธ์ง€ ์•Œ ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์— ๊ฒฝ์Ÿ ์กฐ๊ฑด์— ๋”ฐ๋ผ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ• ๋‹นํ•˜๊ธฐ ์ „์— ์ •ํ™•ํžˆ ์นด๋“œ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜์žˆ๋Š” ๋ฉ”๋ชจ๋ฆฌ ์–‘์„ ์•Œ์•„์•ผํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋Ÿฌ๋‚˜ ๊ฒฝ์Ÿ ์กฐ๊ฑด์„ ์ œ์ณ๋‘๊ณ  ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ์—†๋Š”์ง€ ์ธก์ •ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ์ž์ฒด์ ์œผ๋กœ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” cudaMemInfo ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ˆ˜ํ–‰๋ฉ๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ cudaMemInfo ๋ฅผ ์ธก์ •ํ•˜๊ธฐ ์œ„ํ•ด ํ•œ ๋ฒˆ ์‹คํ–‰ํ•  ์ˆ˜์žˆ๋Š” ์ถฉ๋ถ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ์žˆ๊ณ  ๋ชจ๋ธ์— ๋งž๋Š” ์ถฉ๋ถ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๊ณ  cudaMemInfo ํ•œ ๋ฒˆ ๋” ์‹คํ–‰ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ ๋ชจ๋ธ์„ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ํ•ด๋‹น ์นด๋“œ์—์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ VRAM์˜ ๋ฐฑ๋ถ„์œจ์„ ์ถฉ๋ถ„ํžˆ ํ• ๋‹น ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์–ด์จŒ๋“  ๋‚ด ์ž„์˜์˜ ์˜น์•Œ์ด์—์„œ ์ง‘์œผ๋กœ ๊ฐ€์ ธ๊ฐ€๋Š” ๊ฒƒ์€ ํ• ๋‹น์— ์‚ฌ์šฉํ•  ์ˆ˜์žˆ๋Š” ๋ฉ”๋ชจ๋ฆฌ ์–‘์„ ํด๋งํ•˜๋Š” ๋ฐ cudaMemInfo ๊ฐ€ ํ•„์š”ํ•˜๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์–ด์ฉŒ๋ฉด cudaMemInfo ์‚ฌ์šฉํ•˜๋Š” ๋ฉ”๋ชจ๋ฆฌ ์–‘์ด Pascal ๊ธฐ๋ฐ˜ ์นด๋“œ์™€ ๋น„๊ตํ•˜์—ฌ Turing ๊ธฐ๋ฐ˜ ์นด๋“œ์—์„œ ๋‹ค๋ฅผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์›ํ•˜๋Š” ๊ฒฝ์šฐ NVIDIA์—์„œ ๋ˆ„๊ตฐ๊ฐ€๋ฅผ ์‚ดํŽด ๋ณด๋„๋ก ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ˆ, cudaMemInfo ์— ๋Œ€ํ•œ ์ฐธ์กฐ๋ฅผ ์ „ํ˜€ ์ฐพ์„ ์ˆ˜ ์—†์ง€๋งŒ ์ตœ๋Œ€ 300Mb ๋ฐ ์นด๋“œ ๋ฉ”๋ชจ๋ฆฌ์˜ 5 %๊ฐ€๋˜๋Š” ์ข…๋ฅ˜์˜ ํ’‹ ํ”„๋ฆฐํŠธ์ฒ˜๋Ÿผ ๋ณด์ž…๋‹ˆ๋‹ค.

์‚ดํŽด๋ณด๋ฉด :

https://github.com/tensorflow/tensorflow/blob/r2.2/tensorflow/core/common_runtime/gpu/gpu_process_state.cc

์ด๊ฒƒ์ด ๊ทธ ์ž์ฒด๋กœ ์ด๊ฒƒ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ ๊ฐ™์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” ์šฐ๋ฆฌ๊ฐ€ ์‹œ์Šคํ…œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์œ„ํ•ด ์˜ˆ์•ฝํ•ด์•ผํ•˜๋Š” ์–‘์˜ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๊ฐ€์ง€๊ณ  ๊ณ ์–‘์ด์™€ ์ฅ๋ฅผ ๊ฐ€์ง€๊ณ  ๋†€์•„์•ผํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋Œ€์‹  IMO๋Š” BFC ํ• ๋‹น์ž๊ฐ€ GPU์˜ ๋‚˜๋จธ์ง€ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ํ• ๋‹น ํ•  ๊ธฐํšŒ๋ฅผ ๊ฐ–๊ธฐ ์ „์— ์‹œ์Šคํ…œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์ดˆ๊ธฐํ™”ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

CC @chsigg

์•„๋งˆ๋„ ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€ ํ—ˆ์šฉ์ด ๊บผ์ ธ์žˆ๋Š” ๊ฒฝ์šฐ์—๋งŒ์ด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ๋ชจ๋“  ์šด์˜์ž๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š๋”๋ผ๋„ 2080์„ ์œ„ํ•ด ํ•ญ์ƒ ์•ฝ 580MB๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

๋‚ด ํ…Œ์ŠคํŠธ ์ผ€์ด์Šค์—์„œ ์„ธ ๊ฐ€์ง€ ์ž‘์—…์˜ ์กฐํ•ฉ์„ ์‹คํ–‰ํ•˜๊ธฐ์œ„ํ•œ ์ตœ์†Œ ์‹œ์Šคํ…œ ๋ฉ”๋ชจ๋ฆฌ ์š”๊ตฌ ์‚ฌํ•ญ์— ๋Œ€ํ•ด ๋ช‡ ๊ฐ€์ง€ ์ถ”๊ฐ€ ํ…Œ์ŠคํŠธ๋ฅผํ–ˆ์Šต๋‹ˆ๋‹ค. 1080๊ณผ 2080 ์นด๋“œ ๋งŒ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค. ์–ด๋–ค ๊ฒฝ์šฐ์—๋„ blas๋ฅผ ์ดˆ๊ธฐํ™”ํ•˜๊ธฐ ๋•Œ๋ฌธ์— conv2d ๋งŒ ์ฐพ์„ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๋‚˜์˜จ๋‹ค

GPU | MatMul | STFT | Conv2D + MatMUL | MatMul + STFT | MATMUL + STFT + Conv2D |
: --- | : --- | : --- | : --- | : --- | : ---
1080 | 140MB | 130MB | 290MB | 170MB | 320MB
2080 | 190MB | 190MB | 520MB | 250MB | 580MB

2080 cuda์—์„œ๋Š” ๊ฐ ์ž‘์—…์— ๋Œ€ํ•œ ์˜ค๋ฒ„ ํ—ค๋“œ๊ฐ€ ํ•„์š”ํ•˜๋ฉฐ ๋” ๋งŽ์€ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ์ด ์˜ค๋ฒ„ ํ—ค๋“œ๊ฐ€ ์ฆ๊ฐ€ํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ ์˜ค๋ฒ„ ํ—ค๋“œ๋Š” <100MB ์ด์ง€๋งŒ Conv2D๊ฐ€ ๊ด€๋ จ๋˜๋ฉด >220MB ๊ฐ€๋ฉ๋‹ˆ๋‹ค.

@samhodge ๊ฐ€ NVIDIA์™€ ์—ฐ๋ฝ์„ํ–ˆ๋‹ค๋ฉด

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„!
๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๋ฅผ ์ œํ•œํ•˜๋Š” ์œ ์‚ฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ–ˆ์œผ๋ฉฐ ์‹œ๋„ํ•ด ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€ ์ œํ•œ ์„น์…˜์—์„œ ์ฝ”๋“œ๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

(์ด๊ฒƒ์€ GitHub์˜ ์ฒซ ๋ฒˆ์งธ ๋Œ“๊ธ€์ž…๋‹ˆ๋‹ค)

์ด์ „์—๋„ ๋น„์Šทํ•œ ๋ฌธ์ œ๊ฐ€์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. GPU ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์ˆ˜๋™์œผ๋กœ ์ œํ•œํ•˜๋Š” ๊ฒƒ์ด ๋„์›€์ด๋˜์—ˆ์Šต๋‹ˆ๋‹ค. https://github.com/tensorflow/tensorflow/issues/25160#issuecomment -643703167

GeForce RTX 2060 SUPER๋ฅผ ์‚ฌ์šฉํ•˜๋Š” Ubuntu 20.04์—์„œ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ณ ๋ฐ€๋„ ๋ ˆ์ด์–ด๊ฐ€์žˆ๋Š” NN์ด ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ CNN ๋ ˆ์ด์–ด๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
tf.config.experimental.set_memory_growth(tf.config.list_physical_devices('GPU')[0], True) ์„ ์ถ”๊ฐ€ํ•ด๋„ ์˜ค๋ฅ˜์— ์ฐจ์ด๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.
https://www.tensorflow.org/install/gpu ๋ฐ nvidia-smi ์‡ผ์— ๋”ฐ๋ผ ์„ค์น˜๋ฅผ ๋”ฐ๋ž์Šต๋‹ˆ๋‹ค.
Driver Version: 440.64.00 CUDA Version: 10.2
๋‚ด conda ํ™˜๊ฒฝ์—๋Š” ๋‹ค์Œ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

cudatoolkit               10.1.243             h6bb024c_0  
cudnn                     7.6.5                cuda10.1_0  
tensorflow-gpu            2.1.0                h0d30ee6_0

tf 1.15๋ฅผ ์‚ฌ์šฉํ•˜๋Š” conda ํ™˜๊ฒฝ์—์„œ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ๊ณ ์ณ์งˆ ์ˆ˜ ์žˆ๋‹ค๋ฉด ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ตœ์‹  ์ •๋ณด

export TF_FORCE_GPU_ALLOW_GROWTH=true ํ•˜๋ฉด ๋ชจ๋“  ๊ฒƒ์ด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” tf.config.experimental.set_memory_growth(tf.config.list_physical_devices('GPU')[0], True) ๊ฐ€ ๊ฐ™์€ ์ผ์„ ํ•  ๊ฒƒ์ด๋ผ๋Š” ์ธ์ƒ์„ ๋ฐ›์•˜์ง€๋งŒ ๊ทธ๋ ‡์ง€ ์•Š์Šต๋‹ˆ๋‹ค. TensorFlow GPU ์ง€์› ์›น ํŽ˜์ด์ง€์—์ด ๋‚ด์šฉ์ด ๋ช…์‹œ๋˜์–ด ์žˆ์–ด์•ผํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์•ผ, ๋‹น์‹ ์˜ ์†”๋ฃจ์…˜์€ ๋‚ด ์ƒ๋ช…์„ ๊ตฌํ•ฉ๋‹ˆ๋‹ค.

Nvidia๋Š” ๋ฐฉ๊ธˆ 440.100 ๋ฐ 450.51 (๋ฒ ํƒ€) Linux ๋””์Šคํ”Œ๋ ˆ์ด ๋“œ๋ผ์ด๋ฒ„๋ฅผ ์ถœ์‹œํ–ˆ์Šต๋‹ˆ๋‹ค.
440.100์„ ์‚ฌ์šฉํ•ด ๋ณด์•˜์ง€๋งŒ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋ฒ ํƒ€ 450.51์„ ์‚ฌ์šฉํ•ด ๋ณธ ์‚ฌ๋žŒ์ด ์žˆ์Šต๋‹ˆ๊นŒ?

์•ˆ๋…•ํ•˜์„ธ์š”.

Nvidia๋Š” ๋ฐฉ๊ธˆ 440.100 ๋ฐ 450.51 (๋ฒ ํƒ€) Linux ๋””์Šคํ”Œ๋ ˆ์ด ๋“œ๋ผ์ด๋ฒ„๋ฅผ ์ถœ์‹œํ–ˆ์Šต๋‹ˆ๋‹ค.
440.100์„ ์‚ฌ์šฉํ•ด ๋ณด์•˜์ง€๋งŒ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋ฒ ํƒ€ 450.51์„ ์‚ฌ์šฉํ•ด ๋ณธ ์‚ฌ๋žŒ์ด ์žˆ์Šต๋‹ˆ๊นŒ?

450.36.06์„ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค. https://github.com/tensorflow/tensorflow/issues/25160#issuecomment -643703167์„ ํ™•์ธ

๋‚˜๋ฅผ ์œ„ํ•ด ์ผํ•œ ์ฝ”๋“œ :

tensorflow๋ฅผ tf๋กœ ๊ฐ€์ ธ ์˜ค๊ธฐ
๊ตฌ์„ฑ = tf.compat.v1.ConfigProto ()
config.gpu_options.allow_growth = True
์„ธ์…˜ = tf.compat.v1.InteractiveSession (config = config)

_ ๋ฒ„๊ทธ์ธ์ง€ ํ™•์ธ ํ•ด์ฃผ์„ธ์š”. GitHub ์ •์ฑ…์— ๋”ฐ๋ผ tag : bug_template_

์‹œ์Šคํ…œ ์ •๋ณด

  • ์‚ฌ์šฉ์ž ์ง€์ • ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑ ํ–ˆ์Šต๋‹ˆ๊นŒ (TensorFlow์—์„œ ์ œ๊ณตํ•˜๋Š” ์ฃผ์‹ ์˜ˆ์ œ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ๊ณผ ๋ฐ˜๋Œ€) : ์˜ˆ ๋ฐ ์•„๋‹ˆ์š” (์•„๋ž˜ ์„ค๋ช…)
  • OS ํ”Œ๋žซํผ ๋ฐ ๋ฐฐํฌ (์˜ˆ : Linux Ubuntu 16.04) : Manjaro
  • ํœด๋Œ€ ๊ธฐ๊ธฐ์—์„œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ๊ฒฝ์šฐ ํœด๋Œ€ ๊ธฐ๊ธฐ (์˜ˆ : iPhone 8, Pixel 2, Samsung Galaxy) :
  • (์†Œ์Šค ๋˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ)์—์„œ ์„ค์น˜๋œ TensorFlow : tf-nightly-gpu (Dec 19, r1.13)
  • TensorFlow ๋ฒ„์ „ (์•„๋ž˜ ๋ช…๋ น ์‚ฌ์šฉ) : 1.13.0-dev20181219
  • Python ๋ฒ„์ „ : 3.7.1
  • Bazel ๋ฒ„์ „ (์†Œ์Šค์—์„œ ์ปดํŒŒ์ผํ•˜๋Š” ๊ฒฝ์šฐ) :
  • GCC / ์ปดํŒŒ์ผ๋Ÿฌ ๋ฒ„์ „ (์†Œ์Šค์—์„œ ์ปดํŒŒ์ผํ•˜๋Š” ๊ฒฝ์šฐ) :
  • CUDA / cuDNN ๋ฒ„์ „ : cuDNN 7.4.1์ด์žˆ๋Š” CUDA 10
  • GPU ๋ชจ๋ธ ๋ฐ ๋ฉ”๋ชจ๋ฆฌ : RTX 2070 8GB

ํ˜„์žฌ ํ–‰๋™ ์„ค๋ช…
MNIST์—์„œ CNN ๋ชจ๋ธ์„ ์‹คํ–‰ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. GPU๋กœ ์‹คํ–‰ ์ค‘์ผ ๋•Œ
2018-12-20 20:09:13.644176: E tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR

๋‚˜๋Š” ์•ฝ๊ฐ„์˜ ํŒŒ๊ณ ๋ฅผํ–ˆ๊ณ  ๊ทธ๊ฒƒ์ด ๋ฉ”๋ชจ๋ฆฌ ๋ฌธ์ œ๋ผ๋Š” ๊ฒƒ์„ ๊นจ๋‹ฌ์•˜๋‹ค. (๋‚ด๊ฐ€ 32GB์˜ RAM๊ณผ 64GB์˜ ์Šค์™‘์„ ๊ฐ€์ง€๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ๋Ÿด ํ•„์š”๊ฐ€ ์—†๋‹ค. ๋ชจ๋ธ์„ ์‹คํ–‰ํ•  ๋•Œ htop์„ ์‹คํ–‰ํ–ˆ๊ณ  20GB ์ด์ƒ์˜ ์—ฌ์œ  ๊ณต๊ฐ„์ด ์žˆ์Šต๋‹ˆ๋‹ค. 8GB vRAM ๋งคํ•‘์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.

gpu_options.allow_growth = True ๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋ชจ๋ธ์ด ์ œ๋Œ€๋กœ ์ž‘๋™ํ•˜๊ณ  os.environ['CUDA_VISIBLE_DEVICES'] = '-1' ๋„ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๋‚ด๊ฐ€ ๊ธฐ์–ต ๋ฌธ์ œ์— ์ง๋ฉดํ•˜๊ณ  ์žˆ์Œ์„ ์˜๋ฏธํ•˜์ง€๋งŒ ๋ฐฉ๋ฒ•์„ ์•Œ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ gpu_options.allow_growth = True ์‚ฌ์šฉํ•˜๋ฉด tensorflow / models / official / mnist / ๋ชจ๋ธ์„ ์‹คํ–‰ํ•˜๋ ค๊ณ  ํ•  ๋•Œ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.์ด ๋ฌธ์ œ๋Š” ๋‚ด ์ฝ”๋“œ์™€ ๋น„์Šทํ•œ ๋™์ž‘์„ ๊ฐ€์ ธ์•ผํ•ฉ๋‹ˆ๋‹ค.

๋ฌธ์ œ๋ฅผ ์žฌํ˜„ํ•˜๋Š” ์ฝ”๋“œ

import os
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import math
import time
# Killing optional CPU driver warnings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
tf.logging.set_verbosity(tf.logging.ERROR)


class Model:

    def __init__(self, image, label):
        """
        A Model class contains a computational graph that classifies images
        to predictions. Each of its methods builds part of the graph
        on Model initialization. Do not modify the constructor, as doing so
        would break the autograder. You may, however, add class variables
        to use in your graph-building. e.g. learning rate, 

        image: the input image to the computational graph as a tensor
        label: the correct label of an image as a tensor
        prediction: the output prediction of the computational graph,
                    produced by self.forward_pass()
        optimize: the model's optimizing tensor produced by self.optimizer()
        loss: the model's loss produced by computing self.loss_function()
        accuracy: the model's prediction accuracy
        """
        self.image = image
        self.label = label

        # TO-DO: Add any class variables you want to use.

        self.prediction = self.forward_pass()
        self.loss = self.loss_function()
        self.optimize = self.optimizer()
        self.accuracy = self.accuracy_function()

    def forward_pass(self):
        """
        Predicts a label given an image using convolution layers

        :return: the prediction as a tensor
        """
        filter_1 = tf.Variable(tf.truncated_normal([3, 3, 1, 8], stddev=0.1))
        conv_1 = tf.nn.conv2d(self.image, filter_1, [1, 1, 1, 1], "SAME")

        reshaped = tf.reshape(conv_1, shape=[50, -1])

        L1 = reshaped.shape[1].value
        L2 = 500
        W1 = tf.Variable(tf.random_normal([L1, L2], mean=0, stddev=0.01))
        b1 = tf.Variable(tf.random_normal([L2], mean=0, stddev=0.01))
        relu_1 = tf.nn.relu(tf.matmul(reshaped, W1) + b1)

        W2 = tf.Variable(tf.random_normal([L2, 10], mean=0, stddev=0.01))
        b2 = tf.Variable(tf.random_normal([10], mean=0, stddev=0.01))
        logits = tf.nn.relu(tf.matmul(relu_1, W2) + b2)
        return logits

    def loss_function(self):
        """
        Calculates the model cross-entropy loss

        :return: the loss of the model as a tensor
        """
        loss = tf.losses.softmax_cross_entropy(onehot_labels=self.label, logits=self.prediction)
        return loss

    def optimizer(self):
        """
        Optimizes the model loss using an Adam Optimizer

        :return: the optimizer as a tensor
        """
        learning_rate = 0.1
        sgd = tf.train.GradientDescentOptimizer(learning_rate)
        train = sgd.minimize(self.loss)
        return train

    def accuracy_function(self):
        """
        Calculates the model's prediction accuracy by comparing
        predictions to correct labels โ€“ no need to modify this

        :return: the accuracy of the model as a tensor
        """
        correct_prediction = tf.equal(tf.argmax(self.prediction, 1),
                                      tf.argmax(self.label, 1))
        return tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


def main():
    t_start = time.time()

    mnist = input_data.read_data_sets("data/mnist/", one_hot=True)
    batch_sz = 50
    batch = 2000

    inputs = tf.placeholder(shape=[batch_sz, 28, 28, 1], dtype=tf.float32)
    labels = tf.placeholder(shape=[batch_sz, 10], dtype=tf.float32)

    model = Model(inputs, labels)

    session_config = tf.ConfigProto(gpu_options=tf.GPUOptions(allow_growth=True))
    sess = tf.Session(config=session_config)

    # sess = tf.Session()

    sess.run(tf.global_variables_initializer())
    for i in range(batch):
        next_image, next_label = mnist.train.next_batch(batch_sz)
        next_image = next_image.reshape((batch_sz, 28, 28, 1))
        sess.run(model.optimize, feed_dict={inputs: next_image, labels: next_label})

    acc, test_images, test_labels = 0, mnist.test.images, mnist.test.labels
    test_batch = math.ceil(len(test_images) / batch_sz)
    for i in range(test_batch):
        batch_images = test_images[i * batch_sz: (i + 1) * batch_sz]
        batch_images = batch_images.reshape((batch_sz, 28, 28, 1))
        batch_labes = test_labels[i * batch_sz: (i + 1) * batch_sz]
        acc += sess.run(model.accuracy, feed_dict={inputs: batch_images, labels: batch_labes})
    acc /= test_batch
    print(acc)

    print(time.time() - t_start, 'seconds')

    return


if __name__ == '__main__':
    main()

์ด๊ฒƒ์€ ๋‚˜๋ฅผ ์œ„ํ•ด ์ผํ–ˆ์Šต๋‹ˆ๋‹ค.RTX 2060์šฐ๋ถ„ํˆฌ 18.04ํŒŒ์ด์ฌ 3.6

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
sess = InteractiveSession(config=config)
with sess.as_default():
       process ...

์•ˆ๋…•ํ•˜์„ธ์š” @ bm777

๋ช‡ ๋‹ฌ ์ „์— ์กฐ์‚ฌํ•œ ํ›„ ๋ฌธ์ œ๋ฅผ ์–ด๋–ป๊ฒŒ ์ดํ•ดํ–ˆ๋Š”์ง€ ์š”์•ฝํ•ฉ๋‹ˆ๋‹ค.

GPU ๋ชจ๋ธ ๋ฐ ๋ฉ”๋ชจ๋ฆฌ : RTX 2070 8GB
... 32GB์˜ RAM๊ณผ 64GB์˜

๋ฌธ์ œ๋Š” ์‹œ์Šคํ…œ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ์•„๋‹ˆ๋ผ GPU ๋ฉ”๋ชจ๋ฆฌ์ž…๋‹ˆ๋‹ค!

os.environ [ 'CUDA_VISIBLE_DEVICES'] = '-1'

GPU๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค!

๋ช‡ ๊ฐ€์ง€ ์„ค๋ช… :

TF์—๋Š” ๋‘ ๊ฐ€์ง€ ์ž‘๋™ ๋ชจ๋“œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

  1. allow memory growth = false :์ด ๊ฒฝ์šฐ TF๋Š” ๋Œ€๋žต์ ์ธ ์ถ”์ธก์„ ์‚ฌ์šฉํ•˜์—ฌ ์‹œ์Šคํ…œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ์ผ๋ถ€ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋ฏธ๋ฆฌ ํ• ๋‹นํ•ฉ๋‹ˆ๋‹ค.
    ์–ผ๋งˆ๋‚˜ ๋งŽ์€ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ํ•„์š”ํ•œ์ง€. ์—ฌ๊ธฐ์—์„œ ์ฝ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. https://github.com/tensorflow/tensorflow/issues/24496#issuecomment -633953715 TF๋Š”์ด ์ถ”์ธก์— max(300MB, GPU-MEM * fac) ๊ณต์‹์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. TF2.1์˜ ๊ฒฝ์šฐ fac = 0.05 ๊ฒฝ์šฐ TF2.2์˜ ๊ฒฝ์šฐ
    fac=0.07 ์ž…๋‹ˆ๋‹ค. ์ด์ œ TF2.1์—์„œ GPU ์‚ฌ์ „ ํ• ๋‹น ๋ฉ”๋ชจ๋ฆฌ์— 400MB๋ฅผ ์ œ๊ณตํ•˜๋Š” 8GB๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
    TF2.2์—์„œ 560MB.

    https://github.com/tensorflow/tensorflow/issues/24496#issuecomment -637715002 ๋ฐ ์—ฌ๊ธฐ https://github.com/tensorflow/tensorflow ์—์„œ ๋ช‡ ๊ฐ€์ง€ GPU ๋ฐ TF21์— ํ•„์š”ํ•œ ์‚ฌ์ „ ํ• ๋‹น ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‹คํ—˜์ ์œผ๋กœ ํ‰๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

    Conv2D ์ž‘์—…์˜ ๊ฒฝ์šฐ 520MB๊ฐ€ ํ•„์š”ํ–ˆ์Šต๋‹ˆ๋‹ค. TF21๋ณด๋‹ค ์ ์ง€ ๋งŒ TF22์—์„œ๋Š” ๋” ๋งŽ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋ถˆํ–‰ํžˆ๋„ TF ๋ฒ„์ „์€ ์–ธ๊ธ‰ํ•˜์ง€ ์•Š์•˜์ง€๋งŒ TF2.1์„ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค. TF2.2๋ฅผ ์‚ฌ์šฉํ–ˆ๋Š”๋ฐ ์—ฌ์ „ํžˆ ์‹คํŒจํ•œ๋‹ค๋ฉด ๋‹ค๋ฅธ GPU๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์–ด์จŒ๋“  ์‚ฌ์‹ค์€ ์‹คํŒจํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜ ์ฐธ์กฐ

2) allow memory growth = true : TF๋Š” ๋ฏธ๋ฆฌ ํ• ๋‹น ๋œ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์˜ฌ ๋•Œ๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค. TF ๋ฌธ์„œ์—์„œ ์ด๊ฒƒ์€ ์ž ์žฌ์  ์ธ ๋ฉ”๋ชจ๋ฆฌ ์กฐ๊ฐํ™”๋กœ ์ธํ•ด ๋ฌธ์ œ๊ฐ€์žˆ๋Š” ๊ฒƒ์œผ๋กœ ์„ ์–ธ๋˜์—ˆ์œผ๋ฏ€๋กœ ๊ธฐ๋ณธ์ ์œผ๋กœ off ์ž…๋‹ˆ๋‹ค.

๋‚ด ํ…Œ์ดํฌ :

GPU์—์„œ ์ˆ˜ํ–‰ํ•˜๋Š” ์ž‘์—…์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง€๋Š” ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ํ•„์š”ํ•œ ๋ฉ”๋ชจ๋ฆฌ ๋ฒ”์œ„๊ฐ€ ๋„“๊ธฐ ๋•Œ๋ฌธ์— allow memory growth = false ๋ชจ๋“œ๋ฅผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ๊ฐ€์ ธ ์˜ค๋Š” ๊ฒƒ์ด ๋งค์šฐ ์–ด๋ ค์›Œ ๋ณด์ž…๋‹ˆ๋‹ค (https://github.com/tensorflow/ ์ฐธ์กฐ). tensorflow / issues / 24496 # issuecomment-637950411). ํ˜„์žฌ ์†”๋ฃจ์…˜ : TF2.2์—์„œ ์ˆ˜ํ–‰ ๋œ ์‚ฌ์ „ ํ• ๋‹น ๋œ ๋ฉ”๋ชจ๋ฆฌ์˜ ํฌ๊ธฐ๋ฅผ ๋Š˜๋ฆฌ๋Š” ๊ฒƒ์€ GPU๊ฐ€ ๋‹ค์†Œ ์ž‘์€ ๊ฒฝ์šฐ ๋ฌธ์ œ๊ฐ€๋ฉ๋‹ˆ๋‹ค. ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋“  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ (blas, Conv, FFT ๋ฐ ๋‹ค๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์žˆ๋Š”์ง€ ๋ชจ๋ฅด๊ฒ  ์Œ)๊ฐ€ ํ•„์š”ํ•˜๋‹ค๊ณ  ๊ฐ€์ •ํ•˜๋ฉด ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ์ด ์ฐจ๋‹จ๋ฉ๋‹ˆ๋‹ค. ์ด ๋ชจ๋“  ๊ฒƒ์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ, ์‚ฌ์ „ ํ• ๋‹น ๋œ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋‚ญ๋น„๋˜๊ณ  ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์—๋กœ๋“œ ํ•  ์ˆ˜์žˆ๋Š” ๋ชจ๋ธ ํฌ๊ธฐ๊ฐ€ ์ค„์–ด ๋“ญ๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ํ•œํŽธ์œผ๋กœ, ํ•™์Šต์„ ์‹œ์ž‘ํ•˜๊ธฐ ์ „์— ์‹œ์Šคํ…œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ๋กœ๋“œํ•˜๋„๋ก ์กฐ๊ธฐ์— ๋ชจ๋ธ์„ ๋งŒ๋“ค๋ฉด ๋ฉ”๋ชจ๋ฆฌ ์กฐ๊ฐํ™” ๋ฌธ์ œ๋ฅผ ๋ฐฉ์ง€ ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์–ด์จŒ๋“  ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ์— ์ผ์–ด๋‚˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ด๋ฏ€๋กœ ํŠนํžˆ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ์ž‘์€ GPU, ํŠนํžˆ ๋‹จ์ผ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๋Š” ๊ฒฝ์šฐ ์‚ฌ์ „ ํ• ๋‹นํ•˜์ง€ ์•Š๊ณ  allow memory growth = true ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์œ ์ตํ•ฉ๋‹ˆ๋‹ค.

๊ฐœ์ธ์ ์œผ๋กœ ์ €๋Š” 4GB์—์„œ 11GB๊นŒ์ง€์˜ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๊ฐ€์ง„ GPU๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์œ„์˜ ์ธ์ˆ˜์— ๋”ฐ๋ผ ๋ชจ๋‘์— ๋Œ€ํ•ด TF_FORCE_GPU_ALLOW_GROWTH = true๋ฅผ ์„ค์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‹น๋ถ„๊ฐ„ ๋‚˜๋Š” ๊ทธ๊ฒƒ์— ๋Œ€ํ•ด ์•„๋ฌด๋Ÿฐ ๋ฌธ์ œ๊ฐ€ ์—†์—ˆ์Šต๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š” @roebel

์ €๋„ ๋ฉ”๋ชจ๋ฆฌ ํ• ๋‹น ์˜ค๋ฅ˜ ๋ฌธ์ œ์— ๋Œ€ํ•ด ์ƒ๊ฐํ•˜๊ณ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๋ถ„๋ช…ํžˆ ๋‚˜๋ฅผ์œ„ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด์ œ ๊ดœ์ฐฎ์•„ ๋ณด์ด๋Š” GPU ๋ฉ”๋ชจ๋ฆฌ

๊ณผ๊ฑฐ์—๋Š” ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๋ฏธ๋ฆฌ ํ• ๋‹นํ•˜๊ธฐ ์œ„ํ•ด ๋งŽ์€ ์˜ต์…˜์„ ํ…Œ์ŠคํŠธํ–ˆ์Šต๋‹ˆ๋‹ค ๐Ÿ˜ข :

gpus = tf.config.experimental.list_physical_devices('GPU')
try:
    tf.config.experimental.set_virtual_device_configuration(gpus[0], 
                 tf.config.experimental.VirtualDeviceConfiguration(memory_limit=5044)])
    """process...."""
except Exception as e:
    raise e

๊ฐœ์ธ์ ์œผ๋กœ ์ €๋Š” 6GB ๋ฉ”๋ชจ๋ฆฌ์˜ GPU๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
๊ทธ๋ฆฌ๊ณ  @roebel ์—๊ฒŒ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.์ด ์ƒˆ๋กœ์šด ํ™”์‚ดํ‘œ TF_FORCE_GPU_ALLOW_GROWTH=true ๊ฐ€ ๋‚ด GPU๋ฅผ ํ• ๋‹นํ•˜๋„๋ก ๊ฐ•์ œํ•ฉ๋‹ˆ๋‹ค ๐Ÿ˜Š.

๋‚˜๋Š” ์ด์™€ ๊ฐ™์€ ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ๋‹ค. ๋ฌธ์ œ๊ฐ€ ์ •ํ™•ํžˆ ๋™์ผํ•œ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๋Š” Titan RTX๊ฐ€ ์•„๋‹Œ 2070 RTX์—์„œ๋งŒ ๋ฐœ์ƒํ•œ๋‹ค๊ณ  ํ™•์‹  ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

https://github.com/DeepLabCut/DeepLabCut/issues/837

CUDA 11 ๋ฐ cudnn 8.0์„ ์‚ฌ์šฉํ•˜์—ฌ Tensorflow 2.3์œผ๋กœ ์—…๊ทธ๋ ˆ์ด๋“œํ•˜๊ธฐ ๋งŒํ•˜๋ฉด๋ฉ๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์€ ๋งˆ์ˆ ์ฒ˜๋Ÿผ ๋‚ด ๋ชจ๋“  ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ–ˆ์œผ๋ฉฐ ์ง€๊ธˆ์€ config.gpu_options.allow_growth = True ๋Œ€ํ•œ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•๋„ ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์•ˆํƒ€๊น๊ฒŒ๋„ tensorflow 1.X ๋งŒ ์ง€์›ํ•˜๋Š” ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

CUDA 11 ๋ฐ cudnn 8.0์„ ์‚ฌ์šฉํ•˜์—ฌ Tensorflow 2.3์œผ๋กœ ์—…๊ทธ๋ ˆ์ด๋“œํ•˜๊ธฐ ๋งŒํ•˜๋ฉด๋ฉ๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์€ ๋งˆ์ˆ ์ฒ˜๋Ÿผ ๋‚ด ๋ชจ๋“  ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ–ˆ์œผ๋ฉฐ ์ง€๊ธˆ์€ config.gpu_options.allow_growth = True ๋Œ€ํ•œ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•๋„ ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋ช…์‹œ์ ์ธ TF_FORCE_GPU_ALLOW_GROWTH=false ๋กœ๋„ 2.2์—์„œ 2.3์œผ๋กœ ์—…๊ทธ๋ ˆ์ด๋“œํ•˜๋ฉด์ด ๋ฌธ์ œ๋„ ํ•ด๊ฒฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค (์ ์–ด๋„ ์ง€๊ธˆ์€ delf ๋ฐ๋ชจ ์ฝ”๋“œ ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ํ…Œ์ŠคํŠธ๋Š” ์ˆ˜ํ–‰ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค).

๋‚˜๋Š” ์—ฌ์ „ํžˆ CUDA 10.1, Cudnn 7.6.5๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

tensorflow 2 ๋ฐ python3์—์„œ์ด ๋ฌธ์ œ์— ๋Œ€ํ•œ ์ˆ˜์ • ์‚ฌํ•ญ์ด ์žˆ์Šต๋‹ˆ๊นŒ?

๋‚˜๋Š” :
RTX 2080

์ด ๋ฉ”์‹œ์ง€๊ฐ€ ๋‚˜ํƒ€๋‚ฉ๋‹ˆ๋‹ค.

2020-08-20 12:38:27.172496: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
2020-08-20 12:38:27.177708: E tensorflow/stream_executor/cuda/cuda_dnn.cc:329] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
Traceback (most recent call last):
  File "/home/anantha/Desktop/RaspiCar/car.py", line 85, in <module>
    tnet.train(x, y)
  File "/home/anantha/Desktop/RaspiCar/car.py", line 65, in train
    self.model.fit(x, y, epochs=epochs)
  File "/home/anantha/.local/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 66, in _method_wrapper
    return method(self, *args, **kwargs)
  File "/home/anantha/.local/lib/python3.8/site-packages/tensorflow/python/keras/engine/training.py", line 848, in fit
    tmp_logs = train_function(iterator)
  File "/home/anantha/.local/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 580, in __call__
    result = self._call(*args, **kwds)
  File "/home/anantha/.local/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 644, in _call
    return self._stateless_fn(*args, **kwds)
  File "/home/anantha/.local/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2420, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/anantha/.local/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1661, in _filtered_call
    return self._call_flat(
  File "/home/anantha/.local/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1745, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/home/anantha/.local/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 593, in call
    outputs = execute.execute(
  File "/home/anantha/.local/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
     [[node sequential/conv2d/Conv2D (defined at /Desktop/RaspiCar/car.py:65) ]] [Op:__inference_train_function_951]

Function call stack:
train_function

๊ท€ํ•˜์˜ ๋ฌธ์ œ๊ฐ€ ํ˜„์žฌ ๋ฌธ์ œ์—์„œ ์ฒ˜๋ฆฌ ๋œ ๋ฌธ์ œ์™€ ๋™์ผํ•œ ์›์ธ์ด์žˆ๋Š” ๊ฒฝ์šฐ (๋ณด๊ณ ์„œ์—์„œ ์•Œ ์ˆ˜ ์—†์Œ)์ด ๋ฌธ์„œ์˜ ๋งˆ์ง€๋ง‰ 10-20 ๊ฐœ ๊ฒŒ์‹œ๋ฌผ์„ ์ฝ๊ณ  ์‰ฝ๊ฒŒ ์ฐพ์„ ์ˆ˜์žˆ๋Š” ๋ช‡ ๊ฐ€์ง€ ํ•ด๊ฒฐ์ฑ…์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์‹ค.

๋‚˜๋Š” ์ด๊ฒƒ์„ ์ˆ˜์ •ํ–ˆ๋‹ค.

config = tf.compat.v1.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.compat.v1.Session(config=config)
sess.as_default()

RTX 2080์—์„œ๋„ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ ์ฝ”๋“œ๊ฐ€ ์ €์—๊ฒŒ ํšจ๊ณผ์ ์ด์—ˆ์Šต๋‹ˆ๋‹ค.

from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

๋ชจ๋‘์—๊ฒŒ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค

์ง€๊ธˆ allow_growth ์ˆ˜์ • ๊ฒŒ์‹œ๋ฅผ ์ค‘๋‹จ ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. :)

์—ฌ๊ธฐ์— RTX 2070์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์ง€๋งŒ ์ด์ œ TF_FORCE_GPU_ALLOW_GROWTH=true (๋‹ค๋ฅธ ์ฃผ์„๊ฐ€๊ฐ€ ์ง€์ ํ–ˆ๋“ฏ์ด ์ˆ˜์ •)๋กœ ์‹คํ–‰ํ•˜๋ฉด ์˜ค๋ฅ˜ ๋ฉ”์‹œ์ง€๊ฐ€ ๋ฉ”๋ชจ๋ฆฌ ๋ถ€์กฑ ์˜ค๋ฅ˜๋กœ ๋ณ€๊ฒฝ๋ฉ๋‹ˆ๋‹ค (๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ์ถฉ๋ถ„ํ•˜๋”๋ผ๋„).

2020-10-17 16:35:11.717658: I tensorflow/stream_executor/cuda/cuda_driver.cc:831] failed to allocate 3.87G (4159818752 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory

ํ•˜์ง€๋งŒ ๋‚ด GPU๋Š” 8GB์ด๊ณ  ํ”„๋กœ์„ธ์Šค๋ฅผ ์‹œ์ž‘ํ•˜๊ธฐ ์ „์—๋Š” ์•ฝ 250MB ๋งŒ ์‚ฌ์šฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ดํ•ด๊ฐ€ ์•ˆ๋˜๋Š”๋ฐ ์™œ 3.87GB๋ฅผ ํ• ๋‹น ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๊นŒ? (๋ฐฐ์น˜ ํฌ๊ธฐ๋ฅผ ์ค„์ด๋ฉด ํšจ๊ณผ๊ฐ€ ์—†์—ˆ์Šต๋‹ˆ๋‹ค. ๊ฐ€์ค‘์น˜ hdf5 ํŒŒ์ผ์€ 200MB ๋ฏธ๋งŒ์ž…๋‹ˆ๋‹ค.)

TF_FORCE_GPU_ALLOW_GROWTH = true๋Š” ๋‚˜๋ฅผ ์œ„ํ•ด ์ผํ–ˆ์Šต๋‹ˆ๋‹ค.
tf.config.experimental.set_memory_growth (gpu, True)๋„ ์ž‘๋™ํ–ˆ์Šต๋‹ˆ๋‹ค.

๋‚ด ๊ตฌ์„ฑ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
GPU GTX 1650
cuda-10-1 10.1.243-1
libcudnn7 7.6.5.32-1 + cuda10.1
Ubuntu 18.04.5 LTS

ํ™˜๊ฒฝ ๋ณ€์ˆ˜๋ฅผ ์„ค์ •ํ•  ์ˆ˜์—†๋Š” ์‚ฌ๋žŒ์€ https://www.tensorflow.org/guide/gpu์— ์ œ์•ˆ ๋œ๋Œ€๋กœ ์‹œ๋„ ํ•  ์ˆ˜
gpus = tf.config.experimental.list_physical_devices ( 'GPU')
GPU ์ธ ๊ฒฝ์šฐ :
์‹œํ—˜:
# ํ˜„์žฌ ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๋Š” GPU ์ „์ฒด์—์„œ ๋™์ผํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.
GPU์˜ GPU :
tf.config.experimental.set_memory_growth (gpu, True)
logical_gpus = tf.config.experimental.list_logical_devices ( 'GPU')
print (len (gpus), "๋ฌผ๋ฆฌ์  GPU", len (logical_gpus), "๋…ผ๋ฆฌ์  GPU")
e๋กœ RuntimeError ์ œ์™ธ :
# GPU๊ฐ€ ์ดˆ๊ธฐํ™”๋˜๊ธฐ ์ „์— ๋ฉ”๋ชจ๋ฆฌ ์ฆ๊ฐ€๋ฅผ ์„ค์ •ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.
์ธ์‡„ (e)

ํ„ฐ๋ฏธ๋„์— ์–ธ๊ธ‰ ๋œ ๋ช…๋ น์„ ์ž…๋ ฅํ•˜๋Š” ๊ฒƒ์€ ์ €์—๊ฒŒ ํšจ๊ณผ์ ์ด์—ˆ์Šต๋‹ˆ๋‹ค.

https://github.com/tensorflow/tfjs/issues/671#issuecomment -494832790

CUDA 11 ๋ฐ cudnn 8.0์„ ์‚ฌ์šฉํ•˜์—ฌ Tensorflow 2.3์œผ๋กœ ์—…๊ทธ๋ ˆ์ด๋“œํ•˜๊ธฐ ๋งŒํ•˜๋ฉด๋ฉ๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์€ ๋งˆ์ˆ ์ฒ˜๋Ÿผ ๋‚ด ๋ชจ๋“  ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ–ˆ์œผ๋ฉฐ ์ง€๊ธˆ์€ config.gpu_options.allow_growth = True ๋Œ€ํ•œ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•๋„ ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์ด ๋ฌธ์ œ๋Š” tensorflow 2.3.0์—์„œ ๋ฐœ๊ฒฌ๋˜๊ณ  ํ•ด๊ฒฐ ๋œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • CUDA 10.1
  • GPU : Quadro RTX 6000
  • Tensorflow 2.2.0
  • cudnn 7.6.5

๊ฐ™์€ ๋ฌธ์ œ :
tensorflow/stream_executor/cuda/cuda_dnn.cc:328] Could not create cudnn handle: CUDNN_STATUS_INTERNAL_ERROR
tensorflow.python.framework.errors_impl.UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.

๊ทธ๋ฆฌ๊ณ  allow_growth = True ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์€ ๋„์›€์ด๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

tensorflow๋ฅผ 2.3.0์œผ๋กœ ์—…๊ทธ๋ ˆ์ด๋“œ ํ•œ ํ›„ allow_growth = True ์ค„์„ ์ถ”๊ฐ€ํ•˜์ง€ ์•Š์•„๋„ ๋ฌธ์ œ๊ฐ€ ์‚ฌ๋ผ์กŒ์Šต๋‹ˆ๋‹ค.

์ข‹์•„, tf-nightly-gpu-2.0-preview ๋ฐ ipython ๋…ธํŠธ๋ถ์—์„œ ์ž‘๋™ํ•˜๋„๋ก ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค.

tensorflow.compat.v1์—์„œ ConfigProto ๊ฐ€์ ธ ์˜ค๊ธฐ
tensorflow.compat.v1์—์„œ InteractiveSession ๊ฐ€์ ธ ์˜ค๊ธฐ

๊ตฌ์„ฑ = ConfigProto ()
config.gpu_options.allow_growth = True
์„ธ์…˜ = InteractiveSession (config = config)

์ œ ๊ฒฝ์šฐ์—๋Š” ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰