Tensorflow: ValueError: ์ฒ˜์Œ ์‚ฌ์šฉํ•œ ๊ฒƒ๊ณผ ๋‹ค๋ฅธ ๋ณ€์ˆ˜ ๋ฒ”์œ„๋กœ RNNCell์„ ์žฌ์‚ฌ์šฉํ•˜๋ ค๊ณ  ํ–ˆ์Šต๋‹ˆ๋‹ค.

์— ๋งŒ๋“  2017๋…„ 03์›” 08์ผ  ยท  102์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: tensorflow/tensorflow

๋‹ค์Œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•œ ์ฒซ ๋ฒˆ์งธ ์‚ฌ์šฉ์ž์ธ์ง€ ํ™•์‹คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

ValueError: RNNCell ์žฌ์‚ฌ์šฉ ์‹œ๋„์ฒซ ๋ฒˆ์งธ ์‚ฌ์šฉ๊ณผ ๋‹ค๋ฅธ ๋ณ€์ˆ˜ ๋ฒ”์œ„๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์…€์˜ ์ฒซ ๋ฒˆ์งธ ์‚ฌ์šฉ์€ ๋ฒ”์œ„ 'rnn/multi_rnn_cell/cell_0/basic_lstm_cell'์ด์—ˆ๊ณ , ์ด ์‹œ๋„๋Š” ๋ฒ”์œ„ 'rnn/multi_rnn_cell/cell_1/basic_lstm_cell'์ž…๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ๊ฐ€์ค‘์น˜ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ์…€์˜ ์ƒˆ ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“œ์‹ญ์‹œ์˜ค. ์ด์ „์— MultiRNNCell([BasicLSTMCell(...)] * num_layers)์„ ์‚ฌ์šฉํ–ˆ๋‹ค๋ฉด MultiRNNCell([BasicLSTMCell(...) for _ in range(num_layers)])๋กœ ๋ณ€๊ฒฝํ•ฉ๋‹ˆ๋‹ค. ์–‘๋ฐฉํ–ฅ RNN์˜ ์ˆœ๋ฐฉํ–ฅ ๋ฐ ์—ญ๋ฐฉํ–ฅ ์…€ ๋ชจ๋‘์™€ ๋™์ผํ•œ ์…€ ์ธ์Šคํ„ด์Šค๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‘ ๊ฐœ์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค(ํ•˜๋‚˜๋Š” ์ •๋ฐฉํ–ฅ, ํ•˜๋‚˜๋Š” ์—ญ๋ฐฉํ–ฅ). 2017๋…„ 5์›”์— scope=None(์ž๋™ ๋ชจ๋ธ ์ €ํ•˜๋กœ ์ด์–ด์งˆ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์ด ์˜ค๋ฅ˜๋Š” ๊ทธ๋•Œ๊นŒ์ง€ ์œ ์ง€๋จ)์œผ๋กœ ํ˜ธ์ถœ๋  ๋•Œ ๊ธฐ์กด์— ์ €์žฅ๋œ ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ์šฉํ•˜๋„๋ก ์ด ์…€์˜ ๋™์ž‘์„ ์ „ํ™˜ํ•˜๊ธฐ ์‹œ์ž‘ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ฝ”๋“œ ์กฐ๊ฐ์œผ๋กœ:

  import tensorflow as tf
  from tensorflow.contrib import rnn

  hidden_size = 100
  batch_size  = 100
  num_steps   = 100
  num_layers  = 100
  is_training = True
  keep_prob   = 0.4

  input_data = tf.placeholder(tf.float32, [batch_size, num_steps])
  lstm_cell = rnn.BasicLSTMCell(hidden_size, forget_bias=0.0, state_is_tuple=True)

  if is_training and keep_prob < 1:
      lstm_cell = rnn.DropoutWrapper(lstm_cell)
  cell = rnn.MultiRNNCell([lstm_cell for _ in range(num_layers)], state_is_tuple=True)

  _initial_state = cell.zero_state(batch_size, tf.float32)

  iw = tf.get_variable("input_w", [1, hidden_size])
  ib = tf.get_variable("input_b", [hidden_size])
  inputs = [tf.nn.xw_plus_b(i_, iw, ib) for i_ in tf.split(input_data, num_steps, 1)]

  if is_training and keep_prob < 1:
      inputs = [tf.nn.dropout(input_, keep_prob) for input_ in inputs]

  outputs, states = rnn.static_rnn(cell, inputs, initial_state=_initial_state)

๋‚˜๋Š” ์šด์ด ์—†์—ˆ์Šต๋‹ˆ๋‹ค. ์•„๋ฌด๋„ ๋‚˜์—๊ฒŒ ํƒˆ์ถœ๊ตฌ๋ฅผ ๋ณด์—ฌ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

awaiting tensorflower

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

๋‚˜๋Š” ๊ฐ™์€ ๋ฌธ์ œ๋ฅผ ๋งŒ๋‚ฌ๋‹ค. ๋งˆ์Šคํ„ฐ ๋ธŒ๋žœ์น˜์—์„œ ๋ชจ๋‘ ์ปดํŒŒ์ผ๋œ ๋ฒ„์ „์„ ์‚ฌ์šฉํ•˜๊ณ  ๊ณ„์‹œ๋‹ค๋ฉด ์ตœ๊ทผ commit ์œผ๋กœ ์ธํ•ด ๋ฐœ์ƒํ•˜๋Š” ๋™์ผํ•œ ๋ฌธ์ œ๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์ปค๋ฐ‹ ๋ฉ”์‹œ์ง€๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋งํ•ฉ๋‹ˆ๋‹ค.

tf.contrib.rnn์˜ ๋ชจ๋“  RNNCell์ด tf.layers Layers์ฒ˜๋Ÿผ ์ž‘๋™ํ•˜๋„๋ก ํ•˜์‹ญ์‹œ์˜ค.
์—ฌ:

  1. __call__์„ ์ฒ˜์Œ ์‚ฌ์šฉํ•˜๋ฉด ์‚ฌ์šฉํ•œ ๋ฒ”์œ„๊ฐ€ ์…€์— ์ €์žฅ๋ฉ๋‹ˆ๋‹ค. RNNCell์€ ํ•ด๋‹น ๋ฒ”์œ„์—์„œ ๊ฐ€์ค‘์น˜๋ฅผ ์ƒ์„ฑํ•˜๋ ค๊ณ  ์‹œ๋„ํ•˜์ง€๋งŒ ์ผ๋ถ€๊ฐ€ ์ด๋ฏธ ์„ค์ •๋œ ๊ฒฝ์šฐ RNNCell์ด ์ธ์ˆ˜ ์žฌ์‚ฌ์šฉ=True๋กœ ๊ตฌ์„ฑ๋˜์ง€ ์•Š์€ ๊ฒฝ์šฐ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

  2. ๋™์ผํ•œ ์…€ ์ธ์Šคํ„ด์Šค์˜ __call__์˜ ํ›„์† ์‚ฌ์šฉ์€ ๋™์ผํ•œ ๋ฒ”์œ„์— ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
    ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

ptb tutorial ์„ ์‹คํ–‰ํ•˜๋Š” ์ œ ๊ฒฝ์šฐ์—๋Š” 112๋ฒˆ์งธ ์ค„์— reuse ๋ผ๋Š” ์ด๋ฆ„์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

def lstm_cell():
  return tf.contrib.rnn.BasicLSTMCell(
      size, forget_bias=0.0, state_is_tuple=True, reuse=tf.get_variable_scope().reuse)

๊ทธ๋Ÿฌ๋ฉด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋“  102 ๋Œ“๊ธ€

https://github.com/tensorflow/models/tree/master/tutorials/rnn/translate ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ๋Š” ๋ฒˆ์—ญ ์˜ˆ์ œ(์ž‘์€ ์ž์ฒด ํ…Œ์ŠคํŠธ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒฝ์šฐ์—๋„)๋ฅผ ์‹คํ–‰ํ•˜๋ ค๊ณ  ํ•  ๋•Œ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

๋‚˜๋Š” ๊ฐ™์€ ๋ฌธ์ œ๋ฅผ ๋งŒ๋‚ฌ๋‹ค. ๋งˆ์Šคํ„ฐ ๋ธŒ๋žœ์น˜์—์„œ ๋ชจ๋‘ ์ปดํŒŒ์ผ๋œ ๋ฒ„์ „์„ ์‚ฌ์šฉํ•˜๊ณ  ๊ณ„์‹œ๋‹ค๋ฉด ์ตœ๊ทผ commit ์œผ๋กœ ์ธํ•ด ๋ฐœ์ƒํ•˜๋Š” ๋™์ผํ•œ ๋ฌธ์ œ๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์ปค๋ฐ‹ ๋ฉ”์‹œ์ง€๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋งํ•ฉ๋‹ˆ๋‹ค.

tf.contrib.rnn์˜ ๋ชจ๋“  RNNCell์ด tf.layers Layers์ฒ˜๋Ÿผ ์ž‘๋™ํ•˜๋„๋ก ํ•˜์‹ญ์‹œ์˜ค.
์—ฌ:

  1. __call__์„ ์ฒ˜์Œ ์‚ฌ์šฉํ•˜๋ฉด ์‚ฌ์šฉํ•œ ๋ฒ”์œ„๊ฐ€ ์…€์— ์ €์žฅ๋ฉ๋‹ˆ๋‹ค. RNNCell์€ ํ•ด๋‹น ๋ฒ”์œ„์—์„œ ๊ฐ€์ค‘์น˜๋ฅผ ์ƒ์„ฑํ•˜๋ ค๊ณ  ์‹œ๋„ํ•˜์ง€๋งŒ ์ผ๋ถ€๊ฐ€ ์ด๋ฏธ ์„ค์ •๋œ ๊ฒฝ์šฐ RNNCell์ด ์ธ์ˆ˜ ์žฌ์‚ฌ์šฉ=True๋กœ ๊ตฌ์„ฑ๋˜์ง€ ์•Š์€ ๊ฒฝ์šฐ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

  2. ๋™์ผํ•œ ์…€ ์ธ์Šคํ„ด์Šค์˜ __call__์˜ ํ›„์† ์‚ฌ์šฉ์€ ๋™์ผํ•œ ๋ฒ”์œ„์— ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
    ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

ptb tutorial ์„ ์‹คํ–‰ํ•˜๋Š” ์ œ ๊ฒฝ์šฐ์—๋Š” 112๋ฒˆ์งธ ์ค„์— reuse ๋ผ๋Š” ์ด๋ฆ„์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

def lstm_cell():
  return tf.contrib.rnn.BasicLSTMCell(
      size, forget_bias=0.0, state_is_tuple=True, reuse=tf.get_variable_scope().reuse)

๊ทธ๋Ÿฌ๋ฉด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

@ebrevdo ์ด๊ฑฐ ์ข€ ๋ด์ฃผ์‹œ๊ฒ ์–ด์š” ?

Shakespeare RNN Repo ์—์„œ Windows/GPU ๋นŒ๋“œ 105๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ ๋ฌธ์ œ๊ฐ€ ๋ฐ˜๋ณต๋ฉ๋‹ˆ๋‹ค.

Win 1.0.0/GPU ๋ฆด๋ฆฌ์Šค๋กœ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•  ๋•Œ๋Š” ๋ฌธ์ œ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

๊ทธ ์ €์žฅ์†Œ๋Š” ์ค‘๊ฐ„ ๋ฆด๋ฆฌ์Šค๊ฐ€ ์•„๋‹Œ tf 1.0์„ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ž…๋‹ˆ๋‹ค.

2017๋…„ 3์›” 8์ผ ์˜คํ›„ 3์‹œ 56๋ถ„์— "Tom Wanzek" [email protected] ์ด ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

์…ฐ์ต์Šคํ”ผ์–ด์—์„œ Windows/GPU ๋นŒ๋“œ 105๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ ๋ฌธ์ œ๊ฐ€ ๋ฐ˜๋ณต๋ฉ๋‹ˆ๋‹ค.
RNN ๋ ˆํฌ https://github.com/martin-gorner/tensorflow-rnn-shakespeare .

Win 1.0.0/GPU ๋ฆด๋ฆฌ์Šค๋กœ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•  ๋•Œ๋Š” ๋ฌธ์ œ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-285209555 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABtim5ansaL1KN51T4nCaqLnqw2QHN4Wks5rj0BBgaJpZM4MWl4f
.

@tongda , ์ €๋Š” CPU ๋ชจ๋“œ์˜ MacOS์—์„œ ์ž‘๋™ํ•˜๋Š” Tensorflow 1.0 ๋ฆด๋ฆฌ์Šค ๋ฒ„์ „์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. "์žฌ์‚ฌ์šฉ" ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ์ž‘๋™ํ•˜๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด ๋งˆ์Šคํ„ฐ ๋ถ„๊ธฐ๋กœ ์ „ํ™˜ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

doncat99: ๊ทธ๋ ‡๊ฒŒ ํ•˜๋Š” ๊ฒฝ์šฐ ์ฝ”๋“œ๊ฐ€ tensorflow ๋ฒ„์ „์„ ์ฟผ๋ฆฌํ•˜๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.
๋ฒ„์ „์ด ๋งˆ์Šคํ„ฐ ๋ถ„๊ธฐ ๋ฒ„์ „๋ณด๋‹ค ๋‚ฎ์œผ๋ฉด ํ”Œ๋ž˜๊ทธ๋ฅผ ๋ฐœ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
๋‹ค์Œ์— ๋Œ€ํ•ด ํ™•์ธํ•ด์•ผ ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

tensorflow.core ๊ฐ€์ ธ์˜ค๊ธฐ ๋ฒ„์ „์—์„œ
๋ฒ„์ „.GIT_VERSION

2017๋…„ 3์›” 8์ผ ์ˆ˜์š”์ผ ์˜คํ›„ 6์‹œ 58๋ถ„์— doncat99 [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

@tongda https://github.com/tongda , ๋ฆด๋ฆฌ์Šค ๋ฒ„์ „์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
Tensorflow 1.0, CPU ๋ชจ๋“œ์˜ MacOS์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๋งˆ์Šคํ„ฐ๋กœ ๊ฐˆ์•„ํƒ€๊ฒ ์Šต๋‹ˆ๋‹ค
๋ถ„๊ธฐ์— "์žฌ์‚ฌ์šฉ" ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ์ž‘๋™ํ•˜๋Š”์ง€ ํ™•์ธํ•˜์„ธ์š”. ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-285240438 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABtim66cU9e16lgD-2D0QLGcQCiHbV0zks5rj2rbgaJpZM4MWl4f
.

@ebrevdo ๊ทธ๋ ‡๋‹ค๋ฉด ์…ฐ์ตํ”ผ์–ด RNN์ด ์ค‘๊ฐ„ ์•ˆ์ • ๋ฆด๋ฆฌ์Šค์™€ ํ•จ๊ป˜ ์ž‘๋™ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๊ธฐ ์œ„ํ•ด ์ œ์•ˆ๋œ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

๋‹ค์Œ์€ ๋นŒ๋“œ#105์—์„œ ์‹คํŒจํ•˜๋Š” ์ฝ”๋“œ์˜ ์ฃผ์š” ์•„ํ‚คํ…์ฒ˜ ์„น์…˜์ž…๋‹ˆ๋‹ค.

#
# the model (see FAQ in README.md)
#
lr = tf.placeholder(tf.float32, name='lr')  # learning rate
pkeep = tf.placeholder(tf.float32, name='pkeep')  # dropout parameter
batchsize = tf.placeholder(tf.int32, name='batchsize')

# inputs
X = tf.placeholder(tf.uint8, [None, None], name='X')    # [ BATCHSIZE, SEQLEN ]
Xo = tf.one_hot(X, ALPHASIZE, 1.0, 0.0)                 # [ BATCHSIZE, SEQLEN, ALPHASIZE ]
# expected outputs = same sequence shifted by 1 since we are trying to predict the next character
Y_ = tf.placeholder(tf.uint8, [None, None], name='Y_')  # [ BATCHSIZE, SEQLEN ]
Yo_ = tf.one_hot(Y_, ALPHASIZE, 1.0, 0.0)               # [ BATCHSIZE, SEQLEN, ALPHASIZE ]
# input state
Hin = tf.placeholder(tf.float32, [None, INTERNALSIZE*NLAYERS], name='Hin')  # [ BATCHSIZE, INTERNALSIZE * NLAYERS]

# using a NLAYERS=3 layers of GRU cells, unrolled SEQLEN=30 times
# dynamic_rnn infers SEQLEN from the size of the inputs Xo

onecell = rnn.GRUCell(INTERNALSIZE)
dropcell = rnn.DropoutWrapper(onecell, input_keep_prob=pkeep)
multicell = rnn.MultiRNNCell([dropcell for _ in range(NLAYERS)], state_is_tuple=False)
multicell = rnn.DropoutWrapper(multicell, output_keep_prob=pkeep)
Yr, H = tf.nn.dynamic_rnn(multicell, Xo, dtype=tf.float32, initial_state=Hin)
# Yr: [ BATCHSIZE, SEQLEN, INTERNALSIZE ]
# H:  [ BATCHSIZE, INTERNALSIZE*NLAYERS ] # this is the last state in the sequence

reuse ํ”Œ๋ž˜๊ทธ์— ๊ด€ํ•œ ๋ฌธ์„œ๋ฅผ ์ฐพ์ง€ ๋ชปํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋ฏธ๋ฆฌ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

์‚ฌ์šฉ:

๋‹ค์ค‘ ์…€ = rnn.MultiRNNCell([rnn.DropoutWrapper(rnn.GRUCell(INTERNALSIZE),
input_keep_prob=pkeep) for _ in range(NLAYERS)], state_is_tuple=False)

๊ฐ ๋ ˆ์ด์–ด์— ๋Œ€ํ•ด ๋ณ„๋„์˜ ๊ทธ๋ฃจ์…€ ๊ฐœ์ฒด๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

2017๋…„ 3์›” 10์ผ ์˜ค์ „ 7์‹œ 44๋ถ„์— "Tom Wanzek" [email protected] ์ด ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

@ebrevdo https://github.com/ebrevdo ๊ทธ๋ž˜์„œ ์ œ์•ˆ๋˜๋Š” ๊ฒƒ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ
์…ฐ์ตํ”ผ์–ด RNN์ด ์ค‘๊ฐ„์ฒด์™€ ์ž‘๋™ํ•  ์ˆ˜ ์žˆ๋„๋ก ๋ณ€๊ฒฝ
์•ˆ์ •์ ์ธ ์ถœ์‹œ?

๋‹ค์Œ์€ ์ฝ”๋“œ์˜ ์ฃผ์š” ์•„ํ‚คํ…์ฒ˜ ์„น์…˜์ž…๋‹ˆ๋‹ค. ์ด์ œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์‹คํŒจํ•ฉ๋‹ˆ๋‹ค.
๋นŒ๋“œ#105:

๋ชจ๋ธ(README.md์˜ FAQ ์ฐธ์กฐ)

lr = tf.placeholder(tf.float32, name='lr') # ํ•™์Šต๋ฅ 
pkeep = tf.placeholder(tf.float32, name='pkeep') # ๋“œ๋กญ์•„์›ƒ ๋งค๊ฐœ๋ณ€์ˆ˜
๋ฐฐ์น˜ ํฌ๊ธฐ = tf.placeholder(tf.int32, ์ด๋ฆ„='๋ฐฐ์น˜ ํฌ๊ธฐ')

์ž…๋ ฅ

X = tf.placeholder(tf.uint8, [์—†์Œ, ์—†์Œ], ์ด๋ฆ„='X') # [ BATCHSIZE, SEQLEN ]
Xo = tf.one_hot(X, ALPHASIZE, 1.0, 0.0) # [ BATCHSIZE, SEQLEN, ALPHASIZE ]# ์˜ˆ์ƒ ์ถœ๋ ฅ = ๋‹ค์Œ ๋ฌธ์ž๋ฅผ ์˜ˆ์ธกํ•˜๋ ค๊ณ  ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋™์ผํ•œ ์‹œํ€€์Šค๊ฐ€ โ€‹โ€‹1๋งŒํผ ์ด๋™๋จ
Y_ = tf.placeholder(tf.uint8, [์—†์Œ, ์—†์Œ], ์ด๋ฆ„='Y_') # [ BATCHSIZE, SEQLEN ]
Yo_ = tf.one_hot(Y_, ALPHASIZE, 1.0, 0.0) # [ BATCHSIZE, SEQLEN, ALPHASIZE ]# ์ž…๋ ฅ ์ƒํƒœ
Hin = tf.placeholder(tf.float32, [None, INTERNALSIZE*NLAYERS], name='Hin') # [ BATCHSIZE, INTERNALSIZE * NLAYERS]

NLAYERS=3๊ฐœ์˜ GRU ์…€ ๋ ˆ์ด์–ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ „๊ฐœ๋œ SEQLEN=30ํšŒ# dynamic_rnn์€ ์ž…๋ ฅ Xo์˜ ํฌ๊ธฐ์—์„œ SEQLEN์„ ์ถ”๋ก ํ•ฉ๋‹ˆ๋‹ค.

onecell = rnn.GRUCell(INTERNALSIZE)
dropcell = rnn.DropoutWrapper(onecell, input_keep_prob=pkeep)
multicell = rnn.MultiRNNCell([๋ฒ”์œ„ ๋‚ด _์— ๋Œ€ํ•œ ๋“œ๋กญ์…€(NLAYERS)], state_is_tuple=False)
multicell = rnn.DropoutWrapper(๋‹ค์ค‘ ์…€, output_keep_prob=pkeep)
Yr, H = tf.nn.dynamic_rnn(multicell, Xo, dtype=tf.float32, initial_state=Hin)# Yr: [ BATCHSIZE, SEQLEN, INTERNALSIZE ]# H: [ BATCHSIZE, INTERNALSIZE*NLAYERS ] # ๋งˆ์ง€๋ง‰ ์ƒํƒœ์ž…๋‹ˆ๋‹ค. ์ˆœ์„œ๋Œ€๋กœ

์žฌ์‚ฌ์šฉ ํ”Œ๋ž˜๊ทธ์— ๊ด€ํ•œ ๋ฌธ์„œ๋ฅผ ์ฐพ์ง€ ๋ชปํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๊นŒ?

๋ฏธ๋ฆฌ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-285702372 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABtim6MOOCbx3RJEJe8PQBDXGVIXTGPmks5rkW_jgaJpZM4MWl4f
.

seq2seq ์ž์Šต์„œ ๋ชจ๋ธ ์—์„œ ์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ์ด์œ ๋ฅผ ์ดํ•ดํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

cell = tf.contrib.rnn.MultiRNNCell([single_cell() for _ in range(num_layers)])

์›์ฒœ

์…€์ด ์ƒ์„ฑ๋˜๋Š” ๊ณณ

def single_cell():
    return tf.contrib.rnn.GRUCell(size)

@ebrevdo ์ด ๋ฌธ์ œ์— ๋Œ€ํ•ด ๋‹ค์‹œ ์•Œ๋ ค์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๋ถˆํ–‰ํžˆ๋„ ์ œ์•ˆ ๋œ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์€ ์•ž์„œ ์–ธ๊ธ‰ ํ•œ ์˜ค๋ฅ˜์™€ ํ•จ๊ป˜ ๋ฌธ์ œ๋ฅผ ๊ทธ๋Œ€๋กœ ๋‘ก๋‹ˆ๋‹ค. seq2seq ํŠœํ† ๋ฆฌ์–ผ ์— ๋Œ€ํ•œ ์œ„์˜ ์„ค๋ช…์„ ๊ฐ์•ˆํ•  ๋•Œ ์šฐ๋ฆฌ๋Š” ๋ชจ๋‘ ๊ฐ™์€ ๋ณดํŠธ์— ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๊นŒ?

์ •ํ™•ํžˆ ๊ฐ™์€ ์˜ค๋ฅ˜๋ผ๊ณ  ํ™•์‹ ํ•˜์‹ญ๋‹ˆ๊นŒ? ์—ฌ๊ธฐ์— ๋ณต์‚ฌํ•˜์—ฌ ๋ถ™์—ฌ๋„ฃ์œผ์‹ญ์‹œ์˜ค.

๋‚ด ๋‚˜์œ, ๋‚˜๋Š” ๋ฐฉ๊ธˆ ๊ด€๋ จ ์ฝ”๋“œ์— ๋Œ€ํ•œ ๋ณ€๊ฒฝ ํ”„๋กœ์„ธ์Šค๋ฅผ ๋‹ค์‹œ (์ฒ˜์Œ๋ถ€ํ„ฐ) ๊ฑฐ์ณค๊ณ  ์ œ์•ˆ๋œ ๋Œ€๋กœ ๋‹ค์‹œ ์‹คํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค. ์˜ค๋ฅ˜๊ฐ€ ์ •๋ง ์ œ๊ฑฐ๋˜์—ˆ๊ณ  ์ด์ œ ์˜ฌ๋“œ ๋ฐ”๋“œ๊ฐ€ ํ™˜๊ฐ์„ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค ๐Ÿ‘

๊ทธ๋ž˜์„œ, thx, ๋‚ด๊ฐ€ ์–ด์ œ ์–ด๋””์„œ ์ž˜๋ชปํ–ˆ๋Š”์ง€ ํ™•์‹คํ•˜์ง€ ์•Š์ง€๋งŒ ๊ทธ๊ฒƒ์€ ๋ถ„๋ช…ํžˆ ๋‚˜์—๊ฒŒ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

Tensorflow 1.0 ๋ฆด๋ฆฌ์Šค ๋ฒ„์ „์„ ์‚ฌ์šฉํ•˜๊ณ  MacOS์—์„œ CPU ๋ชจ๋“œ๋กœ ์ž‘์—…ํ•  ๋•Œ๋„ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. "reuse" ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•˜๋”๋ผ๋„

def cell():
    return tf.contrib.rnn.BasicLSTMCell(rnn_size,state_is_tuple=True,reuse=tf.get_variable_scope().reuse)

muticell = tf.contrib.rnn.MultiRNNCell([cell for _ in range(num_layers)], state_is_tuple=True)

๋‹ค์ค‘ ์…€์ด ์ž˜๋ชป๋œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค... "cell() for _ in
๋ฒ”์œ„(...)"

2017๋…„ 3์›” 16์ผ ๋ชฉ์š”์ผ ์˜คํ›„ 8์‹œ 29๋ถ„์— cuiming [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

Tensorflow 1.0 ๋ฆด๋ฆฌ์Šค ๋ฒ„์ „์„ ์‚ฌ์šฉํ•  ๋•Œ๋„ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.
MacOS์—์„œ CPU ๋ชจ๋“œ๋กœ ์ž‘์—…ํ•ฉ๋‹ˆ๋‹ค. "์žฌ์‚ฌ์šฉ" ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•˜๋”๋ผ๋„

def ์…€():
๋ฐ˜ํ™˜ tf.contrib.rnn.BasicLSTMCell(rnn_size,state_is_tuple=True,reuse=tf.get_variable_scope().reuse)

muticell = tf.contrib.rnn.MultiRNNCell([๋ฒ”์œ„ ๋‚ด _์— ๋Œ€ํ•œ ์…€(num_layers)], state_is_tuple=True)

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-287257629 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABtim3A6JQr8ptRKrdiDW_kgNRIFkHGlks5rmf4WgaJpZM4MWl4f
.

๋ฒˆ์—ญ ์˜ˆ์ œ๋ฅผ ์‹คํ–‰ํ•˜๋ ค๊ณ  ํ–ˆ์Šต๋‹ˆ๋‹ค. python2.7 translate.py --data_dir data/ --train_dir train/ --size=256 --num_layers=2 --steps_per_checkpoint=50

MultiRNNCell์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์˜ฌ๋ฐ”๋ฅธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.
cell = tf.contrib.rnn.MultiRNNCell([single_cell() for _ in range(num_layers)])

ํ•˜์ง€๋งŒ ๊ฐ™์€ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.
ValueError: RNNCell ์žฌ์‚ฌ์šฉ ์‹œ๋„์ฒซ ๋ฒˆ์งธ ์‚ฌ์šฉ๊ณผ ๋‹ค๋ฅธ ๋ณ€์ˆ˜ ๋ฒ”์œ„๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์…€์˜ ์ฒซ ๋ฒˆ์งธ ์‚ฌ์šฉ์€ ๋ฒ”์œ„ 'embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/multi_rnn_cell/cell_0/gru_cell'์ด์—ˆ๊ณ , ์ด ์‹œ๋„๋Š” ๋ฒ”์œ„ 'embedding_attention_seq2seq/rnn/multi_rnn_cell/cell_0/gru_cell'์ž…๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ๊ฐ€์ค‘์น˜ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ์…€์˜ ์ƒˆ ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“œ์‹ญ์‹œ์˜ค. ์ด์ „์— MultiRNNCell([GRUCell(...)] * num_layers)์„ ์‚ฌ์šฉํ–ˆ๋‹ค๋ฉด MultiRNNCell([GRUCell(...) for _ in range(num_layers)])๋กœ ๋ณ€๊ฒฝํ•ฉ๋‹ˆ๋‹ค. ์–‘๋ฐฉํ–ฅ RNN์˜ ์ˆœ๋ฐฉํ–ฅ ๋ฐ ์—ญ๋ฐฉํ–ฅ ์…€ ๋ชจ๋‘์™€ ๋™์ผํ•œ ์…€ ์ธ์Šคํ„ด์Šค๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‘ ๊ฐœ์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค(ํ•˜๋‚˜๋Š” ์ •๋ฐฉํ–ฅ, ํ•˜๋‚˜๋Š” ์—ญ๋ฐฉํ–ฅ). 2017๋…„ 5์›”์— scope=None(์ž๋™ ๋ชจ๋ธ ์ €ํ•˜๋กœ ์ด์–ด์งˆ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์ด ์˜ค๋ฅ˜๋Š” ๊ทธ๋•Œ๊นŒ์ง€ ์œ ์ง€๋จ)์œผ๋กœ ํ˜ธ์ถœ๋  ๋•Œ ๊ธฐ์กด์— ์ €์žฅ๋œ ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ์šฉํ•˜๋„๋ก ์ด ์…€์˜ ๋™์ž‘์„ ์ „ํ™˜ํ•˜๊ธฐ ์‹œ์ž‘ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@bowu - ์šด์ด ์ข‹์•˜์Šต๋‹ˆ๊นŒ? ์•„์ง ์‹œ๋„ํ•˜์ง€ ์•Š์•˜๋‹ค๋ฉด ์ตœ์‹  ์†Œ์Šค์—์„œ tensorflow๋ฅผ ๋‹ค์‹œ ์„ค์น˜ํ•˜์‹ญ์‹œ์˜ค. core_rnn ํŒŒ์ผ ์ค‘ ์ผ๋ถ€๊ฐ€ ์ผ๋ถ€ ๋ณ€๊ฒฝ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ง€๊ธˆ ๋‚˜๋ฅผ ์œ„ํ•ด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

@robmsylvester ์ตœ์‹  ์†Œ์Šค์—์„œ tensorflow๋ฅผ ๋‹ค์‹œ ์„ค์น˜ํ–ˆ์ง€๋งŒ ์—ฌ์ „ํžˆ ๋™์ผํ•œ ์˜ค๋ฅ˜์ž…๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ง€์  ๋งˆ์Šคํ„ฐ์— ์žˆ์—ˆ๊ณ  ์ตœ์‹  ์ปค๋ฐ‹์€ commit 2a4811054a9e6b83e1f5a2705a92aab50e151b13 ์ž…๋‹ˆ๋‹ค. ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๋ฅผ ๊ตฌ์ถ•ํ•  ๋•Œ ๊ฐ€์žฅ ์ตœ๊ทผ์— ์ปค๋ฐ‹ํ•œ ๋‚ด์šฉ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

์•ˆ๋…•ํ•˜์„ธ์š”, ์ €๋Š” ์†Œ์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋นŒ๋“œ๋œ GPU๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Tensorflow r1.0์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ˆ˜์ •๋˜์ง€ ์•Š์€ Seq2Seq ๋ฒˆ์—ญ ์ž์Šต์„œ๋ฅผ ๋”ฐ๋ฅด๋ ค๊ณ  ํ•˜์ง€๋งŒ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰

ValueError: RNNCell ์žฌ์‚ฌ์šฉ ์‹œ๋„์ฒซ ๋ฒˆ์งธ ์‚ฌ์šฉ๊ณผ ๋‹ค๋ฅธ ๋ณ€์ˆ˜ ๋ฒ”์œ„๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์…€์˜ ์ฒซ ๋ฒˆ์งธ ์‚ฌ์šฉ์€ ๋ฒ”์œ„ 'embedding_attention_seq2seq/embedding_attention_decoder/attention_decoder/multi_rnn_cell/cell_0/gru_cell'์ด์—ˆ๊ณ , ์ด ์‹œ๋„๋Š” ๋ฒ”์œ„ 'embedding_attention_seq2seq/rnn/multi_rnn_cell/cell_0/gru_cell'.....

๋‚ด seq2seq_model.py์— ์žˆ๋Š” ์ฝ”๋“œ์˜ ๊ด€๋ จ ๋ถ€๋ถ„์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

 # Create the internal multi-layer cell for our RNN.
    def single_cell():
      return tf.contrib.rnn.GRUCell(size)
    if use_lstm:
      def single_cell():
        return tf.contrib.rnn.BasicLSTMCell(size)
    cell = single_cell()
    if num_layers > 1:
      cell = tf.contrib.rnn.MultiRNNCell([single_cell() for _ in range(num_layers)])

๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋ ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ?

GRUCell์ด ์ƒ์„ฑ๋˜๋Š” ํ˜ธ์ถœ์— "reuse=tf.get_variable_scope().reuse"๋ฅผ ์ถ”๊ฐ€ํ•ด๋„ ๋„์›€์ด ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์—„์ฒญ ๊ณ ๋งˆ์›Œ!

@prashantserai - ์œ„์—์„œ MultiRNNCell ๋ผ์ธ์„ ์ œ๊ฑฐํ•˜์—ฌ ๋„คํŠธ์›Œํฌ๋ฅผ ํ•˜๋‚˜์˜ ๋ ˆ์ด์–ด๋กœ ํšจ๊ณผ์ ์œผ๋กœ ๋งŒ๋“œ๋Š” ๊ฒฝ์šฐ ์–ด๋–ค ์ผ์ด ๋ฐœ์ƒํ•˜๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค. ๊ทธ๋Ÿฌ๋ฉด ์ž‘๋™ํ•ฉ๋‹ˆ๊นŒ? MultiRNNCell์˜ ์–ด๋”˜๊ฐ€์— ๋ฒ„๊ทธ๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ตœ๊ทผ ์–ด๋”˜๊ฐ€์—์„œ ์•„๋งˆ๋„ ์Šคํƒ ์˜ค๋ฒ„ํ”Œ๋กœ์— ๋Œ€ํ•ด ์ฝ์—ˆ์Šต๋‹ˆ๋‹ค.

์Šคํƒํ˜• lstm/gru๋ฅผ ์ง์ ‘ ๊ตฌํ˜„ํ•˜๋ฉด ์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜์ง€ ์•Š๊ณ  ๋™์ผํ•œ ๊ธฐ๋Šฅ์„ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค(์‹ค์ œ๋กœ ๋” ๋งŽ์ด, ์–‘๋ฐฉํ–ฅ ์•„ํ‚คํ…์ฒ˜, ์ด์ƒํ•œ ์ž”์—ฌ ๋ฐ ๊ฑด๋„ˆ๋›ฐ๊ธฐ ์—ฐ๊ฒฐ ๋“ฑ์œผ๋กœ ์›ํ•˜๋Š” ๋ชจ๋“  ์ž‘์—…์„ ์ž์œ ๋กญ๊ฒŒ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—). .)

@robmsylvester num_layers=1๋กœ ์‹œ๋„ํ–ˆ๋Š”๋ฐ๋„ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ์ง€์†๋˜์–ด ํ•ด๋‹น ์ค„์„ ํšจ๊ณผ์ ์œผ๋กœ ๊ฑด๋„ˆ๋›ธ ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ์•„์ด๋””์–ด๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ? ์ž…๋ ฅํ•ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

ํ . ๋‚˜์—๊ฒŒ ๋ˆˆ์— ๋„๋Š” ํ•œ ๊ฐ€์ง€๋Š” ์ฐธ์กฐ๋œ ๋ ˆ๊ฑฐ์‹œ seq2seq ํŒŒ์ผ์— ์žˆ์Šต๋‹ˆ๋‹ค.

encoder_cell = copy.deepcopy(cell)

์ธ์ฝ”๋” ์ธก๊ณผ ๋””์ฝ”๋” ์ธก ๋ชจ๋‘์—์„œ ๋™์ผํ•œ ์•„ํ‚คํ…์ฒ˜๊ฐ€ ์‚ฌ์šฉ๋˜๊ธฐ ๋•Œ๋ฌธ์— ์ด ๋ผ์ธ์ด ์‚ฌ์šฉ๋œ ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๊ทธ๋“ค์€ ์…€์˜ ๋ณต์‚ฌ๋ณธ์„ ๋งŒ๋“  ๋‹ค์Œ ์…€ ์ธ์ˆ˜๋ฅผ ์–ดํ…์…˜ ๋””์ฝ”๋” ์ž„๋ฒ ๋”ฉ ํ•จ์ˆ˜์™€ ํ•จ๊ป˜ ์ „๋‹ฌํ•œ ๋‹ค์Œ ์–ดํ…์…˜ ๋””์ฝ”๋” ์ž์ฒด๋กœ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.

seq2seq ๋ชจ๋ธ ํŒŒ์ผ์— ์ธ์ฝ”๋” ์…€๊ณผ ๋””์ฝ”๋” ์…€์„ ๋ช…์‹œ์ ์œผ๋กœ ์ƒ์„ฑํ•˜๊ณ  ๋‘˜ ๋‹ค ๋ ˆ๊ฑฐ์‹œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ํŒŒ์ผ์— ์ „๋‹ฌํ•˜์—ฌ ํ•จ์ˆ˜์™€ ํ•ด๋‹น ์ธ์ˆ˜๋ฅผ ์•ฝ๊ฐ„ ์กฐ์ •ํ•˜๋ฉด ์–ด๋–ป๊ฒŒ ๋ฉ๋‹ˆ๊นŒ?

@robmsylvester ๊ฐ€ ์ž‘๋™ํ•˜๋Š” ์…€ ๋ฒ”์œ„๋ฅผ ๋ณ€๊ฒฝํ•ด์„œ๋Š” ์•ˆ ๋ฉ๋‹ˆ๊นŒ? ๋‹ค๋ฅธ ๋‘ ๊ฐ€์ง€ ์˜ˆ์—์„œ๋„ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ์ œ ์ƒ๊ฐ์—๋Š” ์ด๊ฒƒ์€ ๋งค์šฐ ์ถ”์•…ํ•œ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋” ๊นจ๋—ํ•œ ์†”๋ฃจ์…˜์ด ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์–ด์ฉŒ๋ฉด ์šฐ๋ฆฌ๊ฐ€ ๋ญ”๊ฐ€๋ฅผ ๋†“์น˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ? ( seq2seq ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋„ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜์—ฌ ์œ„์˜ ๋ชจ๋“  ์†”๋ฃจ์…˜์„ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค.)

@iamgroot42 - ์˜ˆ, ๊ทธ 'ํ•ด๊ฒฐ์ฑ…'์€ ๋ถ„๋ช…ํžˆ ๋งค์šฐ ์ถ”์•…ํ•˜์ง€๋งŒ ๋ฌธ์ œ๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ๋Š” ์œ„์น˜๋ฅผ ์ฐพ์œผ๋ ค๊ณ  ํ•˜๋Š” ๊ฒƒ์ด ๋” ๊ทธ๋ ‡์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๋ช‡ ์‹œ๊ฐ„ ํ›„์— ๊ทธ๊ฒƒ์„ ๊ฐ€์ง€๊ณ  ๋†€๊ณ  ๋‚ด๊ฐ€ ๋ฌด์–ธ๊ฐ€๋ฅผ ์ถ”์ ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ๋ณผ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์‚ฌ์‹ค copy.deepcopy๋Š” ๋ ˆ๊ฑฐ์‹œ ํ•จ์ˆ˜์ด๊ธฐ ๋•Œ๋ฌธ์— ์กด์žฌํ•ฉ๋‹ˆ๋‹ค.
์œ ์ง€/์—…๋ฐ์ดํŠธํ•  ๋ฆฌ์†Œ์Šค๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ์›ํ•˜๋Š” ๊ฒฝ์šฐ
์‚ฌ์šฉ์ž๊ฐ€ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ๋Š” ์ด์ „ ๋ฒ„์ „๊ณผ ํ˜ธํ™˜๋˜๋Š” ๋ณ€๊ฒฝ์„ ๋„์ž…
๋””์ฝ”๋”ฉ ๋‹จ๊ณ„๋ฅผ ์œ„ํ•œ ๋‘ ๋ฒˆ์งธ ์…€, ๊ทธ๋ฆฌ๊ณ  None์ด๋ฉด ๋Œ€์ฒด
deepcopy, ๊ทธ๋Ÿฌ๋ฉด PR์„ ๊ฒ€ํ† ํ•˜๊ฒŒ ๋˜์–ด ๊ธฐ์ฉ๋‹ˆ๋‹ค. ํ•  ๊ฒƒ์ด๋ผ๋Š” ๊ฒƒ์„ ๋ช…์‹ฌํ•˜์‹ญ์‹œ์˜ค
์ด์ „ ๋ฒ„์ „๊ณผ ํ˜ธํ™˜๋˜๋Š” ๋ณ€๊ฒฝ ์‚ฌํ•ญ์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

2017๋…„ 4์›” 4์ผ ํ™”์š”์ผ ์˜ค์ „ 11:38, Rob Sylvester [email protected]
์ผ๋‹ค:

@iamgroot42 https://github.com/iamgroot42 - ๋„ค, ๊ทธ 'ํ•ด๊ฒฐ์ฑ…'์€
๋ถ„๋ช…ํžˆ ๋งค์šฐ ์ถ”์•…ํ•˜์ง€๋งŒ ๋ฌธ์ œ๊ฐ€ ์žˆ๋Š” ์œ„์น˜๋ฅผ ์ฐพ์œผ๋ ค๊ณ  ํ•˜๋Š” ๊ฒฝ์šฐ์—๋Š” ๋”์šฑ ๊ทธ๋ ‡์Šต๋‹ˆ๋‹ค.
์•„๋งˆ๋„. ๋‚˜๋Š” ๋ช‡ ์‹œ๊ฐ„ ํ›„์— ๊ทธ๊ฒƒ์„ ๊ฐ€์ง€๊ณ  ๋†€๊ณ  ๋‚ด๊ฐ€ ๋ญ”๊ฐ€๋ฅผ ์ถ”์ ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ๋ณผ ๊ฒƒ์ž…๋‹ˆ๋‹ค
์•„๋ž˜์—.

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-291593289 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABtim1QHTDhOC_zT6cKtmUFPOit5Yjn7ks5rso5CgaJpZM4MWl4f
.

@ebrevdo - ์ƒ๊ฐํ•ด๋ณผ๊ฒŒ์š”. ๋‚˜๋Š” ์ด๊ฒƒ๊ณผ ๋งค์šฐ ์œ ์‚ฌํ•˜๊ฒŒ ์ž‘๋™ํ•˜์ง€๋งŒ ์›ํ•˜๋Š” ๊ณณ์— ์–‘๋ฐฉํ–ฅ ๋ ˆ์ด์–ด๋ฅผ ์‚ฝ์ž…ํ•  ์ˆ˜ ์žˆ๋Š” ๋ณ„๋„์˜ ํด๋ž˜์Šค๋ฅผ ํ†ตํ•ด ์…€์„ ์ƒ์„ฑํ•˜๊ณ , ์›ํ•˜๋Š” ๊ณณ์— ์ž”์ฐจ๋ฅผ ์‚ฝ์ž…ํ•˜๊ณ , ์ž…๋ ฅ์„ concat ๋Œ€ ํ•ฉ๊ณ„๋กœ ๋ณ‘ํ•ฉํ•˜๋Š” ๋“ฑ์˜ ๋ช‡ ๊ฐ€์ง€ ๋‹ค๋ฅธ ๋ฒˆ์—ญ๊ธฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ •์  RNN์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด ํŠœํ† ๋ฆฌ์–ผ๋กœ ๋‚ด ํด๋ž˜์Šค๋ฅผ ์•„์ฃผ ์‰ฝ๊ฒŒ ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๋‚ด๊ฐ€ ์•Œ๋ ค ์ฃผ๋งˆ.

@ebrevdo ์ €๋Š” Red Hat์—์„œ Tensorflow r1.0(tensorflow-1.0.1-cp36-cp36m-linux_x86_64)์„ ์‹คํ–‰ ์ค‘์ด๋ฉฐ Github์—์„œ ์ตœ์‹  ๋ฒ„์ „์˜ ๋ฒˆ์—ญ ํŠœํ† ๋ฆฌ์–ผ์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ ์ด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๊นŒ?

๋ฒˆ์—ญ ํŠœํ† ๋ฆฌ์–ผ์ด TF 1.0์—์„œ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ๊ฒƒ์€ ์œ ๊ฐ์ž…๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๊ทธ๊ฒƒ์„ ์ˆ˜์ •ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. @lukaszkaiser ์ข€ ๋ด์ฃผ์‹œ๊ฒ ์–ด์š” ? ์šฐ๋ฆฌ๋Š” ์ƒˆ๋กœ์šด ํŠœํ† ๋ฆฌ์–ผ์„ ์ž‘์—… ์ค‘์ด์ง€๋งŒ ์•„์ง ๋ช‡ ์ฃผ ๋‚จ์•˜๊ณ  ์ž‘๋™ํ•˜๋ ค๋ฉด ์•ผ๊ฐ„ ๋ฒ„์ „์˜ TensorFlow(๋˜๋Š” TF 1.1 ๋˜๋Š” 1.2)๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

(lukasz; TF 1.0์—์„œ ํŠœํ† ๋ฆฌ์–ผ์˜ ์–ด๋Š ๋ถ€๋ถ„์— ๊ฒฐํ•จ์ด ์žˆ๋Š”์ง€ ๋‹ค์–‘ํ•œ ์˜๊ฒฌ์—์„œ ์‹๋ณ„ํ•˜๊ธฐ๊ฐ€ ์–ด๋ ต์Šต๋‹ˆ๋‹ค. ๋ผ์ธ์„ ์‹๋ณ„ํ•  ์ˆ˜ ์žˆ๊ณ  ์ž‘๋™ํ•˜๋„๋ก ๋„์šธ ์ˆ˜ ์žˆ๋Š” ๊ธฐํšŒ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?)

@ebrevdo ์ด ํŠœํ† ๋ฆฌ์–ผ์ž…๋‹ˆ๋‹ค. ์ด ๋ผ์ธ ํด๋Ÿฌ์Šคํ„ฐ์— ์˜ค๋ฅ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์— ์ „๋‹ฌ๋œ ์…€์€ ๋ ˆ๊ฑฐ์‹œ seq2seq ๋ชจ๋ธ์˜ ์—ญ๋ฐฉํ–ฅ ๋ฐ ์ˆœ๋ฐฉํ–ฅ ๋‹จ๊ณ„ ๋ชจ๋‘์— ์‚ฌ์šฉ๋˜๋ฉฐ, ๋™์ผํ•œ ์…€์ด ๋‹ค๋ฅธ ๋ฒ”์œ„์—์„œ ์‚ฌ์šฉ๋˜๊ธฐ ๋•Œ๋ฌธ์— ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

@iamgroot42 ํ•„์š”ํ•œ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์œผ๋กœ ํ™๋ณดํ•˜์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ? ๊ทธ๊ฒƒ์€ ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‚˜๋Š” ํ˜„์žฌ ์Šค์Šค๋กœ ๊ทธ๊ฒƒ์„ ํ•  ์ˆ˜ ์žˆ๋Š” ์ฃผ๊ธฐ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ๊ฐ์‚ฌ ํ•ด์š”!

๋‚˜๋Š” TF 1.0์ด remotes/origin/r1.0 ๋ธŒ๋žœ์น˜์˜ ์†Œ์Šค์—์„œ ์ปดํŒŒ์ผ๋œ ๊ฒฝ์šฐ ์ตœ์‹  ๋ฒ„์ „์˜ ๋ฒˆ์—ญ ํŠœํ† ๋ฆฌ์–ผ์—์„œ ์ž˜ ์ž‘๋™ํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ์•„์ฐจ๋ ธ์Šต๋‹ˆ๋‹ค.

$ git clone https://github.com/tensorflow/tensorflow
$ cd tensorflow
$ git checkout remotes/origin/r1.0

๊ทธ๋Ÿฐ ๋‹ค์Œ TensorFlow๋ฅผ ๋นŒ๋“œํ•˜๊ณ  ์„ค์น˜ํ•˜๋ฉด ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

remotes/origin/r1.1 ๋ถ„๊ธฐ์—๋Š” "๋‹ค๋ฅธ ๋ณ€์ˆ˜ ๋ฒ”์œ„" ์˜ค๋ฅ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
@robmsylvester ๊ฐ€ ์ œ์•ˆํ•œ ๋Œ€๋กœ ์ฝ”๋“œ๋ฅผ ์ˆ˜์ •ํ–ˆ์Šต๋‹ˆ๋‹ค.

seq2seq ๋ชจ๋ธ ํŒŒ์ผ์— ์ธ์ฝ”๋” ์…€๊ณผ ๋””์ฝ”๋” ์…€์„ ๋ช…์‹œ์ ์œผ๋กœ ์ƒ์„ฑํ•˜๊ณ  ๋‘˜ ๋‹ค ๋ ˆ๊ฑฐ์‹œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ํŒŒ์ผ์— ์ „๋‹ฌํ•˜์—ฌ ํ•จ์ˆ˜์™€ ํ•ด๋‹น ์ธ์ˆ˜๋ฅผ ์•ฝ๊ฐ„ ์กฐ์ •ํ•˜๋ฉด ์–ด๋–ป๊ฒŒ ๋ฉ๋‹ˆ๊นŒ?

๊ทธ๋ฆฌ๊ณ  ๊ทธ๊ฒƒ์€ ์ง€๊ธˆ ๋‚˜๋ฅผ ์œ„ํ•ด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

@oxwsds ๋‚ด๊ฐ€ ์‚ฌ์šฉํ•˜๋Š” Tensorflow๋Š” 1.0.1์ด๋ฏ€๋กœ ์˜ค๋ฅ˜๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@robmsylvester ๊ฐ€ ์‹ค์ œ๋กœ ์ œ์•ˆํ•œ ๊ฒƒ์„ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ต์œก์ด ์‹œ์ž‘๋˜์—ˆ์Šต๋‹ˆ๋‹ค(2์ผ 13์‹œ๊ฐ„ ์™„๋ฃŒ).. ๋””์ฝ”๋”ฉํ•˜๋Š” ๋™์•ˆ ์˜ค๋ฅ˜์™€ ํ•จ๊ป˜ ์‹คํŒจํ•ฉ๋‹ˆ๋‹ค.

  File "/homes/3/serai/.conda/envs/tensorflow_r1.0_gpu/lib/python3.6/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 883, in embedding_attention_seq2seq
    initial_state_attention=initial_state_attention)
  File "/homes/3/serai/.conda/envs/tensorflow_r1.0_gpu/lib/python3.6/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 787, in embedding_attention_decoder
    initial_state_attention=initial_state_attention)
  File "/homes/3/serai/.conda/envs/tensorflow_r1.0_gpu/lib/python3.6/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 686, in attention_decoder
    cell_output, state = cell(x, state)
  File "/homes/3/serai/.conda/envs/tensorflow_r1.0_gpu/lib/python3.6/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 796, in __call__
    % (len(self.state_size), state))
ValueError: Expected state to be a tuple of length 3, but received: Tensor("model_with_buckets/embedding_attention_seq2seq/rnn/gru_cell_4/add:0", shape=(?, 1024), dtype=float32)

๋””์ฝ”๋”ฉ์„ ์‹œ๋„ ํ–ˆ์Šต๋‹ˆ๊นŒ?

@prashantserai ์ •ํ™•ํžˆ๋Š” ๋ชจ๋ฅด๊ฒ ์ง€๋งŒ ๋งŒ๋‚˜๋ณธ ๋‚ด์šฉ์€ ๋‹ค๋ฅธ ๋ฌธ์ œ์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

@prashantserai ๋””์ฝ”๋”ฉํ•  ๋•Œ๋งŒ ์‹คํŒจํ•˜๋ฉด ๋ฐฐ์น˜ ํฌ๊ธฐ 1์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ๊ณผ ๊ด€๋ จ์ด ์žˆ์Šต๋‹ˆ๊นŒ? ํ›ˆ๋ จ ์ค‘์— ๋ฐฐ์น˜ ํฌ๊ธฐ๋ฅผ 1๋กœ ๋‚ฎ์ถ”๋ฉด ๋ชจ๋ธ์ด ๊ณ„์† ํ›ˆ๋ จ๋ฉ๋‹ˆ๊นŒ?

@bowu ์—ฌ๊ธฐ์— ๊ฐ™์€ ์˜ค๋ฅ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. Mac OX ์‹œ์—๋ผ, TensorFlow 1.1.0-rc1, Python 2.7.10 ๋ฐ Python 3.6.1.

@robmsylvester ๊ทธ๊ฒƒ์€ ๋ฐฐ์น˜ ํฌ๊ธฐ๋„ 1๋กœ ์„ฑ๊ณต์ ์œผ๋กœ ํ›ˆ๋ จํ–ˆ์ง€๋งŒ ๊ฐ™์€ ๋ฐฉ์‹์ด๋‚˜ ์œ ์‚ฌํ•œ ๋ฐฉ์‹์œผ๋กœ ๋””์ฝ”๋”ฉํ•˜๋Š” ๋™์•ˆ ์‹คํŒจํ–ˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์— ์ „์ฒด ์—ญ์ถ”์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค.. ๋‚ด๊ฐ€ ์ด๊ฒƒ์„ ์—ฐ๊ฒฐ๋œ ์˜ค๋ฅ˜๋กœ ์ƒ๊ฐํ•œ ์ด์œ ๋Š” ์— ๋Œ€ํ•œ ์ฐธ์กฐ ๋•Œ๋ฌธ์ด์—ˆ์Šต๋‹ˆ๋‹ค. seq2seq_f(์ˆ˜์ •๋œ ํ•จ์ˆ˜ ์ค‘ ํ•˜๋‚˜)(์ˆ˜์ •๋œ ์ค„์„ ๋‚˜ํƒ€๋‚ด๋Š” #prashant ์ฃผ์„์€ ์ถ”์ ์˜ ์ผ๋ถ€์ž„)

2017-04-10 11:32:27.447042: I tensorflow/core/common_runtime/gpu/gpu_device.cc:887] Found device 0 with properties: 
name: GeForce GTX 780 Ti
major: 3 minor: 5 memoryClockRate (GHz) 0.928
pciBusID 0000:42:00.0
Total memory: 2.95GiB
Free memory: 2.88GiB
2017-04-10 11:32:27.447094: I tensorflow/core/common_runtime/gpu/gpu_device.cc:908] DMA: 0 
2017-04-10 11:32:27.447102: I tensorflow/core/common_runtime/gpu/gpu_device.cc:918] 0:   Y 
2017-04-10 11:32:27.447118: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 780 Ti, pci bus id: 0000:42:00.0)
Traceback (most recent call last):
  File "translate.py", line 322, in <module>
    tf.app.run()
  File "/homes/3/serai/.conda/envs/tensorflow_r1.0_gpu/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "translate.py", line 317, in main
    decode()
  File "translate.py", line 248, in decode
    model = create_model(sess, True)
  File "translate.py", line 136, in create_model
    dtype=dtype)
  File "/data/data6/scratch/serai/models/tutorials/rnn/translate/seq2seq_model.py", line 168, in __init__
    softmax_loss_function=softmax_loss_function)
  File "/homes/3/serai/.conda/envs/tensorflow_r1.0_gpu/lib/python3.6/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 1203, in model_with_buckets
    decoder_inputs[:bucket[1]])
  File "/data/data6/scratch/serai/models/tutorials/rnn/translate/seq2seq_model.py", line 167, in <lambda>
    self.target_weights, buckets, lambda x, y: seq2seq_f(x, y, True),
  File "/data/data6/scratch/serai/models/tutorials/rnn/translate/seq2seq_model.py", line 144, in seq2seq_f
    dtype=dtype) #prashant
  File "/homes/3/serai/.conda/envs/tensorflow_r1.0_gpu/lib/python3.6/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 883, in embedding_attention_seq2seq
    initial_state_attention=initial_state_attention)
  File "/homes/3/serai/.conda/envs/tensorflow_r1.0_gpu/lib/python3.6/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 787, in embedding_attention_decoder
    initial_state_attention=initial_state_attention)
  File "/homes/3/serai/.conda/envs/tensorflow_r1.0_gpu/lib/python3.6/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 686, in attention_decoder
    cell_output, state = cell(x, state)
  File "/homes/3/serai/.conda/envs/tensorflow_r1.0_gpu/lib/python3.6/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 796, in __call__
    % (len(self.state_size), state))
ValueError: Expected state to be a tuple of length 3, but received: Tensor("model_with_buckets/embedding_attention_seq2seq/rnn/gru_cell_4/add:0", shape=(?, 1024), dtype=float32)

@oxwsds ์œ„์˜ ์ „์ฒด ์ถ”์ ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ท€ํ•˜์˜ ์˜๊ฒฌ์ด ๋ณ€๊ฒฝ๋ฉ๋‹ˆ๊นŒ?

@prashantserai ๋””์ฝ”๋”ฉ์„ ์‹œ๋„ํ–ˆ๋Š”๋ฐ ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. encoder_cell arg ๋ฅผ tf.contrib.legacy_seq2seq.embedding_attention_seq2seq ํ•จ์ˆ˜์— ์ถ”๊ฐ€ํ•˜๊ณ  $ translate/seq2seq_model.py ์—์„œ ์…€์„ ๋งŒ๋“ค๊ณ  seq2seq_f ํ•จ์ˆ˜์—์„œ ํ˜ธ์ถœ๋œ ํ•จ์ˆ˜์— ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค. ์ฝ”๋“œ๋ฅผ ์–ด๋–ป๊ฒŒ ๋ณ€๊ฒฝํ–ˆ์Šต๋‹ˆ๊นŒ?

@ssssss @robmsylvester @ebrevdo
๋งˆ์นจ๋‚ด ์ง€๊ธˆ ์ž‘๋™ํ•˜๋Š” ๊ฒƒ์ด ์žˆ์Šต๋‹ˆ๋‹ค(์ œ ๋‹จ์ผ ๋ ˆ์ด์–ด 256 ๋‹จ์œ„ ๋„คํŠธ์›Œํฌ์— ๋Œ€ํ•œ ๊ฒฐ๊ณผ๋Š” ์ผ์ข…์˜ ๋”์ฐํ•˜์ง€๋งŒ ๊ทธ๊ฒƒ์€ ์•„๋งˆ๋„ ๋„คํŠธ์›Œํฌ๊ฐ€ ์ดˆ๊ฒฝ๋Ÿ‰์ด๊ณ  ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ „ํ˜€ ์กฐ์ •ํ•˜์ง€ ์•Š์•˜๊ธฐ ๋•Œ๋ฌธ์ผ ๊ฒƒ์ž…๋‹ˆ๋‹ค)
์—ฌ๋Ÿฌ๋ถ„ ์ •๋ง ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค...!!!!!

_๋งˆ์ง€๋ง‰์— ์ œ ์ƒ๊ฐ์€ ์ด๋ ‡์Šต๋‹ˆ๋‹ค._

@oxwsds ๋Š” Tensorflow๊ฐ€ remotes/origin/r1.0 ๋ธŒ๋žœ์น˜์—์„œ ์ปดํŒŒ์ผ๋  ๋•Œ ์ˆ˜์ •์ด ํ•„์š” ์—†์ด ํŠœํ† ๋ฆฌ์–ผ(ํ˜„์žฌ ํ˜•์‹)์ด TRUE ๋ผ๊ณ  ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค . ๊ทธ๋Ÿฌ๋‚˜ ์Šฌํ”ˆ ๋น„ํŠธ๋Š” ๋‚ด๊ฐ€ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” Tensorflow ์ฝ”๋“œ ๋‚ด์—์„œ ์ˆ˜์ •์ด ํ•„์š”ํ•œ Tensorflow ๋ฒ„์ „๊ณผ remotes/origin/r1.0์˜ ๋ฒ„์ „์ด ๋ชจ๋‘ ๋™์ผํ•˜๊ฒŒ ๋ ˆ์ด๋ธ”์ด ์ง€์ •๋˜์—ˆ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ฃผ์„์—์„œ @robmsylvester ์˜ ์ˆ˜์ • ์‚ฌํ•ญ(์•„๋ž˜์— ๋ณต์‚ฌ๋จ)์€ Tutorial์ด ๊ธฐ๋ณธ์ ์œผ๋กœ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ๋‚ด ๋ฒ„์ „์˜ Tensorflow์—์„œ ์ž‘๋™ํ–ˆ์Šต๋‹ˆ๋‹ค(TF 1.1์—์„œ๋„ ์ž‘๋™ํ•ด์•ผ ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค). ๊ตฌํ˜„ํ•˜๊ธฐ๊ฐ€ ์•ฝ๊ฐ„ ์ง€์ €๋ถ„ํ•˜์ง€๋งŒ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฆ‰, ๋ญ”๊ฐ€๋ฅผ ๋งํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค :-P
์ด ์ „์— ๋‚ด ๋งˆ์ง€๋ง‰ ๋‘ ๋Œ“๊ธ€์˜ ์˜ค๋ฅ˜๋Š” ๋‚ด ์‹ค์ˆ˜๋กœ ์ธํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋”๋ฏธ์ฒ˜๋Ÿผ ํ›ˆ๋ จ ์ค‘์—๋งŒ ๋ ˆ์ด์–ด์™€ ์€๋‹‰ ์œ ๋‹› ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ง€์ •ํ•˜๊ณ  ๋””์ฝ”๋”ฉํ•˜๋Š” ๋™์•ˆ ๊ธฐ๋ณธ๊ฐ’์„ ์‚ฌ์šฉํ•˜๋„๋ก ์ฝ”๋“œ๋ฅผ ๊ทธ๋Œ€๋กœ ๋‘์—ˆ์Šต๋‹ˆ๋‹ค. (ํŠœํ† ๋ฆฌ์–ผ์˜ ์ด ๋ถ€๋ถ„์€ ์•ฝ๊ฐ„ ๋” ๋”๋ฏธ ์ฆ๊ฑฐ๊ฐ€ ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค: https://www.tensorflow.org/tutorials/seq2seq#lets_run_it )

ํ . ๋‚˜์—๊ฒŒ ๋ˆˆ์— ๋„๋Š” ํ•œ ๊ฐ€์ง€๋Š” ์ฐธ์กฐ๋œ ๋ ˆ๊ฑฐ์‹œ seq2seq ํŒŒ์ผ์— ์žˆ์Šต๋‹ˆ๋‹ค.

์ธ์ฝ”๋”_์…€ = copy.deepcopy(์…€)

์ธ์ฝ”๋” ์ธก๊ณผ ๋””์ฝ”๋” ์ธก ๋ชจ๋‘์—์„œ ๋™์ผํ•œ ์•„ํ‚คํ…์ฒ˜๊ฐ€ ์‚ฌ์šฉ๋˜๊ธฐ ๋•Œ๋ฌธ์— ์ด ๋ผ์ธ์ด ์‚ฌ์šฉ๋œ ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๊ทธ๋“ค์€ ์…€์˜ ๋ณต์‚ฌ๋ณธ์„ ๋งŒ๋“  ๋‹ค์Œ ์…€ ์ธ์ˆ˜๋ฅผ ์–ดํ…์…˜ ๋””์ฝ”๋” ์ž„๋ฒ ๋”ฉ ํ•จ์ˆ˜์™€ ํ•จ๊ป˜ ์ „๋‹ฌํ•œ ๋‹ค์Œ ์–ดํ…์…˜ ๋””์ฝ”๋” ์ž์ฒด๋กœ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.

seq2seq ๋ชจ๋ธ ํŒŒ์ผ์— ์ธ์ฝ”๋” ์…€๊ณผ ๋””์ฝ”๋” ์…€์„ ๋ช…์‹œ์ ์œผ๋กœ ์ƒ์„ฑํ•˜๊ณ  ๋‘˜ ๋‹ค ๋ ˆ๊ฑฐ์‹œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ํŒŒ์ผ์— ์ „๋‹ฌํ•˜์—ฌ ํ•จ์ˆ˜์™€ ํ•ด๋‹น ์ธ์ˆ˜๋ฅผ ์•ฝ๊ฐ„ ์กฐ์ •ํ•˜๋ฉด ์–ด๋–ป๊ฒŒ ๋ฉ๋‹ˆ๊นŒ?

ํ”ผ๋“œ๋ฐฑ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค! TF๋งˆ๋‹ค ๋ญ”๊ฐ€ ๋‹ค๋ฅธ๊ฒŒ ๋ณด์ด๋„ค์š”.
pypi์™€ ํ•ด๋‹น ํƒœ๊ทธ์—์„œ? ๊ฑดํ•œ, ๊ทธ๊ฒŒ ๊ฐ€๋Šฅํ•ด?

2017๋…„ 4์›” 10์ผ ์›”์š”์ผ ์˜คํ›„ 9:05 prashantserai [email protected]
์ผ๋‹ค:

@oxwsds https://github.com/oxwsds @robmsylvester
https://github.com/robmsylvester @ebrevdo https://github.com/ebrevdo
๋‚˜๋Š” ๋งˆ์นจ๋‚ด ์ง€๊ธˆ ์ž‘๋™ํ•˜๋Š” ๋ฌด์–ธ๊ฐ€๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค (๋‚ด ๋ง์€, ๋‚ด ์‹ฑ๊ธ€์— ๋Œ€ํ•œ ๊ฒฐ๊ณผ
๋ ˆ์ด์–ด 256 ๋‹จ์œ„ ๋„คํŠธ์›Œํฌ๋Š” ์ผ์ข…์˜ ๋”์ฐํ•˜์ง€๋งŒ ์•„๋งˆ๋„
๋„คํŠธ์›Œํฌ๊ฐ€ ์ดˆ๊ฒฝ๋Ÿ‰์ด๊ณ  ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ „ํ˜€ ์กฐ์ •ํ•˜์ง€ ์•Š์•˜๊ธฐ ๋•Œ๋ฌธ์—)

๋‚ด ๊ฒฐ๋ก ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

@oxwsds https://github.com/oxwsds ๋Š” ํŠœํ† ๋ฆฌ์–ผ์ดํ˜„์žฌ ํ˜•์‹) Tensorflow๊ฐ€ ์žˆ์„ ๋•Œ ์ˆ˜์ •ํ•  ํ•„์š” ์—†์ด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.remotes/origin/r1.0 ๋ธŒ๋žœ์น˜์—์„œ ์ปดํŒŒ์ผ๋œ ๊ฒƒ์€ TRUE ์ž…๋‹ˆ๋‹ค. ์Šฌํ”ˆ ๋น„ํŠธ
๋‚ด๊ฐ€ ์ˆ˜์ •ํ•œ Tensorflow ๋ฒ„์ „์ด์ง€๋งŒ
Tensorflow ์ฝ”๋“œ ๋‚ด์—์„œ ํ•„์š”ํ–ˆ์œผ๋ฉฐ remotes/origin/r1.0์˜ ๋ฒ„์ „
๋‘˜ ๋‹ค ๋™์ผํ•˜๊ฒŒ ํ‘œ์‹œ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

@robmsylvester https://github.com/robmsylvester ์˜ ๋Œ“๊ธ€ ์ˆ˜์ •
(์•„๋ž˜์— ๋ณต์‚ฌ) ์ž์Šต์„œ๊ฐ€ ์žˆ๋Š” Tensorflow ๋ฒ„์ „์—์„œ ์ž‘๋™ํ–ˆ์Šต๋‹ˆ๋‹ค.
๊ธฐ๋ณธ์ ์œผ๋กœ ์ž‘๋™ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค(TF 1.1์—์„œ๋„ ์ž‘๋™ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค). ๊ทธ๊ฒƒ์€
๊ตฌํ˜„ํ•˜๊ธฐ๊ฐ€ ์•ฝ๊ฐ„ ์ง€์ €๋ถ„ํ•˜์ง€๋งŒ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
:-ํ”ผ
์ด ์ „์— ๋‚ด ๋งˆ์ง€๋ง‰ ๋‘ ๋Œ“๊ธ€์˜ ์˜ค๋ฅ˜๋Š” ๋‚ด ์‹ค์ˆ˜๋กœ ์ธํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ข‹๋‹ค
๋”๋ฏธ, ๋‚˜๋Š” ๋ ˆ์ด์–ด์™€ ์€๋‹‰ ์œ ๋‹› ๋งค๊ฐœ๋ณ€์ˆ˜๋งŒ ์ง€์ •ํ•˜๊ณ  ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.
ํ›ˆ๋ จ ์ค‘์— ๋””์ฝ”๋”ฉํ•˜๋Š” ๋™์•ˆ ๊ธฐ๋ณธ๊ฐ’์„ ์‚ฌ์šฉํ•˜๋„๋ก ์ฝ”๋“œ๋ฅผ ๊ทธ๋Œ€๋กœ ๋‘์—ˆ์Šต๋‹ˆ๋‹ค. (์ด๊ฒƒํŠœํ† ๋ฆฌ์–ผ์˜ ์ผ๋ถ€๋Š” ์•ฝ๊ฐ„ ๋” ๋”๋ฏธ ์ฆ๊ฑฐ๊ฐ€ ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.https://www.tensorflow.org/tutorials/seq2seq#lets_run_ithttps://www.tensorflow.org/tutorials/seq2seq#lets_run_it )

ํ . ๋‚˜์—๊ฒŒ ๋ˆˆ์— ๋„๋Š” ํ•œ ๊ฐ€์ง€๋Š” ์ฐธ์กฐ๋œ ๋ ˆ๊ฑฐ์‹œ seq2seq
ํŒŒ์ผ:

์ธ์ฝ”๋”_์…€ = copy.deepcopy(์…€)

๋™์ผํ•œ ์•„ํ‚คํ…์ฒ˜๊ฐ€ ์–‘์ชฝ ๋ชจ๋‘์—์„œ ์‚ฌ์šฉ๋˜๊ธฐ ๋•Œ๋ฌธ์— ์ด ๋ผ์ธ์ด ์‚ฌ์šฉ๋˜๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค.
์ธ์ฝ”๋” ๋ฐ ๋””์ฝ”๋” ์ธก. ๊ทธ๋“ค์€ ์…€์˜ ์‚ฌ๋ณธ์„ ๋งŒ๋“  ๋‹ค์Œ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.
์–ดํ…์…˜ ๋””์ฝ”๋” ์ž„๋ฒ ๋”ฉ ๊ธฐ๋Šฅ๊ณผ ํ•จ๊ป˜ ์…€ ์ธ์ˆ˜, ๋‹ค์Œ์œผ๋กœ
์ฃผ์˜ ๋””์ฝ”๋” ์ž์ฒด.

์ธ์ฝ”๋” ์…€๊ณผ ๋””์ฝ”๋”๋ฅผ ๋ช…์‹œ์ ์œผ๋กœ ์ƒ์„ฑํ•˜๋ฉด ์–ด๋–ป๊ฒŒ ๋ฉ๋‹ˆ๊นŒ?
seq2seq ๋ชจ๋ธ ํŒŒ์ผ์˜ ์…€์„ ๋งŒ๋“ค๊ณ  ๋‘˜ ๋‹ค ๋ ˆ๊ฑฐ์‹œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.
ํŒŒ์ผ, ํ•จ์ˆ˜ ๋ฐ ํ•ด๋‹น ์ธ์ˆ˜๋ฅผ ์•ฝ๊ฐ„ ์กฐ์ •ํ•ฉ๋‹ˆ๊นŒ?

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-293143828 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABtimxvcfFnbWbpj7aUs3BUjwGEFj6p5ks5ruvvygaJpZM4MWl4f
.

์ •๋ณด๋ฅผ ์œ„ํ•ด LSTM ์…€์„ ์Œ“๋Š” ๋™์•ˆ ์ด ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.
๋‚ด ์›๋ž˜ ์ฝ”๋“œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

    lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(hidden_size, forget_bias=0.0, state_is_tuple=True)
    if is_training and keep_prob < 1:
      lstm_cell = tf.nn.rnn_cell.DropoutWrapper(
          lstm_cell, output_keep_prob=keep_prob)
    cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell] * num_layers, state_is_tuple=True)

๊ทธ๋Ÿฐ ๋‹ค์Œ ๋‹ค์Œ ์ฝ”๋“œ๋กœ ๋ชจ๋ธ์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์€ ๊ดœ์ฐฎ์•˜์ง€๋งŒ ๋‹ค๋ฅธ ๋ชจ๋ธ๊ณผ ๋ณ€์ˆ˜๋ฅผ ๊ณต์œ ํ•  ์ˆ˜ ์—†์—ˆ์Šต๋‹ˆ๋‹ค. (์˜ˆ๋ฅผ ๋“ค์–ด ํ…์„œ๋ฅผ ๊ณต์œ ํ•ด์•ผ ํ•˜๋Š” train_model ๋ฐ valid_model์„ ์ƒ์„ฑํ•˜๋ฉด ์‹คํŒจํ•ฉ๋‹ˆ๋‹ค)

    lstm_creator = lambda: tf.contrib.rnn.BasicLSTMCell(
                                        hidden_size, 
                                        forget_bias=0.0, state_is_tuple=True)
    if is_training and keep_prob < 1:
      cell_creator = lambda:tf.contrib.rnn.DropoutWrapper(
          lstm_creator(), output_keep_prob=keep_prob)
    else:
      cell_creator = lstm_creator

    cell = tf.contrib.rnn.MultiRNNCell([cell_creator() for _ in range(num_layers)], state_is_tuple=True)

๊ทธ๋ž˜์„œ ๋งˆ์ง€๋ง‰์œผ๋กœ lstm_creator ๋ฅผ tensorflow/models/tutorials/rnn/ptb/ptb_word_lm.py#L112 ์—์„œ lstm_cell ์™€ ๊ฐ™์€ ํ•จ์ˆ˜๋กœ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ง€๊ธˆ ๊ฐ€์ง€๊ณ ์žˆ๋‹ค :

def lstm_cell():
      # With the latest TensorFlow source code (as of Mar 27, 2017),
      # the BasicLSTMCell will need a reuse parameter which is unfortunately not
      # defined in TensorFlow 1.0. To maintain backwards compatibility, we add
      # an argument check here:
      if 'reuse' in inspect.getargspec(
          tf.contrib.rnn.BasicLSTMCell.__init__).args:
        return tf.contrib.rnn.BasicLSTMCell(
            size, forget_bias=0.0, state_is_tuple=True,
            reuse=tf.get_variable_scope().reuse)
      else:
        return tf.contrib.rnn.BasicLSTMCell(
            size, forget_bias=0.0, state_is_tuple=True)
    attn_cell = lstm_cell

    lstm_creator = lstm_cell
    if is_training and keep_prob < 1:
      cell_creator = lambda:tf.contrib.rnn.DropoutWrapper(
          lstm_creator(), output_keep_prob=keep_prob)
    else:
      cell_creator = lstm_creator

    cell = tf.contrib.rnn.MultiRNNCell([cell_creator() for _ in range(num_layers)], state_is_tuple=True)

์ด์ œ ์™„์ „ํžˆ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

์ด ์ž‘์—…์„ ์‹คํ–‰ํ•˜๋ ค๊ณ  ํ•˜๋ฉด ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

https://gist.github.com/danijar/c7ec9a30052127c7a1ad169eeb83f159#file -blog_tensorflow_sequence_classification-py-L38

@pltrdy ์˜ ์†”๋ฃจ์…˜์€ ์ด์ƒํ•˜๊ฒŒ ๋‚˜๋ฅผ ์œ„ํ•ด ๊ทธ๊ฒƒ์„ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ ์ 

ValueError: Variable rnn/multi_rnn_cell/cell_0/basic_lstm_cell/weights does not exist, or was not created with tf.get_variable(). Did you mean to set reuse=None in VarScope?

@aep ๋Š” https://github.com/tensorflow/models/blob/master/tutorials/rnn/ptb/ptb_word_lm.py#L112 ์˜ ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜์…จ์Šต๋‹ˆ๊นŒ? ๋‚ด ๊ฒŒ์‹œ๋ฌผ ๋์—์„œ ์–ธ๊ธ‰ํ•œ(์ด์ œ ๋” ๋ช…ํ™•ํ•˜๊ฒŒ ํŽธ์ง‘ )

cells=[]
for _ in range(15):
    cell = create_lstm_cell(config)
    cells.append(cell)
lsmt_layers = rnn.MultiRNNCell(cells)

๊ทธ๊ฒƒ์€ ๋‚ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ–ˆ๋‹ค

์ด์ „ ๋ฒ„์ „์˜ Tensorflow๋ฅผ ์„ค์น˜ํ•˜์—ฌ ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ–ˆ์Šต๋‹ˆ๋‹ค.
pip install -Iv tensorflow==1.0

seq2seq ํŠœํ† ๋ฆฌ์–ผ์„ ์‹คํ–‰ํ•  ๋•Œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.

@ebrevdo ๊ฐ€ ๋งํ•œ ๊ฒƒ๊ณผ ๊ด€๋ จํ•˜์—ฌ ์†”๋ฃจ์…˜์€ ๋ ˆ๊ฑฐ์‹œ seq2seq ์ฝ”๋“œ๋ฅผ ์ˆ˜์ •ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์ ๊ทน์ ์œผ๋กœ ์œ ์ง€ ๊ด€๋ฆฌ๋˜๋Š” contrib.seq2seq ํŒจํ‚ค์ง€๋ฅผ ๋Œ€์‹  ์‚ฌ์šฉํ•˜๋„๋ก ์ž์Šต์„œ๋ฅผ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์ฒ˜์Œ์œผ๋กœ ์‹คํ–‰ํ•œ tensorflow ํ”„๋กœ๊ทธ๋žจ์ด ๋งŽ์€ ์˜ค๋ฅ˜๋ฅผ ๋ฑ‰์–ด๋‚ด๋Š” ๊ฒƒ์€ ๋งค์šฐ ์‚ฌ๊ธฐ๋ฅผ ์ €ํ•˜์‹œํ‚ต๋‹ˆ๋‹ค. ์ด๋ฒˆ ์ฃผ์— ์‹œ๊ฐ„์ด ๋˜๋ฉด PR์„ ์ œ์ถœํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

์šฐ๋ฆฌ๋Š” ์ƒˆ๋กœ์šด seq2seq ํŠœํ† ๋ฆฌ์–ผ์„ ์ง„ํ–‰ ์ค‘์ž…๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๋ง๊นŒ์ง€ ์ถœ์‹œ๋˜๊ธฐ๋ฅผ ๋ฐ”๋ž์Šต๋‹ˆ๋‹ค.
์ง€๋‚œ ๋‹ฌ์ด์ง€๋งŒ ์ง€์—ฐ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ƒˆ API๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

2017๋…„ 5์›” 1์ผ ์˜ค์ „ 8์‹œ 7๋ถ„์— "Kyle Teague" [email protected] ์ด ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

@ebrevdo https://github.com/ebrevdo ๊ฐ€ ๋งํ•œ ๊ฒƒ๊ณผ ๊ด€๋ จํ•˜์—ฌ ์ œ ์ƒ๊ฐ์—๋Š”
ํ•ด๊ฒฐ์ฑ…์€ ๋ ˆ๊ฑฐ์‹œ seq2seq ์ฝ”๋“œ๋ฅผ ์ˆ˜์ •ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์—…๋ฐ์ดํŠธํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๋Œ€์‹  ์ ๊ทน์ ์œผ๋กœ contrib.seq2seq ํŒจํ‚ค์ง€๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์ž์Šต์„œ
์œ ์ง€. ์ฒ˜์Œ tensorflow ํ”„๋กœ๊ทธ๋žจ์„ ์‹œ์ž‘ํ•  ๋•Œ ๋งค์šฐ ์‚ฌ๊ธฐ๋ฅผ ๋–จ์–ด๋œจ๋ฆฝ๋‹ˆ๋‹ค.
์ด์ œ๊นŒ์ง€ ์‹คํ–‰ํ•˜๋ฉด ๋งŽ์€ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฒˆ ์ฃผ์— ์‹œ๊ฐ„์ด ๋œ๋‹ค๋ฉด
PR์„ ์ œ์ถœํ•ฉ๋‹ˆ๋‹ค.

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-298350307 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABtim587xZx9Gi4-yXmwccSum8_Trc1oks5r1fUogaJpZM4MWl4f
.

@ebrevdo tensorflow1.1 ์›น์‚ฌ์ดํŠธ์—์„œ sequence_to_sequence ๋ชจ๋ธ์„ ์‹คํ–‰ํ•  ๋•Œ๋„ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  'use' ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๊ณ  ํ–ˆ์ง€๋งŒ ์‹คํŒจํ–ˆ์Šต๋‹ˆ๋‹ค. ์ƒˆ๋กœ์šด seq2seq ํŠœํ† ๋ฆฌ์–ผ์ด ์–ธ์ œ ์ถœ์‹œ๋˜๋Š”์ง€ ์•Œ๋ ค์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

tf 1.2์™€ ๋™์‹œ์— ๋ณด์ž…๋‹ˆ๋‹ค. ์™œ๋ƒํ•˜๋ฉด ์šฐ๋ฆฌ๋Š” ์ƒˆ๋กœ์šด
ํ•ด๋‹น ๋ฆด๋ฆฌ์Šค์˜ ๊ธฐ๋Šฅ.

2017๋…„ 5์›” 4์ผ ์˜คํ›„ 9์‹œ 16๋ถ„์— "njuzrs" [email protected] ์ด ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

@ebrevdo https://github.com/ebrevdo ์‹คํ–‰ํ•  ๋•Œ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค
tensorflow1.1 ์›น์‚ฌ์ดํŠธ์˜ sequence_to_sequence ๋ชจ๋ธ. ๊ทธ๋ฆฌ๊ณ  ๋‚˜๋Š” ์‹œ๋„ํ–ˆ๋‹ค
'์žฌ์‚ฌ์šฉ' ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ฌ์šฉํ–ˆ์ง€๋งŒ ์‹คํŒจํ–ˆ์Šต๋‹ˆ๋‹ค. ์ƒˆ๋กœ์šด seq2seq๊ฐ€ ์–ธ์ œ์ธ์ง€ ์•Œ๋ ค์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?
ํŠœํ† ๋ฆฌ์–ผ์ด ๊ณต๊ฐœ๋˜๋‚˜์š”?

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-299366774 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABtim8_kFTM7-SsXQAA-Ar0dfhHMGT0Zks5r2qKngaJpZM4MWl4f
.

@ebrevdo ์ €๋„ ๊ฐ™์€ ๋ฌธ์ œ์— ์ง๋ฉดํ•ด ์žˆ์œผ๋ฉฐ seq2seq๋กœ ์ง„ํ–‰ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ƒˆ ํŠœํ† ๋ฆฌ์–ผ์˜ ๊ฐ€๋Šฅํ•œ ๋‚ ์งœ๋ฅผ ์•Œ๋ ค์ฃผ์‹œ๋ฉด ์ •๋ง ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๋„์™€ ์ฃผ์…”์„œ ์ •๋ง๋กœ ๊ณ ๋ง™์Šต๋‹ˆ๋‹ค.

pip install tensorflow==1.0 (Tensorflow 1.0)์„ ์‚ฌ์šฉํ•˜์—ฌ ์„ค์น˜ํ•˜๋Š” ๊ฒƒ์ด ํšจ๊ณผ์ ์ž…๋‹ˆ๋‹ค(ํŠœํ† ๋ฆฌ์–ผ ๋ฒˆ์—ญ).

๋ฒ„์ „ 1.1.0-rc2๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

TF1.2๊ฐ€ ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ• ๊นŒ์š”? ๋ชจ๋ธ ๊ต์œก์„ ๊ณ„์†ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ๋ ค์ฃผ์„ธ์š”. TF 1.0์€ ์ž‘๋™ํ•˜์ง€๋งŒ ์—ฌ๋Ÿฌ GPU์— ๋Œ€ํ•œ devicewrapper API๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

ํ…์„œ ํ๋ฆ„ 1.1๊ณผ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ์ „ํžˆ ์†”๋ฃจ์…˜ ์ž‘์—… ์ค‘

์—ฌ๋Ÿฌ ๊ฐ€์ง€๋ฅผ ์‹œ๋„ํ–ˆ์ง€๋งŒ ๊ฒฐ๊ตญ tensorflow 1.1์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์—ˆ์ง€๋งŒ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ณ€๊ฒฝํ•ด์•ผ ํ–ˆ์Šต๋‹ˆ๋‹ค. (์œ„์˜ Tshzzz ๊ธฐ๋ฐ˜)

์ด๊ฒƒ์„ ์ œ๊ฑฐํ•˜์‹ญ์‹œ์˜ค:
multicell = rnn.MultiRNNCell([dropcell]*NLAYERS, state_is_tuple=False)

๊ทธ๋ฆฌ๊ณ  ์ด๊ฒƒ์„ ์ถ”๊ฐ€ํ•˜์‹ญ์‹œ์˜ค:
์…€=[]
_ ๋ฒ”์œ„ ๋‚ด(NLAYERS):
์…€ = rnn.DropoutWrapper(tf.contrib.rnn.GRUCell(INTERNALSIZE), input_keep_prob=pkeep)
cells.append(์…€)
multicell = rnn.MultiRNNCell(์…€, state_is_tuple=False)

@ebrevdo ์ถ•ํ•˜ํ•ฉ๋‹ˆ๋‹ค. TF 1.2๊ฐ€ ๋ฐฉ๊ธˆ ์ถœ์‹œ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ƒˆ ํŠœํ† ๋ฆฌ์–ผ๋„ ์–ด๋”˜๊ฐ€์— ์ถœ์‹œ๋˜์—ˆ๋‚˜์š”? ์•„๋‹ˆ๋ฉด ๊ณง ์ถœ์‹œ๋  ์˜ˆ์ •์ธ๊ฐ€์š”?

๊ฐ์‚ฌ ํ•ด์š”

์ถœ์‹œ๋˜๋ฉด ๊ณต์ง€ํ•  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค. ์ž‘์—… ์ค‘์ž…๋‹ˆ๋‹ค.

2017๋…„ 5์›” 19์ผ ์˜คํ›„ 7์‹œ 2๋ถ„์— "prashantserai" [email protected] ์ด ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

@ebrevdo https://github.com/ebrevdo ์ถ•ํ•˜ํ•ฉ๋‹ˆ๋‹ค. TF 1.2๊ฐ€ ๋‚˜์™”์Šต๋‹ˆ๋‹ค.
์ถœ์‹œ๋จ - ์ƒˆ ํŠœํ† ๋ฆฌ์–ผ๋„ ์–ด๋”˜๊ฐ€์— ์ถœ์‹œ๋˜์—ˆ์Šต๋‹ˆ๊นŒ?
๊ณง ์ถœ์‹œ?

๊ฐ์‚ฌ ํ•ด์š”

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-302844002 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABtim0RWDzNCXk-bIjKSyHLvgFxUvq2lks5r7km7gaJpZM4MWl4f
.

tensorflow-gpu==1.1.0์„ ์‚ฌ์šฉํ•˜๊ณ  ์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜๋Š” ์‚ฌ๋žŒ์€ pip install tensorflow-gpu==1.0.0์„ ํ†ตํ•ด 1.0.0์œผ๋กœ ์ „ํ™˜ํ•ด๋„ ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ ์–ด๋„ ์ €์—๊ฒŒ๋Š” ํšจ๊ณผ๊ฐ€ ์—†์—ˆ์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” mac๊ณผ ubuntu ๋ชจ๋‘์—์„œ ์ด ๋ฌธ์ œ์— ๋ถ€๋”ช์ณค๊ณ  ์†Œ์Šค์—์„œ ์ปดํŒŒ์ผํ•˜๋Š” ๊ฒƒ์ด ๋‘ ๋ฒˆ ๋ชจ๋‘ ์ž‘๋™ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ:
ํ• ์„ค์น˜ https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.0-cp34-cp34m-linux_x86_64.whl

@ajaanbaahu ์—ฌ์ „ํžˆ tf1.2 ์ƒˆ๋กœ์šด seq2seq ํŠœํ† ๋ฆฌ์–ผ์„ ๊ธฐ๋‹ค๋ฆฌ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

pip install tensorflow==1.0 ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ €์—๊ฒŒ ํšจ๊ณผ์ ์ด์—ˆ์Šต๋‹ˆ๋‹ค.

tf r1.2์˜ ๊ฒฝ์šฐ deepcopy ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ ์˜ค๋ฅ˜ #1050์„ ์ˆœ์„œ ๋Œ€๋กœ ๋‚˜์—ดํ•œ ๋Œ€๋กœ

์‹ ์ธ์œผ๋กœ์„œ ์ œ ์˜๊ฒฌ์„ ์ข€ ์˜ฌ๋ฆฝ๋‹ˆ๋‹ค.
๋‹ค์Œ ์ฝ”๋“œ๋Š” ์ด์™€ ์œ ์‚ฌํ•œ ์‹ค์ˆ˜๋ฅผ ๋ฐœ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.
(๋‚ด ์ฝ”๋“œ์˜ ์ผ๋ถ€)

lstm_cell = self.LSTMCell(self.num_hidden)
lstm_entity = tf.contrib.rnn.DropoutWrapper(lstm_cell, output_keep_prob=0.5)
layer = tf.contrib.rnn.MultiRNNCell([lstm_entity] * self.num_layer)
__, _ = tf.nn.dynamic_rnn(layer, self.data, dtype=tf.float64)

๋‹ค์Œ๊ณผ ๊ฐ™์€ ์˜ค๋ฅ˜ ๋คํ”„:

Traceback (most recent call last):
  File "IntentNet.py", line 71, in <module>
    net = Net(data, target, 5, 1)
  File "IntentNet.py", line 45, in __init__
    __, _ = tf.nn.dynamic_rnn(layer, self.data, dtype=tf.float64)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 553, in dynamic_rnn
    dtype=dtype)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 720, in _dynamic_rnn_loop
    swap_memory=swap_memory)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2623, in while_loop
    result = context.BuildLoop(cond, body, loop_vars, shape_invariants)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2456, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/control_flow_ops.py", line 2406, in _BuildLoop
    body_result = body(*packed_vars_for_body)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 705, in _time_step
    (output, new_state) = call_cell()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/rnn.py", line 691, in <lambda>
    call_cell = lambda: cell(input_t, state)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 953, in __call__
    cur_inp, new_state = cell(cur_inp, cur_state)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 713, in __call__
    output, new_state = self._cell(inputs, state, scope)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 235, in __call__
    with _checked_scope(self, scope or "basic_lstm_cell", reuse=self._reuse):
  File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 77, in _checked_scope
    type(cell).__name__))
ValueError: Attempt to reuse RNNCell <tensorflow.contrib.rnn.python.ops.core_rnn_cell_impl.BasicLSTMCell object at 0x7fe4fc7bd150> with a different variable scope than its first use.  First use of cell was with scope 'rnn/multi_rnn_cell/cell_0/basic_lstm_cell', this attempt is with scope 'rnn/multi_rnn_cell/cell_1/basic_lstm_cell'.  Please create a new instance of the cell if you would like it to use a different set of weights.  If before you were using: MultiRNNCell([BasicLSTMCell(...)] * num_layers), change to: MultiRNNCell([BasicLSTMCell(...) for _ in range(num_layers)]).  If before you were using the same cell instance as both the forward and reverse cell of a bidirectional RNN, simply create two instances (one for forward, one for reverse).  In May 2017, we will start transitioning this cell's behavior to use existing stored weights, if any, when it is called with scope=None (which can lead to silent model degradation, so this error will remain until then.)

๊ทธ๋Ÿฌ๋‚˜ ์ˆ˜์ • ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•œ ํ›„์—๋Š” ์ž‘๋™ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

"""
lstm_cell = self.LSTMCell(self.num_hidden)
lstm_entity = tf.contrib.rnn.DropoutWrapper(lstm_cell, output_keep_prob=0.5)
layer = tf.contrib.rnn.MultiRNNCell([lstm_entity] * self.num_layer)
"""
layer = []
for i in range(self.num_layer):
    lstm_cell = self.LSTMCell(self.num_hidden)
    lstm_entity = tf.contrib.rnn.DropoutWrapper(lstm_cell, output_keep_prob=0.5)
    layer.append(lstm_entity)
layer = tf.contrib.rnn.MultiRNNCell(layer)
__, _ = tf.nn.dynamic_rnn(layer, self.data, dtype=tf.float64)

์ด๋Ÿฌํ•œ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ• ์ค‘ ์–ด๋Š ๊ฒƒ๋„ Tensorflow 1.1์—์„œ ์ž‘๋™ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.

MultiRNNCell ์…€์ด ์žˆ๋Š” seq2seq ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” 1.0.1๋กœ ๋˜๋Œ์•„๊ฐ€์•ผ ํ–ˆ๋‹ค: pip3 install tensorflow==1.0

legacy_seq2seq.rnn_decoder()๋กœ ์ž‘์—…ํ•  ๋•Œ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๊ฐ€ ์žˆ๋Š” ์‚ฌ๋žŒ์ด ์žˆ์Šต๋‹ˆ๊นŒ?

@oxwsds ๋ง์”€ํ•˜์‹  ๋Œ€๋กœ tf.contrib.legacy_seq2seq.embedding_attention_seq2seq์˜ ์ž…๋ ฅ ์ธ์ˆ˜ ์…€์„ ๋‘ ๊ฐœ์˜ ๋‹ค๋ฅธ ์…€ {encoder_cells, decoder_cells}๋กœ ๋ณ€๊ฒฝํ•ฉ๋‹ˆ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ seq2seq ๋ชจ๋ธ์ด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. 73200 setps ํ›„์— ๋‹นํ˜น๋„ 5.54๋ฅผ ์–ป์Šต๋‹ˆ๋‹ค.
๊ทธ๋Ÿฐ ๋‹ค์Œ ๋””์ฝ”๋”ฉ ๋ถ€๋ถ„์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

๋ฏธ๊ตญ ๋Œ€ํ†ต๋ น์€ ๋ˆ„๊ตฌ์ž…๋‹ˆ๊นŒ?
Qui est le prรฉsident des ร‰tats-Unis?

๋ฌธ์ œ ํ•ด๊ฒฐ๋จ. ๊ฐ์‚ฌ ํ•ด์š”.

@doncat99
seq2seq.py ์˜ copy.deepcopy(cell) $์ด(๊ฐ€) ์ ์šฉ๋˜์ง€ ์•Š๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.
๋”ฐ๋ผ์„œ seq2seq_model.py ์˜ ๊ด€๋ จ ๋ถ€๋ถ„์„ ๋‹ค์Œ์œผ๋กœ ๋ณ€๊ฒฝํ•ฉ๋‹ˆ๋‹ค.

if num_layers > 1:
      cell_enc = tf.contrib.rnn.MultiRNNCell([single_cell() for _ in range(num_layers)])
      cell_dec = tf.contrib.rnn.MultiRNNCell([single_cell() for _ in range(num_layers)])

    # The seq2seq function: we use embedding for the input and attention.
    def seq2seq_f(encoder_inputs, decoder_inputs, do_decode):
      return seq2seq.embedding_attention_seq2seq(
          encoder_inputs,
          decoder_inputs,
          cell_enc,
          cell_dec,
          num_encoder_symbols=source_vocab_size,
          num_decoder_symbols=target_vocab_size,
          embedding_size=size,
          output_projection=output_projection,
          feed_previous=do_decode,
          dtype=dtype)

@supermeatboy82 , ์ฝ”๋“œ๋ฅผ ๊ณต์œ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

Tensorflow 1.2.0์œผ๋กœ ์—…๊ทธ๋ ˆ์ด๋“œํ•˜๊ณ  ๋ชฉ๋ก ๊ณฑ์…ˆ ๋Œ€์‹  ๋ฃจํ”„์—์„œ ์…€์„ ์ƒ์„ฑํ•˜๋ฉด ์ด ๋ฌธ์ œ๊ฐ€ ํ•ด๊ฒฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

translate.py๋ฅผ ์‹คํ–‰ํ•  ๋•Œ TF1.2์— ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ์„ธ๋ถ€ ์ •๋ณด:
์ด๋ฆ„: ์ง€ํฌ์Šค GTX 1080 Ti
๋ฉ”์ด์ €: 6 ๋งˆ์ด๋„ˆ: 1 memoryClockRate(GHz) 1.582
pci๋ฒ„์ŠคID 0000:02:00.0
์ด ๋ฉ”๋ชจ๋ฆฌ: 10.91GiB
์—ฌ์œ  ๋ฉ”๋ชจ๋ฆฌ: 10.76GiB
2017-06-22 09:15:04.485252: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2017-06-22 09:15:04.485256: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y
2017-06-22 09:15:04.485265: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] TensorFlow ์žฅ์น˜ ์ƒ์„ฑ(/gpu:0) -> (์žฅ์น˜: 0, ์ด๋ฆ„: GeForce GTX 1080 Ti, pci ๋ฒ„์Šค ID: 0000:02:00.0)
1024 ์œ ๋‹›์˜ 3๊ฐœ์˜ ๋ ˆ์ด์–ด๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
์—ญ์ถ”์ (๊ฐ€์žฅ ์ตœ๊ทผ ํ˜ธ์ถœ ๋งˆ์ง€๋ง‰):
ํŒŒ์ผ "translate.py", 322ํ–‰,
tf.app.run()
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/platform/app.py", 48ํ–‰, ์‹คํ–‰ ์ค‘
_sys.exit(main(_sys.argv[:1] + flags_passthrough))
ํŒŒ์ผ "translate.py", 319ํ–‰, ๋ฉ”์ธ
๊ธฐ์ฐจ()
๊ธฐ์ฐจ์—์„œ ํŒŒ์ผ "translate.py", 178ํ–‰
๋ชจ๋ธ = create_model(sess, False)
create_model์˜ ํŒŒ์ผ "translate.py", 136ํ–‰
dtype=dtype)
ํŒŒ์ผ "/data/research/github/dl/tensorflow/tensorflow/models/tutorials/rnn/translate/seq2seq_model.py", 179ํ–‰, __init__
softmax_loss_function=softmax_loss_function)
model_with_buckets์˜ ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", 1206ํ–‰
๋””์ฝ”๋”_์ž…๋ ฅ[:๋ฒ„ํ‚ท[1]])
ํŒŒ์ผ "/data/research/github/dl/tensorflow/tensorflow/models/tutorials/rnn/translate/seq2seq_model.py", ๋ผ์ธ 178, in
๋žŒ๋‹ค x, y: seq2seq_f(x, y, False),
ํŒŒ์ผ "/data/research/github/dl/tensorflow/tensorflow/models/tutorials/rnn/translate/seq2seq_model.py", 142ํ–‰, seq2seq_f
dtype=dtype)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", 848ํ–‰, embedding_attention_seq2seq
์ธ์ฝ”๋”_์…€ = copy.deepcopy(์…€)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 174ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/site-packages/tensorflow/python/layers/base.py", 476ํ–‰, __deepcopy__
setattr(๊ฒฐ๊ณผ, k, copy.deepcopy(v, ๋ฉ”๋ชจ))
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 230ํ–‰, _deepcopy_list
y.append(deepcopy(a, ๋ฉ”๋ชจ))
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 190ํ–‰, deepcopy
y = _reconstruct(x, rv, 1, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 334ํ–‰, _reconstruct
์ƒํƒœ = deepcopy(์ƒํƒœ, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 257ํ–‰, _deepcopy_dict
y[deepcopy(ํ‚ค, ๋ฉ”๋ชจ)] = deepcopy(๊ฐ’, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 190ํ–‰, deepcopy
y = _reconstruct(x, rv, 1, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 334ํ–‰, _reconstruct
์ƒํƒœ = deepcopy(์ƒํƒœ, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 257ํ–‰, _deepcopy_dict
y[deepcopy(ํ‚ค, ๋ฉ”๋ชจ)] = deepcopy(๊ฐ’, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 190ํ–‰, deepcopy
y = _reconstruct(x, rv, 1, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 334ํ–‰, _reconstruct
์ƒํƒœ = deepcopy(์ƒํƒœ, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 257ํ–‰, _deepcopy_dict
y[deepcopy(ํ‚ค, ๋ฉ”๋ชจ)] = deepcopy(๊ฐ’, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 190ํ–‰, deepcopy
y = _reconstruct(x, rv, 1, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 334ํ–‰, _reconstruct
์ƒํƒœ = deepcopy(์ƒํƒœ, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 257ํ–‰, _deepcopy_dict
y[deepcopy(ํ‚ค, ๋ฉ”๋ชจ)] = deepcopy(๊ฐ’, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 257ํ–‰, _deepcopy_dict
y[deepcopy(ํ‚ค, ๋ฉ”๋ชจ)] = deepcopy(๊ฐ’, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 230ํ–‰, _deepcopy_list
y.append(deepcopy(a, ๋ฉ”๋ชจ))
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 190ํ–‰, deepcopy
y = _reconstruct(x, rv, 1, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 334ํ–‰, _reconstruct
์ƒํƒœ = deepcopy(์ƒํƒœ, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 257ํ–‰, _deepcopy_dict
y[deepcopy(ํ‚ค, ๋ฉ”๋ชจ)] = deepcopy(๊ฐ’, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 190ํ–‰, deepcopy
y = _reconstruct(x, rv, 1, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 334ํ–‰, _reconstruct
์ƒํƒœ = deepcopy(์ƒํƒœ, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 257ํ–‰, _deepcopy_dict
y[deepcopy(ํ‚ค, ๋ฉ”๋ชจ)] = deepcopy(๊ฐ’, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 190ํ–‰, deepcopy
y = _reconstruct(x, rv, 1, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 334ํ–‰, _reconstruct
์ƒํƒœ = deepcopy(์ƒํƒœ, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 257ํ–‰, _deepcopy_dict
y[deepcopy(ํ‚ค, ๋ฉ”๋ชจ)] = deepcopy(๊ฐ’, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 230ํ–‰, _deepcopy_list
y.append(deepcopy(a, ๋ฉ”๋ชจ))
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 237ํ–‰, _deepcopy_tuple
y.append(deepcopy(a, ๋ฉ”๋ชจ))
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 257ํ–‰, _deepcopy_dict
y[deepcopy(ํ‚ค, ๋ฉ”๋ชจ)] = deepcopy(๊ฐ’, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 190ํ–‰, deepcopy
y = _reconstruct(x, rv, 1, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 334ํ–‰, _reconstruct
์ƒํƒœ = deepcopy(์ƒํƒœ, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 163ํ–‰, deepcopy
y = ๋ณต์‚ฌ๊ธฐ(x, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 257ํ–‰, _deepcopy_dict
y[deepcopy(ํ‚ค, ๋ฉ”๋ชจ)] = deepcopy(๊ฐ’, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 190ํ–‰, deepcopy
y = _reconstruct(x, rv, 1, ๋ฉ”๋ชจ)
ํŒŒ์ผ "/home/lscm/opt/anaconda2/lib/python2.7/copy.py", 343ํ–‰, _reconstruct
y.__dict__.update(์ƒํƒœ)
AttributeError: 'NoneType' ๊ฐœ์ฒด์— '์—…๋ฐ์ดํŠธ' ์†์„ฑ์ด ์—†์Šต๋‹ˆ๋‹ค.

ํŠœํ† ๋ฆฌ์–ผ์˜ ๋ฒˆ์—ญ ๋ชจ๋ธ์—์„œ self_test() ๋ฅผ ์‹คํ–‰ํ•  ๋•Œ embedding_attention_seq2seq() copy.deepcopy(cell) ๋กœ ์ธํ•œ ์˜ค๋ฅ˜๋„ ๋งŒ๋‚ฌ์Šต๋‹ˆ๋‹ค.
Seq2SeqModel seq2seq_f() ์˜ ์ฝ”๋“œ๋ฅผ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋ณ€๊ฒฝํ•˜๋ ค๊ณ  ํ–ˆ์Šต๋‹ˆ๋‹ค.

    def seq2seq_f(encoder_inputs, decoder_inputs, do_decode=False):
        tmp_cell = copy.deepcopy(cell) #new
        return tf.contrib.legacy_seq2seq.embedding_attention_seq2seq(
            encoder_inputs,
            decoder_inputs,
            tmp_cell, #new
            num_encoder_symbols=source_vocab_size,
            num_decoder_symbols=target_vocab_size,
            embedding_size=size,
            output_projection=output_projection,
            feed_previous=do_decode,
            dtype=dtype)

๊ทธ๋Ÿฌ๋ฉด ์ด์ œ ์˜ค๋ฅ˜๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.
๊ทธ๋Ÿฌ๋‚˜ ์‹ ์ธ์œผ๋กœ์„œ ์—ฌ๊ธฐ ์ฝ”๋“œ๊ฐ€ ์ด์ „๊ณผ ๊ฐ™์ด ์ž‘๋™ํ•˜๋Š”์ง€ ์—ฌ๋ถ€๋ฅผ ์•Œ์ง€ ๋ชปํ•˜๋ฉฐ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์œผ๋กœ ์ธํ•ด ๋ชจ๋ธ์ด ๋Š๋ฆฌ๊ฒŒ ์‹คํ–‰๋˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

tensorflow๋ฅผ 1.0.0(tensorflow-GPU)์œผ๋กœ ๋‹ค์šด๊ทธ๋ ˆ์ด๋“œํ–ˆ์œผ๋ฉฐ ์ €์—๊ฒŒ ํšจ๊ณผ์ ์ด๋ผ๋Š” ์‚ฌ์‹ค์„ ๋ชจ๋“  ์‚ฌ๋žŒ์—๊ฒŒ ์—…๋ฐ์ดํŠธํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ์ด ์˜ˆ์ƒ๋Œ€๋กœ ์ž‘๋™ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. 1.0.0์˜ CPU ๋ฒ„์ „์ด ์˜ˆ์ƒ๋Œ€๋กœ ์ž‘๋™ํ•ด์•ผ ํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๊นŒ? ๋˜๋Š”?.
๊ฐ์‚ฌ ํ•ด์š” :)

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„, ์—ฌ์ „ํžˆ ๊ด€์‹ฌ์ด ์žˆ๋Š”์ง€๋Š” ๋ชจ๋ฅด๊ฒ ์ง€๋งŒ embedding_attention_seq2seq ํ•จ์ˆ˜์— ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ ์ „๋‹ฌ๋œ ์…€์„ ๋ณต์‚ฌํ•˜๋Š” ์ž‘์—…๊ณผ ๊ด€๋ จ๋œ ๋ฌธ์ œ๋ผ๋Š” ๊ฒƒ์„ ์•Œ์•˜์Šต๋‹ˆ๋‹ค. ์ธ์ฝ”๋”์™€ ๋””์ฝ”๋” ๋ชจ๋‘์— ๋™์ผํ•œ ์…€ ์ •์˜๊ฐ€ ์‚ฌ์šฉ๋˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ด ํŠœํ† ๋ฆฌ์–ผ์€ ๋™์  seq2seq์™€ ๋Œ€์กฐ์ ์œผ๋กœ ๋ฒ„ํ‚ทํŒ…์ด ์žˆ๋Š” seq2seq ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋” ์ด์ƒ ์‚ฌ์šฉ๋˜์ง€ ์•Š๋Š”๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ž‘๋™ํ•˜๋Š” ์ˆ˜์ •๋œ ๊ธฐ๋Šฅ์„ ๋ถ™์—ฌ๋„ฃ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ํ•จ์ˆ˜๋Š” tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py ํŒŒ์ผ์—์„œ ์—…๋ฐ์ดํŠธ๋ฉ๋‹ˆ๋‹ค.

๊ฐ์‚ฌ,
ํŒŒ๋น„์˜ค

```!ํŒŒ์ด์ฌ
def embedding_attention_seq2seq(encoder_inputs,
๋””์ฝ”๋”_์ž…๋ ฅ,
enc_cell,
dec_cell,
num_encoder_symbols,
num_decoder_symbols,
์ž„๋ฒ ๋”ฉ_ํฌ๊ธฐ,
num_heads=1,
output_projection=์—†์Œ,
feed_previous=๊ฑฐ์ง“,
dtype=์—†์Œ,
๋ฒ”์œ„=์—†์Œ,
initial_state_attention=๊ฑฐ์ง“):
"""์ฃผ์˜๋ฅผ ๊ธฐ์šธ์—ฌ sequence-to-sequence ๋ชจ๋ธ์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.

์ด ๋ชจ๋ธ์€ ๋จผ์ € ์ƒˆ๋กœ ์ƒ์„ฑ๋œ ์ž„๋ฒ ๋”ฉ(๋ชจ์–‘
[num_encoder_symbols x input_size]). ๊ทธ๋Ÿฐ ๋‹ค์Œ RNN์„ ์‹คํ–‰ํ•˜์—ฌ ์ธ์ฝ”๋”ฉํ•ฉ๋‹ˆ๋‹ค.
์ž„๋ฒ ๋””๋“œ ์ธ์ฝ”๋”_์ž…๋ ฅ์„ ์ƒํƒœ ๋ฒกํ„ฐ์— ๋„ฃ์Šต๋‹ˆ๋‹ค. ์ด ์ถœ๋ ฅ์„ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค.
๋‚˜์ค‘์— ์ฃผ์˜๋ฅผ ๋Œ๊ธฐ ์œ„ํ•ด ๋ชจ๋“  ๋‹จ๊ณ„์—์„œ RNN์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ์œผ๋กœ ๋””์ฝ”๋”_์ž…๋ ฅ์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.
์ƒˆ๋กœ ์ƒ์„ฑ๋œ ๋‹ค๋ฅธ ์ž„๋ฒ ๋”ฉ(๋ชจ์–‘ [num_decoder_symbols x
input_size]). ๊ทธ๋Ÿฐ ๋‹ค์Œ ๋งˆ์ง€๋ง‰์œผ๋กœ ์ดˆ๊ธฐํ™”๋œ ์ฃผ์˜ ๋””์ฝ”๋”๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.
์ธ์ฝ”๋” ์ƒํƒœ, ์ž„๋ฒ ๋””๋“œ ๋””์ฝ”๋”_์ž…๋ ฅ ๋ฐ ์ธ์ฝ”๋” ์ถœ๋ ฅ์— ์ฃผ์˜.

๊ฒฝ๊ณ : output_projection์ด None์ด๋ฉด ์ฃผ์˜ ๋ฒกํ„ฐ์˜ ํฌ๊ธฐ
๋ณ€์ˆ˜๋Š” num_decoder_symbols์— ๋น„๋ก€ํ•˜์—ฌ ๋งŒ๋“ค์–ด์ง€๋ฉฐ ํด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ธ์ˆ˜:
์ธ์ฝ”๋”_์ž…๋ ฅ: [batch_size] ๋ชจ์–‘์˜ 1D int32 ํ…์„œ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค.
๋””์ฝ”๋”_์ž…๋ ฅ: ๋ชจ์–‘์ด [batch_size]์ธ 1D int32 ํ…์„œ์˜ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค.
cell: tf.nn.rnn_cell.RNNCell ์…€ ๊ธฐ๋Šฅ ๋ฐ ํฌ๊ธฐ๋ฅผ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค.
num_encoder_symbols: ์ •์ˆ˜; ์ธ์ฝ”๋” ์ธก์˜ ๊ธฐํ˜ธ ์ˆ˜.
num_decoder_symbols: ์ •์ˆ˜; ๋””์ฝ”๋” ์ธก์˜ ์‹ฌ๋ณผ ์ˆ˜.
Embedding_size: ์ •์ˆ˜, ๊ฐ ์‹ฌ๋ณผ์— ๋Œ€ํ•œ ์ž„๋ฒ ๋”ฉ ๋ฒกํ„ฐ์˜ ๊ธธ์ด.
num_heads: Attention_states์—์„œ ์ฝ์€ ์ฃผ์˜ ํ—ค๋“œ์˜ ์ˆ˜์ž…๋‹ˆ๋‹ค.
output_projection: ์—†์Œ ๋˜๋Š” ์ถœ๋ ฅ ํˆฌ์˜ ๊ฐ€์ค‘์น˜ ์Œ(W, B) ๋ฐ
ํŽธ๊ฒฌ; W์˜ ๋ชจ์–‘์€ [output_size x num_decoder_symbols]์ด๊ณ  B๋Š”
๋ชจ์–‘ [num_decoder_symbols]; ์ œ๊ณต๋˜๊ณ  feed_previous=True์ธ ๊ฒฝ์šฐ ๊ฐ๊ฐ
๊ณต๊ธ‰๋œ ์ด์ „ ์ถœ๋ ฅ์€ ๋จผ์ € W๋ฅผ ๊ณฑํ•˜๊ณ  B๋ฅผ ๋”ํ•ฉ๋‹ˆ๋‹ค.
feed_previous: ๋ถ€์šธ ๋˜๋Š” ์Šค์นผ๋ผ ๋ถ€์šธ ํ…์„œ ์ฐธ์ด๋ฉด ์ฒซ ๋ฒˆ์งธ
Decoder_inputs("GO" ๊ธฐํ˜ธ)๊ฐ€ ์‚ฌ์šฉ๋˜๊ณ  ๋‹ค๋ฅธ ๋ชจ๋“  ๋””์ฝ”๋”
์ž…๋ ฅ์€ ์ด์ „ ์ถœ๋ ฅ์—์„œ โ€‹โ€‹๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค(embedding_rnn_decoder์—์„œ์™€ ๊ฐ™์ด).
False์ธ ๊ฒฝ์šฐ Decoder_inputs๋Š” ์ฃผ์–ด์ง„ ๋Œ€๋กœ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค(ํ‘œ์ค€ ๋””์ฝ”๋” ๊ฒฝ์šฐ).
dtype: ์ดˆ๊ธฐ RNN ์ƒํƒœ์˜ dtype(๊ธฐ๋ณธ๊ฐ’: tf.float32).
๋ฒ”์œ„: ์ƒ์„ฑ๋œ ํ•˜์œ„ ๊ทธ๋ž˜ํ”„์— ๋Œ€ํ•œ VariableScope; ๊ธฐ๋ณธ๊ฐ’์€
"embedding_attention_seq2seq".
initial_state_attention: False(๊ธฐ๋ณธ๊ฐ’)์ธ ๊ฒฝ์šฐ ์ดˆ๊ธฐ ์ฃผ์˜๋Š” 0์ž…๋‹ˆ๋‹ค.
True์ด๋ฉด ์ดˆ๊ธฐ ์ƒํƒœ์—์„œ ์ฃผ์˜ ์ดˆ๊ธฐํ™” ๋ฐ ์ฃผ์˜
์ƒํƒœ.

๋ณด๊ณ :
(์ถœ๋ ฅ, ์ƒํƒœ) ํ˜•์‹์˜ ํŠœํ”Œ, ์—ฌ๊ธฐ์„œ:
output: 2D Tensor์˜ Decoder_inputs์™€ ๋™์ผํ•œ ๊ธธ์ด์˜ ๋ชฉ๋ก
์ƒ์„ฑ๋œ ๋ชจ์–‘์„ ํฌํ•จํ•˜๋Š” [batch_size x num_decoder_symbols]
์ถœ๋ ฅ.
state: ์ตœ์ข… ์‹œ๊ฐ„ ๋‹จ๊ณ„์—์„œ ๊ฐ ๋””์ฝ”๋” ์…€์˜ ์ƒํƒœ.
[batch_size x cell.state_size] ๋ชจ์–‘์˜ 2D Tensor์ž…๋‹ˆ๋‹ค.
""
variable_scope.variable_scope(
๋ฒ”์œ„ ๋˜๋Š” "embedding_attention_seq2seq", dtype=dtype) ๋ฒ”์œ„๋กœ:
dtype = ๋ฒ”์œ„.dtype
# ์ธ์ฝ”๋”.

encoder_cell = enc_cell

encoder_cell = core_rnn_cell.EmbeddingWrapper(
    encoder_cell,
    embedding_classes=num_encoder_symbols,
    embedding_size=embedding_size)
encoder_outputs, encoder_state = rnn.static_rnn(
    encoder_cell, encoder_inputs, dtype=dtype)

# First calculate a concatenation of encoder outputs to put attention on.
top_states = [
    array_ops.reshape(e, [-1, 1, encoder_cell.output_size]) for e in encoder_outputs
]
attention_states = array_ops.concat(top_states, 1)

# Decoder.
output_size = None
if output_projection is None:
  dec_cell = core_rnn_cell.OutputProjectionWrapper(dec_cell, num_decoder_symbols)
  output_size = num_decoder_symbols

if isinstance(feed_previous, bool):
  return embedding_attention_decoder(
      decoder_inputs,
      encoder_state,
      attention_states,
      dec_cell,
      num_decoder_symbols,
      embedding_size,
      num_heads=num_heads,
      output_size=output_size,
      output_projection=output_projection,
      feed_previous=feed_previous,
      initial_state_attention=initial_state_attention)

# If feed_previous is a Tensor, we construct 2 graphs and use cond.
def decoder(feed_previous_bool):
  reuse = None if feed_previous_bool else True
  with variable_scope.variable_scope(
      variable_scope.get_variable_scope(), reuse=reuse):
    outputs, state = embedding_attention_decoder(
        decoder_inputs,
        encoder_state,
        attention_states,
        dec_cell,
        num_decoder_symbols,
        embedding_size,
        num_heads=num_heads,
        output_size=output_size,
        output_projection=output_projection,
        feed_previous=feed_previous_bool,
        update_embedding_for_previous=False,
        initial_state_attention=initial_state_attention)
    state_list = [state]
    if nest.is_sequence(state):
      state_list = nest.flatten(state)
    return outputs + state_list

outputs_and_state = control_flow_ops.cond(feed_previous,
                                          lambda: decoder(True),
                                          lambda: decoder(False))
outputs_len = len(decoder_inputs)  # Outputs length same as decoder inputs.
state_list = outputs_and_state[outputs_len:]
state = state_list[0]
if nest.is_sequence(encoder_state):
  state = nest.pack_sequence_as(
      structure=encoder_state, flat_sequence=state_list)
return outputs_and_state[:outputs_len], state

```

@fabiofumarola ๊ธฐ๋Šฅ ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์ •๋ง ๋„์›€์ด ๋  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๋˜ํ•œ ํŠœํ† ๋ฆฌ์–ผ์ด ๋” ์ด์ƒ ์‚ฌ์šฉ๋˜์ง€ ์•Š๋Š” ๊ฒƒ์„ ๋ณด์•˜์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์—ฌ์ „ํžˆ ๊ณต์‹ ํŠœํ† ๋ฆฌ์–ผ ๋ฆด๋ฆฌ์Šค๋ฅผ ๊ธฐ๋‹ค๋ฆฌ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ƒˆ๋กœ์šด API๋ฅผ ์‚ฌ์šฉํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ƒˆ API์—์„œ ์ฝ”๋”ฉ์„ ์‹œ์ž‘ํ•˜๊ธฐ ์œ„ํ•ด ์กฐํšŒํ•  ์ˆ˜ ์žˆ๋Š” ์ฝ”๋“œ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?
๋„์›€์„ ์ฃผ์‹œ๋ฉด ๊ฐ์‚ฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ๋‹ค์‹œ ํ•œ๋ฒˆ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค :)

@syw2014 ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์…จ๋‚˜์š”?

@w268wang ์€ ์•„์ง ๋‹ค๋ฅธ ์†”๋ฃจ์…˜์„ ๊ธฐ๋‹ค๋ฆฌ๊ณ  ์žˆ์ง€๋งŒ @Miopas์˜ ์˜๊ฒฌ์€ ์‹œ๋„ํ•ด ๋ณผ ์ˆ˜ ์žˆ์œผ๋ฉฐ @fabiofumarola ์˜ ์†”๋ฃจ์…˜์„ ์‹œ๋„ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

TypeError: embedding_attention_seq2seq() missing 1 required positional argument: 'dec_cell' ๋ผ๊ณ  ์ ํ˜€์žˆ์Šต๋‹ˆ๋‹ค.
@fabiofumarola ๊ฐ€ ๊ฒŒ์‹œํ•œ ์—…๋ฐ์ดํŠธ๋ฅผ ์‚ฌ์šฉํ•œ ํ›„. ๋„์™€์ฃผ์‹œ๊ฒ ์–ด์š”?

์˜ˆ, ๋‚ด๊ฐ€ ์ œ์•ˆํ•œ ์—…๋ฐ์ดํŠธ๋ฅผ ๋ณ€๊ฒฝํ•ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์—
embedding_attention_seq2seq ํ•จ์ˆ˜. ์ž์‹ ์˜ ์†Œ์Šค ํŒŒ์ผ๋กœ ์ด๋™ํ•˜๋ฉด
tensorflow ๋ฆด๋ฆฌ์Šค์—์„œ๋Š” ์Šค์Šค๋กœ ๋ฉ”์„œ๋“œ ์ •์˜๋ฅผ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

2017๋…„ 7์›” 2์ผ ์ผ์š”์ผ 18:15, sachinh35 [email protected] trote

TypeError:embding_attention_seq2seq() ๋ˆ„๋ฝ 1์ด ํ•„์š”ํ•˜๋‹ค๊ณ  ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.
์œ„์น˜ ์ธ์ˆ˜: 'dec_cell'

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-312500996 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABepUEc3W8m5CVDQGnCLu4dcJVFwwLDZks5sJ8IOgaJpZM4MWl4f
.

>

Gmail ๋ชจ๋ฐ”์ผ์—์„œ ์ „์†ก๋จ

์˜ˆ, ๋‚˜๋Š” ๊ฐ™์€ ์ผ์„ํ–ˆ์Šต๋‹ˆ๋‹ค. tensorflow ๋ฆด๋ฆฌ์Šค์—์„œ seq2seq.py ํŒŒ์ผ์˜ ๊ธฐ๋Šฅ์„ ๋ณ€๊ฒฝํ–ˆ์Šต๋‹ˆ๋‹ค. ์—ฌ์ „ํžˆ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ํ•จ์ˆ˜์— ๋Œ€ํ•œ ์ธ์ˆ˜๊ฐ€ ํ•˜๋‚˜ ๋” ์žˆ์Šต๋‹ˆ๊นŒ?

์˜ˆ, ์ด์ œ ์ฝ”๋“œ์—์„œ rnn_cells์— ์ง€์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ธ์ฝ”๋”์šฉ
๋””์ฝ”๋”์— ๋Œ€ํ•œ ๋˜ ๋‹ค๋ฅธ.

2017๋…„ 7์›” 2์ผ ์ผ์š”์ผ 20:54 fabio fumarola [email protected] ์ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

๋„ค

2017๋…„ 7์›” 2์ผ ์ผ์š”์ผ 18:50์— sachinh35 [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

์˜ˆ, ๋‚˜๋Š” ๊ฐ™์€ ์ผ์„ํ–ˆ์Šต๋‹ˆ๋‹ค. seq2seq.py ํŒŒ์ผ์˜ ๊ธฐ๋Šฅ์„ ๋ณ€๊ฒฝํ–ˆ์Šต๋‹ˆ๋‹ค.
ํ…์„œํ”Œ๋กœ ๋ฆด๋ฆฌ์Šค. ์—ฌ์ „ํžˆ ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ํ•˜๋‚˜ ์žˆ๋‚˜์š”
๊ธฐ๋Šฅ์— ๋Œ€ํ•œ ๋” ๋งŽ์€ ์ธ์ˆ˜?

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-312503106 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABepUOXTQC_mzLuhcwW0iZRVkLmmr8yIks5sJ8pugaJpZM4MWl4f
.

>

Gmail ๋ชจ๋ฐ”์ผ์—์„œ ์ „์†ก๋จ

๋‚˜๋Š” ์ด๊ฒƒ์— ์™„์ „ํžˆ ์ƒˆ๋กœ์šด ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์•„์ฃผ ๊ธฐ๋ณธ์ ์ธ ์งˆ๋ฌธ์ผ ์ˆ˜ ์žˆ์ง€๋งŒ ์ด ์ฝ”๋“œ์—์„œ ๋””์ฝ”๋” ์…€๋กœ ์ „๋‹ฌํ•  ์ธ์ˆ˜๋ฅผ ๋งํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ์ž์ฒด ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ tensorflow ์ž์Šต์„œ์— ํ‘œ์‹œ๋œ ๋Œ€๋กœ seq2seq๋ฅผ ๊ฐœ๋ฐœํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

`
__future__์—์„œ import absolute_import
__future__ ์ˆ˜์ž… ๋ถ€๋ฌธ์—์„œ
__future__์—์„œ print_function ๊ฐ€์ ธ์˜ค๊ธฐ

๋ฌด์ž‘์œ„๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ

numpy๋ฅผ np๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ
from Six.moves import xrange # pylint: disable=redefined-builtin
ํ…์„œํ”Œ๋กœ๋ฅผ tf๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ

import data_utils

ํด๋ž˜์Šค Seq2SeqModel(๊ฐ์ฒด):
def __init__(์ž์‹ ,
์†Œ์Šค_์–ดํœ˜_ํฌ๊ธฐ,
target_vocab_size,
์–‘๋™์ด,
ํฌ๊ธฐ,
num_layers,
max_gradient_norm,
๋ฐฐ์น˜ ํฌ๊ธฐ,
ํ•™์Šต ์†๋„,
learning_rate_decay_factor,
use_lstm=๊ฑฐ์ง“,
num_samples=512,
forward_only=๊ฑฐ์ง“,
dtype=tf.float32):

self.source_vocab_size = source_vocab_size
self.target_vocab_size = target_vocab_size
self.buckets = buckets
self.batch_size = batch_size
self.learning_rate = tf.Variable(
    float(learning_rate), trainable=False, dtype=dtype)
self.learning_rate_decay_op = self.learning_rate.assign(
    self.learning_rate * learning_rate_decay_factor)
self.global_step = tf.Variable(0, trainable=False)


output_projection = None
softmax_loss_function = None

if num_samples > 0 and num_samples < self.target_vocab_size:
  w_t = tf.get_variable("proj_w", [self.target_vocab_size, size], dtype=dtype)
  w = tf.transpose(w_t)
  b = tf.get_variable("proj_b", [self.target_vocab_size], dtype=dtype)
  output_projection = (w, b)

  def sampled_loss(labels, inputs):
    labels = tf.reshape(labels, [-1, 1])

    local_w_t = tf.cast(w_t, tf.float32)
    local_b = tf.cast(b, tf.float32)
    local_inputs = tf.cast(inputs, tf.float32)
    return tf.cast(
        tf.nn.sampled_softmax_loss(local_w_t, local_b, local_inputs, labels,
                                   num_samples, self.target_vocab_size),
        dtype)
  softmax_loss_function = sampled_loss


def single_cell():
  return tf.nn.rnn_cell.GRUCell(size)
if use_lstm:
  def single_cell():
    return tf.nn.rnn_cell.BasicLSTMCell(size)
cell = single_cell()
if num_layers > 1:
  cell = tf.nn.rnn_cell.MultiRNNCell([single_cell() for _ in range(num_layers)])


def seq2seq_f(encoder_inputs, decoder_inputs, do_decode):
  return tf.contrib.legacy_seq2seq.embedding_attention_seq2seq(
      encoder_inputs,
      decoder_inputs,
      cell,
      num_encoder_symbols=source_vocab_size,
      num_decoder_symbols=target_vocab_size,
      embedding_size=size,
      output_projection=output_projection,
      feed_previous=do_decode,
      dtype=dtype)


self.encoder_inputs = []
self.decoder_inputs = []
self.target_weights = []
for i in xrange(buckets[-1][0]):  # Last bucket is the biggest one.
  self.encoder_inputs.append(tf.placeholder(tf.int32, shape=[None],
                                            name="encoder{0}".format(i)))
for i in xrange(buckets[-1][1] + 1):
  self.decoder_inputs.append(tf.placeholder(tf.int32, shape=[None],
                                            name="decoder{0}".format(i)))
  self.target_weights.append(tf.placeholder(dtype, shape=[None],
                                            name="weight{0}".format(i)))

# Our targets are decoder inputs shifted by one.
targets = [self.decoder_inputs[i + 1]
           for i in xrange(len(self.decoder_inputs) - 1)]

# Training outputs and losses.
if forward_only:
  self.outputs, self.losses = tf.contrib.legacy_seq2seq.model_with_buckets(
      self.encoder_inputs, self.decoder_inputs, targets,
      self.target_weights, buckets, lambda x, y: seq2seq_f(x, y, True),
      softmax_loss_function=softmax_loss_function)
  # If we use output projection, we need to project outputs for decoding.
  if output_projection is not None:
    for b in xrange(len(buckets)):
      self.outputs[b] = [
          tf.matmul(output, output_projection[0]) + output_projection[1]
          for output in self.outputs[b]
      ]
else:
  self.outputs, self.losses = tf.contrib.legacy_seq2seq.model_with_buckets(
      self.encoder_inputs, self.decoder_inputs, targets,
      self.target_weights, buckets,
      lambda x, y: seq2seq_f(x, y, False),
      softmax_loss_function=softmax_loss_function)

# Gradients and SGD update operation for training the model.
params = tf.trainable_variables()
if not forward_only:
  self.gradient_norms = []
  self.updates = []
  opt = tf.train.GradientDescentOptimizer(self.learning_rate)
  for b in xrange(len(buckets)):
    gradients = tf.gradients(self.losses[b], params)
    clipped_gradients, norm = tf.clip_by_global_norm(gradients,
                                                     max_gradient_norm)
    self.gradient_norms.append(norm)
    self.updates.append(opt.apply_gradients(
        zip(clipped_gradients, params), global_step=self.global_step))

self.saver = tf.train.Saver(tf.global_variables())

def ๋‹จ๊ณ„(์ž์ฒด, ์„ธ์…˜, ์ธ์ฝ”๋”_์ž…๋ ฅ, ๋””์ฝ”๋”_์ž…๋ ฅ, ๋Œ€์ƒ_๊ฐ€์ค‘์น˜,
๋ฒ„ํ‚ท ID, forward_only):

# Check if the sizes match.
encoder_size, decoder_size = self.buckets[bucket_id]
if len(encoder_inputs) != encoder_size:
  raise ValueError("Encoder length must be equal to the one in bucket,"
                   " %d != %d." % (len(encoder_inputs), encoder_size))
if len(decoder_inputs) != decoder_size:
  raise ValueError("Decoder length must be equal to the one in bucket,"
                   " %d != %d." % (len(decoder_inputs), decoder_size))
if len(target_weights) != decoder_size:
  raise ValueError("Weights length must be equal to the one in bucket,"
                   " %d != %d." % (len(target_weights), decoder_size))

# Input feed: encoder inputs, decoder inputs, target_weights, as provided.
input_feed = {}
for l in xrange(encoder_size):
  input_feed[self.encoder_inputs[l].name] = encoder_inputs[l]
for l in xrange(decoder_size):
  input_feed[self.decoder_inputs[l].name] = decoder_inputs[l]
  input_feed[self.target_weights[l].name] = target_weights[l]

# Since our targets are decoder inputs shifted by one, we need one more.
last_target = self.decoder_inputs[decoder_size].name
input_feed[last_target] = np.zeros([self.batch_size], dtype=np.int32)

# Output feed: depends on whether we do a backward step or not.
if not forward_only:
  output_feed = [self.updates[bucket_id],  # Update Op that does SGD.
                 self.gradient_norms[bucket_id],  # Gradient norm.
                 self.losses[bucket_id]]  # Loss for this batch.
else:
  output_feed = [self.losses[bucket_id]]  # Loss for this batch.
  for l in xrange(decoder_size):  # Output logits.
    output_feed.append(self.outputs[bucket_id][l])

outputs = session.run(output_feed, input_feed)
if not forward_only:
  return outputs[1], outputs[2], None  # Gradient norm, loss, no outputs.
else:
  return None, outputs[0], outputs[1:]  # No gradient norm, loss, outputs.

def get_batch(์ž์‹ , ๋ฐ์ดํ„ฐ, ๋ฒ„ํ‚ท ID):

encoder_size, decoder_size = self.buckets[bucket_id]
encoder_inputs, decoder_inputs = [], []

# Get a random batch of encoder and decoder inputs from data,
# pad them if needed, reverse encoder inputs and add GO to decoder.
for _ in xrange(self.batch_size):
  encoder_input, decoder_input = random.choice(data[bucket_id])

  # Encoder inputs are padded and then reversed.
  encoder_pad = [data_utils.PAD_ID] * (encoder_size - len(encoder_input))
  encoder_inputs.append(list(reversed(encoder_input + encoder_pad)))

  # Decoder inputs get an extra "GO" symbol, and are padded then.
  decoder_pad_size = decoder_size - len(decoder_input) - 1
  decoder_inputs.append([data_utils.GO_ID] + decoder_input +
                        [data_utils.PAD_ID] * decoder_pad_size)

# Now we create batch-major vectors from the data selected above.
batch_encoder_inputs, batch_decoder_inputs, batch_weights = [], [], []

# Batch encoder inputs are just re-indexed encoder_inputs.
for length_idx in xrange(encoder_size):
  batch_encoder_inputs.append(
      np.array([encoder_inputs[batch_idx][length_idx]
                for batch_idx in xrange(self.batch_size)], dtype=np.int32))

# Batch decoder inputs are re-indexed decoder_inputs, we create weights.
for length_idx in xrange(decoder_size):
  batch_decoder_inputs.append(
      np.array([decoder_inputs[batch_idx][length_idx]
                for batch_idx in xrange(self.batch_size)], dtype=np.int32))

  # Create target_weights to be 0 for targets that are padding.
  batch_weight = np.ones(self.batch_size, dtype=np.float32)
  for batch_idx in xrange(self.batch_size):
    # We set weight to 0 if the corresponding target is a PAD symbol.
    # The corresponding target is decoder_input shifted by 1 forward.
    if length_idx < decoder_size - 1:
      target = decoder_inputs[batch_idx][length_idx + 1]
    if length_idx == decoder_size - 1 or target == data_utils.PAD_ID:
      batch_weight[batch_idx] = 0.0
  batch_weights.append(batch_weight)
return batch_encoder_inputs, batch_decoder_inputs, batch_weights`

์ด๊ฒƒ์€ ์Šคํƒ ์˜ค๋ฒ„ํ”Œ๋กœ์— ๋Œ€ํ•œ ์ข‹์€ ์งˆ๋ฌธ์ž…๋‹ˆ๋‹ค.

2017๋…„ 7์›” 3์ผ ์˜ค์ „ 8์‹œ 46๋ถ„์— "sachinh35" [email protected] ์ด ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” ์ด๊ฒƒ์— ์™„์ „ํžˆ ์ƒˆ๋กœ์šด ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์–ด์ฉŒ๋ฉด ์ด๊ฒƒ์€ ์•„์ฃผ ๊ธฐ๋ณธ์ ์ธ ์งˆ๋ฌธ์ผ ์ˆ˜๋„ ์žˆ์ง€๋งŒ
์ด ์ฝ”๋“œ์—์„œ ๋””์ฝ”๋” ์…€๋กœ ์ „๋‹ฌํ•  ์ธ์ˆ˜๋ฅผ ์•Œ๋ ค์ฃผ์„ธ์š”. ๊ทธ๋ž˜์š”
์ž์‹ ์„ ์‚ฌ์šฉํ•˜์—ฌ tensorflow ์ž์Šต์„œ์— ํ‘œ์‹œ๋œ ๋Œ€๋กœ seq2seq๋ฅผ ๊ฐœ๋ฐœํ•˜๋ ค๊ณ  ์‹œ๋„ํ•ฉ๋‹ˆ๋‹ค.
๋ฐ์ดํ„ฐ ์„ธํŠธ.
`# Copyright 2015 TensorFlow ์ž‘์„ฑ์ž. ํŒ๊ถŒ ์†Œ์œ .
Apache ๋ผ์ด์„ ์Šค ๋ฒ„์ „ 2.0("๋ผ์ด์„ ์Šค")์— ๋”ฐ๋ผ ๋ผ์ด์„ ์Šค๊ฐ€ ๋ถ€์—ฌ๋ฉ๋‹ˆ๋‹ค. ๋‹น์‹ ์€ ํ•  ์ˆ˜์žˆ๋‹ค
๋ผ์ด์„ ์Šค๋ฅผ ์ค€์ˆ˜ํ•˜์ง€ ์•Š๋Š” ํ•œ ์ด ํŒŒ์ผ์„ ์‚ฌ์šฉํ•˜์ง€ ๋งˆ์‹ญ์‹œ์˜ค. ๋‹น์‹ ์€ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค
http://www.apache.org/licenses/LICENSE-2.0 ์— ์žˆ๋Š” ๋ผ์ด์„ ์Šค ์‚ฌ๋ณธ
๊ด€๋ จ ๋ฒ•๋ฅ ์—์„œ ์š”๊ตฌํ•˜๊ฑฐ๋‚˜ ์„œ๋ฉด์œผ๋กœ ๋™์˜ํ•œ ์†Œํ”„ํŠธ์›จ์–ด ๋ฐฐํฌ
๋ผ์ด์„ผ์Šค์— ๋”ฐ๋ผ ๋ณด์ฆ ์—†์ด "์žˆ๋Š” ๊ทธ๋Œ€๋กœ" ๋ฐฐํฌ๋ฉ๋‹ˆ๋‹ค.
๋˜๋Š” ๋ช…์‹œ์ ์ด๋“  ๋ฌต์‹œ์ ์ด๋“  ๋ชจ๋“  ์ข…๋ฅ˜์˜ ์กฐ๊ฑด. ๋ผ์ด์„ ์Šค ์ฐธ์กฐ
์•„๋ž˜์˜ ๊ถŒํ•œ ๋ฐ ์ œํ•œ ์‚ฌํ•ญ์„ ๊ด€๋ฆฌํ•˜๋Š” ํŠน์ • ์–ธ์–ด

ํŠนํ—ˆ. ==================================================== ===========

"""์ฃผ์˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜์ด ์žˆ๋Š” ์‹œํ€€์Šค ๋Œ€ ์‹œํ€€์Šค ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค."""

ํ–ฅํ›„ ๊ฐ€์ ธ์˜ค๊ธฐ absolute_import์—์„œ
๋ฏธ๋ž˜ ์ˆ˜์ž… ๋ถ€์„œ์—์„œ
ํ–ฅํ›„ ๊ฐ€์ ธ์˜ค๊ธฐ์—์„œ print_function

๋ฌด์ž‘์œ„๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ

numpy๋ฅผ np๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ
from Six.moves import xrange # pylint: disable=redefined-builtin
ํ…์„œํ”Œ๋กœ๋ฅผ tf๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ

import data_utils

ํด๋ž˜์Šค Seq2SeqModel(๊ฐ์ฒด):
"""์ฃผ์˜๊ฐ€ ์žˆ๊ณ  ์—ฌ๋Ÿฌ ๋ฒ„ํ‚ท์— ๋Œ€ํ•œ ์‹œํ€€์Šค ๋Œ€ ์‹œํ€€์Šค ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.

์ด ํด๋ž˜์Šค๋Š” ์ธ์ฝ”๋”๋กœ ๋‹ค์ธต ์ˆœํ™˜ ์‹ ๊ฒฝ๋ง์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.
๋ฐ ์ฃผ์˜ ๊ธฐ๋ฐ˜ ๋””์ฝ”๋”. ์— ์„ค๋ช…๋œ ๋ชจ๋ธ๊ณผ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.
์ด ๋ฌธ์„œ: http://arxiv.org/abs/1412.7449 - ๊ฑฐ๊ธฐ์—์„œ ์ฐพ์•„๋ณด์‹ญ์‹œ์˜ค.
์„ธ๋ถ€,
๋˜๋Š” ์™„์ „ํ•œ ๋ชจ๋ธ ๊ตฌํ˜„์„ ์œ„ํ•ด seq2seq ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ๋„ฃ์Šต๋‹ˆ๋‹ค.
์ด ํด๋ž˜์Šค๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด LSTM ์…€ ์™ธ์— GRU ์…€๋„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
ํฐ ์ถœ๋ ฅ ์–ดํœ˜ ํฌ๊ธฐ๋ฅผ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ์ƒ˜ํ”Œ๋ง๋œ softmax. ๋‹จ์ธต
์ด ๋ชจ๋ธ์˜ ๋ฒ„์ „์ด์ง€๋งŒ ์–‘๋ฐฉํ–ฅ ์ธ์ฝ”๋”๊ฐ€ ์žˆ๋Š”
http://arxiv.org/abs/1409.0473
์ƒ˜ํ”Œ๋ง๋œ softmax๋Š” ๋‹ค์Œ ๋ฌธ์„œ์˜ ์„น์…˜ 3์— ์„ค๋ช…๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
http://arxiv.org/abs/1412.2007
""

def ์ดˆ๊ธฐํ™” (์ž์‹ ,
์†Œ์Šค_์–ดํœ˜_ํฌ๊ธฐ,
target_vocab_size,
์–‘๋™์ด,
ํฌ๊ธฐ,
num_layers,
max_gradient_norm,
๋ฐฐ์น˜ ํฌ๊ธฐ,
ํ•™์Šต ์†๋„,
learning_rate_decay_factor,
use_lstm=๊ฑฐ์ง“,
num_samples=512,
forward_only=๊ฑฐ์ง“,
dtype=tf.float32):
"""๋ชจ๋ธ์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

์ธ์ˆ˜:
source_vocab_size: ์†Œ์Šค ์–ดํœ˜์˜ ํฌ๊ธฐ.
target_vocab_size: ๋Œ€์ƒ ์–ดํœ˜์˜ ํฌ๊ธฐ.
๋ฒ„ํ‚ท: ์Œ(I, O)์˜ ๋ชฉ๋ก, ์—ฌ๊ธฐ์„œ I์€ ์ตœ๋Œ€ ์ž…๋ ฅ ๊ธธ์ด๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
ํ•ด๋‹น ๋ฒ„ํ‚ท์—์„œ ์ฒ˜๋ฆฌ๋˜๊ณ  O๋Š” ์ตœ๋Œ€ ์ถœ๋ ฅ์„ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
๊ธธ์ด. I ๋˜๋Š” ์ถœ๋ ฅ๋ณด๋‹ค ์ž…๋ ฅ์ด ๊ธด ํ›ˆ๋ จ ์ธ์Šคํ„ด์Šค
O๋ณด๋‹ค ๊ธธ๋ฉด ๋‹ค์Œ ๋ฒ„ํ‚ท์œผ๋กœ ํ‘ธ์‹œ๋˜๊ณ  ๊ทธ์— ๋”ฐ๋ผ ์ฑ„์›Œ์ง‘๋‹ˆ๋‹ค.
๋ชฉ๋ก์ด ์ •๋ ฌ๋˜์–ด ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค(์˜ˆ: [(2, 4), (8, 16)].
ํฌ๊ธฐ: ๋ชจ๋ธ์˜ ๊ฐ ๋ ˆ์ด์–ด์— ์žˆ๋Š” ๋‹จ์œ„ ์ˆ˜์ž…๋‹ˆ๋‹ค.
num_layers: ๋ชจ๋ธ์˜ ๋ ˆ์ด์–ด ์ˆ˜.
max_gradient_norm: ๊ทธ๋ผ๋””์–ธํŠธ๊ฐ€ ์ตœ๋Œ€๋กœ ์ด ํ‘œ์ค€์œผ๋กœ ์ž˜๋ฆฝ๋‹ˆ๋‹ค.
batch_size: ํ›ˆ๋ จ ์ค‘์— ์‚ฌ์šฉ๋œ ๋ฐฐ์น˜์˜ ํฌ๊ธฐ.
๋ชจ๋ธ ๊ตฌ์„ฑ์€ batch_size์™€ ๋ฌด๊ด€ํ•˜๋ฏ€๋กœ
์˜ˆ๋ฅผ ๋“ค์–ด ๋””์ฝ”๋”ฉ์„ ์œ„ํ•ด ์ด๊ฒƒ์ด ํŽธ๋ฆฌํ•œ ๊ฒฝ์šฐ ์ดˆ๊ธฐํ™” ํ›„์— ๋ณ€๊ฒฝ๋ฉ๋‹ˆ๋‹ค.
learning_rate: ์‹œ์ž‘ํ•  ํ•™์Šต๋ฅ ์ž…๋‹ˆ๋‹ค.
learning_rate_decay_factor: ํ•„์š”ํ•  ๋•Œ ์ด๋งŒํผ ํ•™์Šต๋ฅ ์„ ๊ฐ์†Œ์‹œํ‚ต๋‹ˆ๋‹ค.
use_lstm: true์ด๋ฉด GRU ์…€ ๋Œ€์‹  LSTM ์…€์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
num_samples: ์ƒ˜ํ”Œ๋ง๋œ softmax์˜ ์ƒ˜ํ”Œ ์ˆ˜์ž…๋‹ˆ๋‹ค.
forward_only: ์„ค์ •๋œ ๊ฒฝ์šฐ ๋ชจ๋ธ์—์„œ ์—ญ๋ฐฉํ–ฅ ํŒจ์Šค๋ฅผ ๊ตฌ์„ฑํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
dtype: ๋‚ด๋ถ€ ๋ณ€์ˆ˜๋ฅผ ์ €์žฅํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ๋ฐ์ดํ„ฐ ์œ ํ˜•์ž…๋‹ˆ๋‹ค.
""
self.source_vocab_size = source_vocab_size
self.target_vocab_size = target_vocab_size
self.buckets = ๋ฒ„ํ‚ท
self.batch_size = ๋ฐฐ์น˜_ํฌ๊ธฐ
self.learning_rate = tf.Variable(
float(learning_rate), ํ›ˆ๋ จ ๊ฐ€๋Šฅ=๊ฑฐ์ง“, dtype=dtype)
self.learning_rate_decay_op = self.learning_rate.assign(
self.learning_rate * learning_rate_decay_factor)
self.global_step = tf.Variable(0, ํ›ˆ๋ จ ๊ฐ€๋Šฅ=๊ฑฐ์ง“)

์ƒ˜ํ”Œ๋ง๋œ softmax๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ์ถœ๋ ฅ ํˆฌ์˜์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

output_projection = ์—†์Œ
softmax_loss_function = ์—†์Œ

์ƒ˜ํ”Œ๋ง๋œ ์†Œํ”„ํŠธ๋งฅ์Šค๋Š” ์–ดํœ˜ ํฌ๊ธฐ๋ณด๋‹ค ์ ์€ ์ƒ˜ํ”Œ๋ง์„ ํ•˜๋Š” ๊ฒฝ์šฐ์—๋งŒ ์˜๋ฏธ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

num_samples > 0 ๋ฐ num_samples < self.target_vocab_size:
w_t = tf.get_variable("proj_w", [self.target_vocab_size, ํฌ๊ธฐ], dtype=dtype)
w = tf.transpose(w_t)
b = tf.get_variable("proj_b", [self.target_vocab_size], dtype=dtype)
output_projection = (w, b)

def sampled_loss(๋ ˆ์ด๋ธ”, ์ž…๋ ฅ):
๋ ˆ์ด๋ธ” = tf.reshape(๋ ˆ์ด๋ธ”, [-1, 1])
# 32๋น„ํŠธ ๋ถ€๋™ ์†Œ์ˆ˜์ ์„ ์‚ฌ์šฉํ•˜์—ฌ sampled_softmax_loss๋ฅผ ๊ณ„์‚ฐํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
# ์ˆ˜์น˜์  ๋ถˆ์•ˆ์ •์„ฑ์„ ํ”ผํ•˜์‹ญ์‹œ์˜ค.
local_w_t = tf.cast(w_t, tf.float32)
local_b = tf.cast(b, tf.float32)
local_inputs = tf.cast(์ž…๋ ฅ, tf.float32)
๋ฐ˜ํ™˜ tf.cast(
tf.nn.sampled_softmax_loss(local_w_t, local_b, local_inputs, ๋ ˆ์ด๋ธ”,
num_samples, self.target_vocab_size),
dtype)
softmax_loss_function = ์ƒ˜ํ”Œ๋ง๋œ ์†์‹ค

RNN์„ ์œ„ํ•œ ๋‚ด๋ถ€ ๋‹ค์ธต ์…€์„ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

def single_cell():
๋ฐ˜ํ™˜ tf.nn.rnn_cell.GRUCell(ํฌ๊ธฐ)
use_lstm์ธ ๊ฒฝ์šฐ:
def single_cell():
๋ฐ˜ํ™˜ tf.nn.rnn_cell.BasicLSTMCell(ํฌ๊ธฐ)
์…€ = single_cell()
num_layers > 1์ธ ๊ฒฝ์šฐ:
cell = tf.nn.rnn_cell.MultiRNNCell([single_cell() for _ in range(num_layers)])

seq2seq ํ•จ์ˆ˜: ์ž…๋ ฅ๊ณผ ์ฃผ์˜๋ฅผ ์œ„ํ•ด ์ž„๋ฒ ๋”ฉ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

def seq2seq_f(encoder_inputs, ๋””์ฝ”๋”_์ž…๋ ฅ, do_decode):
๋ฐ˜ํ™˜ tf.contrib.legacy_seq2seq.embedding_attention_seq2seq(
์ธ์ฝ”๋”_์ž…๋ ฅ,
๋””์ฝ”๋”_์ž…๋ ฅ,
์…€,
num_encoder_symbols=source_vocab_size,
num_decoder_symbols=target_vocab_size,
embedding_size=ํฌ๊ธฐ,
output_projection=์ถœ๋ ฅ_ํˆฌ์˜,
feed_previous=do_decode,
dtype=dtype)

์ž…๋ ฅ์— ๋Œ€ํ•œ ํ”ผ๋“œ.

self.encoder_inputs = []
self.decoder_inputs = []
self.target_weights = []
for i in xrange(buckets[-1][0]): # ๋งˆ์ง€๋ง‰ ๋ฒ„ํ‚ท์ด ๊ฐ€์žฅ ํฐ ๋ฒ„ํ‚ท์ž…๋‹ˆ๋‹ค.
self.encoder_inputs.append(tf.placeholder(tf.int32, ๋ชจ์–‘=[์—†์Œ],
์ด๋ฆ„="์ธ์ฝ”๋”{0}".format(i)))
xrange(buckets[-1][1] + 1)์˜ i์— ๋Œ€ํ•ด:
self.decoder_inputs.append(tf.placeholder(tf.int32, ๋ชจ์–‘=[์—†์Œ],
์ด๋ฆ„="๋””์ฝ”๋”{0}".format(i)))
self.target_weights.append(tf.placeholder(dtype, shape=[์—†์Œ],
์ด๋ฆ„="๋ฌด๊ฒŒ{0}".ํ˜•์‹(i)))

์šฐ๋ฆฌ์˜ ๋ชฉํ‘œ๋Š” 1๋งŒํผ ์ด๋™๋œ ๋””์ฝ”๋” ์ž…๋ ฅ์ž…๋‹ˆ๋‹ค.

๋Œ€์ƒ = [self.decoder_inputs[i + 1]
for i in xrange(len(self.decoder_inputs) - 1)]

ํ›ˆ๋ จ ๊ฒฐ๊ณผ ๋ฐ ์†์‹ค.

forward_only์ธ ๊ฒฝ์šฐ:
self.outputs, self.losses = tf.contrib.legacy_seq2seq.model_with_buckets(
self.encoder_inputs, self.decoder_inputs, ๋Œ€์ƒ,
self.target_weights, ๋ฒ„ํ‚ท, ๋žŒ๋‹ค x, y: seq2seq_f(x, y, True),
softmax_loss_function=softmax_loss_function)
# ์ถœ๋ ฅ ํˆฌ์˜์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ๋””์ฝ”๋”ฉ์„ ์œ„ํ•ด ์ถœ๋ ฅ์„ ํˆฌ์˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
output_projection์ด None์ด ์•„๋‹Œ ๊ฒฝ์šฐ:
xrange(len(buckets))์˜ b์— ๋Œ€ํ•ด:
self.outputs[b] = [
tf.matmul(์ถœ๋ ฅ, output_projection[0]) + output_projection[1]
self.outputs[b]์˜ ์ถœ๋ ฅ์šฉ
]
๋˜ ๋‹ค๋ฅธ:
self.outputs, self.losses = tf.contrib.legacy_seq2seq.model_with_buckets(
self.encoder_inputs, self.decoder_inputs, ๋Œ€์ƒ,
self.target_weights, ๋ฒ„ํ‚ท,
๋žŒ๋‹ค x, y: seq2seq_f(x, y, False),
softmax_loss_function=softmax_loss_function)

๋ชจ๋ธ ํ›ˆ๋ จ์„ ์œ„ํ•œ ๊ทธ๋ผ๋””์–ธํŠธ ๋ฐ SGD ์—…๋ฐ์ดํŠธ ์ž‘์—….

๋งค๊ฐœ๋ณ€์ˆ˜ = tf.trainable_variables()
forward_only๊ฐ€ ์•„๋‹Œ ๊ฒฝ์šฐ:
self.gradient_norms = []
self.updates = []
opt = tf.train.GradientDescentOptimizer(self.learning_rate)
xrange(len(buckets))์˜ b์— ๋Œ€ํ•ด:
๊ทธ๋ผ๋””์–ธํŠธ = tf.gradients(self.losses[b], params)
clipped_gradients, norm = tf.clip_by_global_norm(๊ทธ๋ผ๋””์–ธํŠธ,
max_gradient_norm)
self.gradient_norms.append(ํ‘œ์ค€)
self.updates.append(opt.apply_gradients(
zip(clipping_gradients, params), global_step=self.global_step))

self.saver = tf.train.Saver(tf.global_variables())

def ๋‹จ๊ณ„(์ž์ฒด, ์„ธ์…˜, ์ธ์ฝ”๋”_์ž…๋ ฅ, ๋””์ฝ”๋”_์ž…๋ ฅ, ๋Œ€์ƒ_๊ฐ€์ค‘์น˜,
๋ฒ„ํ‚ท ID, forward_only):
"""์ฃผ์–ด์ง„ ์ž…๋ ฅ์„ ๊ณต๊ธ‰ํ•˜๋Š” ๋ชจ๋ธ์˜ ๋‹จ๊ณ„๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

์ธ์ˆ˜:
session: ์‚ฌ์šฉํ•  tensorflow ์„ธ์…˜์ž…๋‹ˆ๋‹ค.
์ธ์ฝ”๋”_์ž…๋ ฅ: ์ธ์ฝ”๋” ์ž…๋ ฅ์œผ๋กœ ๊ณต๊ธ‰ํ•  numpy int ๋ฒกํ„ฐ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค.
๋””์ฝ”๋”_์ž…๋ ฅ: ๋””์ฝ”๋” ์ž…๋ ฅ์œผ๋กœ ๊ณต๊ธ‰ํ•  numpy int ๋ฒกํ„ฐ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค.
target_weights: ๋Œ€์ƒ ๊ฐ€์ค‘์น˜๋กœ ์ œ๊ณตํ•  numpy float ๋ฒกํ„ฐ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค.
bucket_id: ์‚ฌ์šฉํ•  ๋ชจ๋ธ์˜ ๋ฒ„ํ‚ท์ž…๋‹ˆ๋‹ค.
forward_only: ๋’ค๋กœ ๋‹จ๊ณ„๋ฅผ ์ˆ˜ํ–‰ํ• ์ง€ ์•„๋‹ˆ๋ฉด ์•ž์œผ๋กœ๋งŒ ์ˆ˜ํ–‰ํ• ์ง€ ์—ฌ๋ถ€.

๋ณด๊ณ :
๊ทธ๋ž˜๋””์–ธํŠธ ๋†ˆ์œผ๋กœ ๊ตฌ์„ฑ๋œ ํŠธ๋ฆฌํ”Œ(๋˜๋Š” ๋’ค๋กœ ํ•˜์ง€ ์•Š์•˜๋‹ค๋ฉด None),
ํ‰๊ท  ๋‹นํ˜น๋„ ๋ฐ ์ถœ๋ ฅ.

๋ ˆ์ด์ฆˆ:
ValueError: ์ธ์ฝ”๋”_์ž…๋ ฅ, ๋””์ฝ”๋”_์ž…๋ ฅ ๋˜๋Š”
target_weights๊ฐ€ ์ง€์ •๋œ bucket_id์˜ ๋ฒ„ํ‚ท ํฌ๊ธฐ์™€ ์ผ์น˜ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
""

ํฌ๊ธฐ๊ฐ€ ์ผ์น˜ํ•˜๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.

์ธ์ฝ”๋”_ํฌ๊ธฐ, ๋””์ฝ”๋”_ํฌ๊ธฐ โ€‹โ€‹= self.buckets[bucket_id]
len(encoder_inputs) != ์ธ์ฝ”๋”_ํฌ๊ธฐ์ธ ๊ฒฝ์šฐ:
raise ValueError("์ธ์ฝ”๋” ๊ธธ์ด๋Š” ๋ฒ„ํ‚ท์˜ ๊ธธ์ด์™€ ๊ฐ™์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค."
" %d != %d." % (len(encoder_inputs), ์ธ์ฝ”๋”_ํฌ๊ธฐ))
len(decoder_inputs) != ๋””์ฝ”๋” ํฌ๊ธฐ์ธ ๊ฒฝ์šฐ:
raise ValueError("๋””์ฝ”๋” ๊ธธ์ด๋Š” ๋ฒ„ํ‚ท์˜ ๊ธธ์ด์™€ ๊ฐ™์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค."
" %d != %d." % (len(decoder_inputs), ๋””์ฝ”๋”_ํฌ๊ธฐ))
len(target_weights) != ๋””์ฝ”๋” ํฌ๊ธฐ:
raise ValueError("๋ฌด๊ฒŒ ๊ธธ์ด๋Š” ๋ฒ„ํ‚ท์˜ ๊ธธ์ด์™€ ๊ฐ™์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค."
" %d != %d." % (len(target_weights), ๋””์ฝ”๋”_ํฌ๊ธฐ))

์ž…๋ ฅ ํ”ผ๋“œ: ์ธ์ฝ”๋” ์ž…๋ ฅ, ๋””์ฝ”๋” ์ž…๋ ฅ, ์ œ๊ณต๋˜๋Š” target_weights.

input_feed = {}
xrange(encoder_size)์˜ l:
input_feed[self.encoder_inputs[l].name] = ์ธ์ฝ”๋”_์ž…๋ ฅ[l]
xrange(decoder_size)์˜ l:
input_feed[self.decoder_inputs[l].name] = ๋””์ฝ”๋”_์ž…๋ ฅ[l]
input_feed[self.target_weights[l].name] = target_weights[l]

์šฐ๋ฆฌ์˜ ๋ชฉํ‘œ๋Š” 1๋งŒํผ ์ด๋™๋œ ๋””์ฝ”๋” ์ž…๋ ฅ์ด๋ฏ€๋กœ ํ•˜๋‚˜ ๋” ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

last_target = self.decoder_inputs[decoder_size].name
input_feed[last_target] = np.zeros([self.batch_size], dtype=np.int32)

์ถœ๋ ฅ ํ”ผ๋“œ: ํ›„์ง„ ๋‹จ๊ณ„๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š”์ง€ ์—ฌ๋ถ€์— ๋”ฐ๋ผ ๋‹ค๋ฆ…๋‹ˆ๋‹ค.

forward_only๊ฐ€ ์•„๋‹Œ ๊ฒฝ์šฐ:
output_feed = [self.updates[bucket_id], # SGD๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ์—…๋ฐ์ดํŠธ ์ž‘์—….
self.gradient_norms[bucket_id], # ๊ธฐ์šธ๊ธฐ ๋…ธ๋ฆ„.
self.losses[bucket_id]] # ์ด ๋ฐฐ์น˜์˜ ์†์‹ค์ž…๋‹ˆ๋‹ค.
๋˜ ๋‹ค๋ฅธ:
output_feed = [self.losses[bucket_id]] # ์ด ๋ฐฐ์น˜์˜ ์†์‹ค์ž…๋‹ˆ๋‹ค.
for l in xrange(decoder_size): # ์ถœ๋ ฅ ๋กœ๊ทธ.
output_feed.append(self.outputs[bucket_id][l])

์ถœ๋ ฅ = session.run(์ถœ๋ ฅ_ํ”ผ๋“œ, ์ž…๋ ฅ_ํ”ผ๋“œ)
forward_only๊ฐ€ ์•„๋‹Œ ๊ฒฝ์šฐ:
return output[1], output[2], None # ๊ธฐ์šธ๊ธฐ ๋…ธ๋ฆ„, ์†์‹ค, ์ถœ๋ ฅ ์—†์Œ.
๋˜ ๋‹ค๋ฅธ:
return None, output[0], output[1:] # ๊ธฐ์šธ๊ธฐ ๋…ธ๋ฆ„, ์†์‹ค, ์ถœ๋ ฅ์ด ์—†์Šต๋‹ˆ๋‹ค.

def get_batch(์ž์‹ , ๋ฐ์ดํ„ฐ, ๋ฒ„ํ‚ท ID):
"""์ง€์ •๋œ ๋ฒ„ํ‚ท์—์„œ ๋ฐ์ดํ„ฐ์˜ ๋ฌด์ž‘์œ„ ๋ฐฐ์น˜๋ฅผ ๊ฐ€์ ธ์˜ค๊ณ  ๋‹จ๊ณ„๋ฅผ ์ค€๋น„ํ•˜์‹ญ์‹œ์˜ค.

๋‹จ๊ณ„(..)์—์„œ ๋ฐ์ดํ„ฐ๋ฅผ ๊ณต๊ธ‰ํ•˜๋ ค๋ฉด ์ผ๊ด„ ์ฒ˜๋ฆฌ ์ค‘์‹ฌ ๋ฒกํ„ฐ ๋ชฉ๋ก์ด์–ด์•ผ ํ•˜์ง€๋งŒ
์—ฌ๊ธฐ์— ์žˆ๋Š” ๋ฐ์ดํ„ฐ๋Š” ๋‹จ์ผ ๊ธธ์ด ์ฃผ์š” ์‚ฌ๋ก€๋ฅผ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ด๊ฒƒ์˜ ์ฃผ์š” ๋…ผ๋ฆฌ๋Š”
๊ธฐ๋Šฅ์€ ๋ฐ์ดํ„ฐ ์ผ€์ด์Šค๋ฅผ ํ”ผ๋“œ์— ์ ํ•ฉํ•œ ํ˜•์‹์œผ๋กœ ๋‹ค์‹œ ์ธ๋ฑ์‹ฑํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ธ์ˆ˜:
data: ๊ฐ ์š”์†Œ๊ฐ€ ํฌํ•จํ•˜๋Š” len(self.buckets) ํฌ๊ธฐ์˜ ํŠœํ”Œ
๋ฐฐ์น˜๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•˜๋Š” ์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ ๋ฐ์ดํ„ฐ ์Œ์˜ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค.
bucket_id: ๋ฐฐ์น˜๋ฅผ ๊ฐ€์ ธ์˜ฌ ๋ฒ„ํ‚ท์˜ ์ •์ˆ˜์ž…๋‹ˆ๋‹ค.

๋ณด๊ณ :
ํŠธ๋ฆฌํ”Œ(encoder_inputs, decoder_inputs, target_weights)
๋‚˜์ค‘์— step(...)์„ ํ˜ธ์ถœํ•˜๊ธฐ์— ์ ์ ˆํ•œ ํ˜•์‹์„ ๊ฐ€์ง„ ๊ตฌ์„ฑ๋œ ๋ฐฐ์น˜์ž…๋‹ˆ๋‹ค.
""
์ธ์ฝ”๋”_ํฌ๊ธฐ, ๋””์ฝ”๋”_ํฌ๊ธฐ โ€‹โ€‹= self.buckets[bucket_id]
์ธ์ฝ”๋”_์ž…๋ ฅ, ๋””์ฝ”๋”_์ž…๋ ฅ = [], []

๋ฐ์ดํ„ฐ์—์„œ ์ธ์ฝ”๋” ๋ฐ ๋””์ฝ”๋” ์ž…๋ ฅ์˜ ๋ฌด์ž‘์œ„ ๋ฐฐ์น˜๋ฅผ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค.

ํ•„์š”ํ•œ ๊ฒฝ์šฐ ํŒจ๋“œ๋ฅผ ์ฑ„์šฐ๊ณ  ์ธ์ฝ”๋” ์ž…๋ ฅ์„ ์—ญ์ „์‹œํ‚ค๊ณ  ๋””์ฝ”๋”์— GO๋ฅผ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

xrange(self.batch_size)์˜ _์— ๋Œ€ํ•ด:
์ธ์ฝ”๋”_์ž…๋ ฅ, ๋””์ฝ”๋”_์ž…๋ ฅ = random.choice(๋ฐ์ดํ„ฐ[๋ฒ„ํ‚ท_ID])

# ์ธ์ฝ”๋” ์ž…๋ ฅ์€ ํŒจ๋”ฉ๋œ ๋‹ค์Œ ๋ฐ˜์ „๋ฉ๋‹ˆ๋‹ค.
์ธ์ฝ”๋”_ํŒจ๋“œ = [data_utils.PAD_ID] * (encoder_size - len(encoder_input))
encoder_inputs.append(list(reversed(encoder_input + encoder_pad)))

# ๋””์ฝ”๋” ์ž…๋ ฅ์€ ์ถ”๊ฐ€ "GO" ๊ธฐํ˜ธ๋ฅผ ์–ป๊ณ  ํŒจ๋”ฉ๋ฉ๋‹ˆ๋‹ค.
๋””์ฝ”๋”_ํŒจ๋“œ_ํฌ๊ธฐ = ๋””์ฝ”๋”_ํฌ๊ธฐ โ€‹โ€‹- len(decoder_input) - 1
Decoder_inputs.append([data_utils.GO_ID] + ๋””์ฝ”๋”_์ž…๋ ฅ +
[data_utils.PAD_ID] * ๋””์ฝ”๋”_ํŒจ๋“œ_ํฌ๊ธฐ)

์ด์ œ ์œ„์—์„œ ์„ ํƒํ•œ ๋ฐ์ดํ„ฐ์—์„œ ๋ฐฐ์น˜ ์ฃผ์š” ๋ฒกํ„ฐ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

batch_encoder_inputs, batch_decoder_inputs, batch_weights = [], [], []

๋ฐฐ์น˜ ์ธ์ฝ”๋” ์ž…๋ ฅ์€ ๋‹ค์‹œ ์ธ๋ฑ์‹ฑ๋œ ์ธ์ฝ”๋”_์ž…๋ ฅ์ž…๋‹ˆ๋‹ค.

xrange(encoder_size)์˜ length_idx:
batch_encoder_inputs.append(
np.array([์ธ์ฝ”๋”_์ž…๋ ฅ[๋ฐฐ์น˜_idx][๊ธธ์ด_idx]
xrange(self.batch_size)]์˜ batch_idx์šฉ], dtype=np.int32))

๋ฐฐ์น˜ ๋””์ฝ”๋” ์ž…๋ ฅ์€ ๋‹ค์‹œ ์ธ๋ฑ์‹ฑ๋œ ๋””์ฝ”๋”_์ž…๋ ฅ์ด๋ฉฐ ๊ฐ€์ค‘์น˜๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

xrange(decoder_size)์˜ length_idx:
batch_decoder_inputs.append(
np.array([๋””์ฝ”๋”_์ž…๋ ฅ[๋ฐฐ์น˜_idx][๊ธธ์ด_idx]
xrange(self.batch_size)]์˜ batch_idx์šฉ], dtype=np.int32))

# ํŒจ๋”ฉ ๋Œ€์ƒ์— ๋Œ€ํ•ด target_weights๋ฅผ 0์œผ๋กœ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
batch_weight = np.ones(self.batch_size, dtype=np.float32)
xrange(self.batch_size)์˜ batch_idx:
# ํ•ด๋‹น target์ด PAD symbol์ด๋ฉด weight๋ฅผ 0์œผ๋กœ ์„ค์ •ํ•œ๋‹ค.
# ํ•ด๋‹น ๋Œ€์ƒ์€ 1 ์•ž์œผ๋กœ ์ด๋™ํ•œ ๋””์ฝ”๋”_์ž…๋ ฅ์ž…๋‹ˆ๋‹ค.
length_idx < ๋””์ฝ”๋” ํฌ๊ธฐ - 1์ธ ๊ฒฝ์šฐ:
๋Œ€์ƒ = ๋””์ฝ”๋”_์ž…๋ ฅ[๋ฐฐ์น˜_idx][๊ธธ์ด_idx + 1]
length_idx == ๋””์ฝ”๋”_ํฌ๊ธฐ โ€‹โ€‹- 1 ๋˜๋Š” ๋Œ€์ƒ == data_utils.PAD_ID์ธ ๊ฒฝ์šฐ:
๋ฐฐ์น˜ ๋ฌด๊ฒŒ[๋ฐฐ์น˜_idx] = 0.0
batch_weights.append(batch_weight)
๋ฐ˜ํ™˜ batch_encoder_inputs, batch_decoder_inputs, batch_weights`

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-312679587 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABtim0l5UMHHtbL1sz7meXserV8NVS7cks5sKQzXgaJpZM4MWl4f
.

๊ดœ์ฐฎ์•„! ๊ทธ๋ž˜๋„ ๊ณ ๋งˆ์›Œ! :)

@ebrevdo ์ƒˆ API๋ฅผ ์‚ฌ์šฉํ•˜๋Š” seq2seq์˜ ์ƒˆ ํŠœํ† ๋ฆฌ์–ผ์ด ์–ธ์ œ ๋‚˜์˜ฌ์ง€์— ๋Œ€ํ•œ ์—…๋ฐ์ดํŠธ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?
๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๋†€๋ผ์šด ์ž‘ํ’ˆ!.

์˜ˆ, ์ƒˆ๋กœ์šด ํŠœํ† ๋ฆฌ์–ผ์„ ๊ธฐ๋‹ค๋ฆฌ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค... ์กฐ๋งŒ๊ฐ„ ์ถœ์‹œ๋  ์˜ˆ์ •์ธ์ง€ ์•Œ๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.. @ebrevdo

์ปค๋„ ํ…Œ์ŠคํŠธ์—์„œ ์ฝ”๋“œ๋ฅผ ๊ฐ€์ ธ์˜ค๊ณ  ๊ธฐ์กด seq2seq๋กœ ๋น” ๊ฒ€์ƒ‰์„ ๊ฐœ์กฐํ•˜๋ ค๊ณ  ์‹œ๋„ํ–ˆ์ง€๋งŒ ๋„์ „์ ์ด์—ˆ์Šต๋‹ˆ๋‹ค...

์ด๋ฒˆ์ฃผ๊ฐ€ ๊ธฐ๋‹ค๋ ค์ง‘๋‹ˆ๋‹ค!

2017๋…„ 7์›” 3์ผ ์˜ค์ „ 10์‹œ 16๋ถ„์— "prashantserai" [email protected] ์ด ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

๋„ค, ์ƒˆ๋กœ์šด ํŠœํ† ๋ฆฌ์–ผ์„ ๊ธฐ๋‹ค๋ฆฌ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค...
์กฐ๋งŒ๊ฐ„ ์ถœ์‹œ ์˜ˆ์ •.. @ebrevdo
https://github.com/ebrevdo

์ปค๋„ ํ…Œ์ŠคํŠธ์—์„œ ์ฝ”๋“œ๋ฅผ ๊ฐ€์ ธ์˜ค๊ณ  ๋‹ค์Œ์œผ๋กœ ๋น” ๊ฒ€์ƒ‰์„ ๊ฐœ์กฐํ•˜๋ ค๊ณ  ํ–ˆ์Šต๋‹ˆ๋‹ค.
๋ ˆ๊ฑฐ์‹œ seq2seq์ด์ง€๋งŒ ๋„์ „์ ์ธ ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์˜€์Šต๋‹ˆ๋‹ค ...

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-312697274 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABtim45-HTuQrIRDhphqqHjqkKOKTe53ks5sKSHYgaJpZM4MWl4f
.

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„,

์ด ๋ฌธ์ œ์— ๋Œ€ํ•œ ์—…๋ฐ์ดํŠธ๋Š” Mac os x์šฉ tensorflow 1.1-gpu์—์„œ๋„ ๋™์ผํ•˜๊ฒŒ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

@tshi1983
์šฐ๋ถ„ํˆฌ์šฉ tensorflow 1.1-gpu์—์„œ๋„ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.
tf 1.2๋กœ ์—…๊ทธ๋ ˆ์ด๋“œํ•ฉ๋‹ˆ๋‹ค. ์—ฌ์ „ํžˆ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
๊ทธ๋Ÿฐ ๋‹ค์Œ ํŒŒ์ผ์—์„œ embedding_attention_seq2seq ํ•จ์ˆ˜๋ฅผ ๋ณ€๊ฒฝํ•ฉ๋‹ˆ๋‹ค.
tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py
์œ„์—์„œ ์ œ์•ˆํ•œ @fabiofumarola ๋กœ.
์ด์ œ ํ›ˆ๋ จ์„ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค. ์•„์ง ๋””์ฝ”๋”ฉ์„ ํ…Œ์ŠคํŠธํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.

์…€ ์ •์˜์˜ ์ฝ”๋“œ๋ฅผ seq2seq_f๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค.

def seq2seq_f(encoder_inputs, decoder_inputs, do_decode):
      def single_cell():
        return tf.contrib.rnn.GRUCell(size)
      if use_lstm:
        def single_cell():
          return tf.contrib.rnn.BasicLSTMCell(size)
      cell = single_cell()
      if num_layers > 1:
        cell = tf.contrib.rnn.MultiRNNCell([single_cell() for _ in range(num_layers)])
      return tf.contrib.legacy_seq2seq.embedding_attention_seq2seq(
      ...
      )

๊ทธ๋Ÿฐ ๋‹ค์Œ "python translate.py --data_dir data/ --train_dir checkpoint/ --size=256 --num_layers=2 --steps_per_checkpoint=50"์ด ์ž‘๋™ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@huxuanlai ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค! ์ ์–ด๋„ ์ง€๊ธˆ์€ ํ›ˆ๋ จ ์ค‘์ด์•ผ, thx!

@huxuanlai ์ €์—๊ฒŒ๋„ ํšจ๊ณผ์ ์ž…๋‹ˆ๋‹ค.

๋™์ผํ•œ AttributeError: 'NoneType' object has no attribute 'update' ๋ฅผ ๋ฐ›๊ณ  ์žˆ์ง€๋งŒ tf.contrib.legacy_seq2seq.model_with_buckets ์ž…๋‹ˆ๋‹ค. ์šฐ๋ถ„ํˆฌ 16.04 lts์—์„œ tf 1.2.1(GPU)์„ ์‹คํ–‰ ์ค‘์ž…๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ๋ฒ„ํ‚ท์ด 1๊ฐœ ์ด์ƒ์ธ ๊ฒฝ์šฐ์—๋งŒ ๋ฐœ์ƒํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์ „์ฒด ์—ญ์ถ”์ :

Traceback (most recent call last):
  File "chatbot.py", line 262, in <module>
    main()
  File "chatbot.py", line 257, in main
    train()
  File "chatbot.py", line 138, in train
    model.build_graph()
  File "/home/jkarimi91/Projects/cs20/code/hw/a3/model.py", line 134, in build_graph
    self._create_loss()
  File "/home/jkarimi91/Projects/cs20/code/hw/a3/model.py", line 102, in _create_loss
    softmax_loss_function=self.softmax_loss_function)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 1206, in model_with_buckets
    decoder_inputs[:bucket[1]])
  File "/home/jkarimi91/Projects/cs20/code/hw/a3/model.py", line 101, in <lambda>
    lambda x, y: _seq2seq_f(x, y, False),
  File "/home/jkarimi91/Projects/cs20/code/hw/a3/model.py", line 76, in _seq2seq_f
    feed_previous=do_decode)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 848, in embedding_attention_seq2seq
    encoder_cell = copy.deepcopy(cell)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 174, in deepcopy
    y = copier(memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 476, in __deepcopy__
    setattr(result, k, copy.deepcopy(v, memo))
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 230, in _deepcopy_list
    y.append(deepcopy(a, memo))
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 190, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 334, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 257, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 190, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 334, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 257, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 190, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 334, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 257, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 190, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 334, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 257, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 257, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 230, in _deepcopy_list
    y.append(deepcopy(a, memo))
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 190, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 334, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 257, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 190, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 334, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 257, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 190, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 334, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 257, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 230, in _deepcopy_list
    y.append(deepcopy(a, memo))
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 237, in _deepcopy_tuple
    y.append(deepcopy(a, memo))
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 257, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 190, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 334, in _reconstruct
    state = deepcopy(state, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 163, in deepcopy
    y = copier(x, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 257, in _deepcopy_dict
    y[deepcopy(key, memo)] = deepcopy(value, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 190, in deepcopy
    y = _reconstruct(x, rv, 1, memo)
  File "/home/jkarimi91/Apps/anaconda2/envs/tf/lib/python2.7/copy.py", line 343, in _reconstruct
    y.__dict__.update(state)
AttributeError: 'NoneType' object has no attribute 'update'

@Tshzzz @jtubert
thx, ๊ท€ํ•˜์˜ ์†”๋ฃจ์…˜์ด ์ €์—๊ฒŒ ํšจ๊ณผ์ ์ด์—ˆ์Šต๋‹ˆ๋‹ค. ๋‚ด tf ๋ฒ„์ „์€ 1.1.0์ž…๋‹ˆ๋‹ค.

๋‚˜๋Š” ๋‹ค์Œ์—์„œ ๋ณ€๊ฒฝํ–ˆ๋‹ค:

    lstm_cell = tf.contrib.rnn.BasicLSTMCell(HIDDEN_SIZE, state_is_tuple=True)
    cell = tf.contrib.rnn.MultiRNNCell([lstm_cell() for _ in range(NUM_LAYERS)])
    output, _ = tf.nn.dynamic_rnn(cell, X, dtype=tf.float32)  

์—๊ฒŒ:

    cells=[]
    for _ in range(NUM_LAYERS):
        cell = tf.contrib.rnn.BasicLSTMCell(HIDDEN_SIZE, state_is_tuple=True)
        cells.append(cell)
    multicell = tf.contrib.rnn.MultiRNNCell(cells, state_is_tuple=True)
    output, _ = tf.nn.dynamic_rnn(multicell, X, dtype=tf.float32)

์ด๊ฒƒ์€ ์—ฌ์ „ํžˆ โ€‹โ€‹๊ณ ์ •๋˜์ง€ ์•Š์•˜์œผ๋ฉฐ ๊ฐ€๋Šฅํ•œ ๋ชจ๋“  ์†”๋ฃจ์…˜์„ ์‹œ๋„ํ–ˆ์ง€๋งŒ ์ด ์Šค๋ ˆ๋“œ์™€ stackoverflow์—์„œ ์–ธ๊ธ‰ํ•œ ์†”๋ฃจ์…˜์€ tensorflow 1.3 ๋˜๋Š” 1.2 ๋˜๋Š” 1.1์—์„œ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.
TypeError: embedding_attention_seq2seq() missing 1 required positional argument: 'dec_cell'

์˜ค๋ฅ˜๋Š” seq2seq_model.py์˜ 142ํ–‰์ธ seq2seq_model.py์˜ ์ด ํ•จ์ˆ˜๋ฅผ ๊ฐ€๋ฆฌํ‚ต๋‹ˆ๋‹ค.

def seq2seq_f(encoder_inputs, decoder_inputs, do_decode): return tf.contrib.legacy_seq2seq.embedding_attention_seq2seq( encoder_inputs, decoder_inputs, cell, num_encoder_symbols=source_vocab_size, num_decoder_symbols=target_vocab_size, embedding_size=size, output_projection=output_projection, feed_previous=do_decode, dtype=dtype)

์ด ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜์—ฌ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•œ ์‚ฌ๋žŒ์€ ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋„๋ก ๋„์™€์ฃผ์„ธ์š”.

ValueError: RNNCell ์žฌ์‚ฌ์šฉ ์‹œ๋„์ฒซ ๋ฒˆ์งธ ์‚ฌ์šฉ๊ณผ ๋‹ค๋ฅธ ๋ณ€์ˆ˜ ๋ฒ”์œ„๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์…€์˜ ์ฒซ ๋ฒˆ์งธ ์‚ฌ์šฉ์€ ๋ฒ”์œ„ 'rnn/multi_rnn_cell/cell_0/gru_cell'์ด์—ˆ๊ณ , ์ด ์‹œ๋„๋Š” ๋ฒ”์œ„ 'rnn/multi_rnn_cell/cell_1/gru_cell'์ž…๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ๊ฐ€์ค‘์น˜ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ์…€์˜ ์ƒˆ ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“œ์‹ญ์‹œ์˜ค. ์ด์ „์— MultiRNNCell([GRUCell(...)] * num_layers)์„ ์‚ฌ์šฉํ–ˆ๋‹ค๋ฉด MultiRNNCell([GRUCell(...) for _ in range(num_layers)])๋กœ ๋ณ€๊ฒฝํ•ฉ๋‹ˆ๋‹ค. ์–‘๋ฐฉํ–ฅ RNN์˜ ์ˆœ๋ฐฉํ–ฅ ๋ฐ ์—ญ๋ฐฉํ–ฅ ์…€ ๋ชจ๋‘์™€ ๋™์ผํ•œ ์…€ ์ธ์Šคํ„ด์Šค๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‘ ๊ฐœ์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค(ํ•˜๋‚˜๋Š” ์ •๋ฐฉํ–ฅ, ํ•˜๋‚˜๋Š” ์—ญ๋ฐฉํ–ฅ). 2017๋…„ 5์›”์— scope=None(์ž๋™ ๋ชจ๋ธ ์ €ํ•˜๋กœ ์ด์–ด์งˆ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์ด ์˜ค๋ฅ˜๋Š” ๊ทธ๋•Œ๊นŒ์ง€ ์œ ์ง€๋จ)์œผ๋กœ ํ˜ธ์ถœ๋  ๋•Œ ๊ธฐ์กด์— ์ €์žฅ๋œ ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ์šฉํ•˜๋„๋ก ์ด ์…€์˜ ๋™์ž‘์„ ์ „ํ™˜ํ•˜๊ธฐ ์‹œ์ž‘ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์›์‚ฐ์ง€ ์ฝ”๋“œ:
tensorflow.contrib์—์„œ ๊ฐ€์ ธ์˜ค๊ธฐ rnn
์ž…๋ ฅ = tf.placeholder(dtype=tf.int32, ๋ชจ์–‘=[์—†์Œ, ์—†์Œ], ์ด๋ฆ„="์ž…๋ ฅ")
keep_prob = tf.placeholder(dtype=tf.float32, name="keep_prob")
์…€ = rnn.GRUCell(10)
cell = rnn.DropoutWrapper(cell=cell, input_keep_prob=keep_prob)
cell = rnn.MultiRNNCell([๋ฒ”์œ„(5)์— ์žˆ๋Š” _์— ๋Œ€ํ•œ ์…€], state_is_tuple=True)

์ถœ๋ ฅ, ์ƒํƒœ = tf.nn.dynamic_rnn(์…€=์…€, ์ž…๋ ฅ=look_up, dtype=tf.float32)
ํ•ด๊ฒฐ์ฑ…:
์ž…๋ ฅ = tf.placeholder(dtype=tf.int32, ๋ชจ์–‘=[์—†์Œ, ์—†์Œ], ์ด๋ฆ„="์ž…๋ ฅ")
keep_prob = tf.placeholder(dtype=tf.float32, name="keep_prob")
cell = rnn.MultiRNNCell([rnn.DropoutWrapper(rnn.GRUCell(10), input_keep_prob=keep_prob) for _ in range(5)], state_is_tuple=True)

tf nightlies์— ์ด ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?

2017๋…„ 10์›” 1์ผ ์˜ค์ „ 8์‹œ 34๋ถ„์— "Baohua Zhou" ์•Œ๋ฆผ @github.com์ด ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.

ios์™€ ํ•จ๊ป˜ CPU์—์„œ tensorflow 1.1์„ ์‚ฌ์šฉํ•  ๋•Œ๋„ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/8191#issuecomment-333384725 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABtimwOv7vf5vvFXBllbZryjCFwmJcU6ks5sn7DxgaJpZM4MWl4f
.

AttributeError: 'NoneType' ๊ฐœ์ฒด์— '์—…๋ฐ์ดํŠธ' ์†์„ฑ์ด ์—†์Šต๋‹ˆ๋‹ค.

tf=1.3์—์„œ

ValueError: RNNCell ์žฌ์‚ฌ์šฉ ์‹œ๋„์ฒซ ๋ฒˆ์งธ ์‚ฌ์šฉ๊ณผ ๋‹ค๋ฅธ ๋ณ€์ˆ˜ ๋ฒ”์œ„๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. cell์˜ ์ฒซ ๋ฒˆ์งธ ์‚ฌ์šฉ์€ ๋ฒ”์œ„ 'embedding_attention_seq2seq/rnn/multi_rnn_cell/cell_0/gru_cell'์ด์—ˆ๊ณ , ์ด ์‹œ๋„๋Š” ๋ฒ”์œ„ 'embedding_attention_seq2seq/rnn/multi_rnn_cell/cell_1/gru_cell'์ž…๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ๊ฐ€์ค‘์น˜ ์„ธํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ์…€์˜ ์ƒˆ ์ธ์Šคํ„ด์Šค๋ฅผ ๋งŒ๋“œ์‹ญ์‹œ์˜ค. ์ด์ „์— MultiRNNCell([GRUCell(...)] * num_layers)์„ ์‚ฌ์šฉํ–ˆ๋‹ค๋ฉด MultiRNNCell([GRUCell(...) for _ in range(num_layers)])๋กœ ๋ณ€๊ฒฝํ•ฉ๋‹ˆ๋‹ค. ์–‘๋ฐฉํ–ฅ RNN์˜ ์ˆœ๋ฐฉํ–ฅ ๋ฐ ์—ญ๋ฐฉํ–ฅ ์…€ ๋ชจ๋‘์™€ ๋™์ผํ•œ ์…€ ์ธ์Šคํ„ด์Šค๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋‘ ๊ฐœ์˜ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค(ํ•˜๋‚˜๋Š” ์ •๋ฐฉํ–ฅ, ํ•˜๋‚˜๋Š” ์—ญ๋ฐฉํ–ฅ). 2017๋…„ 5์›”์— scope=None(์ž๋™ ๋ชจ๋ธ ์ €ํ•˜๋กœ ์ด์–ด์งˆ ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์ด ์˜ค๋ฅ˜๋Š” ๊ทธ๋•Œ๊นŒ์ง€ ์œ ์ง€๋จ)์œผ๋กœ ํ˜ธ์ถœ๋  ๋•Œ ๊ธฐ์กด์— ์ €์žฅ๋œ ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ์šฉํ•˜๋„๋ก ์ด ์…€์˜ ๋™์ž‘์„ ์ „ํ™˜ํ•˜๊ธฐ ์‹œ์ž‘ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

14์ผ ๋™์•ˆ ํ™œ๋™์ด ์—†์—ˆ์œผ๋ฉฐ awaiting tensorflower ๋ ˆ์ด๋ธ”์ด ์ง€์ •๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ์— ๋”ฐ๋ผ ๋ ˆ์ด๋ธ” ๋ฐ/๋˜๋Š” ์ƒํƒœ๋ฅผ ์—…๋ฐ์ดํŠธํ•˜์‹ญ์‹œ์˜ค.

TensorFlower๋ฅผ ๊ธฐ๋‹ค๋ฆฌ๋Š” ์ž”์†Œ๋ฆฌ: 14์ผ ๋™์•ˆ ํ™œ๋™์ด ์—†์—ˆ์œผ๋ฉฐ awaiting tensorflower ๋ ˆ์ด๋ธ”์ด ํ• ๋‹น๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ์— ๋”ฐ๋ผ ๋ ˆ์ด๋ธ” ๋ฐ/๋˜๋Š” ์ƒํƒœ๋ฅผ ์—…๋ฐ์ดํŠธํ•˜์‹ญ์‹œ์˜ค.

ํ•ด๊ฒฐ์ฑ…์€ ์ตœ์‹  ๋ฒ„์ „์˜ TF๋กœ ์ด๋™ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด ์Šค๋ ˆ๋“œ๋Š” ์›๋ž˜ ๋ฌธ์ œ์—์„œ ํฌ๊ฒŒ ๋ฒ—์–ด๋‚ฌ์Šต๋‹ˆ๋‹ค. ํ์‡„.

์ฆ‰๊ฐ์ ์ธ ์†”๋ฃจ์…˜์„ ์›ํ•œ๋‹ค๋ฉด ๋‚ด๊ฐ€ ์‹œ๋„ํ•œ ๊ฒƒ์„ ์‹œ๋„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

pip install tensorflow==1.0
๋ฌธ์ œ๋Š” tenorflow 1.1 ๋ฒ„์ „์— ์žˆ์Šต๋‹ˆ๋‹ค. ์ €์—๊ฒŒ ํšจ๊ณผ์ ์ด์—ˆ์Šต๋‹ˆ๋‹ค.

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰