å€ãã®éå°é家ã¯ã次ã®ã³ãŒãhttp://stackoverflow.com/questions/33949786/how-could-i-use-batch-normalization-in-tensorflow?answertab=votes#tab-topã䜿çšããŠããŸãã
DNNã®ãã¬ãŒãã³ã°ã«ãããéèŠæ§ãèãããšãå ¬åŒã®ããããã«ã ã¬ã€ã€ãŒããããšäŸ¿å©ã§ãã
ç§ã¯ãã®ããã€ãã®éšåã«åãçµãã§ããŸãã
batch_norm
ã¬ã€ã€ãŒããããŸãïŒ
https://github.com/tensorflow/tensorflow/blob/b826b79718e3e93148c3545e7aa3f90891744cc0/tensorflow/contrib/layers/python/layers/layers.py#L100
ãã®ã¬ã€ã€ãŒã«äœãåé¡ããããšæããŸãã ãã¬ãŒãã³ã°ã§ã¯ãã¹ãŠãOKã§ãæ倱ã¯éåžžã«å°ãªããªããŸãã ãããããã¹ãã§ã¯ç²ŸåºŠããŒãã«ãªããŸãã
ã¡ãªã¿ã«ãis_training = Falseã䜿çšããå Žåã®ãã¹ãã§ã¯ãaccããŒãã«ãªããŸãã
ãããæ£èŠåã¯ããã¬ãŒãã³ã°æãšãã¹ãæã«ããã«ããã¬ãŒãã³ã°ãã§ãŒãºãšãã¹ããã§ãŒãºã§ç°ãªãåäœãããããšãç¥ã£ãŠ--Quora ã ãã®å®è£
ã¯äžæ確ã ãšæããŸã
ããã§ãåãã§ãããis_training = Falseã§äºæããªãåäœãçºçããŸããã ãã®ãã©ã°ãå€æŽããæ£ããæ¹æ³ã¯äœã§ããïŒ tf.placeholders
ãåç¬ã§äœ¿çšããªããããçŸåštf.cond
ã䜿çšããŠããŸãã
@pawni is_training
ã¯PythonããŒã«å€ã䜿çšããå¿
èŠããããŸãã tf.cond
ã«ããããšã¯ã§ããŸããã
@ppwwyyxxããç§ã¯tf.cond(placeholder, batch_norm(.., is_training = True), batch_norm(.., is_training = False))
ããã£ãŠããŸããããããšãbatch_norm(.., is_training=variable)
ããã£ãŠãå¿
èŠã«å¿ããŠã°ã©ãã®å€ã§ãããå€æŽããããšã«ãªã£ãŠããŸããïŒ
ãããããªãã¯batch_norm(.., is_training=tf.cond(placeholder))
ããã£ãŠãããšæããŸããããããã¯æ£ãããããŸããã
ããªãã®çŸåšã®æ¹æ³ã«ãåé¡ããããããããŸããã äœæãã2ã€ã®batch_norm
opãåãã¹ã³ãŒããå
±æããŠããããšãå確èªããå¿
èŠããããŸããããããªããšãåºç€ãšãªãå¹³å/åæ£çµ±èšãå
±æãããŸããã
ãããè¡ãã«ã¯ã reuse
åŒæ°ã圹ç«ã€å ŽåããããŸãããç¬èªã®ããŒãžã§ã³ã®bnã¬ã€ã€ãŒã䜿çšããŠããããããããŸããã
åãã¹ã³ãŒããšreuse=True
ãŸãã æã
åäœããããã§ãããããããããŸããã ãã¬ãŒãã³ã°ãããã¹ããžã®å€æŽãæé©ã«åŠçããæ¹æ³ãç°¡åã«èª¬æããã¬ã€ã€ãŒãããã¥ã¡ã³ãã«è¿œå ã§ããã°çŽ æŽããããšæããŸãã
@sguada FYI
çŸåšãbatch_normã«ã¯pythonããŒã«å€ãå¿ èŠã§ãããTensorãæž¡ããªãã·ã§ã³ã®è¿œå ã«åãçµãã§ããŸãã
@pawnimoving_meanãšmoving_varianceã®æŽæ°ã«ã€ããŠå¿é ããããªãå Žåã¯updates_collections = Noneãèšå®ããŠãããããé©åã«æŽæ°ãããŠããããšã確èªããŠãã ãããããã§ãªãå Žåã¯ããã¬ãŒãã³ã°äžã«tf.GraphKeys.UPDATE_OPSã«è¿œå ãããupdate_opsãå®è¡ãããŠããããšã確èªããå¿ èŠããããŸãã
ãã³ãœã«ãããŒã«ã¯ãããŒãã®ããã«ãã¢ãã«ã®ç¶æ ãå€æŽãã2ã€ã®ãã€ããŒã¡ãœãããå¿ èŠã ãšæããŸãã ã¢ãã«ã®ç¶æ ãå€æŽããŸãã ãšãŠãç°¡åã ãšæããŸãã
ãã®ãå ¬åŒãBNã¬ã€ã€ãŒã䜿çšããé©åãªæ¹æ³ã瀺ãéåžžã«åçŽãªNNã®å°ããªã¹ã¯ãªããã¯ãããŸããïŒ æ¬åœã«ãããããã§ãã
ãããå°ãç¹°ãè¿ãããå Žåã¯ç³ãèš³ãããŸããããAPIã¯å¥ã®ã€ã³ã¿ãŒãã§ãŒã¹ã§BNã«ã€ããŠè©±ããŠããããã§ãïŒ https ïŒ
ããã¯BNã䜿çšããå ¬åŒã®æ¹æ³ã§ã¯ãããŸãããïŒ ç§ã¯ããã䜿çšããæ¹æ³ã«ã€ããŠæ··ä¹±ããŠããŠãSOãå€ããªã£ãŠããããã§ãAPIãšã¯ç°ãªããªã³ã¯ã«ã¬ã€ã€ãŒããããŸããããããã©ã®ããã«æ£ç¢ºã«è¡ãã®ã§ããïŒ SOã«è¡ãã®ããããã§å°ããã®ãããããŸããã
ã¹ãã ã«ã€ããŠã¯ç³ãèš³ãããŸãããã次ã®ãããªãã®ã䜿çšããã ãã§äœãåé¡ã«ãªããŸããã
def standard_batch_norm(l, x, n_out, phase_train, scope='BN'):
"""
Batch normalization on feedforward maps.
Args:
x: Vector
n_out: integer, depth of input maps
phase_train: boolean tf.Varialbe, true indicates training phase
scope: string, variable scope
Return:
normed: batch-normalized maps
"""
with tf.variable_scope(scope+l):
#beta = tf.Variable(tf.constant(0.0, shape=[n_out], dtype=tf.float64 ), name='beta', trainable=True, dtype=tf.float64 )
#gamma = tf.Variable(tf.constant(1.0, shape=[n_out],dtype=tf.float64 ), name='gamma', trainable=True, dtype=tf.float64 )
init_beta = tf.constant(0.0, shape=[n_out], dtype=tf.float64)
init_gamma = tf.constant(1.0, shape=[n_out],dtype=tf.float64)
beta = tf.get_variable(name='beta'+l, dtype=tf.float64, initializer=init_beta, regularizer=None, trainable=True)
gamma = tf.get_variable(name='gamma'+l, dtype=tf.float64, initializer=init_gamma, regularizer=None, trainable=True)
batch_mean, batch_var = tf.nn.moments(x, [0], name='moments')
ema = tf.train.ExponentialMovingAverage(decay=0.5)
def mean_var_with_update():
ema_apply_op = ema.apply([batch_mean, batch_var])
with tf.control_dependencies([ema_apply_op]):
return tf.identity(batch_mean), tf.identity(batch_var)
mean, var = tf.cond(phase_train, mean_var_with_update, lambda: (ema.average(batch_mean), ema.average(batch_var)))
normed = tf.nn.batch_normalization(x, mean, var, beta, gamma, 1e-3)
return normed
次ã«ã次ã®ããã«ãã£ãŒããã£ã¯ã·ã§ããªã§ã©ã¡ãã䜿çšãããããã³ãœã«ãããŒã«æ瀺ããã®ã¯ç°¡åã§ãã
feed_dict = {x: Xminibatch, y_: Yminibatch, phase_train: True}
sess.run(fetches=[merged,train_step], feed_dict=feed_dict)
å®è£ ãå€æŽããããã©ãããäžæãªãããææ¡ãããããšæããŸããïŒç³ã¿èŸŒã¿ãªã©ãã³ãŒãã貌ãä»ããªãã£ããã®ã«ç°¡åã«æ¡åŒµã§ããããšã«æ³šæããŠãã ããïŒã
@pawni @ppwwyyxxã¹ã³ãŒãã®åé¡ã解決ããããã«ãåå©çšãtrueã«äœ¿çšããå¿ èŠããããã©ãããå€æããŸãããïŒ
@ brando90çŸåšãç§ã¯æ¬¡ã®ãããªããšãããŠããŸãã
def BatchNorm(inputT, is_training=True, scope=None):
return tf.cond(isTraining,
lambda: batch_norm(inputT, is_training=True,
center=False, updates_collections=None, scope=scope),
lambda: batch_norm(inputT, is_training=False,
updates_collections=None, center=False, scope=scope, reuse = True))
ããããïŒ3265ã¯åºæ¬çã«ãã®ããã«å®è£ ããããšæããŸãã ããã§ã®ããããã¢ãŠãå®è£ ãåç §ã§ããŸãïŒ https ïŒ
update_collections = Noneã®å ŽåãæŽæ°ã¯ã€ã³ãã¬ãŒã¹ã§è¡ãããtf.condïŒïŒã䜿çšããŠãis_trainingãTensorã«ããæ¹ãç°¡åã§ããããã¯ãæŽæ°ãé
ããŠupdate_opsãåŸã§å®è¡ãããå Žåã§ãã
ç§ã¯ããã«æåã®éšåãååŸããããšããŸãã
@ brando90 @pawni圌ã®ã³ãŒãã¯ããŸãæ©èœããŸããã以äžã®ããã«å€æŽããå¿ èŠããããŸã
def BatchNorm(inputT, is_training=True, scope=None):
# Note: is_training is tf.placeholder(tf.bool) type
return tf.cond(is_training,
lambda: batch_norm(inputT, is_training=True,
center=False, updates_collections=None, scope=scope),
lambda: batch_norm(inputT, is_training=False,
updates_collections=None, center=False, scope=scope, reuse = True))
ãããŠããã¬ãŒãã³ã°ããã¹ãã®æéã«å®è¡ãããšã
# when training
sess.run([opt, loss], feed_dict={x: bx, y: by, is_training=True})
# when test
sess.run([opt, loss], feed_dict={x: bx, y: by, is_training=False})
ãã®ã³ãŒãã¯æ©èœããŸããã ïŒ3265ãèšãããã«ã tf.contrib.layers.batch_norm
ãis_training
å€æ°ãtf.plcaeholer
ãšããŠååŸãããšçŽ æŽãããã§ãããã
@ nmhkahn @ pawniã³ãŒãã¹ãããããããããšãã ãããã¯ãç³ã¿èŸŒã¿ãããã¯ãŒã¯ã«ãããæ£èŠåãè¿œå ããã®ã«éåžžã«åœ¹ç«ã¡ãŸããã ãã¬ãŒãã³ã°ã¯éåžžã«ããŸãæ©èœããŠããããã§ãã ãã¹ãã¯ããã§ã¯ãããŸããã ã³ãŒããã¬ãŒãã³ã°ã®ç²ŸåºŠã®äžéšã®ããŒãžã§ã³ã§ã¯ããã¹ãã®ç²ŸåºŠãããã¯ããã«é«ãããããããæ£èŠåãã©ã¡ãŒã¿ãŒãå ±æããŠããªãå¯èœæ§ããããŸãã ä»ã®ããŒãžã§ã³ã®ã³ãŒãã§ã¯ããValueErrorïŒå€æ°conv1 / betaã¯ãã§ã«ååšããèš±å¯ãããŠããŸãããVarScopeã§reuse = Trueãèšå®ããã€ããã§ãããïŒããšããã¡ãã»ãŒãžã衚瀺ãããŸãã ããã¯ããã©ã¡ãŒã¿ãååŠç¿ããããšããŠããããšã瀺ããŠããããã§ã...åå©çšããããšãããšãã
å€æ°ã®å ±æãæ£ããè¡ãããããã«ããã¬ãŒãã³ã°ããã³ãã¹ãäžã«ãdefBatchNormãé¢æ°ãåŒã³åºãæ¹æ³ã®äŸã誰ããæäŸã§ããŸããã
å©ããŠãããŠããããšãã
2016幎7æ25æ¥æŽæ°ïŒ
@ nmhkahn @ pawniã³ã¡ã³ãããããšãããããŸãã contribã®ã³ãŒãã詳ãã調ã¹ãåŸãç§ã¯èªåã®åé¡ãäœã§ããããç解ããŸããã ãã¬ãŒãã³ã°ãšãã¹ãäžã«ã4ã€ã®å€æ°ïŒããŒã¿ãã¬ã³ãã移åå¹³åã移åå¹³åïŒãæŽæ°ãŸãã¯åå©çšããŸãã ãããããŠããŒã¯ã«ããããã«ãã¬ã€ã€ãŒããšã«ã¹ã³ãŒããèšå®ããå¿ èŠããããŸããã ç§ã¯ãã®ããã«ããŸããïŒ
conv1 = tf.nn.reluïŒbatch_norm_layerïŒconv2d_stride2_validïŒdataãW_conv1ïŒ+ b_conv1ãtrain_phaseãscope = "conv1"ïŒïŒ
ããã§ãbatch_norm_layerã¯@nmhkahn @pawniã®äŸã«äŒŒãŠãããconv2d_stride2_validã¯ç³ã¿èŸŒã¿å±€ãå®çŸ©ããããã®åãªãå®çŸ©ã§ãããW_conv1ãšb_conv1ã¯éã¿ãšãã€ã¢ã¹ãä¿æããå€æ°ã§ãã ãããæ£èŠåã䜿çšããŠããããããã€ã¢ã¹é ãåé€ã§ããå¯èœæ§ããããŸãã
ãããã¯ä»ããŸãæ©èœããŠããŸãã ãã¬ãŒãã³ã°ã¢ãŒããšãã¹ãã¢ãŒãã§ç²ŸåºŠãããããããåŸããã¬ãŒãã³ã°ç²ŸåºŠã®åŸã«ãã¹ã粟床ãäžæãå§ããããšã«æ°ä»ããŸããã æ¯ãè¿ã£ãŠã¿ããšããã¹ãçšã®ããŒã¿ã»ããçµ±èšãåéããŠããã®ã§ãããã¯çã«ããªã£ãŠããŸãã ããããæåã®ãã¹ãã§äœãééã£ãããšãããŠããããã«èŠããŸããã ã³ã¡ã³ããããã ããã³ãã¥ããã£ã§ãããæ£èŠåãå©çšã§ããããã«ããŠããã ãããããšãããããŸãã
@nmhkahn pawniã®ææ¡ãšã©ãéãã®ã§ããïŒ
@ brando90 nmhkahnã«ãã£ãŠä¿®æ£ãããããŒãžã§ã³ã§å°ããªãšã©ãŒãçºçããŸããïŒ isTraining
ãis_training
ïŒ
@diegoAtAlpineç§ã¯åãåé¡ãèŠã€ããŸãã-ãããããªãããã
@nmhkahn @pawni @ããªãããããšãïŒ
sess.run([opt, loss], feed_dict={x: bx, y: by, is_training=True})
is_training
ããã¬ãŒã¹ãã«ããŒãšããŠäœ¿çšããŠãããšããæå³ã§ã¯ãããŸãããïŒ äººã
ã¯is_training
ããã¬ãŒãµãŒãã«ããŒã«ããããšã³ã¡ã³ãããŠããŸãããããã¯ç§ã®ããŒãžã§ã³ã®ããã§ãã
def batch_norm_layer(x,train_phase,scope_bn):
bn_train = batch_norm(x, decay=0.999, center=True, scale=True,
is_training=True,
reuse=None, # is this right?
trainable=True,
scope=scope_bn)
bn_inference = batch_norm(x, decay=0.999, center=True, scale=True,
is_training=False,
reuse=True, # is this right?
trainable=True,
scope=scope_bn)
z = tf.cond(train_phase, lambda: bn_train, lambda: bn_inference)
return z
ããã¯æ£ãããããŸãããïŒ
is_trainingã®ãã³ãœã«ãŸãã¯ãã¬ãŒã¹ãã«ããŒãæž¡ãããšãã§ããããã«tf.contrib.layers.batch_normãæ¢ã«æ¡åŒµããŸããã éããªãTFcontribã«çµ±åãããŸãã
ã§å©çšå¯èœã«ãªããŸãã
https://github.com/tensorflow/tensorflow/commit/9da5fc8e6425cabd61fc36f0dcc1823a093d5c1d#diff -94bbcef0ec8a5cdef55f705e99c2b2ed
ããã¯ç§ã ãã§ããããããšããã®BNã¬ã€ã€ãŒãè¿œå ãããšãåäžã®ãšããã¯ã®ãã¬ãŒãã³ã°ãèããé ããªããŸããïŒ
@ brando90ããã¯ç§ã«ãšã£ãŠããã¬ãŒãã³ã°ãé ãããŸãããããã€ãã®çµ±èšãèšç®ããå¿ èŠãããã®ã§ãããã¯äºæ³ããããšæããŸãã ãããŠãããªãã®ããŒãžã§ã³ã¯ç§ã«ã¯è¯ãããã§ãã
@nmhkahnç°¡åãªè³ªåã ããªããæžãããšãïŒãã¹ãçšïŒïŒ
sess.run([opt, loss], feed_dict={x: bx, y: by, is_training=False})
çè«çã«ã¯ãbxã¯ä»»æã®ããŒã¿ã»ããã«ããããšãã§ããŸããïŒ ã€ãŸãããã¬ãŒãã³ã°ãããŠããªããŠãã
@ brando90ãã®éãã§ãã
is_trainingãã©ã°ãšåå©çšãã©ã°ã«ã€ããŠãæ··ä¹±ããŠããŸãã CIFARã®äŸã«åŸã£ãŠããã°ã©ã ãäœæããŸãããããã§ãã³ãŒãã¯CIFARã®ããã«æ§é åãããŠããŸãã
ãããŠãç§ã¯ããããã«ãGPUæ¹åŒã§å®è¡ããŠããŸãïŒãã¬ãŒãã³ã°çšïŒã
ãããã£ãŠããã¬ãŒãã³ã°çšã®ã¹ã¯ãªããïŒcifar10_multigpu.pyãšåæ§ïŒãšãã¹ãçšã®ã¹ã¯ãªããïŒcifar10_eval.pyãšåæ§ïŒã1ã€ãã€ãããŸãã
ãã
for ii in xrange(2): # Num of GPU
with tf.device('/gpu:%d' % ii):
with tf.name_scope('device_%d' % ii) as scope:
data_batch, label_batch = factory.GetShuffleBatch(batch_size)
unnormalized_logits = factory.MyModel(dataBatch=data_batch, numClasses=numClasses,
isTraining=True)
More stuff happening
tf.get_variable_scope().reuse_variables()
æšè«ã¯é¢æ°MyModelã§çºçããŸãã ïŒä»¥äžã¯é¢æ°ã®äŸã§ããå®éã«ã¯ãããå€ãã®ã¬ã€ã€ãŒãšãã¥ãŒãã³ã䜿çšããŠããŸãïŒã
def MyModel(data_batch, num_classes, feature_dim):
# Hidden Layer 1
with tf.variable_scope('hidden1') as scope:
weights = variable_on_cpu('weights',[feature_dim, 256], tf.truncated_normal_initializer(stddev=0.04))
biases = variable_on_cpu('biases', [256], tf.constant_initializer(0.001))
hidden1 = tf.nn.relu(tf.matmul(data_batch, weights) + biases, name=scope.name)
# Hidden Layer 2
with tf.variable_scope('hidden2') as scope:
weights = variable_on_cpu('weights',[256, 256], tf.truncated_normal_initializer(stddev=0.04))
biases = variable_on_cpu('biases', [256], tf.constant_initializer(0.001))
hidden2 = tf.nn.relu(tf.matmul(hidden1, weights) + biases, name=scope.name)
# output, unnormalized softmax
with tf.variable_scope('softmax_unnorm') as scope:
weights = variable_on_cpu('weights', [256, num_classes], tf.truncated_normal_initializer(stddev=1/num_classes))
biases = variable_on_cpu('biases', [num_classes], tf.constant_initializer(0.0))
softmax_un = tf.add(tf.matmul(hidden2, weights), biases, name=scope.name)
return softmax_un
ãããããã©ã€ãºãå®è¡ãããã ã ããç§ããããšãïŒ
def MyModel(data_batch, num_classes, feature_dim, isTraining):
with tf.variable_scope('bnormalization') as scope:
norm_data_batch = tcl.batch_norm(inputs=dataBatch, epsilon=0.0001, is_training=isTraining,
reuse=True, scope=scope)
# Hidden Layer 1
with tf.variable_scope('hidden1') as scope:
weights = variable_on_cpu('weights',[feature_dim, 256], tf.truncated_normal_initializer(stddev=0.04))
biases = variable_on_cpu('biases', [256], tf.constant_initializer(0.001))
hidden1 = tf.nn.relu(tf.matmul(data_batch, weights) + biases, name=scope.name)
ãã¬ãŒãã³ã°ãã§ãŒãºã§æ¬¡ã®ãšã©ãŒãçºçããŸããã
å¯å€bnormalization / betaã¯ååšãããèš±å¯ãããŠããŸããã VarScopeã§reuse = Noneãèšå®ããã€ããã§ãããïŒ
ãã¬ãŒãã³ã°ãã§ãŒãºã§ãã®ã¹ã¬ããã§èªãã ããšãããreuse = Noneã䜿çšããå¿ èŠããããŸãã ãã®éšåã¯æ£ããã§ããïŒ ãããåœãŠã¯ãŸãå Žåã2ã€ã®GPUã䜿çšããŠããã®ã§ãæåã®GPUã§reuse = Noneãå®è¡ãã2çªç®ã®GPUã§reuse = Trueãå®è¡ããå¿ èŠããããŸããïŒ ãŸãã¯ãtf.get_variable_scopeïŒïŒãreuse_variablesïŒïŒãå®è¡ããŠããã®ã§ãããèªäœãåŠçãããŸããïŒ
æåŸã«ããã¹ããã§ãŒãºã§ã¯ãis_training = Falseããã³reuse = Trueã«ããå¿ èŠããããŸããïŒ
ã©ããªå©ãã§ã倧æè¿ã§ãã
ããã§ãtf.contrib.layers.batch_normã¯ããã³ãœã«ãå€æ°ããŸãã¯ãã¬ãŒã¹ãã«ããŒãis_trainingãšããŠåãå ¥ããŸãã
https://github.com/tensorflow/tensorflow/commit/9da5fc8e6425cabd61fc36f0dcc1823a093d5c1d#diff -94bbcef0ec8a5cdef55f705e99c2b2ed
ãããæ£èŠåã«ãã£ãŠå®éšãæªåããã®ã¯æ£åžžã§ããïŒ MNISTåå¿è åããã¥ãŒããªã¢ã«ã«åºã¥ããŠ2å±€ã®NNãããã¯ãŒã¯ã§è©ŠããŠã¿ãŸããããBNãååšããå Žåã¯äžè²«ããŠæªãçµæãåŸãããŸãã 0.9477ã
ç§ã®ã¹ã¯ãªããã¯ããã«ãããŸãhttps://github.com/brando90/tensor_flow_experiments/blob/master/tf_tutorials/beginner_tutorial_MNIST_BN.py
誰ãããããã®åé¡ãçµéšããããšããããŸããããããšãBNã¯ãã®ãããªãã®ã§ããããããæ©èœãããããã«äœãä»ã®ããšãããå¿ èŠããããŸããïŒ
tf.contrib.layers.batch_normã®ææ°ããŒãžã§ã³ã¯ã
ãã ããéèŠãªã®ã¯ã updates_collections = Noneãæž¡ããŠãmoving_meanãšmoving_varianceãã€ã³ãã¬ãŒã¹ã§æŽæ°ãããããã«ããããšã§ããããããªããšãupdate_opsãåéããŠãããããå®è¡ãããŠããããšã確èªããå¿ èŠããããŸãã
tf.contrib.layers
ãŸãã¯tf.contrib.slim
ã䜿çšããŠã¢ãã«ãæ§ç¯ããããšããå§ãããŸãã
slim = tf.contrib.slim
def build_NN_two_hidden_layers(x, is_training):
batch_norm_params = {'is_training': is_training, 'decay': 0.9, 'updates_collections': None}
with slim.arg_scope([slim.fully_connected],
activation_fn=tf.nn.relu,
weigths_initializer=tf.contrib.layers.xavier_initializer(),
biases_initializer=tf.constant_initializer(0.1),
normalizer_fn=slim.batch_norm,
normalizer_params=batch_norm_params):
net = slim.fully_connected(x, 50, scope='A1')
net = slim.fully_connected(net, 49, scope='A2')
y = slim.fully_connected(net, 10, activation_fn=tf.nn.softmax, normalizer_fn=None, scope='A3')
return y
@sguadaïŒtf.condã«åºã¥ããŠïŒæåã§ãã¬ãŒãã³ã°ãããã©ãããæ瀺ããå€ããã®ãå€æŽããŸãããã粟床ã¯åã³æ倧95ã«ãªã£ãŠããããã§ãã updates_collectionsãNoneã«å€æŽããå¿ èŠããã£ãã®ã¯ãªãã§ããïŒ ãªããããªã«å€§ããªç²ŸåºŠã®éããåºãã®ã説æããŠããããŸããïŒ ããã¯éèŠãªå€æŽã®ããã«æãããŸãïŒãããããã»ã©éèŠãªå Žåã¯ãNoneãããã©ã«ãå€ã«ããå¿ èŠããããŸããïŒïŒã ããããšãïŒ :)
ãŸãããã¬ãŒã¹ãã«ããŒã§ãããæåã§è¡ãå¿ èŠã¯ãªããšãã£ããã£ãŠããŸããã ãã ããis_trainingã®ãã¬ãŒã¹ãã«ããŒãæž¡ããšã
TypeError: Using a
tf.Tensor as a Python
bool is not allowed. Use
tãNoneã§ãªãå ŽåïŒ instead of
tã®å ŽåïŒ to test if a tensor is defined, and use the logical TensorFlow ops to test the value of a tensor.
ãããŠbatch_normã³ãŒããæããŠããŸãã ããããããã®ãã¬ãŒã¹ãã«ããŒãã©ã®ããã«äœ¿çšãããã¹ããã瀺ãã®ã¯è¯ãããšãããããŸããããªããªãããããã©ã®ããã«äœ¿çšãããã®ãç解ããŠããªãããã ããã§ãã ããããšãïŒ :)
@ brando90
ã³ãŒãã®é¢é£éšåã¯ããL227-256ã§ãã
ãæ°ã¥ãã®ããã«ãæŽæ°ã匷å¶ããwith ops.control_dependencies
ã¹ããŒãã¡ã³ãããããŸãã ãç®±ããåºããŠããã«ã䜿çšãããã³ãŒãã®ããã©ã«ãã¯Noneã§ããå¿
èŠããããšæããŸãã
1122ã®äžã®ç§ã®ã³ã¡ã³ãã«é¢ããŠã¯ãtf.get_variable_scopeïŒïŒãreuse_variablesïŒïŒãåé¡ãåŠçããããšãããã£ãã®ã§ããã¬ãŒãã³ã°ãã§ãŒãºã§ã¯batch_normã®åŒæ°ã®åå©çšã¯Noneã§ããå¿ èŠããããŸãã ããã¯ã¹ããŒãã¡ã³ãvariable_op_scopeãšé¢ä¿ããããŸãïŒtensorflowã§ãã®ããã¥ã¡ã³ããèªãã§ãã ããïŒ
tf.placeholderã§ã®batch_normã®äœ¿çš
x = tf.placeholder(tf.float32, [None, 784])
is_training = tf.placeholder(tf.bool, [], name='is_training')
y = build_NN_two_hidden_layers(x, is_training)
# For training
sess.run(y, {is_training: True, x: train_data})
# For eval
sess.run(y, {is_training: False, x: eval_data})
以åã®åé¡ã¯ãåã¹ãããã®åŸã«moving_mean
ãšmoving_variance
æŽæ°ããŠããªãã£ãããšmoving_mean
ããupdates_collectionsãNoneã®å Žåãèšç®ã®äžéšãšããŠæŽæ°ã匷å¶ãããŸãã
ãã ãããããã¯ãŒã¯ã«å€æ°ã®batch_normã¬ã€ã€ãŒãããå Žåã¯ããã¹ãŠã®æŽæ°æäœãåéããŠäžç·ã«å®è¡ããæ¹ãå¹ççã§ãããããåã¬ã€ã€ãŒã¯æŽæ°ãå®äºããã®ãåŸ
ã€å¿
èŠã¯ãããŸããã
y = build_model_with_batch_norm(x, is_training)
update_ops = tf.group(tf.get_collection(tf.GraphKeys.UPDATE_OPS))
sess.run([y, update_ops])
ãããåºæºã®é«éåã«é²å±ã¯ãããŸãããïŒ
èªåãšã³ã³ãŒãã®ã¿ã¹ã¯ã®ããã«ãïŒãã©ããåãããïŒMNISTïŒããã³reluãŠãããïŒããŒã¿ã»ããã䜿çšããŠã2å±€ã®å¯ã«æ¥ç¶ãããNNã§ããããã«ã ã䜿çšããããšããŸããããNaNãšã©ãŒãçºçãç¶ããŸãã ãªãããããããªã®ã誰ãç¥ã£ãŠããŸããïŒ ããã¯BNã§å¯èœã§ããïŒ æªãããªããã«èŠããŸãããããã¯ç§ã®åŠç¿èšå®ãã¬ãŒããªã©ã§ã¯ãããŸããã§ããïŒããããBNã¯ããã«å¯ŸããŠäžçš®ã®é åºãªã¯ããªã®ã§ããããã¹ãã§ã¯ãªããšæããŸãïŒ
@sguadaç¹ã«ãã©ã°updates_collections
ã«é¢ããŠã batch_norm
ã®æ£ãã䜿çšæ¹æ³ãç解ããŠããŸããã ãã©ã°ãNone
ãããã©ãããæ£ããç解ããå Žåããããã¯ãŒã¯ã¯å¹ççã§ã¯ãªãããã updates_collections=tf.GraphKeys.UPDATE_OPS
ãèš±å¯ããŠããããã¹ãŠã®batch_normæŽæ°ãåéããŠäžç·ã«å®è¡ããå¿
èŠããããŸãã
update_ops = tf.group(tf.get_collection(tf.GraphKeys.UPDATE_OPS))
å®è¡ããŠãbatch_normsã®æŽæ°ãåéããŸãã
ç°ãªãbatch_normã¬ã€ã€ãŒã䜿çšããå€ãã®ç°ãªãã¢ãã«ããããŸãããããã¯æ£ããæ©èœããŸãããïŒïŒ
#model 1
y1 = build_model_with_batch_norm(x, is_training)
update_ops1 = tf.group(tf.get_collection(tf.GraphKeys.UPDATE_OPS))
sess.run([y1, update_ops1])
#model 2
y2 = build_model_with_batch_norm(x, is_training)
update_ops2 = tf.group(tf.get_collection(tf.GraphKeys.UPDATE_OPS))
sess.run([y2, update_ops2])
ãã®éšåãããå°ã詳ãã説æããŠããã ããŸããïŒ ã©ããããããšãããããŸããã
å¥ã®ã³ã¬ã¯ã·ã§ã³ããŒã«å ¥ããã ãã§ãïŒ
# While building your 1st model...
tf.contrib.layers.batch_norm(..., updates_collection="updates-model1")
# same for 2nd model with key "updates-model2"
#model 1
y1 = build_model_with_batch_norm(x, is_training)
update_ops1 = tf.group(tf.get_collection("updates-model1"))
sess.run([y1, update_ops1])
#model 2
y2 = build_model_with_batch_norm(x, is_training)
update_ops2 = tf.group(tf.get_collection("updates-model1"))
sess.run([y2, update_ops2])
ããã«ãããããããããã¥ã¡ã³ãã¯æ代é ãã«ãªã£ãŠããŸãã 次ã®ããšãè¡ãããã«æ瀺ããŸãã
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
if update_ops:
updates = tf.group(update_ops)
total_loss = control_flow_ops.with_dependencies([updates], total_loss)
ã ãïŒ
ç·šéïŒ
ããã¥ã¡ã³ããs.thã«æŽæ°ããå¿ èŠããããŸãã ãã®ãããªïŒ
from tensorflow.python import control_flow_ops
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
if update_ops:
updates = tf.tuple(update_ops)
total_loss = control_flow_ops.with_dependencies(updates, total_loss)
ç·šé2ïŒ
ãããã¯ãŒã¯ã§ããã€ãã®å®è¡ãè¡ã£ãåŸãã°ã©ãã®äœæäžã«_tf.GraphKeys.UPDATE_OPS_ãæåã§ãã§ããããã®ãšã¯å¯Ÿç
§çã«ã_updates_collections = None_ã䜿çšããŠãããã©ãŒãã³ã¹ã®éãã¯èŠãããªããšèšããã
ç·šéïŒç§ã®çµæãæ£ãããã°èšãã®ã¯é£ããã§ããããããã¯ãŒã¯å šäœã¯ç¢ºãã«1.5åé«éã«ãªããŸãã ç§ã®ç¥ãéããBNçµ±èšã¯GPUã§ã¯ãªãCPUã§èšç®ãããŸãã
誰ããããã©ãŒãã³ã¹äžã®å©ç¹ãèŠãããšãã§ããŸããïŒ çµæãå ±æããŠãã ãã:)
ããã©ãŒãã³ã¹ã®åé¡ã«æ»ããŸãããçŸåšã®ãããæšæºã¬ã€ã€ãŒã¯GPUã®äœ¿çšãããŸã£ããã¡ãªããããããŸããïŒ ãã®ããããã«ã ã®å®è£ ã§GPUã®ã¡ãªãããçµéšãã人ã¯ããŸããïŒ
ããªãã¯ããªãèªèº«ã®ããã«ãã¹ãããããšãã§ããŸãïŒ
https://github.com/tensorflow/tensorflow/blob/4addf4b5806cd731949c6582a83f5824599cd1ef/tensorflow/python/ops/batch_norm_benchmark.py
ã¹ãã ã«ã€ããŠã¯ç³ãèš³ãããŸããããããã¥ã¡ã³ãã«ã¯ããã®BNãç³ã¿èŸŒã¿ã§äœ¿çšããæ¹æ³ãå®éã«ã¯èª¬æãããŠããŸããïŒããããã©ããã«æäŸããå¿ èŠããããŸããïŒïŒã èŠããã«ãïŒã¢ã¯ãã£ããŒã·ã§ã³ããšã§ã¯ãªãïŒæ©èœããšã«åããã©ã¡ãŒã¿ãŒãé©çšããŠåŠç¿ããå¿ èŠãããããšãã©ã®ããã«ç解ããã®ã§ããããïŒ
ïŒãããè¡ãããã®ã³ãŒãã¹ããããã¯å°ãªããšããããŸããïŒïŒ
ã¹ãªã ãªbatch_normã©ãããŒã¯ãå ¥åãã³ãœã«ã®æåŸã®æ¬¡å ãæ£èŠåããŸãã ãããã£ãŠãå®å šã«æ¥ç¶ãããã¬ã€ã€ãŒããã®2Då ¥åãã³ãœã«ã®å Žåããããã§æ£èŠåããããããã¢ã¯ãã£ãåããšã®æ£èŠåãå®è¡ãããŸãã ç³ã¿èŸŒã¿ã«ç±æ¥ãã4Dãã³ãœã«ã®å Žåãæåã®3ã€ã®æ¬¡å ïŒããããå¹ ãæ·±ãïŒã§æ£èŠåããããããæ©èœããšã®æ£èŠåãå®è¡ãããŸãã @sguadaã¯ãããã«ã€ããŠããå°ã説æçã§ãããããããŸããã
@nmhkahnã³ãŒãã¹ããããã«é¢ããŠã is_training=True
ãšãã«reuse
ãNone
ã«èšå®ãããŠããã®ã¯ãªãã§ããïŒ ããã«ãããã¹ã±ãŒãªã³ã°ãã©ã¡ãŒã¿gamma
ãšãªãã»ãããã©ã¡ãŒã¿beta
ããã¹ãŠã®ãã¬ãŒãã³ã°ã¹ãããã§ååæåãããŸãããïŒ å
ã®è«æã§ã¯ã beta
ãšgamma
ã¯ãå
ã®ã¢ãã«ãã©ã¡ãŒã¿ãšãšãã«åŠç¿ãããããšæããŸããã ãã®ããã«ã¯ãäžåºŠã ãåæåããŠããããã¹ãŠã®ãã¬ãŒãã³ã°ã¹ãããã§åå©çšããã¹ãã§ã¯ãããŸãããïŒ
tf.cond(is_training,
lambda: batch_norm(inputT, is_training=True, updates_collections=None, scope=scope),
lambda: batch_norm(inputT, is_training=False, updates_collections=None, scope=scope, reuse = True))
batch_normãå©çšå¯èœã§å¹æçã«ããããã«ãTFããŒã ãããã«æå ¥ããäœæ¥ã«å€§ãã«æè¬ããŸãã ç§ã®æ€çŽ¢ããããã®ã¹ã¬ããã¯ããã䜿çšããæ¹æ³ã«ã€ããŠã®æè¯ã®ãªãœãŒã¹ã§ãã ãã®èŸºãã«ã¯ããŸããŸãªåé¡ãã¢ã€ãã¢ãé£ã³äº€ã£ãŠããŸããbatch_normã¬ã€ã€ãŒã®äœ¿çšæ¹æ³ã®æãåçŽãªæšæºã±ãŒã¹ã«ã€ããŠã®ã³ã³ã»ã³ãµã¹ã¢ããã€ã¹ãç解ããã®ã¯å°é£ã§ãã ããã¥ã¡ã³ããæ¡åŒµããŠæ£ç¢ºãªæšå¥šäœ¿çšæ³ãæå®ããããšã«ã¯ãå€ãã®äŸ¡å€ããããšæããŸãã
ãããç解ããããã®æåã®è©Šã¿ã¯ãç§ã次ã®ã³ãŒãã«å°ããŸããïŒ
is_training_ph = tf.placeholder(tf.bool)
...
with tf.variable_scope('bn_test_layer') as vs:
layer_output = tf.cond(is_training_ph,
lambda: tf.contrib.layers.batch_norm(layer_input, is_training=True, center=True, scale=True, activation_fn=tf.nn.relu, updates_collections=None, scope=vs),
lambda: tf.contrib.layers.batch_norm(layer_input, is_training=False, center=True, scale=True, activation_fn=tf.nn.relu, updates_collections=None, scope=vs, reuse=True))
次ã«ãis_training_phããã¬ãŒãã³ã°ã®å Žåã¯Trueã«ããã¹ãã®å Žåã¯Falseã«èšå®ããŸãã ããã¯ç§ã«ã¯ããŸããããŸããã ééæš¡åã¯ããŸãèšç·ŽãããŸããããã¹ãã®ããã©ãŒãã³ã¹ã¯ã²ã©ãã§ãã å¯Ÿç §çã«ããã¹ãæã«is_training_ph = Trueãç¶æãããšãããŸãæ©èœããŸãã ãããã£ãŠãé©åãªæ¢åã®å€æ°ãèŠã€ãããªãããã«ããŸã ã¹ã³ãŒãã®åé¡ããããšæããŸãã
@ davek44ç§ã¯ããªãã䜿çšããŠããã®ãšåãã³ãŒããã¬ãŒã ã¯ãŒã¯ã䜿çšããŠããŸãããåãããšã芳å¯ããŸããããã¬ãŒãã³ã°ãã§ãŒãºã§is_training=True
ãªã³ã«ããæ€èšŒããã³/ãŸãã¯ãã¹ããã§ãŒãºã§is_training=False
ããªãã«ãããšãã¢ãã«ã¯èª¬æããè«æãšåãããã«ãã¬ãŒãã³ã°ããŸãïŒã¢ãã«ã¯ããéãåæãããã倧ããªåŠç¿çã䜿çšã§ããŸããïŒãããã¹ãã®ããã©ãŒãã³ã¹ã¯ã²ã©ããã®ã§ãã is_training=True
åžžã«ãªã³ã«ãããšãã¢ãã«ã¯ããããã«ã ã¬ã€ã€ãŒãæ¿å
¥ããªãå Žåãšåãããã«ãã¬ãŒãã³ã°ããŸãã äœãæªãã£ãã®ãããããŸãããTensorBoardã䜿çšããŠãã©ã¡ãŒã¿ãŒãç£èŠããäºå®ã§ãã ãã®åäœã®åå ã蚺æããå ŽåãæŽæ°ããŠããã ããŸããïŒ
tf.contrib.layers.batch_normã¯ãã³ãœã«ãis_trainingãšèŠãªãããšãã§ãããããç¹å¥ãªããšãããå¿ èŠã¯ãããŸããã
is_training_ph = tf.placeholder(tf.bool)
outputs = tf.contrib.layers.batch_norm(layer_input, is_training=is_training_ph, center=True, scale=True, activation_fn=tf.nn.relu, updates_collections=None, scope='batch_norm'),
ãã®ã³ãŒãã§ãåãããã«ãã¹ãããã©ãŒãã³ã¹ãäœäžããŸãã
詳现ãç¥ãããšã¯äžå¯èœã§ãããç§ã®æšæž¬ã§ã¯ããã¬ãŒãã³ã°ã¯æ°åã®å埩ã§ããè¡ãããªããããmoving_meanãšmoving_averageã¯ãŸã åæããŠããŸããã
ãã¹ãäžã«batch_sizeãå€æŽããŠãããããå°ãããããšããã©ãŒãã³ã¹ãã©ã®ããã«äœäžââãããã確èªã§ããŸãã
ãã®ã³ãŒãã§ãåãããã«ãã¹ãããã©ãŒãã³ã¹ãäœäžããŸãã
tf.slim batchnormãŸãã¯tf.condã®ããããã§ããã¬ãŒã¹ãã«ããŒãšããŠis_trainingãå
¥åããŠããŸã£ããåãåé¡ãçºçããŸããã
åè
ã®å Žåãèšç·Žãããã¢ãã«ã調æ»ãããšããã移åå¹³åãšç§»ååæ£ããã¹ãŠãŒãã§æ§æãããŠããããšãããããŸããã
åŸè
ã®å Žåã移åå¹³åãšåæ£ã¯ïŒç°ãªãå€ã§ïŒããåççã«èŠããŸããããã¹ãæã«is_training = Falseã䜿çšãããšãããã©ãŒãã³ã¹ãéåžžã«æªããªããŸãã is_training = Trueã䜿çšãããšãããé©åã«æ©èœããŸããããã¹ããããå
ã®ç§»åå¹³åãšåæ£ã®ã¿ã䜿çšãããšæããŸãã
@nmduc @ davek44ãã¬ãŒãã³ã°ãšãã¹ãäžã«tf.contrib.layers.batch_norm
èšç®ããã移åå¹³åãšç§»ååæ£ã远跡ããã³ãŒããäœæããŸããã decay
ã®å€ã¯éåžžã«éèŠã§ããïŒææ°é¢æ°çæžè¡°ã䜿çšããŠç§»åå¹³åãšç§»ååæ£ãèšç®ããŸãïŒã decay
èšå®ã1.0ã«è¿ã¥ããŸãïŒã€ãŸãã decay=.999
ïŒã移åå¹³åã¯0ã«è¿ãå€ã«äœäžããŸãããŸã£ããåãã³ãŒãã§2åã®ãã¹ãå®è¡ãè¡ããŸãããã tf.contrib.layers.batch_norm
decay
èšå®ãç°ãªããæ€èšŒ/ãã¹ãã®ç²ŸåºŠãããåççã§ããããã«èŠããŸããã
decay=0.9
ã®ãã¹ãå®è¡çµæ
decay=0.999
䜿çšãããã¹ãå®è¡çµæïŒ decay=0.999
ã¯tf.contrib.layers.batch_norm
ã®ããã©ã«ãèšå®ã§ãïŒ
ïŒãŸããæžè¡°å€ã倧ããã»ã©ãæ€èšŒç²ŸåºŠã®å€åã確èªããããã«ã¢ãã«ãããé·ããã¬ãŒãã³ã°ããå¿ èŠãããããã§ãïŒ
ããããããä¿®æ£ããã @zhongyukã§åæãå ±æããŠããã ãããããšãããããŸãã
éçºè ã«ã¯ãdecay = 0.9ãããã©ã«ãã«ããããšãæ€èšããããšããå§ãããŸãã 0.99ã§ããç§ã«ã¯ããŸããããŸããã ããã¯ãTorchã®å®è£ ã®ããã©ã«ãå€ã§ããããŸãã https://github.com/torch/nn/blob/master/BatchNormalization.luaã®éåéãã©ã¡ãŒã¿ãŒãåç §ããŠ
@zhongyukå ±æããŠããã ãããããšãããããŸãã ããã¯ä»ç§ã®ããã«åããŸãã
ããã¯éèŠãªããã§ãã @sguada 1.0ããåã«ãããã§æ£ããè¡ådecay
倧å¹
ã«äžããå¿
èŠãããå¯èœæ§ããããšããäºå®ãææžåããPRãéã£ãŠãããŸããïŒ ãã®ãã©ã¡ãŒã¿ãŒã埮調æŽããå¿
èŠã¯ãªãã£ããšç¢ºä¿¡ããŠããŸãããåæ£èšå®ã®å¯äœçšã§ããå¯èœæ§ããããŸãã
ããã©ã«ãã0.9ã«å€æŽããããããå°ããªããŒã¿ã»ãããŸãã¯å°æ°ã®æŽæ°ã§ãã®åœ±é¿ãããé©åã«ææžåããããšãã§ããŸãã
åæ£èšå®ã®@vincentvanhouckeã¯éåžžãæ°çŸäžã®æŽæ°ãè¡ãã®ã§åé¡ãããŸããããæ°çŸã®æŽæ°ã®ã¿ãè¡ãããã®ãããªä»ã®å Žåã«ã¯ã倧ããªéãããããŸãã
ããšãã°ãdecay = 0.999ã䜿çšãããšã1000åã®æŽæ°åŸã«0.36ã®ãã€ã¢ã¹ãçºçããŸããããã®ãã€ã¢ã¹ã¯10000åã®æŽæ°åŸã«0.000045ã«äœäžãã50000åã®æŽæ°åŸã«0.0ã«äœäžããŸãã
ç¹ã«å°ããªããããµã€ãºã䜿çšãããšããã¹ãã®ããã©ãŒãã³ã¹ãäœäžãããšããåé¡ãããããšã«æ³šæããŠãã ããïŒãã¬ãŒãã³ã°ã«äœ¿çšãã200ã§ã¯ãªã10ããå°ãããã®ã¯ããã¹ãã®ç²ŸåºŠãäœäžãããŸãïŒã tf.placeholderã䜿çšããŠããã¹ã/ãã¬ãŒãã³ã°ã¢ãŒããåãæ¿ããŸããã
ãã®ãããæ£èŠåã¬ã€ã€ãŒããã¬ãŒãã³ã°ã®åæãæ¹åããããã«æ©èœããããšã¯çŽ æŽãããããšã§ãããã¢ãã«ãæ¬çªç°å¢ã«é©çšã§ããªãå Žåã¯ãããã䜿çšããŠãããŸãæå³ããããŸããã ãã®ããããã«ã ã¬ã€ã€ãŒã䜿çšããŠãå°ããªããŒã¿ãµã³ãã«ãŸãã¯åäžã®ããŒã¿ãµã³ãã«ã§è¯å¥œãªãã¹ãããã©ãŒãã³ã¹ã確èªã§ãã人ã¯ããŸããïŒ
is_training = Falseãå°ããªãããã§äœ¿çšããbatch_size = 1ã§ã䜿çšãããšããããããã®çµ±èšã䜿çšããããã¬ãŒãã³ã°äžã«åŠç¿ããçµ±èšã䜿çšããããããã¹ãã®ããã©ãŒãã³ã¹ãè¯å¥œã§ããããšã確èªã§ããŸãã çµ±èšãããã©ã«ãã®decay = 0.999ã«åæããŠããããšã確èªããå¿ èŠããããŸããããã¯ãå°ãªããšã50kã®æŽæ°ãæå³ããŸãã
TFéçºè
ã®ç¢ºèªããã©ããŒã¢ããããããã«ã2ã€ã®ç°ãªãdecay
èšå®ïŒããã³ãã¬ãŒãã³ã°batch_size = 1ïŒã䜿çšããŠçµ±èšã®åæã远跡ããŸãã decay=0.99
ãšã550ã600ã¹ãããã®åŠç¿/æŽæ°åŸã«çµ±èšãåæããŸãïŒãã€ã¢ã¹<0.001ïŒã decay=0.9
ãšãåŠç¿/æŽæ°ã®100ã¹ããã以å
ã§çµ±èšãåæããŸãïŒãã€ã¢ã¹<0.001ïŒã
@sguadaããããšããããã¯åºåãå®éã«ããããµã€ãºã«äŸåããªãããšãæå³ããŸããïŒ ç²ŸåºŠã«å€§ããªåœ±é¿ãäžããéåžžã«ããããªå€åã«æ°ä»ããŠããããã§ãïŒãããããããã©ãŒãã³ã¹ã®å®çŸ©ã¯ããã®ããããªå€åã®åœ±é¿ãåãããããªã£ãŠããŸãïŒã æ£ç¢ºã«ã¯ã128次å ã®åºåãã³ãœã«ã®ãã¹ãŠã®å€ãå¢å ããããããã¯ãã«ã®å šé·ã¯ããããµã€ãºã«ã»ãŒçŽç·çã«æ¯äŸããŸãã å€ããšã«ãããã¯ããã»ã©å€§ããªéãã§ã¯ãããŸããããæœåšç©ºéã®ãã¯ãã«è·é¢ãèšç®ãããšãã«å€§ããªåœ±é¿ãäžããŸãã
@zhongyukããããšãdecay=0.9
ã§çŽ5kã®æŽæ°ãå®è¡ããã®ã§ãåæããã¯ãã§ããã倧ããªããããµã€ãºã䜿çšããããã©ãŒãã³ã¹ã®ãã¹ãã¯åé¡ãããŸããã ããããããã§ãªãã£ããšããŠãããã¹ãã®ãã¬ãŒãã³ã°ã«éããçããã§ããããïŒ åæããŠããªãã£ããšãããããã¬ãŒãã³ã°
ããã«èª¿æ»ããŠãå¥ã®ã¿ã¹ã¯ã§åé¡ãåçŸã§ãããã©ããã確èªããŸãã ãããŸã§ã®è¿ éãªãã£ãŒãããã¯ã«æè¬ããŸãïŒ
@dominikandreasçµ±èšãåæããªãããšãåå ã§ãã¹ãã®ããã©ãŒãã³ã¹ãäœäžããå Žåã¯ããã¬ãŒãã³ã°ã®ããã©ãŒãã³ã¹ã¯ããªãè¯å¥œã§ããããã¹ãã®ããã©ãŒãã³ã¹ã¯äžè¯ã§ãã ãã¬ãŒãã³ã°äžããããã®æ£èŠåã¯ãã¬ãŒãã³ã°ãããçµ±èšã®ã¿ã䜿çšããŠè¡ãããããã§ãã ãã ãããã¹ãæéäžã¯ããã¹ãŠã®ãã¬ãŒãã³ã°ãããã®ç§»åå¹³åçµ±èšã䜿çšããŠãå ¥åãã³ãœã«ãæ£èŠåããŸãã
ã³ãŒãã«ãšã©ãŒãèŠã€ãããŸããããããæ£èŠåã¯æ£åžžã«æ©èœããŠããŸã:-)ãµããŒãã«æè¬ããŸã
ããã«ã¡ã¯@zhongyuk ã移åå¹³åãšåæ£ãã©ã®ããã«è¿œè·¡ããŸãããïŒ
ããããšãïŒ
@rogertrulloéåžžã移åå¹³åãšåæ£ã远跡ããããã«TensorBoardãèšå®ããŸãã ãã以å€ã«ããã€ã¢ã¹ãç£èŠããããã®ãã¬ãŒãã³ã°ãšåç
§äžã«ãã¹ã³ãŒãå
ã®tf.get_variable("moving_mean")
ãä»ããŠçµ±èšããã§ããããããšããŸããã
ããã«ã¡ã¯ã
ä»ã®èª¬æãšåãåé¡ãããããã¬ãŒãã³ã°çµæã¯è¯å¥œã§ãããbatch_normã䜿çšããåŸã®æ€èšŒ/ãã¹ããäžè¯ã§ãã
ç§ã¯æ¬¡ã®ãããªé¢æ°ã䜿çšããŸãïŒ
conv_normed1 = tf.contrib.layers.batch_normïŒconv1 + block1_layer3_1_biasesãupdates_collections = Noneãscale = Trueãdecay = batch_norm_decayãcenter = Trueãis_training = is_trainingïŒ
æžè¡°å€ã¯0.9ã§ã
åå©çšãã©ã°ãèšå®ããå¿
èŠããããŸããïŒ
å©ããŠããã ããã°å¹žãã§ãã
ãã®ã¹ã¬ããã§èª¬æãããŠããããã«batch_normã䜿çšããŠããïŒãã¬ãŒãã³ã°çšã®tf.boolãããã³ops.GraphKeys.UPDATE_OPSã䜿çšïŒããã¹ãŠãæ©èœããŸãã
以äžã䜿çšããŠä¿åããã³åŸ©å
ããå ŽåïŒ
ã»ãŒããŒ= tf.train.SaverïŒïŒ
ã§ããŸãã
ãã ãã以äžã䜿çšããŠä¿åããå ŽåïŒ
ã»ãŒããŒ= tf.train.SaverïŒtf.trainable_variablesïŒïŒ+ [global_step]ïŒ
ïŒã°ã©ããŒã·ã§ã³ãªã©ãä¿åããªãããšã§ïŒã¹ãã¬ãŒãžã¹ããŒã¹ãç¯çŽã§ããããã«
埩å
æã«ãšã©ãŒãçºçããŸãïŒ
ãåæåãããŠããªãå€unpool4 / convc / bn / moving_meanã
æããã«ãããã¯ãmoving_meanïŒããã³moving_varianceïŒãã©ã®ã¬ã€ã€ãŒã«ãä¿åãããŠããªãããã§ãã ç§ã¯ããããããããæã£ãŠããã®ã§ïŒå€ãã®ã¬ã€ã€ãŒã«ãã¹ããããŠããŸãïŒ-ä¿åããå€ã®ãªã¹ãã«ããããè¿œå ããæãå¹ççãªæ¹æ³ã¯äœã§ããïŒ ãŸãããããã¯ãã¬ãŒãã³ã°å¯èœãªå€æ°ã§ãããããtrainable_variablesã³ã¬ã¯ã·ã§ã³ã«è¿œå ãããªãã®ã¯ãªãã§ããïŒ
@mshunshinã®ç§»åå¹³åãšåæ£ã¯ããã¬ãŒãã³ã°å¯èœãªå€æ°ã§ã¯ãããŸãããåŸé
ã¯ãããŸããããããã¯ãäŸã®ãããããå
šäœã§çµ±èšãèç©ããŠããã ãã§ãã
ããããä¿å/埩å
ããã«ã¯ãtf.global_variablesïŒïŒã䜿çšã§ããŸãã
ç§ã«ãšã£ãŠããã®ã©ãããŒã䜿çšãããšãç©äºãæ©èœãå§ããŸããã
def batch_norm_wrapper(x, phase, decay, scope, reuse):
with tf.variable_scope(scope, reuse=reuse):
normed = tf.contrib.layers.batch_norm(x, center=True, scale=True, decay=decay, is_training=phase, scope='bn',updates_collections=None, reuse=reuse)
return normed
ç§ã®æèŠã§ã¯ãã¹ã³ãŒãã®äœ¿çšãšåå©çšå
šäœã¯ããã®ã¹ã¬ããã§ã¯æ確ã§ã¯ãããŸããã
ã©ããããããšãã tf.global_variablesïŒïŒã䜿çšãããšãã°ã©ããŒã·ã§ã³ãå«ãŸããŠãããšæãã®ã§ãä¿åãã¡ã€ã«ã¯ã¯ããã«å€§ãããªããŸãã çµå±ç§ã¯äœ¿çšããŸããïŒ
saver = tf.train.SaverïŒ[x for x in tf.global_variablesïŒïŒif'Adam 'not in x.name]ïŒ
ãããŠãã»ãã·ã§ã³ãããŒãžã£ã®initãããããé©åã«åæåããªãããïŒ
sess.runïŒtf.variables_initializerïŒ[x for x in tf.global_variablesïŒïŒif'Adam 'in x.name]ïŒïŒ
ïŒtf.train.AdamOptimizerã䜿çšïŒ
ã¢ãã«ã®å€æ°ãå«ãtf.model_variablesïŒïŒãã€ãŸã移åå¹³åã䜿çšããããšãã§ããŸãã
@sguadaãè¿·æã
ç§ã¯slim.batch_normã䜿çšããŠããŸããããã¬ãŒãã³ã°ã®ããã©ãŒãã³ã¹ãé«ããæ€èšŒ/ãã¹ãã®ããã©ãŒãã³ã¹ãäœããªã£ãŠããŸãã reuse
ãscope
ãªã©ã®ãã©ã¡ãŒã¿ã®äžé©åãªäœ¿çšãåå ã ãšæããŸãã ãããæ£èŠåã«ã¯å€ãã®åé¡ããããŸãããããã䜿çšããæ¹æ³ã«é¢ããå®å
šãªã³ãŒãã¹ãããããèŠã€ããã®ã¯å°é£ã§ãã ããŸããŸãªãã§ãŒãºã§ããŸããŸãªãã©ã¡ãŒã¿ãæž¡ãæ¹æ³ã«ã€ããŠã
ããšãã°ã tf.GraphKeys.UPDATE_OPS
ã䜿çšããŠäŸåé¢ä¿ãå¶åŸ¡ãã is_training
ããã¬ãŒã¹ãã«ããŒãšããŠèšå®ãããšããŸãã ãããã{is_trainingïŒFalse}ããã£ãŒããããšãæ€èšŒããã©ãŒãã³ã¹ã¯äŸç¶ãšããŠäœäžããŸãã
å ¬åŒã§å®å šãªïŒã€ãŸãããã¬ãŒãã³ã°ãæ€èšŒããã¹ãããã¹ãŠå«ãŸããŠããïŒãããæ£èŠåã®äŸãããã°ã倧ãã«æè¬ããŸãã
åãã£ãŠæè¬ããŸãïŒ
ããã«ã¡ã¯ã
ããããã«ã ã䜿çšãããã³ã«ç°ãªãã¹ã³ãŒããèšå®ããç§ã«é©ãããã¬ãŒãã³ã°/ãã¹ããã§ãŒãºïŒãã¹ãã®å Žåã¯TRUEããã¬ãŒãã³ã°ã®å Žåã¯FALSEïŒã«åŸã£ãŠåå©çšå
¥åãäžããå¿
èŠããããŸãã
@ishaybeeå©ããŠãããŠããããšãã åé¡ãèŠã€ãããŸãã==ããã¯moving_mean / moving_varianceã®ã³ãŒã«ãã¹ã¿ãŒããåå ã§ãã
ååãªã¹ãããããã¬ãŒãã³ã°ããŠããªããããæšå®ç§»åå¹³å/åæ£ã¯ããã»ã©å®å®ããŠããŸããã çµæã¯æ¬¡ã®ããã«ãªããŸããã¢ãã«ã¯ãããããã®ãã¬ãŒãã³ã°ã§ããªãããŸãæ©èœããŸããïŒæåã¯æ倱ãæ¥éã«æžå°ããããšãç¥ã£ãŠããŸãïŒãæ€èšŒããã©ãŒãã³ã¹ã¯äžå®å®ã§ãïŒæšå®ãããæ¯éå£ã®å¹³å/åæ£ãååã«å®å®ããŠããªãããïŒã
ã¢ãã«ãããé·ããã¬ãŒãã³ã°ãããšãæ€èšŒã®ç²ŸåºŠãé«ããªããŸãã
ãã1ã€ã®éèŠãªããšã¯ãå¿
ãslim.learning.create_train_op
ã䜿çšããŠtrainopãäœæããããšã§ãã tfãã€ãã£ãtf.train.GradientDescentOptimizer(0.1).minimize(loss)
ã¯äœ¿çšããªãã§ãã ããã
ãããã£ãŠãçãã¯ããããæ£èŠåãæ£ãã䜿çšããŠããŸããããã¬ãŒãã³ã°äžã®ãã€ããã¯ã¹ãå®å šã«ã¯ç解ããŠããŸããã
================
ãã®ããïŒ
@soloice ã泚æãã³ã¡ã³ãã«ã€ããŠã¯ãbatch_normãåŒã³åºãããã«æ¬¡ã®ãã©ã¡ãŒã¿ãŒãã¬ã€ã€ãŒå ã«æž¡ãããŸãã
batch_norm_params = {'is_training'ïŒis_trainingã 'decay'ïŒ0.9ã 'updates_collections'ïŒãªã}
updates_collections
ãNoneã«èšå®ãããŠããªãå ŽåïŒã€ãŸããBatchNormå
ã§å¹³åæŽæ°ãè¡ãããïŒãBatchNormã¬ã€ã€ãŒã移åå¹³åãæŽæ°ããããã«å¿
èŠãªtf.GraphKeys.UPDATE_OPSãäœããã®æ¹æ³ã§å®è¡ããããšã¯æåŸ
ã§ããŸããïŒäŸïŒconv2dïŒããããã£ãŠãåŸã§ãã¹ãããŒã¿ãå®è¡ã§ããŸãã
ãŸãã¯ãããã§UPDATE_OPSãæ瀺çã«å®è¡ããŠã¿ãŠ
update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS)
if update_ops:
updates = tf.group(*update_ops)
cross_entropy = control_flow_ops.with_dependencies([updates], cross_entropy)
æŽæ°-ç§ã¯ããªãã®ã³ãŒããæ£ç¢ºã«åŒçšããããªãã¯UPDATE_OPSã䜿çšããŠããããšãããããŸããã
ãã³ãŒã«ãã¹ã¿ãŒããã«ã€ããŠã¯ãäžèšã®èª¬æã§èŠãããã«ãBatchNormã®ç§»åå¹³åæžè¡°ïŒå ¥åãã©ã¡ãŒã¿ãŒïŒãããã©ã«ãã®0.999ãã0.95ã®ãããªå€ã«æžãããšãèµ·åãé«éåã§ããŸãã
@pavelbulanovãããæäŒã£ãŠãããŠããããšãïŒ decay
å°ããå€ãè©ŠããŠããããã©ã®ããã«åœ¹ç«ã€ãã確èªããŸãã
================
æŽæ°ïŒå°ããªæžè¡°ïŒããšãã°ã0.9ãŸãã¯0.95ïŒã䜿çšãããšå€§ãã«åœ¹ç«ã¡ãŸãã decay
ã0.9ã«èšå®ãããšãæ€èšŒæ倱ã¯ããã«æžå°ããŸãã ãã ããå°ããªæžè¡°ã®æ¬ ç¹ã¯ããã®æå¹ç¯å²ãå°ããããšã§ããçµæã¯ãæè¿ã®ããã€ãã®ãµã³ãã«ã«ãã£ãŠæ¯é
ãããŠãããããæ¯å¹³å/åæ£ã®é©åãªæšå®ã§ã¯ãããŸããã ã¯ã€ãã¯ã¹ã¿ãŒãïŒå°ããªæžè¡°ïŒãšããé·ãæå¹ç¯å²ïŒå€§ããªæžè¡°ïŒã®ãã©ã³ã¹ããšãå¿
èŠããããŸãã
ããã«ã¡ã¯ã
ãã®åé¡ã®ææ¡ãå©çšããŠãããæ£èŠåã¬ã€ã€ãŒãå®è£
ããããšããŸããããæ€èšŒãšãã¹ãã§70ïŒ
ãè¶
ãããšã©ãŒãçºçããŸã...ãã¬ãŒãã³ã°ä»¥å€ã®åŒã³åºãã®æžè¡°ã¯äœããªããŸã...
ãããç§ã®ã³ãŒãã§ãïŒ
def BatchNorm(inputT, is_training=False, scope=None):
return tf.cond(
is_training,
lambda: tf.contrib.layers.batch_norm(inputT, is_training=True, reuse=None, decay=0.999, epsilon=1e-5, center=True, scale=True, updates_collections=None, scope=scope),
lambda: tf.contrib.layers.batch_norm(inputT, is_training=False, reuse=True, decay=0.900, epsilon=1e-5, center=True, scale=True, updates_collections=None, scope=scope)
)
åãã£ãŠæè¬ããŸãã
@Alexivia 2ã€ã®ç°ãªããããæ£èŠåã¬ã€ã€ãŒã䜿çšããŠããããã§ããïŒ BNã¬ã€ã€ãŒã¯1ã€ã ã䜿çšããå¿
èŠããããŸãïŒãã¡ããã is_training
ãã©ã¡ãŒã¿ãŒãç°ãªããŸãïŒã
@soloiceã«ã¢ããã€ã¹ããããšãããããŸãã
is_training
ãšreuse
ãã©ã¡ãŒã¿ãŒãå€ããŠã¿ãŸããã
lambda: tf.contrib.layers.batch_norm(inputT, is_training=True, reuse=None, decay=0.9, epsilon=1e-5, center=True, scale=True, updates_collections=None, scope=scope),
lambda: tf.contrib.layers.batch_norm(inputT, is_training=False, reuse=True, decay=0.9, epsilon=1e-5, center=True, scale=True, updates_collections=None, scope=scope)
ããã§ãè¯ãæ€èšŒãšãã¹ãçµæãåŸãããŸãã...> 70ïŒ ...
ããã«ã¡ã¯ã
äžèšã®ç§ã®ã©ãããŒãåç
§ããŠãã ããã
ãwithtf.variable_scopeïŒscopeãreuse = reuseïŒïŒãã䜿çšããå¿
èŠããããšæããŸãã
ããã«ã¡ã¯@ishaybee ã
ç§ã¯ããªãã®ã¢ããã€ã¹ã«åŸããŸãããä»ç§ã®ã³ãŒãã¯æ¬¡ã®ãšããã§ãïŒ
def BatchNorm(inputT, is_training=False, reuse=True, scope=None):
with tf.variable_scope(scope, reuse=reuse):
return tf.contrib.layers.batch_norm(inputT, is_training=is_training, reuse=reuse, scope=scope, updates_collections=None, decay=0.9, center=True, scale=True)
ãããŠãfeed_dictãä»ããŠis_training
ãšreuse
ããã£ãŒãããŸããããšã©ãŒValueError("The reuse parameter must be True or False or None.")
åå©çšãPythonå€æ°ïŒã¢ãã«ã®å ¥åïŒããã³ãã¬ãŒã¹ãã«ããŒãšããŠãã£ãŒãããŠã¿ãŠãã ããã
è©ŠããŠã¿ããšãããå€ã«ã€ããŠã®æå¥ã¯ãªããªããŸãã...ããããå€ãbatch_norm
é¢æ°ã«åŒ·å¶ããŠãå€åãèŠããããTensorBoardã§ã¯å€åãèŠãããªãããããã¬ãŒã¹ãã«ããŒå€ã¯äœ¿çšãããŠããªããšæããŸããã°ã©ãã«æ¥ç¶ããŠããŸã...ïŒæ·»ä»ç»åãåç
§ïŒ
ç§ã®ã³ãŒãã¯ä»ãã®ããã«ãªã£ãŠããŸãïŒ
ãããæ£èŠåã©ãããŒ
def BatchNorm(inputT, is_training=False, reuse=None, scope=None):
with tf.variable_scope(scope):
return tf.contrib.layers.batch_norm(inputT, is_training=is_training, reuse=reuse, scope=scope, updates_collections=None, decay=0.9, center=True, scale=True)
ã¢ãã«å®çŸ©
def model(data, train=False, is_training=False, reuse=None):
# 1st conv layer
with tf.name_scope('conv1') as scope:
conv = tf.nn.conv2d(
<...>
norm = BatchNorm(pool, is_training=is_training, reuse=reuse, scope=scope)
ãã¬ãŒãã³ã°
feed_dict = {train_data_node: batch_data,
train_labels_node: batch_labels,
is_training: True,
reuse: None}
# Run the optimizer to update weights.
sess.run(optimizer, feed_dict=feed_dict)
æ€èšŒ
batch_predictions = sess.run(eval_prediction, feed_dict={eval_data: data[-EVAL_BATCH_SIZE:, ...], is_training: False, reuse: True})
is_traningã¯å¯èœã§ããããã¬ãŒã¹ãã«ããŒã®åå©çšã¯ããŒã«å€ã§ããå¿ èŠãããããã³ãœã«ã§ããã¬ãŒã¹ãã«ããŒã§ããããŸããã
äœãããããšããŠããã®ãããããŸãããã»ãšãã©ã®å Žåãéçãªå€ã䜿çšãããšåé¡ã解決ããŸãã ããšãã°ããã®ãã¿ãŒã³ã¯ããŸãæ©èœããŸãã
def model(data, is_training=False, reuse=None, scope='my_model'):
# Define a variable scope to contain all the variables of your model
with tf.variable_scope(scope, 'model', data, reuse=reuse):
# 1 layer
net = tf.contrib.layers.conv2d(data, ....)
....
net = tf.contrib.layers.batch_norm(net, is_training)
return net
train_outputs = model(train_data, is_training=True)
eval_outputs = model(eval_data, is_training=False, reuse=True)
eval_predictions = sess.run(eval_outputs, feed_dict={eval_data: data[-EVAL_BATCH_SIZE:, ...]})
ã¢ãã«ã®åäœãåçã«å€æŽããå¿ èŠããªãéããis_trainingã«ãã¬ãŒã¹ãã«ããŒã䜿çšããå¿ èŠã¯ãããŸããã ç§èš£ã¯ãã¢ãã«ã2åäœæãã2åç®ã«å€æ°ãå ±æããããšã§ãã
ããããšã@sguada ïŒ ããªãã®ææ¡ãé©çšããåŸãç§ã¯ã€ãã«ãããæ©èœãããããšãã§ããŸããïŒ
API 1.0ã®ããã¥ã¡ã³ãã«ãã°ã©ãã«æŽæ°æäœãæåã§è¿œå ããå¿ èŠãããããšãåæ ãããŠãããšäŸ¿å©ã§ãã æ°ããtfãŠãŒã¶ãŒã§ããç§ã¯ããã¹ããšã©ãŒãããããããšã«æ°ã¥ãããããã®æ£èŠåãåé¡ã§ããããšã«æ°ä»ããŸã§ãã°ã©ãã®ãããã°ã«ããªãã®æéãè²»ãããªããã°ãªããŸããã§ããã 次ã«ãæé©åã«contribé¢æ°ã䜿çšããªãéããããã©ã«ãã§ã¢ãŒã¡ã³ãã远跡ããå€æ°ãæŽæ°ãããªãããšãç解ããããã«ãããå€ãã®æéãè²»ããå¿ èŠããããŸããã 1.0ã§ã¯ãupdate_collectionsãNoneã«èšå®ãããªãã·ã§ã³ããªãããããããåé¡ã«ãªãå¯èœæ§ãããããšã瀺ãã€ã³ãžã±ãŒã¿ãŒã¯ããã¥ã¡ã³ããããããŸããã ããã«ããã¬ãŒãã³ã°ã±ãŒã¹ã§å®è¡ãããæäœã«å¶åŸ¡ãããŒã®äŸåé¢ä¿ãè¿œå ãããã©ã¡ãŒã¿ãŒãããããšã¯çã«ããªã£ãŠããããã§ãã
@danrscãã®éãã BNã¬ã€ã€ãŒã®äœ¿çšæ³ã¯ããªãæ··ä¹±ããŠããŸãã ãããæ£èŠåã«é¢ããããã¥ã¡ã³ããŸãã¯å®å šãªå ¬åŒãã¥ãŒããªã¢ã«ãè¿œå ããããšãææ¡ããŸããããæ®å¿µãªããå¿çããããŸãã= =
å®å šã«åæããŸãã BNã®äœ¿çšæ³ã¯éåžžã«ããªãããŒã§ãããããã¥ã¡ã³ãã¯çŸåšäžååã§ã¯ãªããšæããŸãã ããã¯ããã®ãããªäžè¬çã«äœ¿çšãããã¬ã€ã€ãŒã§ã¯ä¿®æ£ããå¿ èŠããããŸãã
ããã¥ã¡ã³ãã®åé¡ãå¯èŠåããããã«å床éããŸãã
@sguadaãããªã¢ãŒãžã®ããã«ããªãã«å²ãåœãŠãŸãã ã±ãŒã¹ã«ã€ããŠãã¯ãã«ã«ã©ã€ã¿ãŒãéã䟡å€ããããããããŸããã
å é±ãã®åé¡ã«æ··ä¹±ãã3æ¥éã®ãã¬ãŒãã³ã°ãç¡é§ã«ããŸãã...ããã¥ã¡ã³ããããã«ä¿®æ£ãããå ¬åŒã®ãããæ£èŠåã®äŸãAPIããã¥ã¡ã³ãã«è¿œå ãããããšãé¡ã£ãŠããŸãã
@sguada ãtf.contrib.layers.batch_normã¯ãã³ãœã«ãis_trainingãšããŠäœ¿çšã§ãããããç¹å¥ãªããšãããå¿
èŠã¯ãããŸããããšãã£ããã£ãŠããŸããã
ããããã³ãŒãå
ã®ã³ã¡ã³ãã¯
is_training
ãTensor
ã§ããããã«å®æ°å€ãæããªãå Žåã
ïŒ Variable
ãŸãã¯Placeholder
å Žåãis_training_valueã¯Noneã«ãªãã
ïŒ needs_moments
ãçã«ãªããŸãã
is_trainingããã¬ãŒã¹ãã«ããŒãšããŠèšå®ããå Žåããã¹ããã§ãŒãºã§ãnees_momentsãtrueã«ãªããšããããšã§ããïŒ
ç§ã®ç¥ãéãããã¹ãäžã®ç¬éã¯å¿
èŠãããŸããã
ãããã£ãŠã is_training
ãVariable
ãŸãã¯Placeholder
å Žåãå€æŽã§ããããšãæå³ãããããã¢ãŒã¡ã³ããèšç®ããã°ã©ããå¿
èŠã«ãªããããã¬ã€ã€ãŒããããæ§ç¯ããŸãã
次ã«ãå®è¡æã«ãå€ãTrue
ãŸãã¯False
True
ãã©ããã«å¿ããŠããããmoments
ãŸãã¯moving_mean
ãšmoving_variance
ãŸãã
ãããã£ãŠããã¹ãäžã«å€ãFalse
ãšã moments
ã¯äœ¿çšãããŸããã
@sguada @ brando90
def batch_norm_layer(self, x,train_phase, scope_bn):
bn_train = batch_norm(x, decay=0.9, center=False, scale=True,
updates_collections=None,
is_training=True,
reuse=None,
variables_collections= [UPDATE_OPS_COLLECTION],
trainable=True,
scope=scope_bn)
bn_inference = batch_norm(x, decay=0.9, center=False, scale=True,
updates_collections=None,
is_training=False,
reuse=True,
variables_collections= [UPDATE_OPS_COLLECTION],
trainable=True,
scope=scope_bn)
z = tf.cond(train_phase, lambda: bn_train, lambda: bn_inference)
return z
ãã®ããã«batchnormãäœæããŸãããã移åå¹³åãšç§»åå€æ°ããã¹ãäžã«æŽæ°ãããçç±ãããããŸããã
@sguadaãèšã£ãããã«ã2ã€ã®ã¢ãã«ãäœæããããšããŸããããis_training = Falseã®ã¢ãã«ãã¯ã©ãã·ã¥ããŸãã
W tensorflow/core/framework/op_kernel.cc:993] Not found: Key fully_connected_5/weights not found in checkpoint
W tensorflow/core/framework/op_kernel.cc:993] Not found: Key fully_connected_6/weights not found in checkpoint
W tensorflow/core/framework/op_kernel.cc:993] Not found: Key fully_connected_7/biases not found in checkpoint
W tensorflow/core/framework/op_kernel.cc:993] Not found: Key fully_connected_6/biases not found in checkpoint
W tensorflow/core/framework/op_kernel.cc:993] Not found: Key fully_connected_7/weights not found in checkpoint
W tensorflow/core/framework/op_kernel.cc:993] Not found: Key history_embeddings_1 not found in checkpoint
W tensorflow/core/framework/op_kernel.cc:993] Not found: Key global_step_1 not found in checkpoint
å®å šã«æ¥ç¶ãããããããšCNNã䜿çšããŠããããã«ã ãå®è¡ããæ¹æ³ã®å ·äœçãªäŸãããã¯ãã ãšæããŸãã 誰ãããã®æ©èœã䜿ãããšããŠããã®ãèŠãåã«ãç©äºãããŸãããããšãæåŸ ããŠäœæ¥ãã¢ãã«ãèšç·ŽããŠããã®ã¯ããããã§ãã
èå³æ·±ãããšã«ãbatch_normã§ãã¬ãŒãã³ã°ããåŸãã¢ãã«ã埩å ããã«ã¯æ°åå幎ããããŸãã ã»ãšãã©ã®å ŽåãTF2.0ãŸã§ãã®ãããªããšãåè©Šè¡ããã®ãåŸ ã¡ãŸãã
@MisayaZ 2ã€ã®batch_normã¬ã€ã€ãŒãäœæããå¿ èŠã¯ãªããtrain_phaseïŒtf.boolã§ãããšæ³å®ïŒãbatch_normã«æž¡ãã ãã§ãã ãŸããUPDATE_OPS_COLLECTIONå€æ°_collectionsãæž¡ããŸããããã«ãããè¿œå ãããå€æ°ã§ããã³ã¬ã¯ã·ã§ã³ãå€æŽãããŸãã
以äžãæ©èœããã¯ãã§ãã
z = batch_norm(x, decay=0.9, center=False, scale=True, updates_collections=None,
is_training=train_phase, scope=scope_bn)
@OktayGardenerã¯ãäœæããããšããŠããã¢ãã«ãããããªããããå€æ°ããã§ãã¯ãã€ã³ãã«ä¿åãããŠããªãããã§ãã
batch_normã¯ãå®å šã«æ¥ç¶ãããã¬ã€ã€ãŒã§ãæ©èœããŸãã
slim = tf.contrib.slim
def model(data, is_training=False, reuse=None, scope='my_model'):
# Define a variable scope to contain all the variables of your model
with tf.variable_scope(scope, 'model', data, reuse=reuse):
# Configure arguments of fully_connected layers
with slim.arg_scope([slim.fully_connected],
activation_fn=tf.nn.relu,
normalizer_fn=slim.batch_nom):
# Configure arguments of batch_norm layers
with slim.arg_scope([slim.batch_norm],
decay=0.9, # Adjust decay to the number of iterations
update_collections=None, # Make sure updates happen automatically
is_training=is_training, # Switch behavior from training to non-training):
net = slim.fully_connected(data, 100, scope='fc1')
net = slim.fully_connected(net, 200, scope='fc2')
....
# Don't use activation_fn nor batch_norm in the last layer
net = slim.fully_connected(net, 10, activation_fn=None, normalizer_fn=None, scope='fc10')
return net
@sguadaããããšããç§ã¯ããªããäžã§è¿°ã¹ãããã«å®è£ ãããŠããbathnormã§ãããã¯ãŒã¯ãæ§ç¯ããŸã
z = batch_norm(x, decay=0.9, center=False, scale=True, updates_collections=None,
is_training=train_phase, scope=scope_bn)
é床ãé
ãã®ã§ããã³ãœã«ãããŒãã³ãããŒã¯ã䜿çšããŠæ¬¡ã®ããã«èšç®æéãååŸããŸãã
I tensorflow / core / util / stat_summarizer.ccïŒ392]==============================èšç®æéã§ããã=== ===========================
I tensorflow / core / util / stat_summarizer.ccïŒ392] [ããŒãã¿ã€ã] [éå§] [æå] [å¹³åms] [ïŒ
] [cdfïŒ
] [mem KB] [åå]
I tensorflow / core / util / stat_summarizer.ccïŒ392] Conv2D 106.164 51.354 51.004 23.145ïŒ
23.145ïŒ
692.224 conv8 / Conv2D
I tensorflow / core / util / stat_summarizer.ccïŒ392] Conv2D 85.187 19.115 19.283 8.750ïŒ
31.896ïŒ
692.224 conv7 / Conv2D
I tensorflow / core / util / stat_summarizer.ccïŒ392] SquaredDifference 11.967 15.105 14.331 6.503ïŒ
38.399ïŒ
11075.584 conv1 / batch_norm / moments / success_statistics / SquaredDifference
I tensorflow / core / util / stat_summarizer.ccïŒ392] Mul 11ââ.970 14.162 13.495 6.124ïŒ
44.523ïŒ
11075.584 conv1 / batch_norm / batchnorm / mul_1
I tensorflow / core / util / stat_summarizer.ccïŒ392] Conv2D 3.948 8.170 7.986 3.624ïŒ
48.146ïŒ
11075.584 conv1 / Conv2D
I tensorflow / core / util / stat_summarizer.ccïŒ392] Sub 11.960 10.176 7.943 3.604ïŒ
51.751ïŒ
11075.584 conv1 / batch_norm / moments / success_statistics / Sub
I tensorflow / core / util / stat_summarizer.ccïŒ392] SquaredDifference 45.570 5.908 7.177 3.257ïŒ
55.007ïŒ
5537.792 conv2 / batch_norm / moments / success_statistics / SquaredDifference
I tensorflow / core / util / stat_summarizer.ccïŒ392] Mul 45.574 7.755 6.902 3.132ïŒ
58.140ïŒ
5537.792 conv2 / batch_norm / batchnorm / mul_1
I tensorflow / core / util / stat_summarizer.ccïŒ392] Conv2D 40.692 5.408 4.845 2.199ïŒ
60.338ïŒ
5537.792 conv2 / Conv2D
I tensorflow / core / util / stat_summarizer.ccïŒ392] Sub 45.563 6.067 4.784 2.171ïŒ
62.509ïŒ
5537.792 con
conv1 / batch_norm / moments / success_statistics / SquaredDifferenceã®ããã«ããã¹ãäžã«ç¬éçã«ããã€ãã®æäœãå®è¡ãããå€ãã®æéããããçç±ãããããŸããã
ãã¹ãã§ã¯ãã®ç¬éã¯å¿ èŠãããŸãããããªãç¬éã®äžã§ããã€ãã®æäœãå®è¡ãããã®ã§ããïŒ
ããã«ã¡ã¯ã
contrib.layers
ã®äžèšã®batch_norm
ã¬ã€ã€ãŒã䜿çšãããšããã¬ã€ã³ã°ã©ããã·ãŒã ã¬ã¹ã«å®è¡ãããŠããéãæ€èšŒã°ã©ãã®åºåãšããŠnan
ãååŸããŠããŸãã 足ããªããã®ã¯ãããŸããïŒ
ç§ã䜿çšããŠãããã®ïŒ
def batchnormlayer(inputs, numout, train_model):
with tf.variable_scope("batch_norm") as scope_bn:
epsilon = 1e-3
return tf.contrib.layers.batch_norm(inputs, decay=0.9, updates_collections=None,
scale=True, scope=scope_bn,
is_training=train_model, epsilon=epsilon,
fused=True, reuse=scope_bn.reuse)
ããããšã
ãã©ããŒã¢ãããšããŠãbatch_normã®16ã¬ã€ã€ãŒãåå©çšããŠããŸãã
ãããã4ã€ã®ã¬ã€ã€ãŒãåå©çšããããšã§ããŸãããããšãããããŸããã
ãã³ãœã«ãããŒããã»ã¹ã匷å¶çµäºããŠåèµ·åãããšããšã©ãŒãæ°ãšããã¯ã®éæªåããïŒã€ãŸããæåŸã®ãã§ãã¯ãã€ã³ãã§ã®ãšã©ãŒãããæªåããïŒããšã«æ°ä»ããã°ããã§ãã ãŸããbatch_normãåé€ãããšããã®åé¡ã解æ¶ãããããšãããããŸããã ãã°ããã³ãŒããèŠãåŸã移åå¹³åã管çããããã«ExponentialMovingAveragesã¯ã©ã¹ã䜿çšãããå Žåã®ããã«ãå€æ°ã®å€ãã·ã£ããŠå€æ°ãã埩å ãããªãã£ãããšãåå ã§ããå¯èœæ§ããããŸãã ããã¯ãå¥ã®ããã»ã¹ã䜿çšããŠè©äŸ¡ãããšã移åå¹³åã§ã¯ãªããå€æ°ã®æåŸã®å€ãååŸãããããšãæå³ããŸãã ç§ã¯ãããæ£ãã解éããŠããŸããïŒããã¯æå³ãããåäœã§ããïŒ ã·ã£ããŠå€æ°ã®å€ã埩å ãããããã§ã...
ç§ã¯åé¡ãèŠã€ããŸãããç§ã®å Žåã®ç§»ååæ£ã¯ããã€ãã®å埩ã®åŸã«è² ã«ãªããŸãã
ãã³ãœã«ã®åºåïŒ Model/clip_logits/batch_norm/moving_variance:0
ååšããtf.model_variables()
ã¯
Moving variance (shape = (101,)) =
[ 214.70379639 95.36338043 0.57885742 189.49542236 102.72473145
137.14886475 286.57333374 111.06427002 154.98750305 167.75219727
207.83955383 211.14007568 158.23495483 171.61665344 116.81361389
115.77380371 43.59399796 137.75064087 181.75245667 161.37339783
215.21934509 92.88521576 191.23846436 336.3946228 259.85919189
299.47039795 186.23222351 165.19311523 262.82446289 170.11567688
233.56843567 209.35050964 115.96807861 154.34109497 295.5770874
123.6055603 295.76187134 296.88583374 240.88217163 247.32983398
87.15661621 217.69897461 133.00698853 -4.80375671 344.77462769
291.50601196 117.77174377 265.83712769 207.90093994 194.186203
220.21418762 178.03738403 115.27571869 196.62184143 228.8089447
191.53205872 331.36807251 151.55435181 197.2951355 179.67504883
181.09727478 90.09922791 173.30133057 102.6836853 160.9434967
236.59512329 168.05305481 403.36340332 41.14326096 185.93409729
130.57434082 266.31509399 101.44387817 163.88059998 290.25015259
244.52597046 229.86647034 158.14352417 202.68774414 187.78227234
248.78218079 126.0978241 171.41891479 274.40740967 119.84254456
202.53045654 200.20608521 214.04730225 111.53284454 222.03184509
244.81187439 172.23052979 187.09806824 194.62802124 255.26345825
293.63598633 307.91036987 210.86982727 308.88919067 144.94792175
229.69013977]
ã芧ã®ãšããããã£ã¡ã³ã·ã§ã³ã®1ã€ã«è² ã®åæ£ããããŸãã ããã¯ã©ãããŠå¯èœã§ããïŒ
PSããããã«ã å±€ã¯ããããã¯ãŒã¯ã®æåŸã®å®å
šã«æ¥ç¶ãããå±€ã®çŽåŸã§ãsoftmaxã®åã«äœ¿çšãããŸãã
@ raghavgoyal14 fused = Trueã§äœ¿çšããŠããŸããïŒ åæ§ã®åé¡ããããèåããŒãžã§ã³ã䜿çšãããšåé¡ã解決ããŸãã
@abred ïŒã¯ãã fused=True
ãåãåé¡ã§ãã
@sguadaããã«ã¡ã¯ãsguadaãåé¡ããããŸãã
tensorflowã§ã®contrib.layers.batch_normã®å®çŸ©ïŒ
def batch_normïŒinputsã
æžè¡°= 0.999ã
center = Trueã
scale = Falseã
ã€ãã·ãã³= 0.001
Activation_fn =ãªãã
param_initializers =ãªãã
param_regularizers =ãªãã
updates_collections = ops.GraphKeys.UPDATE_OPSã
is_training = Trueã
åå©çš=ãªãã
variables_collections = Noneã
output_collections =ãªãã
trainable = Trueã
batch_weights = Noneã
fused = Falseã
data_format = DATA_FORMAT_NHWCã
zero_debias_moving_mean = Falseã
scope =ãªãã
renorm = Falseã
renorm_clipping =ãªãã
renorm_decay = 0.99ïŒïŒ
ã¹ã±ãŒã«ïŒTrueã®å Žåãã¬ã³ããæããŸãã Falseã®å Žåãã¬ã³ãã¯
䜿çšãããŠããªãã 次ã®å±€ãç·åœ¢ã§ããå ŽåïŒããšãã°ãnn.reluïŒãããã¯æ¬¡ã®ããã«ãªããŸãã
ã¹ã±ãŒãªã³ã°ã¯æ¬¡ã®ã¬ã€ã€ãŒã§å®è¡ã§ãããããç¡å¹ã«ãªã£ãŠããŸãã
tf.contrib.layers.batch_normïŒinputãscale = FalseïŒã䜿çšããå Žåããscale = Falseãã¯ããã¬ãŒãã³ã°äžã«ãy = gamma * x + betaãã§ã¬ã³ãããŒããã©ãããæå³ããŸãã ã©ããããããšãããããŸããã
scale = Falseã®å Žåãã¬ã³ãã¯å®æ°1ã§ãã
@ppwwyyxxã
ã©ããããããšãããããŸããã
@MisayaZ ãis_trainingãã®ãã¬ãŒã¹ãã«ããŒãæå®ããŠBatchnormã䜿çšããŠãåãåäœãããŠããŸããã ãã¬ãŒã¹ã§ã¯ããã¹ãæã§ãã¢ãŒã¡ã³ããèšç®ãããŠããããšãããããŸãããã®ããããœãŒã¹ã³ãŒãã«å ¥ããšã次ã®ããšãããããŸããã
# If `is_training` doesn't have a constant value, because it is a `Tensor`,
# a `Variable` or `Placeholder` then is_training_value will be None and
# `needs_moments` will be true.
is_training_value = utils.constant_value(is_training)
need_moments = is_training_value is None or is_training_value
if need_moments:
# here it defines the moments
ãis_trainingããå€æ°ãŸãã¯ãã¬ãŒã¹ãã«ããŒã®å Žåããã¬ãŒã¹ãã«ããŒããFalseãã«èšå®ããå Žåã§ããã¢ãŒã¡ã³ããå®çŸ©ãããå®è¡æã«ããããèšç®ãããããã«èŠããŸãã ã°ã©ããåå®çŸ©ããã«ãã¬ãŒãã³ã°äžã«å®æçãªãã¹ããå®è¡ã§ããããããã¬ãŒã¹ãã«ããŒã®ãŸãŸã«ããŠããããšããå§ãããŸãããå®æ°ãšããŠäœ¿çšãããã¬ãŒãã³ã°ãšãã¹ãã®ç°ãªãåäœãå®çŸ©ããããšã«ããŸãããçŸåšãã¢ãŒã¡ã³ãã¯èšç®ãããŠããŸããããã¹ãæã«ã
@ tano297ããããšãããããŸãã 'is_training'ãå®æ°ãšããŠäœ¿çšããããã«ãªããŸããã ãã¬ãŒã¹ãã«ããŒã®ãŸãŸã«ããŠãå®æçãªãã¹ããè¡ããšã移åå¹³åãšç§»ååæ£ã®å€ãå€æŽãããŸãã ãŸããå ¥åã®å¹³åãšåæ£ãèšç®ãã移åå¹³åãšç§»ååæ£ãæŽæ°ãããããæšè«æéã¯é·ããªããŸãã ãã¹ããè¡ãæ£ããæ¹æ³ã¯ãåè¿°ã®ããã«ãã¬ãŒãã³ã°ãšãã¹ãã®ããŸããŸãªåäœãå®çŸ©ããããšã§ãã
@ tano297 @MisayaZ
ãããããsmart_condãã¯
is_training_value = utils.constant_value(is_training)
need_updates = is_training_value is None or is_training_value
if need_updates:
...
outputs = utils.smart_cond(is_training, _force_updates, no_updates)
is_trainingãTrueãšè©äŸ¡ãããå Žåã«ã®ã¿ãæŽæ°ãèšç®ããã³é©çšãããããšã確èªããŠãã ããã
@abredã¯ãã確ãã«ããã§ããã391è¡ç®ãåç §ããŠããŸããããã§ã¯ã_fused_batch_normïŒïŒå ã®ç§»åå¹³åã®æŽæ°ãè¡ãããŸãã
# If `is_training` doesn't have a constant value, because it is a `Tensor`,
# a `Variable` or `Placeholder` then is_training_value will be None and
# `need_updates` will be true.
is_training_value = utils.constant_value(is_training)
need_updates = is_training_value is None or is_training_value
if need_updates:
...
outputs = utils.smart_cond(is_training, _force_updates, no_updates)
...
ç§ã¯batch_normïŒïŒå ã®753è¡ç®ã«ã€ããŠè©±ããŠããïŒ
# If `is_training` doesn't have a constant value, because it is a `Tensor`,
# a `Variable` or `Placeholder` then is_training_value will be None and
# `needs_moments` will be true.
is_training_value = utils.constant_value(is_training)
need_moments = is_training_value is None or is_training_value
if need_moments:
...
mean, variance = utils.smart_cond(is_training,
_force_updates,
moving_vars_fn)
...
ãã®å Žåã®ã¹ããŒãæ¡ä»¶ïŒç§ã«é¢ããéãïŒã¯ã移åå¹³åãæŽæ°ãããã©ããã決å®ããŸãããã¢ãŒã¡ã³ãã¯èšç®ãããŸãã
@ tano297ããªãã¯ããã«ã€ããŠæ£ããã§ããç§ã¯ééã£ãå Žæã«ããŸããããããããã§ãïŒ
è¡755-770ã¯ã¢ãŒã¡ã³ããèšç®ããŸãããã¢ãŒã¡ã³ãã¯_force_updatesã§ã®ã¿äœ¿çšãããis_trainingãTrueãšè©äŸ¡ãããå Žåã«ã®ã¿å®è¡ãããŸããã
ãããã£ãŠ
mean, variance = utils.smart_cond(is_training, _force_updates, moving_vars_fn)
è¡804ãšåçã§ããå¿ èŠããããŸãã
mean, variance = moving_mean, moving_variance
is_trainingãFalseã«è©äŸ¡ããããããã£ãŠãã¢ãŒã¡ã³ããã®å Žå-ã°ã©ãã®äžéšã¯äœ¿çšãããªããããå®è¡ããªãã§ãã ãã
ããããç§ã¯ãã¹ãããŠããªãã®ã§ãããã«ã€ããŠééã£ãŠãããããããŸãã:)
@ tano297 @abred youright ã 次ã®ããã«batchnormã䜿çšãããšã移åå¹³åãšç§»ååæ£ãå€æŽãããŸãã
def batch_norm_layer(self, x,train_phase, scope_bn):
bn_train = batch_norm(x, decay=0.9, center=False, scale=True,
updates_collections=None,
is_training=True,
reuse=None,
variables_collections= [UPDATE_OPS_COLLECTION],
trainable=True,
scope=scope_bn)
bn_inference = batch_norm(x, decay=0.9, center=False, scale=True,
updates_collections=None,
is_training=False,
reuse=True,
variables_collections= [UPDATE_OPS_COLLECTION],
trainable=True,
scope=scope_bn)
z = tf.cond(train_phase, lambda: bn_train, lambda: bn_inference)
return z
次ã®ããã«äœ¿çšããå ŽåïŒ
z = batch_norm(x, decay=0.9, center=False, scale=True, updates_collections=None,
is_training=train_phase, scope=scope_bn)
移åå¹³åãšç§»ååæ£ã¯ãã¹ãäžã«å€æŽãããŸããããé床ã¯éåžžã«é ãã§ãã
ããã«ã¡ã¯@zhongyuk ã
ãŸãããã¬ãŒãã³ã°ãšæšè«ã®äž¡æ¹ã«is_training = Trueã䜿çšãããšè¯ãçµæãåŸãããããæšè«äžã«is_training = Falseãèšå®ãããšæªãçµæãåŸããããšããåé¡ãçºçããŸããïŒis_training = Trueã䜿çšããå ŽåãããæªãïŒã ããªãã®åæã«ãããšãç§ãæ£ããç解ããŠããã°ãBNã§decay = 0.9ãèšå®ããã ãã§ããã®åé¡ã解決ã§ããŸãã ç§ã¯æ£ããã§ããïŒ
ãšããã§ãdecay = 0.9ã䜿çšããŠã¢ãã«ãæåããåãã¬ãŒãã³ã°ããå¿ èŠããããŸããïŒ ãŸãã¯ããã§ãã¯ãã€ã³ããããã¬ãŒãã³ã°ãåéããïŒã€ãŸããdecay = 0.999ã®ãšãã«ãã¬ãŒãã³ã°ããïŒããšãã§ããŸããïŒ
ããããšãïŒ
@nmduc @ davek44
ããã«ã¡ã¯ããã¬ãŒãã³ã°ãšæšè«ã®äž¡æ¹ã«is_training = Trueã䜿çšãããšè¯ãçµæãåŸãããããæšè«äžã«is_training = Falseãèšå®ãããšæªãçµæãåŸããããšããåé¡ãçºçããŸããïŒis_training = Trueã䜿çšããå ŽåãããæªãïŒã ããªããã¡ã¯ãã®åé¡ã解決ããŸãããïŒ ããããšãïŒ
@tyshiwobatch_normã«decay = 0.9ãèšå®ãããšããããããŸã§ã®ãšããããŸãæ©èœããŠããŸãã
ããããã«ã ãé©åã«äœ¿çšããæ¹æ³ã«ã€ããŠã®ããããã¹ãŠã®ã³ã¡ã³ãã®åŸã§ç§ã¯æ··ä¹±ããŸããïŒããã§ç§ãæã£ãŠãããã®ã¯ããã«ãããŸãã ç§ãééã£ãŠããå Žåã¯ç§ãèšæ£ããŠãã ããã
batch_norm = tf.contrib.layers.batch_norm(conv,
center=True,
scale=True,
reuse=phase_train_py,
scope='bn',
is_training=is_training)
ããã§ãphase_train_pyã¯Pythonã®ããŒã«å€æ°ã§ãããis_trainingã¯ããŒã«å€æ°ãåãåããã¬ãŒã¹ãã«ããŒã§ãã tf.condã®äœ¿çšã¯ééã£ãŠãããšæããŸããããã§ãªãå Žåãé¢æ°ã«ã¯ããŒã«ãã©ã¡ãŒã¿ãŒãä»å±ããŠããŸããã èšãæãããšã tf.cond
ãçã®å Žåããã¬ãŒãã³ã°çšã«batch_norm
é¢æ°ã䜿çšãããã¹ãçšã«å¥ã®é¢æ°ã䜿çšããå¿
èŠããããŸãã ãããã£ãŠãéçºè
ã¯ãé¢æ°ã®åäœãå€æŽããããã«ããããã®ããŒã«å€æ°ãå€æŽããããšãèš±å¯ããŸãã ã€ãŸããç§ãè¡ã£ãŠããã®ã¯ããã¬ãŒãã³ã°äžã«is_training
phase_train_py
ãFalseã«èšå®ãã is_training
ãTrueã«èšå®ããããšã§ãã ãããŠããã¹ãäžã¯ãã®éã§ãã ãã³ãœã«ãŸãã¯ãã¬ãŒã¹ãã«ããŒã¯sess.run
ã§ããå€æŽã§ããªããããã°ã©ããå®è¡ããåã«æå³çã«phase_train_py
ãå€æŽããŸããã äŸïŒ
if condition:
phase_train_py = False
sess.run(to_run_list, feed_dict={phase_train: True})
else:
phase_train_py = True
sess.run(to_run_list, feed_dict={phase_train: False})
++++++++++++++++++++++++++++++++++++++++++++++++++++ +++++++++++++++++++++++++++++
ãã¶ãããªãã¯ãããèªãå¿
èŠããããŸã
++++++++++++++++++++++++++++++++++++++++++++++++++++ +++++++++++++++++++++++++++++
TFv1.3ã«ã¯ãŸã åé¡ãããããã§ãã ç§ã¯æ¬¡ã®è©³çŽ°ã«æ³šæããŠãããšç¢ºä¿¡ããŠããŸãããããã§ãå
¬åŒã®tf.contrib.layers.batch_norm
ã䜿çšã§ãããè©äŸ¡äžã«is_training=False
ããŸããïŒãã ããè©äŸ¡äžã«is_training=True
å€æŽããªããšã OKïŒïŒ
1. decay
ãææ°ç§»åå¹³åã¯å®éã«ã¯ä¿¡å·åŠçã®ã¢ã«ãã¡ãã£ã«ã¿ãŒã§ãããåæããæéã¯åè»ã®çŽ1 /ïŒ1æžè¡°ïŒã¹ãããã§ãã æžè¡°= 0.999ã®å Žåãåæããã«ã¯1 / 0.001 = 1000ã¹ããããå¿
èŠã§ãã ãããã£ãŠããã¬ãŒãã³ã°ã¹ãããæ°ã«é©åãªæžè¡°ãèšå®ããŸãã
updates_collections=None
䜿çšããŸãreuse
ãé©åãªå€ã«èšå®ããŸããå
¬åŒã®batch_normã䜿çšããå¯äžã®æ¹æ³ã¯ã2ã€ã®ã°ã©ããäœæããããšã§ãã1ã€ã¯é»è»çšããã1ã€ã¯è©äŸ¡çšã§ãããããis_training=True
ãšis_training=False
ã§ãã ãã®ããã«ããã¬ãŒãã³ã°ãšè©äŸ¡ãåçã«åãæ¿ããå¿
èŠã¯ãããŸããã ããããè€æ°ã®ã°ã©ããäœæããå¿
èŠããããããããã¯ã°ãããæ¹æ³ã§ãã
æåŸã«ãç§ã¯èªåã§ç§»åå¹³åãæžããŸãããããŠãããããŸããã£ãããšãããããŸãïŒ ããã¯æ¬¡ã®ãšããã§ãïŒWebäžã®ã³ãŒãã«åºã¥ããŠãããèªåã§å€æŽããŠããŸãïŒ
def bn_layer(x, scope, is_training, epsilon=0.001, decay=0.99, reuse=None):
"""
Performs a batch normalization layer
Args:
x: input tensor
scope: scope name
is_training: python boolean value
epsilon: the variance epsilon - a small float number to avoid dividing by 0
decay: the moving average decay
Returns:
The ops of a batch normalization layer
"""
with tf.variable_scope(scope, reuse=reuse):
shape = x.get_shape().as_list()
# gamma: a trainable scale factor
gamma = tf.get_variable("gamma", shape[-1], initializer=tf.constant_initializer(1.0), trainable=True)
# beta: a trainable shift value
beta = tf.get_variable("beta", shape[-1], initializer=tf.constant_initializer(0.0), trainable=True)
moving_avg = tf.get_variable("moving_avg", shape[-1], initializer=tf.constant_initializer(0.0), trainable=False)
moving_var = tf.get_variable("moving_var", shape[-1], initializer=tf.constant_initializer(1.0), trainable=False)
if is_training:
# tf.nn.moments == Calculate the mean and the variance of the tensor x
avg, var = tf.nn.moments(x, np.arange(len(shape)-1), keep_dims=True)
avg=tf.reshape(avg, [avg.shape.as_list()[-1]])
var=tf.reshape(var, [var.shape.as_list()[-1]])
#update_moving_avg = moving_averages.assign_moving_average(moving_avg, avg, decay)
update_moving_avg=tf.assign(moving_avg, moving_avg*decay+avg*(1-decay))
#update_moving_var = moving_averages.assign_moving_average(moving_var, var, decay)
update_moving_var=tf.assign(moving_var, moving_var*decay+var*(1-decay))
control_inputs = [update_moving_avg, update_moving_var]
else:
avg = moving_avg
var = moving_var
control_inputs = []
with tf.control_dependencies(control_inputs):
output = tf.nn.batch_normalization(x, avg, var, offset=beta, scale=gamma, variance_epsilon=epsilon)
return output
def bn_layer_top(x, scope, is_training, epsilon=0.001, decay=0.99):
"""
Returns a batch normalization layer that automatically switch between train and test phases based on the
tensor is_training
Args:
x: input tensor
scope: scope name
is_training: boolean tensor or variable
epsilon: epsilon parameter - see batch_norm_layer
decay: epsilon parameter - see batch_norm_layer
Returns:
The correct batch normalization layer based on the value of is_training
"""
#assert isinstance(is_training, (ops.Tensor, variables.Variable)) and is_training.dtype == tf.bool
return tf.cond(
is_training,
lambda: bn_layer(x=x, scope=scope, epsilon=epsilon, decay=decay, is_training=True, reuse=None),
lambda: bn_layer(x=x, scope=scope, epsilon=epsilon, decay=decay, is_training=False, reuse=True),
)
ã°ã©ãã®äœæäžã«bn_layer_top
é¢æ°ã䜿çšããã ãã§ãis_trainingãã©ã¡ãŒã¿ãŒã¯tf.placeholder
ã 次ã«ã feed_dict
ããã¬ãŒãã³ã°äžã«ãã¬ãŒã¹ãã«ããŒãTrueã«ãè©äŸ¡äžã«Falseã«èªç±ã«åãæ¿ããããšãã§ããŸãã
ãããã³ãã¥ããã£ã«åœ¹ç«ã€ããšãé¡ã£ãŠããŸãã
Slim.batch_normã䜿çšããå Žåã¯ããtf.train.GradientDecentOptimizerïŒlrïŒ.minimizeïŒlossïŒãããã®ä»ã®ãªããã£ãã€ã¶ãŒã§ã¯ãªããå¿ ããslim.learning.create_train_opãã䜿çšããŠãã ããã ãããæ©èœãããã©ããã確èªããŠã¿ãŠãã ããïŒ
@vincentvanhouckeããªãã¯ãã®ã¹ã¬ããã®å¥ã®æçš¿ã«æžããŠããŸãïŒ
ã¹ãªã ãªbatch_normã©ãããŒã¯ãå ¥åãã³ãœã«ã®æåŸã®æ¬¡å ãæ£èŠåããŸãã ãããã£ãŠãå®å šã«æ¥ç¶ãããã¬ã€ã€ãŒããã®2Då ¥åãã³ãœã«ã®å Žåããããã§æ£èŠåããããããã¢ã¯ãã£ãåããšã®æ£èŠåãå®è¡ãããŸãã ç³ã¿èŸŒã¿ã«ç±æ¥ãã4Dãã³ãœã«ã®å Žåãæåã®3ã€ã®æ¬¡å ïŒããããå¹ ãæ·±ãïŒã§æ£èŠåããããããæ©èœããšã®æ£èŠåãå®è¡ãããŸãã @sguadaã¯ãããã«ã€ããŠããå°ã説æçã§ãããããããŸããã
ãã¹ãªã batch_normã©ãããŒãã§é¢æ°tf.contrib.layers.batch_norm
ãæå³ããŸããïŒ ãããããªããç§ã¯ãã®æ
å ±ããã®é¢æ°ã®ããã¥ã¡ã³ãããã¹ãã«è¿œå ããããšããå§ãããŸãã ãããã£ãŠããã®é¢æ°ã¯ãFC-LayerãšConv2D-Layerã®äž¡æ¹ã«ã€ããŠããã®ããŒããŒã§èª¬æãããŠããã®ãšãŸã£ããåãããã«ãããæ£èŠåãå®è¡ããããšãéåžžã«æ確ã«ãªããŸãã çŸæç¹ã§ã¯ããconv2dããã³full_connectedã®ããŒãã©ã€ã¶ãŒé¢æ°ãšããŠäœ¿çšã§ããŸããããšããããã¹ãã®ã¿ãããããããæ£èŠå軞ã®ãããã¯ã«é¢é£ããŠãããã©ããã¯äžæã§ãã
@ZahlGrafããã¥ã¡ã³ããæ確ã«ããPRãåãã§æ€èšããŸãã ç§ãã¡ã¯é·ãéããã«åãçµãã§ããã®ã§ãäœãæçœã§ãããã©ãããããããããŸããããã®ãããã¯ã«ã€ããŠæ°é®®ãªèŠç¹ãæã€äººã®ããã«ãããã¥ã¡ã³ããæ確ã«ããããšãæè¿ããŸãã
@vincentvanhoucke
äž»ã«ãã®ã¹ã¬ããã§ã®ããªãã®å£°æã«åºã¥ããŠããã詳现ãªèª¬æãå«ãPRãäœæããŸããã
https://github.com/tensorflow/tensorflow/pull/15653
ãã®åé¡ã¯å€éšããã®è²¢ç®ãæããŠãããããè²å人ãåé€ããŠãã ããã ãã以å€ã®å Žåã¯ã contributions welcome
ã©ãã«ãåé€ããŸãã ããããšãããããŸããã
ãã®åé¡ã¯å€éšããã®è²¢ç®ãæããŠãããããè²å人ãåé€ããŠãã ããã ãã以å€ã®å Žåã¯ã contributions welcome
ã©ãã«ãåé€ããŸãã ããããšãããããŸããã
ããããã«ã ã¬ã€ã€ãŒãè¿œå ãããšããå
ã®ãªã¯ãšã¹ãã解決ãããããããã®ãã°ã解æ¶ããŸãã ããã¥ã¡ã³ãã«é¢ããæè¿ã®åé¡ã®ããã€ãã«ã¯ãç¬èªã®PRãããããã§ã
batch_normã«åé¡ãããå Žåã¯ãStackOverflowã§è³ªåããããå¥ã®åé¡ãéããŠãã ããã
æãåèã«ãªãã³ã¡ã³ã
batch_norm
ã¬ã€ã€ãŒããããŸãïŒhttps://github.com/tensorflow/tensorflow/blob/b826b79718e3e93148c3545e7aa3f90891744cc0/tensorflow/contrib/layers/python/layers/layers.py#L100