Tensorflow: Unclear documentation for tf.layers.dense flattening behavior

Created on 7 Mar 2017  ·  3Comments  ·  Source: tensorflow/tensorflow

Documentation states:

Note: if the inputs tensor has a rank greater than 2, then it is
flattened prior to the initial matrix multiply by w.

However, the following returns tensor with shape shape=(2, 2, 2, 400), as if the input has not been flattened
tf.layers.dense(tf.placeholder(tf.float32, (2,2,2,2)), 400)

docs-bug

Most helpful comment

Ah thanks, tensordot connection makes it more clear.

It seems tf.dense behaves by contracting last index of input tensor with first index of weights tensor. The "flatten" word was confusing, not sure what it means here

Toy example I made for myself to translate tf.dense to equivalent np.tensordot

tf.reset_default_graph()
x0 = np.ones((3, 3, 3))
w0 = np.arange(6).reshape((3, 2))
x = tf.constant(x0)
y = tf.layers.dense(x, 2)
var_dict = {v.op.name: v for v in tf.global_variables()}
assert(var_dict["dense/kernel"].get_shape() == (3, 2))
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
sess.run(tf.assign(var_dict["dense/kernel"], w0))
sess.run(tf.assign(var_dict["dense/bias"], np.zeros((2,))))

expected_y0 = np.tensordot(x0,w0,axes=[(2,),(0,)])
y0 = sess.run(y)
np.testing.assert_allclose(y0, expected_y0)

All 3 comments

Whoops, sorry, didn't mean to remove myself. Investigating internally, was trying to add the person I think is responsible.

So, after investigation, what I was told was that it behaves like tensordot, and so the behavior and the current documentation are both correct -- and the engineer I talked to said that they thought we could use more precise language but that it wouldn't necessarily help as the current language is clear and unambiguous when in context. (See tensordot: https://docs.scipy.org/doc/numpy/reference/generated/numpy.tensordot.html)

If you want to try to make a pull request yourself to do some rewording, that would be welcome, but I'm closing this bug for now since we're not going to do anything about it internally.

Ah thanks, tensordot connection makes it more clear.

It seems tf.dense behaves by contracting last index of input tensor with first index of weights tensor. The "flatten" word was confusing, not sure what it means here

Toy example I made for myself to translate tf.dense to equivalent np.tensordot

tf.reset_default_graph()
x0 = np.ones((3, 3, 3))
w0 = np.arange(6).reshape((3, 2))
x = tf.constant(x0)
y = tf.layers.dense(x, 2)
var_dict = {v.op.name: v for v in tf.global_variables()}
assert(var_dict["dense/kernel"].get_shape() == (3, 2))
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
sess.run(tf.assign(var_dict["dense/kernel"], w0))
sess.run(tf.assign(var_dict["dense/bias"], np.zeros((2,))))

expected_y0 = np.tensordot(x0,w0,axes=[(2,),(0,)])
y0 = sess.run(y)
np.testing.assert_allclose(y0, expected_y0)
Was this page helpful?
0 / 5 - 0 ratings