Documentation states:
Note: if the
inputs
tensor has a rank greater than 2, then it is
flattened prior to the initial matrix multiply byw
.
However, the following returns tensor with shape shape=(2, 2, 2, 400)
, as if the input has not been flattened
tf.layers.dense(tf.placeholder(tf.float32, (2,2,2,2)), 400)
Whoops, sorry, didn't mean to remove myself. Investigating internally, was trying to add the person I think is responsible.
So, after investigation, what I was told was that it behaves like tensordot, and so the behavior and the current documentation are both correct -- and the engineer I talked to said that they thought we could use more precise language but that it wouldn't necessarily help as the current language is clear and unambiguous when in context. (See tensordot: https://docs.scipy.org/doc/numpy/reference/generated/numpy.tensordot.html)
If you want to try to make a pull request yourself to do some rewording, that would be welcome, but I'm closing this bug for now since we're not going to do anything about it internally.
Ah thanks, tensordot connection makes it more clear.
It seems tf.dense
behaves by contracting last index of input tensor with first index of weights tensor. The "flatten" word was confusing, not sure what it means here
Toy example I made for myself to translate tf.dense to equivalent np.tensordot
tf.reset_default_graph()
x0 = np.ones((3, 3, 3))
w0 = np.arange(6).reshape((3, 2))
x = tf.constant(x0)
y = tf.layers.dense(x, 2)
var_dict = {v.op.name: v for v in tf.global_variables()}
assert(var_dict["dense/kernel"].get_shape() == (3, 2))
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
sess.run(tf.assign(var_dict["dense/kernel"], w0))
sess.run(tf.assign(var_dict["dense/bias"], np.zeros((2,))))
expected_y0 = np.tensordot(x0,w0,axes=[(2,),(0,)])
y0 = sess.run(y)
np.testing.assert_allclose(y0, expected_y0)
Most helpful comment
Ah thanks, tensordot connection makes it more clear.
It seems
tf.dense
behaves by contracting last index of input tensor with first index of weights tensor. The "flatten" word was confusing, not sure what it means hereToy example I made for myself to translate tf.dense to equivalent np.tensordot