Hey Tensorflow,
Lately, I have been using the argmax function but I have always placed a tf.stop_gradient before using it. However, when I remove the stop_gradient, tensorflow still works fine.
Maybe I'm misunderstanding something, but argmax is not a differentiable function. How is backprop still working when you remove it? Shouldn't an error be thrown when you pass argmax without any stop_gradient?
If it is possible to differentiate argmax, then I would greatly appreciate any resource showing how this is done. Thanks TF!
Gradient is defined almost everywhere, so it could be defined in practice. It's not very useful though, so it's not registered for this op in TensorFlow.
x = tf.Variable([1., 1])
z = tf.argmax(x, 0)
sess = create_session()
xgrad = tf.gradients(z, x)
sess.run(tf.initialize_all_variables())
sess.run(xgrad)
LookupError: No gradient defined for operation 'ArgMax' (op type: ArgMax)
Most helpful comment
Gradient is defined almost everywhere, so it could be defined in practice. It's not very useful though, so it's not registered for this op in TensorFlow.