Pytorch: Bilinear interpolation behavior inconsistent with TF, CoreML and Caffe

Created on 17 Aug 2018 · 4Comments · Source: pytorch/pytorch

Issue description

Trying to compare and transfer models between Caffe, TF and Pytorch found difference in output of bilinear interpolations between all. Caffe is using depthwise transposed convolutions instead of straightforward resize, so it's easy to reimplement both in TF and Pytorch.
However, there is difference between output for TF and Pytorch with align_corners=False, which is default for both.

Code example

img = cv2.resize(cv2.imread('./lenna.png')[:, :, ::-1], (256, 256))
img = img.reshape(1, 256, 256, 3).astype('float32') / 255.
img = tf.convert_to_tensor(img)
output_size = [512, 512]
output = tf.image.resize_bilinear(img, output_size, align_corners=True)
with tf.Session() as sess:
        values = sess.run([output])
out_tf = values[0].astype('float32')[0]

img = img.reshape(1, 256, 256, 3).transpose(0, 3, 1, 2).astype('float32') / 255.
out_pt = nn.functional.interpolate(torch.from_numpy(nimg), 
                                   scale_factor=2, 
                                   mode='bilinear', 
                                   align_corners=True)
out_pt = out_pt.data.numpy().transpose(0, 2, 3, 1)[0]

print(np.max(np.abs(out_pt - out_tf)))
# output 5.6624413e-06

But

img = cv2.resize(cv2.imread('./lenna.png')[:, :, ::-1], (256, 256))
img = img.reshape(1, 256, 256, 3).astype('float32') / 255.
img = tf.convert_to_tensor(img)
output_size = [512, 512]
output = tf.image.resize_bilinear(img, output_size, align_corners=False)
with tf.Session() as sess:
        values = sess.run([output])
out_tf = values[0].astype('float32')[0]

img = img.reshape(1, 256, 256, 3).transpose(0, 3, 1, 2).astype('float32') / 255.
out_pt = nn.functional.interpolate(torch.from_numpy(nimg), 
                                   scale_factor=2, 
                                   mode='bilinear', 
                                   align_corners=False)
out_pt = out_pt.data.numpy().transpose(0, 2, 3, 1)[0]

print(np.max(np.abs(out_pt - out_tf)))
# output 0.22745097

Output diff * 10:

Output of CoreML is consistent with TF, so it seems that there is a bug with implementation of bilinear interpolation with align_corners=False in Pytorch.

Diff is reproducible both on cpu and cuda with cudnn 7.1, cuda 9.1.

Source

libfun

Most helpful comment

For pytorch implementation, here's the logic to compute src_idx <-> dst_idx mapping.
https://github.com/pytorch/pytorch/blob/master/aten/src/THNN/generic/linear_upsampling.h#L22
For TF implementation, scale is caculated in the same way. https://github.com/tensorflow/tensorflow/blob/f66daa493e7383052b2b44def2933f61faf196e0/tensorflow/core/kernels/image_resizer_state.h#L41
But the src_idx <-> dst_idx mapping is different. https://github.com/tensorflow/tensorflow/blob/6795a8c3a3678fb805b6a8ba806af77ddfe61628/tensorflow/core/kernels/resize_bilinear_op.cc#L85
I think Pytorch takes into account the pixel center when scaling, while TF doesn't.

ailzhang on 17 Aug 2018

👍2

All 4 comments

Thanks for the report. We will look into this!

SsnL on 17 Aug 2018

I've run a few experiments and it seems that Pytorch with align_corners=False is consistent with python-opencv, PIL and skimage with mode='edge'. I'll look into CoreML closer today.

libfun on 17 Aug 2018

👍1

ailzhang on 17 Aug 2018

👍2

For pytorch implementation, here's the logic to compute src_idx <-> dst_idx mapping.
https://github.com/pytorch/pytorch/blob/master/aten/src/THNN/generic/linear_upsampling.h#L22
For TF implementation, scale is caculated in the same way. https://github.com/tensorflow/tensorflow/blob/f66daa493e7383052b2b44def2933f61faf196e0/tensorflow/core/kernels/image_resizer_state.h#L41
But the src_idx <-> dst_idx mapping is different. https://github.com/tensorflow/tensorflow/blob/6795a8c3a3678fb805b6a8ba806af77ddfe61628/tensorflow/core/kernels/resize_bilinear_op.cc#L85
I think Pytorch takes into account the pixel center when scaling, while TF doesn't.

So do you think there is any workaround in Pytorch to make it's interpolation consistent with that of TensorFlow?
I thought it may be possible to try torch.nn.functional.grid_sample() of Pytorch.