The implementation would be easy, but could help many people suffered from the headache of calculating how many padding they need.
cc @ezyang @gchanan @zou3519 @albanD @mruberry
This seems worth doing. What is the interface you are proposing? like nn.Conv2d(..., padding="same")
?
Note yhat if you are looking for the same behavior of TensorFlow, the implementation will not be that straighforward, because the number of pixels to add depend on the input size. See https://github.com/caffe2/caffe2/blob/master/caffe2/proto/caffe2_legacy.proto for reference
Thank you for indicating the issue and the reference.
To resolve the issue stated by @fmassa, I propose two interfaces.
First, as @soutmith mentioned, the first interface would be likenn.Conv*d(..., padding="same")
, calculating the padding every forward()
call.
However, it would be an inefficient way when the input shape is known in the initialization phase. Therefore, I suggest an interface like nn.CalcPadConv*d(<almost same parameters as Conv*d>)
. Using it, a user can calculate the padding using known width and height in initialization, and pass the output (the shape of padding) to the padding parameter of nn.Conv2d(...)
I'm not sure if the second proposal could be a premature optimization.
How do you think about these? Is there any idea of a better name?
I think the biggest source of inefficiency will come from the fact that we will need to add a F.pad
layer before every other convolution that requires the padding=same
case (because the amount of padding might not the same on the left and right sides), see for example how TensorFlow has to handle that in the cudnn
case. So that means that the nn.CalcPadConv*d
would be normally as expensive as a nn.Conv*d(..., padding="same")
.
This could be made more efficient if we supported different paddings for each side of the convolution (like in Caffe2, so left, right, top, bottom), but cudnn still doesn't support that so we would require the extra padding in those cases.
Also, I think if we add the padding="same"
to nn.Conv*d
, we should probably do the same for nn.*Pool*d
, right?
I think what bothers me a bit is that users might expect the behavior of padding=same
to be equivalent to TF, but they might not be expecting a performance drop.
What do you think?
Why would that be inefficient? couldn't we just compute the padding at every forward step? the cost should be tiny, so there's no need to optimize that. Maybe I don't fully understand the semantics, but I can't see why F.pad
would be needed.
making padding dependent on input size is quite bad. We just had an internal discussion about this, with @Yangqing outlining why this is a bad idea for a variety of serialization and efficiency reasons.
@fmassa, what I intended was to calculate "constant" padding shape in __init__()
using nn.CalcPadConv*d()
. As you said, this way won't just work when calculated padding is odd. Therefore, it is needed for F.pad
layer to be added, or, support of F.conv*d
for odd paddings should help.
EDIT: Then what I suggested should be a function and placed in, say, torch.nn.utils or torch.utils.
In result, what I suggest is a simple utility function, like (pseudocode):
def calc_pad_conv1d(width, padding='same', check_symmetric=True, ... <params that conv1d has>):
shape = <calculate padding>
assert not check_symmetric or <shape is symmetric>, \
'Calculated padding shape is asymmetric, which is not supported by conv1d. ' \
'If you just want to get the value, consider using check_symmetric=False.'
return shape
width = 100 # for example
padding = calc_pad_conv1d(width, ...)
m = nn.Conv1d(..., padding=padding)
Also, The function could be used with F.pad
in user's favor.
@qbx2 maybe I don't understand fully your proposal, but if we want to replicate TensorFlow behavior I don't think this is enough.
Here is a snippet of what I think mimics TensorFlow SAME
padding (I'm writing it down into the functional interface, so that nn.Conv2d
can just call into F.conv2d_same_padding
):
def conv2d_same_padding(input, weight, bias=None, stride=1, dilation=1, groups=1):
input_rows = input.size(2)
filter_rows = weight.size(2)
effective_filter_size_rows = (filter_rows - 1) * dilation[0] + 1
out_rows = (input_rows + stride[0] - 1) // stride[0]
padding_needed =
max(0, (out_rows - 1) * stride[0] + effective_filter_size_rows -
input_rows)
padding_rows = max(0, (out_rows - 1) * stride[0] +
(filter_rows - 1) * dilation[0] + 1 - input_rows)
rows_odd = (padding_rows % 2 != 0)
# same for padding_cols
if rows_odd or cols_odd:
input = F.pad(input, [0, int(cols_odd), 0, int(rows_odd)])
return F.conv2d(input, weight, bias, stride,
padding=(padding_rows // 2, padding_cols // 2),
dilation=dilation, groups=groups)
It was mostly copy-pasted from TensorFlow code in here and here.
As you can see, there is a lot of hidden things going on there, and that's why I think it might not be worth it adding a padding='same'
. And I think not replicating the SAME
behavior in TensorFlow is not ideal either.
Thoughts?
@fmassa Yes, you're right. It may be inefficient to calculate the padding on every forward()
.
However, my proposal is NOT to calculate the padding every forward()
call. A researcher (developer) may expect the sizes of images to nn.Conv2d
before runtime. And if he/she wants the 'same' padding, he/she can use the function to calculate required padding to mimic 'SAME'.
For example, think the case that a researcher has images with 200x200, 300x300, 400x400. Then he/she can calculate paddings for the three cases in the initialization phase and just pass the images to F.pad()
with the corresponding padding. Or he/she just change the padding field of nn.Conv2d
before the forward()
call, either. Refer to this:
>>> import torch
>>> import torch.nn as nn
>>> from torch.autograd import Variable
>>> m = nn.Conv2d(1,1,1)
>>> m(Variable(torch.randn(1,1,2,2))).shape
torch.Size([1, 1, 2, 2])
>>> m.padding = (1, 1)
>>> m(Variable(torch.randn(1,1,2,2))).shape
torch.Size([1, 1, 4, 4])
Yes, I just want to add the "padding calculating utility function" in pytorch core.
When the researcher wants dependent padding on each input image size, he/she can combine the function with F.pad()
before passing the image to nn.Conv2d
. I want to let the code writer decide whether to pad the inputs on every forward()
call or not.
Is there any plan of implementing a similar api in pytorch in the near future? People coming from a tensorflow / keras background will certainly appreciate it.
So, a basic padding calculation strategy (which does not gives the same results as TensorFlow, but the shapes are similar) is to have
def _get_padding(padding_type, kernel_size):
assert padding_type in ['SAME', 'VALID']
if padding_type == 'SAME':
return tuple((k - 1) // 2 for k in kernel_size))
return tuple(0 for _ in kernel_size)
Is that what you have in mind @im9uri ?
It's similar to what I had in mind, but as you mentioned previously the calculation gets complicated with stride and dilation.
Also having such an api in other convolution operations such as ConvTranspose2d would be great.
I think that "sliding-window operators" should all support asymmetric padding.
About the "same" argument...
@soumith Can you explain why making padding depending on the input size is bad, please?
If that's a problem, anyway, a pragmatic solution could be to require stride == 1
when using "same". For stride == 1
, the padding doesn't depend on the input size and can be computed a single time. The constructor should raise a ValueError
if the user attempts to use padding='same'
with stride > 1
.
I know, it's not the cleanest solution but the constraint sounds reasonable enough to me given that:
stride > 1
and that makes the use of the word "same" a bit misleading IMO;stride > 1
, while if we give to "same" its original semantic, well, of course it doesn't make any sense to use a strided convolution if you want the output has the same size of the input.conv2d documentation gives the explicit formulas for output sizes. Equating e.g. Hout with Hin one can solve for the padding:
def _get_padding(size, kernel_size, stride, dilation):
padding = ((size - 1) * (stride - 1) + dilation * (kernel_size - 1)) //2
return padding
Since same padding means padding = (kernel_size - stride)//2, what if padding = "same" is introduced such that when written, it automatically reads kernel size and stride (as that is also mentioned in nn.Conv2d) and applies padding automatically accordingly
Here is a very simple Conv2d layer with same
padding for reference. It only support square kernels and stride=1, dilation=1, groups=1.
class Conv2dSame(torch.nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, bias=True, padding_layer=torch.nn.ReflectionPad2d):
super().__init__()
ka = kernel_size // 2
kb = ka - 1 if kernel_size % 2 == 0 else ka
self.net = torch.nn.Sequential(
padding_layer((ka,kb,ka,kb)),
torch.nn.Conv2d(in_channels, out_channels, kernel_size, bias=bias)
)
def forward(self, x):
return self.net(x)
c = Conv2dSame(1,3,5)
print(c(torch.rand((16,1,10,10))).shape)
# torch.Size([16, 3, 10, 10])
If this is still being evaluated for being added to PyTorch, then regarding the tradeoffs between complexity / inefficiency vs. ease-of-use for developers:
In the road to 1.0 blog post, it states:
PyTorch’s central goal is to provide a great platform for research and hackability. So, while we add all these [production-use] optimizations, we’ve been working with a hard design constraint to never trade these off against usability.
Anecdotally, I come from a background of using Keras as well as the original tf.layers
/ estimator APIs. All have support for same
padding. I'm currently reimplementing a convnet I had originally written in TF with PyTorch, and the fact that I've had to build in the arithmetic for zero-padding myself has cost me about a half-day of time.
If the "central goal" really is focused on usability, than I'd argue that even if there's an efficiency hit to computing zero-padding on every forward pass (as mentioned above), the time saved in terms of developer efficiency and maintainability (e.g. not having to write custom code to compute zero padding) may be worth the tradeoff. Thoughts?
I would use this feature
It doesn't make sense to me why can't an optional API of padding=SAME
be offered? If someone is willing to incur the additional cost of padding then let them do so. For many researchers, quick prototyping is a requirement.
Yes, if someone can please add and approve this, it would be great.
Definitely add this, conner wants it.
Does pytorch support it now? Can it using same operation like first in VGG, set padding = (kernel_size-1)/2 ?
The VGG network can make output size does not change in the first group. Then you can using stride to resize the featuremap, does it sounds ok?
Here is one example to call padding same conv2d from deepfakes:
# modify con2d function to use same padding
# code referd to @famssa in 'https://github.com/pytorch/pytorch/issues/3867'
# and tensorflow source code
import torch.utils.data
from torch.nn import functional as F
import math
import torch
from torch.nn.parameter import Parameter
from torch.nn.functional import pad
from torch.nn.modules import Module
from torch.nn.modules.utils import _single, _pair, _triple
class _ConvNd(Module):
def __init__(self, in_channels, out_channels, kernel_size, stride,
padding, dilation, transposed, output_padding, groups, bias):
super(_ConvNd, self).__init__()
if in_channels % groups != 0:
raise ValueError('in_channels must be divisible by groups')
if out_channels % groups != 0:
raise ValueError('out_channels must be divisible by groups')
self.in_channels = in_channels
self.out_channels = out_channels
self.kernel_size = kernel_size
self.stride = stride
self.padding = padding
self.dilation = dilation
self.transposed = transposed
self.output_padding = output_padding
self.groups = groups
if transposed:
self.weight = Parameter(torch.Tensor(
in_channels, out_channels // groups, *kernel_size))
else:
self.weight = Parameter(torch.Tensor(
out_channels, in_channels // groups, *kernel_size))
if bias:
self.bias = Parameter(torch.Tensor(out_channels))
else:
self.register_parameter('bias', None)
self.reset_parameters()
def reset_parameters(self):
n = self.in_channels
for k in self.kernel_size:
n *= k
stdv = 1. / math.sqrt(n)
self.weight.data.uniform_(-stdv, stdv)
if self.bias is not None:
self.bias.data.uniform_(-stdv, stdv)
def __repr__(self):
s = ('{name}({in_channels}, {out_channels}, kernel_size={kernel_size}'
', stride={stride}')
if self.padding != (0,) * len(self.padding):
s += ', padding={padding}'
if self.dilation != (1,) * len(self.dilation):
s += ', dilation={dilation}'
if self.output_padding != (0,) * len(self.output_padding):
s += ', output_padding={output_padding}'
if self.groups != 1:
s += ', groups={groups}'
if self.bias is None:
s += ', bias=False'
s += ')'
return s.format(name=self.__class__.__name__, **self.__dict__)
class Conv2d(_ConvNd):
def __init__(self, in_channels, out_channels, kernel_size, stride=1,
padding=0, dilation=1, groups=1, bias=True):
kernel_size = _pair(kernel_size)
stride = _pair(stride)
padding = _pair(padding)
dilation = _pair(dilation)
super(Conv2d, self).__init__(
in_channels, out_channels, kernel_size, stride, padding, dilation,
False, _pair(0), groups, bias)
def forward(self, input):
return conv2d_same_padding(input, self.weight, self.bias, self.stride,
self.padding, self.dilation, self.groups)
# custom con2d, because pytorch don't have "padding='same'" option.
def conv2d_same_padding(input, weight, bias=None, stride=1, padding=1, dilation=1, groups=1):
input_rows = input.size(2)
filter_rows = weight.size(2)
effective_filter_size_rows = (filter_rows - 1) * dilation[0] + 1
out_rows = (input_rows + stride[0] - 1) // stride[0]
padding_needed = max(0, (out_rows - 1) * stride[0] + effective_filter_size_rows -
input_rows)
padding_rows = max(0, (out_rows - 1) * stride[0] +
(filter_rows - 1) * dilation[0] + 1 - input_rows)
rows_odd = (padding_rows % 2 != 0)
padding_cols = max(0, (out_rows - 1) * stride[0] +
(filter_rows - 1) * dilation[0] + 1 - input_rows)
cols_odd = (padding_rows % 2 != 0)
if rows_odd or cols_odd:
input = pad(input, [0, int(cols_odd), 0, int(rows_odd)])
return F.conv2d(input, weight, bias, stride,
padding=(padding_rows // 2, padding_cols // 2),
dilation=dilation, groups=groups)
Just dropping by to say I'd also very much appreciate this. Currently porting a simple model over from tensorflow and the calculations are taking a very long time for me to figure out...
Looks like this thread just died out. Given the number of thumbs up here, it would be really great to add this feature for faster prototyping.
I'll write a proposal for this and we can find someone to implement it.
I'm putting this against the v1.1 milestone.
Thank you, you are awesome! I also filed separate feature request to make padding argument accept 4-tuple. This would allow for asymmetric as well as symmetric padding which is also good low cost route to get halfway there.
@soumith It would be nice to have a padding mode SAME in the pytorch.
@soumith How about using a compile type interface ?
model=torch.compile(model,input_shape=(3,224,224))
I made a Conv2D with same padding that supports dilation and strides, based on how TensorFlow does theirs. This one calculates it in real time though, if you want to precalculate it just move the padding to init() and have an input size parameter.
import torch as tr
import math
class Conv2dSame(tr.nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, stride=1, dilation=1):
super(Conv2dSame, self).__init__()
self.F = kernel_size
self.S = stride
self.D = dilation
self.layer = tr.nn.Conv2d(in_channels, out_channels, kernel_size, stride, dilation=dilation)
def forward(self, x_in):
N, C, H, W = x_in.shape
H2 = math.ceil(H / self.S)
W2 = math.ceil(W / self.S)
Pr = (H2 - 1) * self.S + (self.F - 1) * self.D + 1 - H
Pc = (W2 - 1) * self.S + (self.F - 1) * self.D + 1 - W
x_pad = tr.nn.ZeroPad2d((Pr//2, Pr - Pr//2, Pc//2, Pc - Pc//2))(x_in)
x_out = self.layer(x_pad)
return x_out
Ex1:
Input shape: (1, 3, 96, 96)
Filters: 64
Size: 9x9
Conv2dSame(3, 64, 9)
Padded shape: (1, 3, 104, 104)
Output shape: (1, 64, 96, 96)
Ex2:
Same as before, but with stride=2
Conv2dSame(3, 64, 9, 2)
Padded shape = (1, 3, 103, 103)
Output shape = (1, 64, 48, 48)
@jpatts I believe your output shape calculation is wrong, it should be ceil(input_dimension / stride). Integer division in python is floor division - your code should have a different result from tensorflow for e.g. h=w=28, stride=3, kernel_size=1
.
Here is a variant that does the calculation beforehand:
def pad_same(in_dim, ks, stride, dilation=1):
"""
Refernces:
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/common_shape_fns.h
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/common_shape_fns.cc#L21
"""
assert stride > 0
assert dilation >= 1
effective_ks = (ks - 1) * dilation + 1
out_dim = (in_dim + stride - 1) // stride
p = max(0, (out_dim - 1) * stride + effective_ks - in_dim)
padding_before = p // 2
padding_after = p - padding_before
return padding_before, padding_after
If the input dimension is known and not calculated on the fly, this can be used e.g.:
# Pass this to nn.Sequential
def conv2d_samepad(in_dim, in_ch, out_ch, ks, stride, dilation=1, bias=True):
pad_before, pad_after = pad_same(in_dim, ks, stride, dilation)
if pad_before == pad_after:
return [nn.Conv2d(in_ch, out_ch, ks, stride, pad_after, dilation, bias=bias)]
else:
return [nn.ZeroPad2d((pad_before, pad_after, pad_before, pad_after)),
nn.Conv2d(in_ch, out_ch, ks, stride, 0, dilation, bias=bias)]
However, in this case some book-keeping needs to be done for the input dimension (this is the core issue), so if you use the above you may find useful:
def conv_outdim(in_dim, padding, ks, stride, dilation):
if isinstance(padding, int) or isinstance(padding, tuple):
return conv_outdim_general(in_dim, padding, ks, stride, dilation)
elif isinstance(padding, str):
assert padding in ['same', 'valid']
if padding == 'same':
return conv_outdim_samepad(in_dim, stride)
else:
return conv_outdim_general(in_dim, 0, ks, stride, dilation)
else:
raise TypeError('Padding can be int/tuple or str=same/valid')
def conv_outdim_general(in_dim, padding, ks, stride, dilation=1):
# See https://arxiv.org/pdf/1603.07285.pdf, eq (15)
return ((in_dim + 2 * padding - ks - (ks - 1) * (dilation - 1)) // stride) + 1
def conv_outdim_samepad(in_dim, stride):
return (in_dim + stride - 1) // stride
@mirceamironenco thanks for pointing that out, I made this quick and dirty and never checked. Updated to use ceiling instead
@harritaylor Agree, this feature would definitely simplify porting of Keras/TF models into PyTorch. Every once in a while, I still use "manual" calculations of padding size to build my same-padded layers.
@kylemcdonald
Here is a very simple Conv2d layer with
same
padding for reference. It only support square kernels and stride=1, dilation=1, groups=1.class Conv2dSame(torch.nn.Module): def __init__(self, in_channels, out_channels, kernel_size, bias=True, padding_layer=torch.nn.ReflectionPad2d): super().__init__() ka = kernel_size // 2 kb = ka - 1 if kernel_size % 2 == 0 else ka self.net = torch.nn.Sequential( padding_layer((ka,kb,ka,kb)), torch.nn.Conv2d(in_channels, out_channels, kernel_size, bias=bias) ) def forward(self, x): return self.net(x) c = Conv2dSame(1,3,5) print(c(torch.rand((16,1,10,10))).shape) # torch.Size([16, 3, 10, 10])
Should it be kb = ka - 1 if kernel_size % 2 else ka
or not ?
Will this also apply to Conv1d?
Maybe adding new padding method to the class ConvND would be a elegant choice, and by overloading the method, padding schedule could easily be extend.
I can probably take this if @soumith ever wrote that proposal or if someone summarizes what needs to be done. There's been a lot of discussion above and I'm not sure what we've settled on. Are we calculating padding dependent on input data or not, do we need to implement padding="same"
for pool as well, etc.?
I'd like to add causal padding as well. and please also add this to conv1d.
i stopped following the comments at some point but i think this feature is very well done in keras. you should follow it exactly.
@Chillee here you go:
We should add padding to the following layers:
For the first PR, let's keep it simple and just stick to Conv*d.
The complexity discussed above is around the layer becoming dynamic in nature, after a same
padding option is written. That is, it goes from the parameters of the layer being statically known, which is great for model export (for example ONNX export), to the parameters of the layer being dynamic. In this case, the dynamic parameter is padding
.
While this looks pretty harmless, non-staticness gets pretty important in limited runtimes, like mobile or exotic-hardware runtimes, where for example you want to do static shape analysis and optimization.
The other practical downside is that this dynamically calculated padding
is not always symmetric anymore, because depending on the size / stride of the kernel, dilation factor, and the input size, the padding might have to be assymmetric (i.e. different padding amount on left side vs right). It would mean that you cannot use CuDNN kernels for example.
Currently, the signature of Conv2d is:
torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')
Here, we support padding
to be an int
or tuple
of ints (i.e. for each dimension of height / width).
We should support an additional overload for padding
that would take a string, with value same
.
The same
padding should pad the input
in such a way before giving it to the convolution that the output
size is the same as the input
size.
When 'same'
is given to padding
, we have to calculate the amount of left and right padding needed in each dimension.
There are two cases to consider after the required L (left) and R (right) padding is calcluated:
F.conv2d
with a padding
value equal to L
input_padded = F.pad(input, ...)
and send the input_padded
into the F.conv2d
.Needless to say, it has to be tested to also work on the JIT path
@Chilee for reference, here is a potential implementation to get inspiration from https://github.com/mlperf/inference/blob/master/others/edge/object_detection/ssd_mobilenet/pytorch/utils.py#L40
It matched the TF implementation for the configurations that it was tested, but the testing was not exhaustive
@soumith Some quick questions:
functional.conv2d
? The design you wrote seems to imply that it shouldn't. There's nothing about padding
= "same" that seems like it should be specific to layers. (EDIT: Nvm, didn't realize the F.conv2d
impl I was looking at was the quantized one).valid
padding mode is simply equivalent to ours with padding=0
, right? Also, it doesn't seem that there will be an easy fix for the user to deal with asymmetric padding. The full rule for determining the amount of padding that needs to occur is
(ceil(x/stride) -1)*stride + (filter-1)*dilation + 1 - x
along a dimension. In particular, we will need to do asymmetric padding when this is not a multiple of 2. As a counterexample to your hope that this only happens with even sized filters, take input = 10, stride=3, filter=3, dilation=1
. I don't see any simple rules for resolving the situations in which this can happen.
Furthermore, we won't be able to statically determine the padding except in the case when stride=1
, as then ceil(x/stride) = x
, and we have padding equal to (filter-1)*dilation
.
@Chillee about (1), no reason, I hadn't thought through the implications -- perf or otherwise.
(2) Yes.
Furthermore, we won't be able to statically determine the padding except in the case when stride=1, as then ceil(x/stride) = x, and we have padding equal to (filter-1)*dilation
Yes, but stride=1 is common-enough and the benefits of static padding good enough that we should definitely handle it specially.
About asymmetric padding, oh welll.....
It doesn't make sense to me why can't an optional API of
padding=SAME
be offered? If someone is willing to incur the additional cost of padding then let them do so. For many researchers, quick prototyping is a requirement.
Yes,
It doesn't make sense to me why can't an optional API of
padding=SAME
be offered? If someone is willing to incur the additional cost of padding then let them do so. For many researchers, quick prototyping is a requirement.
Agree! I got stuck in this fuckin “padding” for 4 hours.
Do we have any update about solution for this issue?
Wow and here I thought that Pytorch would be easier than Keras/Tensorflow 2.0...
@zwep there is a bit more effort in getting started. You have to write your trianing loop which can be annoying and you have to write layers more explicitly. Once you get that done ( once ) you can advance much farther on the actual improvement beyond that.
My rule of thumb is use Keras if its something you have done a million times/ super standard.
use pytorch any time there is research and development involved.
here is my code for padded 1d convs
import torch
from torch import nn
import numpy as np
import torch.functional as F
class Conv1dSamePad(nn.Module):
def __init__(self, in_channels, out_channels, filter_len, stride=1, **kwargs):
super(Conv1dSamePad, self).__init__()
self.filter_len = filter_len
self.conv = nn.Conv1d(in_channels, out_channels, filter_len, padding=(self.filter_len // 2), stride=stride,
**kwargs)
nn.init.xavier_uniform_(self.conv.weight)
# nn.init.constant_(self.conv.bias, 1 / out_channels)
def forward(self, x):
if self.filter_len % 2 == 1:
return self.conv(x)
else:
return self.conv(x)[:, :, :-1]
class Conv1dCausalPad(nn.Module):
def __init__(self, in_channels, out_channels, filter_len, **kwargs):
super(Conv1dCausalPad, self).__init__()
self.filter_len = filter_len
self.conv = nn.Conv1d(in_channels, out_channels, filter_len, **kwargs)
nn.init.xavier_uniform_(self.conv.weight)
def forward(self, x):
padding = (self.filter_len - 1, 0)
return self.conv(F.pad(x, padding))
class Conv1dPad(nn.Module):
def __init__(self, in_channels, out_channels, filter_len, padding="same", groups=1):
super(Conv1dPad, self).__init__()
if padding not in ["same", "causal"]:
raise Exception("invalid padding type %s" % padding)
self.conv = Conv1dCausalPad(in_channels, out_channels, filter_len, groups=groups) \
if padding == "causal" else Conv1dSamePad(in_channels, out_channels, filter_len, groups=groups)
def forward(self, x):
return self.conv(x)
@danFromTelAviv He man, thanks for the code. Will keep that pytorch philosophy in mind!
It's 2020. Still no padding='same'
in Pytorch?
This is one way to get same padding working for any kernel size, stride and dilation (even kernel sizes work too).
class Conv1dSame(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, stride=1, dilation=1):
super().__init__()
self.cut_last_element = (kernel_size % 2 == 0 and stride == 1 and dilation % 2 == 1)
self.padding = math.ceil((1 - stride + dilation * (kernel_size-1))/2)
self.conv = nn.Conv1d(in_channels, out_channels, kernel_size, padding=self.padding, stride=stride, dilation=dilation)
def forward(self, x):
if self.cut_last_element:
return self.conv(x)[:, :, :-1]
else:
return self.conv(x)
I want the "same padding" feature in nn.Conv2d
too.
BTW, in addition to the perf/serialization concerns discussed above, there are correctness/accuracy reasons on why the size-dependent "same" padding mode in TF is not a good default. I've discussed in https://github.com/tensorflow/tensorflow/issues/18213 and showed that actually many google's own code uses a size-independent "same" padding mode instead.
It seems there are no ongoing work right now about this issue but if there is, I hope it's a size-independent solution.
Hi, @ppwwyyxx Yuxin, thank you for the response.
I think the implementation from @McHughes288 is good, and I wonder about your opinion about his implementation.
Here is my solution for Conv1D SAME padding(only works correctly when dilation==1
& groups==1
, more complicated when you consider dilation and groups):
import torch.nn.functional as F
from torch import nn
class Conv1dSamePadding(nn.Conv1d):
"""Represents the "Same" padding functionality from Tensorflow.
NOTE: Only work correctly when dilation == 1, groups == 1 !!!
"""
def forward(self, input):
size, kernel, stride = input.size(-1), self.weight.size(
2), self.stride[0]
padding = kernel - stride - size % stride
while padding < 0:
padding += stride
if padding != 0:
# pad left by padding // 2, pad right by padding - padding // 2
# in Tensorflow, one more padding value(default: 0) is on the right when needed
input = F.pad(input, (padding // 2, padding - padding // 2))
return F.conv1d(input=input,
weight=self.weight,
bias=self.bias,
stride=stride,
dilation=1,
groups=1)
@Chillee did you intend to continue working on this feature? I'm going to unassign you for now so that we can better track progress of this issue, please feel free to reassign if you are still working on it.
after read the code of @wizcheu , I create another version of conv1d with padding='same'
class Conv1dPaddingSame(nn.Module):
'''pytorch version of padding=='same'
============== ATTENTION ================
Only work when dilation == 1, groups == 1
=========================================
'''
def __init__(self, in_channels, out_channels, kernel_size, stride):
super(Conv1dPaddingSame, self).__init__()
self.kernel_size = kernel_size
self.stride = stride
self.weight = nn.Parameter(torch.rand((out_channels,
in_channels, kernel_size)))
# nn.Conv1d default set bias=True,so create this param
self.bias = nn.Parameter(torch.rand(out_channels))
def forward(self, x):
batch_size, num_channels, length = x.shape
if length % self.stride == 0:
out_length = length // self.stride
else:
out_length = length // self.stride + 1
pad = math.ceil((out_length * self.stride +
self.kernel_size - length - self.stride) / 2)
out = F.conv1d(input=x,
weight = self.weight,
stride = self.stride,
bias = self.bias,
padding=pad)
return out
Is there any update on this?
any updates??
@peterbell10 has linked a draft PR that you can follow.
Most helpful comment
Is there any plan of implementing a similar api in pytorch in the near future? People coming from a tensorflow / keras background will certainly appreciate it.