Pytorch: [๊ธฐ๋Šฅ ์š”์ฒญ] ์ปจ๋ณผ๋ฃจ์…˜ ์ž‘์—…์— "๋™์ผํ•œ" ํŒจ๋”ฉ์„ ๊ตฌํ˜„ํ•˜์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

์— ๋งŒ๋“  2017๋…„ 11์›” 25์ผ  ยท  59์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: pytorch/pytorch

๊ตฌํ˜„์€ ์‰ฝ์ง€๋งŒ ํ•„์š”ํ•œ ํŒจ๋”ฉ ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐ ์–ด๋ ค์›€์„ ๊ฒช๋Š” ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์„ ๋„์šธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

cc @ezyang @gchanan @zou3519 @albanD @mruberry

enhancement high priority convolution nn triaged

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

๊ฐ€๊นŒ์šด ์žฅ๋ž˜์— pytorch์—์„œ ์œ ์‚ฌํ•œ API๋ฅผ ๊ตฌํ˜„ํ•  ๊ณ„ํš์ด ์žˆ์Šต๋‹ˆ๊นŒ? tensorflow / keras ๋ฐฐ๊ฒฝ์—์„œ ์˜จ ์‚ฌ๋žŒ๋“ค์€ ํ™•์‹คํžˆ ๊ฐ์‚ฌํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋ชจ๋“  59 ๋Œ“๊ธ€

์ด๊ฒƒ์€ ํ•  ๊ฐ€์น˜๊ฐ€ ์žˆ๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋‹น์‹ ์ด ์ œ์•ˆํ•˜๋Š” ์ธํ„ฐํŽ˜์ด์Šค๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? nn.Conv2d(..., padding="same") ์ฒ˜๋Ÿผ?

TensorFlow์˜ ๋™์ผํ•œ ๋™์ž‘์„ ์ฐพ๊ณ  ์žˆ๋Š” ๊ฒฝ์šฐ ์ถ”๊ฐ€ํ•  ํ”ฝ์…€ ์ˆ˜๊ฐ€ ์ž…๋ ฅ ํฌ๊ธฐ์— ๋”ฐ๋ผ ๋‹ฌ๋ผ์ง€๊ธฐ ๋•Œ๋ฌธ์— ๊ตฌํ˜„์ด ๊ฐ„๋‹จํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ฐธ์กฐ๋Š” https://github.com/caffe2/caffe2/blob/master/caffe2/proto/caffe2_legacy.proto ๋ฅผ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค.

๋ฌธ์ œ์™€ ์ฐธ์กฐ๋ฅผ ์ง€์ ํ•ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
@fmassa ๊ฐ€ ์–ธ๊ธ‰ํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋‘ ๊ฐ€์ง€ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค.
๋จผ์ € @soutmith๊ฐ€ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด ์ฒซ ๋ฒˆ์งธ ์ธํ„ฐํŽ˜์ด์Šค๋Š” nn.Conv*d(..., padding="same") ์™€ ๊ฐ™์œผ๋ฉฐ forward() ํ˜ธ์ถœ๋งˆ๋‹ค ํŒจ๋”ฉ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.
๊ทธ๋Ÿฌ๋‚˜ ์ดˆ๊ธฐํ™” ๋‹จ๊ณ„์—์„œ ์ž…๋ ฅ ํ˜•ํƒœ๋ฅผ ์•Œ๋ฉด ๋น„ํšจ์œจ์ ์ธ ๋ฐฉ๋ฒ•์ด ๋ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ nn.CalcPadConv*d(<almost same parameters as Conv*d>) ์™€ ๊ฐ™์€ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์šฉ์ž๋Š” ์ดˆ๊ธฐํ™” ์‹œ ์•Œ๋ ค์ง„ ๋„ˆ๋น„์™€ ๋†’์ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํŒจ๋”ฉ์„ ๊ณ„์‚ฐํ•˜๊ณ  nn.Conv2d(...) ์˜ ํŒจ๋”ฉ ๋งค๊ฐœ๋ณ€์ˆ˜์— ์ถœ๋ ฅ(ํŒจ๋”ฉ ๋ชจ์–‘)์„ ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋‘ ๋ฒˆ์งธ ์ œ์•ˆ์ด ์กฐ๊ธฐ ์ตœ์ ํ™”๊ฐ€ ๋  ์ˆ˜ ์žˆ๋Š”์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.
์ด๊ฒƒ๋“ค์— ๋Œ€ํ•ด ์–ด๋–ป๊ฒŒ ์ƒ๊ฐํ•˜์„ธ์š”? ๋” ๋‚˜์€ ์ด๋ฆ„์— ๋Œ€ํ•œ ์•„์ด๋””์–ด๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?

๋น„ํšจ์œจ์˜ ๊ฐ€์žฅ ํฐ ์›์ธ์€ padding=same ์ผ€์ด์Šค๋ฅผ ํ•„์š”๋กœ ํ•˜๋Š” ๋‹ค๋ฅธ ๋ชจ๋“  ์ปจ๋ณผ๋ฃจ์…˜ ์ „์— F.pad ๋ ˆ์ด์–ด๋ฅผ ์ถ”๊ฐ€ํ•ด์•ผ ํ•œ๋‹ค๋Š” ์‚ฌ์‹ค์—์„œ ๋น„๋กฏ๋  ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค TensorFlow๊ฐ€ cudnn ์ผ€์ด์Šค์—์„œ ์ด๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค. ๋”ฐ๋ผ์„œ nn.CalcPadConv*d ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ nn.Conv*d(..., padding="same") ๋งŒํผ ๋น„์Œ‰๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ์ปจ๋ณผ๋ฃจ์…˜์˜ ๊ฐ ๋ฉด์— ๋Œ€ํ•ด ๋‹ค๋ฅธ ํŒจ๋”ฉ์„ ์ง€์›ํ•˜๋ฉด ๋” ํšจ์œจ์ ์ผ ์ˆ˜ ์žˆ์ง€๋งŒ(Caffe2์—์„œ์™€ ๊ฐ™์ด ์™ผ์ชฝ, ์˜ค๋ฅธ์ชฝ, ์œ„์ชฝ, ์•„๋ž˜์ชฝ) cudnn์€ ์—ฌ์ „ํžˆ โ€‹โ€‹์ด๋ฅผ ์ง€์›ํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ ์ด๋Ÿฌํ•œ ๊ฒฝ์šฐ์— ์ถ”๊ฐ€ ํŒจ๋”ฉ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. .

์šฐ๋ฆฌ๋Š” ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒฝ์šฐ ๋˜ํ•œ, ๋‚˜๋Š” ์ƒ๊ฐ padding="same" ์— nn.Conv*d , ์šฐ๋ฆฌ๋Š” ์•„๋งˆ์— ๋Œ€ํ•ด ๋™์ผํ•ด์•ผ nn.*Pool*d ์˜ค๋ฅธ์ชฝ?

์ œ ์ƒ๊ฐ์—๋Š” ์‚ฌ์šฉ์ž๊ฐ€ padding=same ์˜ ๋™์ž‘์ด TF์™€ ๋™์ผํ•  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ์„ฑ๋Šฅ ์ €ํ•˜๋ฅผ ๊ธฐ๋Œ€ํ•˜์ง€ ์•Š์„ ์ˆ˜๋„ ์žˆ๋‹ค๋Š” ์ ์ด ์ €๋ฅผ ์•ฝ๊ฐ„ ๊ดด๋กญํžˆ๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์–ด๋–ป๊ฒŒ ์ƒ๊ฐํ•˜๋‚˜์š”?

์™œ ๊ทธ๊ฒƒ์ด ๋น„ํšจ์œจ์ ์ž…๋‹ˆ๊นŒ? ๋ชจ๋“  ์ „์ง„ ๋‹จ๊ณ„์—์„œ ํŒจ๋”ฉ์„ ๊ณ„์‚ฐํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๊นŒ? ๋น„์šฉ์ด ์ž‘์•„์•ผ ํ•˜๋ฏ€๋กœ ์ตœ์ ํ™”ํ•  ํ•„์š”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ์˜๋ฏธ๋ฅผ ์™„์ „ํžˆ ์ดํ•ดํ•˜์ง€ ๋ชปํ•˜๋Š” ๊ฒƒ์ผ ์ˆ˜๋„ ์žˆ์ง€๋งŒ F.pad ์ด ํ•„์š”ํ•œ ์ด์œ ๋ฅผ ์•Œ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

์ž…๋ ฅ ํฌ๊ธฐ์— ๋”ฐ๋ผ ํŒจ๋”ฉ์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์€ ๋งค์šฐ ๋‚˜์ฉ๋‹ˆ๋‹ค. ๋‹ค์–‘ํ•œ ์ง๋ ฌํ™” ๋ฐ ํšจ์œจ์„ฑ ์ด์œ ๋กœ ์ด๊ฒƒ์ด ์™œ ๋‚˜์œ ์ƒ๊ฐ์ธ์ง€ ์„ค๋ช…ํ•˜๋Š” @Yangqing ๊ณผ ํ•จ๊ป˜

@fmassa , ๋‚ด๊ฐ€ ์˜๋„ํ•œ ๊ฒƒ์€ __init__() ์‚ฌ์šฉํ•˜์—ฌ nn.CalcPadConv*d() __init__() ์—์„œ "์ผ์ •ํ•œ" ํŒจ๋”ฉ ๋ชจ์–‘์„ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์ด์—ˆ์Šต๋‹ˆ๋‹ค. ๋งํ–ˆ๋“ฏ์ด ์ด ๋ฐฉ๋ฒ•์€ ๊ณ„์‚ฐ๋œ ํŒจ๋”ฉ์ด ํ™€์ˆ˜์ผ ๋•Œ๋งŒ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ F.pad ๋ ˆ์ด์–ด๋ฅผ ์ถ”๊ฐ€ํ•˜๊ฑฐ๋‚˜ ํ™€์ˆ˜ ํŒจ๋”ฉ์— ๋Œ€ํ•œ F.conv*d ์ง€์›์ด ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ํŽธ์ง‘: ๊ทธ๋Ÿฐ ๋‹ค์Œ ๋‚ด๊ฐ€ ์ œ์•ˆํ•œ ๊ฒƒ์€ ๊ธฐ๋Šฅ์ด์–ด์•ผ ํ•˜๋ฉฐ, ์˜ˆ๋ฅผ ๋“ค์–ด torch.nn.utils ๋˜๋Š” torch.utils์— ๋ฐฐ์น˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๊ฒฐ๊ณผ์ ์œผ๋กœ ๋‚ด๊ฐ€ ์ œ์•ˆํ•˜๋Š” ๊ฒƒ์€ (์˜์‚ฌ ์ฝ”๋“œ)์™€ ๊ฐ™์€ ๊ฐ„๋‹จํ•œ ์œ ํ‹ธ๋ฆฌํ‹ฐ ๊ธฐ๋Šฅ์ž…๋‹ˆ๋‹ค.

def calc_pad_conv1d(width, padding='same', check_symmetric=True, ... <params that conv1d has>):
    shape = <calculate padding>

    assert not check_symmetric or <shape is symmetric>, \
        'Calculated padding shape is asymmetric, which is not supported by conv1d. ' \ 
        'If you just want to get the value, consider using check_symmetric=False.'

    return shape


width = 100  # for example
padding = calc_pad_conv1d(width, ...)
m = nn.Conv1d(..., padding=padding)

๋˜ํ•œ ์ด ๊ธฐ๋Šฅ์€ ์‚ฌ์šฉ์ž์—๊ฒŒ ์œ ๋ฆฌํ•˜๊ฒŒ F.pad ์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@qbx2 ์•„๋งˆ๋„ ๊ท€ํ•˜์˜ ์ œ์•ˆ์„ ์™„์ „ํžˆ ์ดํ•ดํ•˜์ง€ ๋ชปํ•  ์ˆ˜๋„ ์žˆ์ง€๋งŒ TensorFlow ๋™์ž‘์„ ๋ณต์ œํ•˜๋ ค๋Š” ๊ฒฝ์šฐ ์ด๊ฒƒ์œผ๋กœ ์ถฉ๋ถ„ํ•˜์ง€ ์•Š๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

๋‹ค์Œ์€ TensorFlow SAME ํŒจ๋”ฉ์„ ๋ชจ๋ฐฉํ•œ ๊ฒƒ์œผ๋กœ ์ƒ๊ฐ๋˜๋Š” ์Šค๋‹ˆํŽซ์ž…๋‹ˆ๋‹ค( nn.Conv2d ๊ฐ€ F.conv2d_same_padding ํ˜ธ์ถœํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ธฐ๋Šฅ ์ธํ„ฐํŽ˜์ด์Šค์— ์ ์–ด ๋‘ก๋‹ˆ๋‹ค).

def conv2d_same_padding(input, weight, bias=None, stride=1, dilation=1, groups=1):
  input_rows = input.size(2)
  filter_rows = weight.size(2)
  effective_filter_size_rows = (filter_rows - 1) * dilation[0] + 1
  out_rows = (input_rows + stride[0] - 1) // stride[0]
  padding_needed =
          max(0, (out_rows - 1) * stride[0] + effective_filter_size_rows -
                  input_rows)
  padding_rows = max(0, (out_rows - 1) * stride[0] +
                        (filter_rows - 1) * dilation[0] + 1 - input_rows)
  rows_odd = (padding_rows % 2 != 0)
  # same for padding_cols

  if rows_odd or cols_odd:
    input = F.pad(input, [0, int(cols_odd), 0, int(rows_odd)])

  return F.conv2d(input, weight, bias, stride,
                  padding=(padding_rows // 2, padding_cols // 2),
                  dilation=dilation, groups=groups)

์—ฌ๊ธฐ ์™€ ์—ฌ๊ธฐ ์˜ TensorFlow ์ฝ”๋“œ์—์„œ ๋Œ€๋ถ€๋ถ„ ๋ณต์‚ฌํ•˜์—ฌ ๋ถ™์—ฌ๋„ฃ์—ˆ

๋ณด์‹œ๋‹ค์‹œํ”ผ, ๊ฑฐ๊ธฐ์—๋Š” ๋งŽ์€ ์ˆจ๊ฒจ์ง„ ์ผ๋“ค์ด ์ผ์–ด๋‚˜๊ณ  ์žˆ๊ณ , ๊ทธ๋ž˜์„œ padding='same' ์ถ”๊ฐ€ํ•  ๊ฐ€์น˜๊ฐ€ ์—†์„ ์ˆ˜๋„ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  TensorFlow์—์„œ SAME ๋™์ž‘์„ ๋ณต์ œํ•˜์ง€ ์•Š๋Š” ๊ฒƒ๋„ ์ด์ƒ์ ์ด์ง€ ์•Š๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์ƒ๊ฐ?

@fmassa ๋„ค, ๋งž์Šต๋‹ˆ๋‹ค. forward() ๋งˆ๋‹ค ํŒจ๋”ฉ์„ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์€ ๋น„ํšจ์œจ์ ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋Ÿฌ๋‚˜ ๋‚ด ์ œ์•ˆ์€ forward() ํ˜ธ์ถœ๋งˆ๋‹ค ํŒจ๋”ฉ์„ ๊ณ„์‚ฐํ•˜์ง€ ์•Š๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์—ฐ๊ตฌ์›(๊ฐœ๋ฐœ์ž)์€ ๋Ÿฐํƒ€์ž„ ์ „์— ์ด๋ฏธ์ง€์˜ ํฌ๊ธฐ๋ฅผ nn.Conv2d ์˜ˆ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ/๊ทธ๋…€๊ฐ€ '๋™์ผํ•œ' ํŒจ๋”ฉ์„ ์›ํ•˜๋ฉด 'SAME'์„ ๋ชจ๋ฐฉํ•˜๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ํŒจ๋”ฉ์„ ๊ณ„์‚ฐํ•˜๋Š” ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด ์—ฐ๊ตฌ์›์ด 200x200, 300x300, 400x400 ํฌ๊ธฐ์˜ ์ด๋ฏธ์ง€๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๊ฒฝ์šฐ๋ฅผ ์ƒ๊ฐํ•ด ๋ณด์‹ญ์‹œ์˜ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ ์ดˆ๊ธฐํ™” ๋‹จ๊ณ„์—์„œ ์„ธ ๊ฐ€์ง€ ๊ฒฝ์šฐ์— ๋Œ€ํ•œ ํŒจ๋”ฉ์„ ๊ณ„์‚ฐํ•˜๊ณ  ํ•ด๋‹น ํŒจ๋”ฉ๊ณผ ํ•จ๊ป˜ ์ด๋ฏธ์ง€๋ฅผ F.pad() ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜๋Š” forward() ํ˜ธ์ถœ ์ „์— nn.Conv2d ์˜ ํŒจ๋”ฉ ํ•„๋“œ๋ฅผ ๋ณ€๊ฒฝํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ์„ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค.

>>> import torch
>>> import torch.nn as nn
>>> from torch.autograd import Variable
>>> m = nn.Conv2d(1,1,1)
>>> m(Variable(torch.randn(1,1,2,2))).shape
torch.Size([1, 1, 2, 2])
>>> m.padding = (1, 1)
>>> m(Variable(torch.randn(1,1,2,2))).shape
torch.Size([1, 1, 4, 4])

์˜ˆ, pytorch ์ฝ”์–ด์— "ํŒจ๋”ฉ ๊ณ„์‚ฐ ์œ ํ‹ธ๋ฆฌํ‹ฐ ๊ธฐ๋Šฅ"์„ ์ถ”๊ฐ€ํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

์—ฐ๊ตฌ์ž๊ฐ€ ๊ฐ ์ž…๋ ฅ ์ด๋ฏธ์ง€ ํฌ๊ธฐ์— ๋Œ€ํ•œ ์ข…์† ํŒจ๋”ฉ์„ ์›ํ•  ๋•Œ nn.Conv2d ์ด๋ฏธ์ง€๋ฅผ ์ „๋‹ฌํ•˜๊ธฐ ์ „์— F.pad() ์™€ ํ•จ์ˆ˜๋ฅผ ๊ฒฐํ•ฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฝ”๋“œ ์ž‘์„ฑ์ž๊ฐ€ ๋ชจ๋“  forward() ํ˜ธ์ถœ์—์„œ ์ž…๋ ฅ์„ ์ฑ„์šธ์ง€ ์—ฌ๋ถ€๋ฅผ ๊ฒฐ์ •ํ•˜๋„๋ก ํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

๊ฐ€๊นŒ์šด ์žฅ๋ž˜์— pytorch์—์„œ ์œ ์‚ฌํ•œ API๋ฅผ ๊ตฌํ˜„ํ•  ๊ณ„ํš์ด ์žˆ์Šต๋‹ˆ๊นŒ? tensorflow / keras ๋ฐฐ๊ฒฝ์—์„œ ์˜จ ์‚ฌ๋žŒ๋“ค์€ ํ™•์‹คํžˆ ๊ฐ์‚ฌํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ๊ธฐ๋ณธ ํŒจ๋”ฉ ๊ณ„์‚ฐ ์ „๋žต(TensorFlow์™€ ๋™์ผํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ œ๊ณตํ•˜์ง€ ์•Š์ง€๋งŒ ๋ชจ์–‘์€ ์œ ์‚ฌํ•จ)์€

def _get_padding(padding_type, kernel_size):
    assert padding_type in ['SAME', 'VALID']
    if padding_type == 'SAME':
        return tuple((k - 1) // 2 for k in kernel_size))
    return tuple(0 for _ in kernel_size)

@im9uri ๋ฅผ ์—ผ๋‘์— ๋‘๊ณ 

๋‚ด๊ฐ€ ์—ผ๋‘์— ๋‘์—ˆ๋˜ ๊ฒƒ๊ณผ ๋น„์Šทํ•˜์ง€๋งŒ ์•ž์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด ๋ณดํญ๊ณผ ํŒฝ์ฐฝ์œผ๋กœ ๊ณ„์‚ฐ์ด ๋ณต์žกํ•ด์ง‘๋‹ˆ๋‹ค.

๋˜ํ•œ ConvTranspose2d์™€ ๊ฐ™์€ ๋‹ค๋ฅธ ์ปจ๋ณผ๋ฃจ์…˜ ์ž‘์—…์—์„œ ์ด๋Ÿฌํ•œ API๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

"์Šฌ๋ผ์ด๋”ฉ ์ฐฝ ์—ฐ์‚ฐ์ž"๋Š” ๋ชจ๋‘ ๋น„๋Œ€์นญ ํŒจ๋”ฉ์„ ์ง€์›ํ•ด์•ผ ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

"๊ฐ™์€" ์ฃผ์žฅ์— ๋Œ€ํ•ด...
@sumith ์ž…๋ ฅ ํฌ๊ธฐ์— ๋”ฐ๋ผ ํŒจ๋”ฉ์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์ด ์™œ
์–ด์จŒ๋“  ๊ทธ๊ฒƒ์ด ๋ฌธ์ œ๋ผ๋ฉด ์‹ค์šฉ์ ์ธ ํ•ด๊ฒฐ์ฑ…์€ "๋™์ผ"์„ ์‚ฌ์šฉํ•  ๋•Œ stride == 1 ๋ฅผ ์š”๊ตฌํ•˜๋Š” ๊ฒƒ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. stride == 1 ์˜ ๊ฒฝ์šฐ ํŒจ๋”ฉ์€ ์ž…๋ ฅ ํฌ๊ธฐ์— ์˜์กดํ•˜์ง€ ์•Š์œผ๋ฉฐ ํ•œ ๋ฒˆ์— ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž๊ฐ€ padding='same' ์™€ ํ•จ๊ป˜ stride > 1 padding='same' ๋ฅผ ์‚ฌ์šฉํ•˜๋ ค๊ณ  ํ•˜๋ฉด ์ƒ์„ฑ์ž๋Š” ValueError ๋ฐœ์ƒ์‹œ์ผœ์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋‚˜๋Š” ๊ทธ๊ฒƒ์ด ๊ฐ€์žฅ ๊นจ๋—ํ•œ ํ•ด๊ฒฐ์ฑ…์€ ์•„๋‹ˆ์ง€๋งŒ ์ œ์•ฝ ์กฐ๊ฑด์ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋‚˜์—๊ฒŒ ์ถฉ๋ถ„ํžˆ ํ•ฉ๋ฆฌ์ ์œผ๋กœ ๋“ค๋ฆฐ๋‹ค๋Š” ๊ฒƒ์„ ์••๋‹ˆ๋‹ค.

  1. "same"์ด๋ผ๋Š” ๋ ˆ์ด๋ธ”์˜ ์›๋ž˜ ์˜๋ฏธ๋Š” strided convolution์ด ์•„๋‹Œ ๊ฒฝ์šฐ์— ๋„์ž…๋˜์—ˆ์œผ๋ฉฐ ์ถœ๋ ฅ์€ ์ž…๋ ฅ์˜ _same_ ํฌ๊ธฐ๋ฅผ ๊ฐ–์Šต๋‹ˆ๋‹ค. ๋ฌผ๋ก  ์ด๊ฒƒ์€ stride > 1 ๋Œ€ํ•œ tensorflow์—์„œ ์‚ฌ์‹ค์ด ์•„๋‹ˆ๋ฉฐ "๋™์ผํ•œ"์ด๋ผ๋Š” ๋‹จ์–ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ IMO๋ฅผ ์•ฝ๊ฐ„ ์˜คํ•ดํ•˜๊ฒŒ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
  2. "๋™์ผ"์„ ์‚ฌ์šฉํ•˜๋ ค๋Š” ๊ฒฝ์šฐ์˜ 99%๋ฅผ ๋‹ค๋ฃน๋‹ˆ๋‹ค. ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ stride > 1 ๋Œ€ํ•œ tensorflow์˜ ๋™์ž‘์„ ์ •๋ง๋กœ ํ•„์š”๋กœ ํ•˜๋Š” ๊ฒฝ์šฐ๋ฅผ ๊ฑฐ์˜ ์ƒ์ƒํ•  ์ˆ˜ ์—†์ง€๋งŒ, ์›๋ž˜ ์˜๋ฏธ๋ฅผ "๋™์ผํ•˜๊ฒŒ" ์ œ๊ณตํ•˜๋ฉด ์Œ, ๋ฌผ๋ก  strided convolution์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์€ ์˜๋ฏธ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ์ถœ๋ ฅ์„ ์›ํ•˜๋Š” ๊ฒฝ์šฐ ์ž…๋ ฅ์˜ ํฌ๊ธฐ๊ฐ€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.

conv2d ๋ฌธ์„œ๋Š” ์ถœ๋ ฅ ํฌ๊ธฐ์— ๋Œ€ํ•œ ๋ช…์‹œ์ ์ธ ๊ณต์‹์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด Hout์„ Hin๊ณผ ๋™์ผ์‹œํ•˜๋ฉด ํŒจ๋”ฉ์„ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

def _get_padding(size, kernel_size, stride, dilation):
    padding = ((size - 1) * (stride - 1) + dilation * (kernel_size - 1)) //2
    return padding

๋™์ผํ•œ ํŒจ๋”ฉ์€ ํŒจ๋”ฉ = (kernel_size - stride)//2๋ฅผ ์˜๋ฏธํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํŒจ๋”ฉ = "๋™์ผ"์ด ๋„์ž…๋˜์–ด ์ž‘์„ฑ๋  ๋•Œ ์ปค๋„ ํฌ๊ธฐ์™€ ๋ณดํญ(nn.Conv2d์—์„œ๋„ ์–ธ๊ธ‰๋จ)์„ ์ž๋™์œผ๋กœ ์ฝ๊ณ  ํŒจ๋”ฉ์„ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ์— ๋”ฐ๋ผ ์ž๋™์œผ๋กœ

๋‹ค์Œ์€ ์ฐธ์กฐ์šฉ์œผ๋กœ same ํŒจ๋”ฉ์ด ์žˆ๋Š” ๋งค์šฐ ๊ฐ„๋‹จํ•œ Conv2d ๋ ˆ์ด์–ด์ž…๋‹ˆ๋‹ค. ์ •์‚ฌ๊ฐํ˜• ์ปค๋„๊ณผ stride=1, dilation=1, groups=1๋งŒ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

class Conv2dSame(torch.nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, bias=True, padding_layer=torch.nn.ReflectionPad2d):
        super().__init__()
        ka = kernel_size // 2
        kb = ka - 1 if kernel_size % 2 == 0 else ka
        self.net = torch.nn.Sequential(
            padding_layer((ka,kb,ka,kb)),
            torch.nn.Conv2d(in_channels, out_channels, kernel_size, bias=bias)
        )
    def forward(self, x):
        return self.net(x)

c = Conv2dSame(1,3,5)
print(c(torch.rand((16,1,10,10))).shape)

# torch.Size([16, 3, 10, 10])

์ด๊ฒƒ์ด ์—ฌ์ „ํžˆ PyTorch์— ์ถ”๊ฐ€๋˜๋Š” ๊ฒƒ์œผ๋กœ ํ‰๊ฐ€๋˜๊ณ  ์žˆ๋‹ค๋ฉด ๊ฐœ๋ฐœ์ž๋ฅผ ์œ„ํ•œ ๋ณต์žก์„ฑ/๋น„ํšจ์œจ์„ฑ ๋Œ€ ์‚ฌ์šฉ ์šฉ์ด์„ฑ ๊ฐ„์˜ ์ ˆ์ถฉ์ ์— ๋Œ€ํ•ด:

1.0 ๋ธ”๋กœ๊ทธ ๊ฒŒ์‹œ๋ฌผ๋กœ ๊ฐ€๋Š” ๊ธธ์— ๋‹ค์Œ ๊ณผ ๊ฐ™์ด ๋ช…์‹œ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

PyTorch์˜ ์ค‘์‹ฌ ๋ชฉํ‘œ๋Š” ์—ฐ๊ตฌ ๋ฐ ํ•ดํ‚น ๊ฐ€๋Šฅ์„ฑ์„ ์œ„ํ•œ ํ›Œ๋ฅญํ•œ ํ”Œ๋žซํผ์„ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด๋Ÿฌํ•œ ๋ชจ๋“  [ํ”„๋กœ๋•์…˜ ์‚ฌ์šฉ] ์ตœ์ ํ™”๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๋™์•ˆ ์šฐ๋ฆฌ๋Š” ์‚ฌ์šฉ์„ฑ๊ณผ ์ด๋ฅผ ์ ˆ์ถฉํ•˜์ง€ ์•Š๋„๋ก ์—„๊ฒฉํ•œ ์„ค๊ณ„ ์ œ์•ฝ ์กฐ๊ฑด์œผ๋กœ ์ž‘์—…ํ•ด ์™”์Šต๋‹ˆ๋‹ค.

์ผํ™”์ ์œผ๋กœ ์ €๋Š” Keras์™€ ์›๋ž˜ tf.layers / estimator API๋ฅผ ์‚ฌ์šฉํ•œ ๊ฒฝํ—˜์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋‘ same ํŒจ๋”ฉ์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. ์ €๋Š” ํ˜„์žฌ PyTorch๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ TF์—์„œ ์›๋ž˜ ์ž‘์„ฑํ–ˆ๋˜ convnet์„ ๋‹ค์‹œ ๊ตฌํ˜„ํ•˜๊ณ  ์žˆ์œผ๋ฉฐ, ์ œ๋กœ ํŒจ๋”ฉ์„ ์œ„ํ•ด ์‚ฐ์ˆ ์„ ์ง์ ‘ ๊ตฌ์ถ•ํ•ด์•ผ ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์•ฝ ๋ฐ˜๋‚˜์ ˆ์˜ ์‹œ๊ฐ„์ด ์†Œ์š”๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

"์ค‘์‹ฌ ๋ชฉํ‘œ"๊ฐ€ ์‹ค์ œ๋กœ ์‚ฌ์šฉ์„ฑ์— ์ค‘์ ์„ ๋‘”๋‹ค๋ฉด ๋ชจ๋“  ์ „์ง„ ํŒจ์Šค(์œ„์—์„œ ์–ธ๊ธ‰ํ•œ ๋ฐ”์™€ ๊ฐ™์ด)์—์„œ ์ œ๋กœ ํŒจ๋”ฉ์„ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐ ํšจ์œจ์„ฑ์ด ๋–จ์–ด์ง€๋”๋ผ๋„ ๊ฐœ๋ฐœ์ž ํšจ์œจ์„ฑ ๋ฐ ์œ ์ง€ ๊ด€๋ฆฌ ์ธก๋ฉด์—์„œ ์‹œ๊ฐ„์ด ์ ˆ์•ฝ๋œ๋‹ค๊ณ  ์ฃผ์žฅํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค ์˜ˆ๋ฅผ ๋“ค์–ด ์ œ๋กœ ํŒจ๋”ฉ์„ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ์ž ์ •์˜ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•  ํ•„์š”๊ฐ€ ์—†์Œ์€ ์ ˆ์ถฉ์˜ ๊ฐ€์น˜๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ƒ๊ฐ?

์ด ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค

padding=SAME ์˜ ์„ ํƒ์  API๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์—†๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ํŒจ๋”ฉ์— ๋Œ€ํ•œ ์ถ”๊ฐ€ ๋น„์šฉ์„ ๊ธฐ๊บผ์ด ์ง€๋ถˆํ•  ์˜์‚ฌ๊ฐ€ ์žˆ๋‹ค๋ฉด ๊ทธ๋ ‡๊ฒŒ ํ•˜๋„๋ก ํ•˜์‹ญ์‹œ์˜ค. ๋งŽ์€ ์—ฐ๊ตฌ์ž์—๊ฒŒ ๋น ๋ฅธ ํ”„๋กœํ† ํƒ€์ดํ•‘์€ ์š”๊ตฌ ์‚ฌํ•ญ์ž…๋‹ˆ๋‹ค.

์˜ˆ, ๋ˆ„๊ตฐ๊ฐ€ ์ด๊ฒƒ์„ ์ถ”๊ฐ€ํ•˜๊ณ  ์Šน์ธํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ํ™•์‹คํžˆ ์ด๊ฒƒ์„ ์ถ”๊ฐ€ํ•˜์‹ญ์‹œ์˜ค. ์ฝ”๋„ˆ๋Š” ๊ทธ๊ฒƒ์„ ์›ํ•ฉ๋‹ˆ๋‹ค.

์ด์ œ pytorch๊ฐ€ ์ง€์›ํ•ฉ๋‹ˆ๊นŒ? VGG์—์„œ ์ฒ˜์Œ๊ณผ ๊ฐ™์€ ์ž‘์—…์„ ์‚ฌ์šฉํ•˜์—ฌ ํŒจ๋”ฉ = (kernel_size-1)/2๋ฅผ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?
VGG ๋„คํŠธ์›Œํฌ๋Š” ์ฒซ ๋ฒˆ์งธ ๊ทธ๋ฃน์—์„œ ์ถœ๋ ฅ ํฌ๊ธฐ๊ฐ€ ๋ณ€๊ฒฝ๋˜์ง€ ์•Š๋„๋ก ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ stride๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ featuremap์˜ ํฌ๊ธฐ๋ฅผ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ดœ์ฐฎ์Šต๋‹ˆ๊นŒ?

๋‹ค์Œ์€ deepfakes์—์„œ ๋™์ผํ•œ conv2d ํŒจ๋”ฉ์„ ํ˜ธ์ถœํ•˜๋Š” ํ•œ ๊ฐ€์ง€ ์˜ˆ์ž…๋‹ˆ๋‹ค.

# modify con2d function to use same padding
# code referd to <strong i="6">@famssa</strong> in 'https://github.com/pytorch/pytorch/issues/3867'
# and tensorflow source code

import torch.utils.data
from torch.nn import functional as F

import math
import torch
from torch.nn.parameter import Parameter
from torch.nn.functional import pad
from torch.nn.modules import Module
from torch.nn.modules.utils import _single, _pair, _triple


class _ConvNd(Module):

    def __init__(self, in_channels, out_channels, kernel_size, stride,
                 padding, dilation, transposed, output_padding, groups, bias):
        super(_ConvNd, self).__init__()
        if in_channels % groups != 0:
            raise ValueError('in_channels must be divisible by groups')
        if out_channels % groups != 0:
            raise ValueError('out_channels must be divisible by groups')
        self.in_channels = in_channels
        self.out_channels = out_channels
        self.kernel_size = kernel_size
        self.stride = stride
        self.padding = padding
        self.dilation = dilation
        self.transposed = transposed
        self.output_padding = output_padding
        self.groups = groups
        if transposed:
            self.weight = Parameter(torch.Tensor(
                in_channels, out_channels // groups, *kernel_size))
        else:
            self.weight = Parameter(torch.Tensor(
                out_channels, in_channels // groups, *kernel_size))
        if bias:
            self.bias = Parameter(torch.Tensor(out_channels))
        else:
            self.register_parameter('bias', None)
        self.reset_parameters()

    def reset_parameters(self):
        n = self.in_channels
        for k in self.kernel_size:
            n *= k
        stdv = 1. / math.sqrt(n)
        self.weight.data.uniform_(-stdv, stdv)
        if self.bias is not None:
            self.bias.data.uniform_(-stdv, stdv)

    def __repr__(self):
        s = ('{name}({in_channels}, {out_channels}, kernel_size={kernel_size}'
             ', stride={stride}')
        if self.padding != (0,) * len(self.padding):
            s += ', padding={padding}'
        if self.dilation != (1,) * len(self.dilation):
            s += ', dilation={dilation}'
        if self.output_padding != (0,) * len(self.output_padding):
            s += ', output_padding={output_padding}'
        if self.groups != 1:
            s += ', groups={groups}'
        if self.bias is None:
            s += ', bias=False'
        s += ')'
        return s.format(name=self.__class__.__name__, **self.__dict__)


class Conv2d(_ConvNd):

    def __init__(self, in_channels, out_channels, kernel_size, stride=1,
                 padding=0, dilation=1, groups=1, bias=True):
        kernel_size = _pair(kernel_size)
        stride = _pair(stride)
        padding = _pair(padding)
        dilation = _pair(dilation)
        super(Conv2d, self).__init__(
            in_channels, out_channels, kernel_size, stride, padding, dilation,
            False, _pair(0), groups, bias)

    def forward(self, input):
        return conv2d_same_padding(input, self.weight, self.bias, self.stride,
                        self.padding, self.dilation, self.groups)


# custom con2d, because pytorch don't have "padding='same'" option.
def conv2d_same_padding(input, weight, bias=None, stride=1, padding=1, dilation=1, groups=1):

    input_rows = input.size(2)
    filter_rows = weight.size(2)
    effective_filter_size_rows = (filter_rows - 1) * dilation[0] + 1
    out_rows = (input_rows + stride[0] - 1) // stride[0]
    padding_needed = max(0, (out_rows - 1) * stride[0] + effective_filter_size_rows -
                  input_rows)
    padding_rows = max(0, (out_rows - 1) * stride[0] +
                        (filter_rows - 1) * dilation[0] + 1 - input_rows)
    rows_odd = (padding_rows % 2 != 0)
    padding_cols = max(0, (out_rows - 1) * stride[0] +
                        (filter_rows - 1) * dilation[0] + 1 - input_rows)
    cols_odd = (padding_rows % 2 != 0)

    if rows_odd or cols_odd:
        input = pad(input, [0, int(cols_odd), 0, int(rows_odd)])

    return F.conv2d(input, weight, bias, stride,
                  padding=(padding_rows // 2, padding_cols // 2),
                  dilation=dilation, groups=groups)

์ด ์ ์— ๋Œ€ํ•ด ๋งค์šฐ ๊ฐ์‚ฌํ•˜๊ฒŒ ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ํ˜„์žฌ tensorflow์—์„œ ๊ฐ„๋‹จํ•œ ๋ชจ๋ธ์„ ์ด์‹ํ•˜๊ณ  ์žˆ์œผ๋ฉฐ ๊ณ„์‚ฐ์„ ์ดํ•ดํ•˜๋Š” ๋ฐ ๋งค์šฐ ์˜ค๋žœ ์‹œ๊ฐ„์ด ๊ฑธ๋ฆฝ๋‹ˆ๋‹ค...

์ด ์Šค๋ ˆ๋“œ๊ฐ€ ๋ฐฉ๊ธˆ ์ฃฝ์€ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—์„œ ์—„์ง€ ์†๊ฐ€๋ฝ์˜ ์ˆ˜๋ฅผ ๊ฐ์•ˆํ•  ๋•Œ ๋” ๋น ๋ฅธ ํ”„๋กœํ† ํƒ€์ดํ•‘์„ ์œ„ํ•ด ์ด ๊ธฐ๋Šฅ์„ ์ถ”๊ฐ€ํ•˜๋ฉด ์ •๋ง ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ด์— ๋Œ€ํ•œ ์ œ์•ˆ์„œ๋ฅผ ์ž‘์„ฑํ•˜๊ณ  ์ด๋ฅผ ๊ตฌํ˜„ํ•  ์‚ฌ๋žŒ์„ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋‚˜๋Š” ์ด๊ฒƒ์„ v1.1 ์ด์ •ํ‘œ์— ๋Œ€ํ•ด ๋†“๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค, ๋‹น์‹ ์€ ๊ต‰์žฅํ•ฉ๋‹ˆ๋‹ค! ๋˜ํ•œ ํŒจ๋”ฉ ์ธ์ˆ˜๊ฐ€ 4-ํŠœํ”Œ์„ ํ—ˆ์šฉํ•˜๋„๋ก ๋ณ„๋„์˜ ๊ธฐ๋Šฅ ์š”์ฒญ ์„ ์ œ์ถœํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๋Œ€์นญ ํŒจ๋”ฉ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๋Œ€์นญ ํŒจ๋”ฉ๋„ ํ—ˆ์šฉํ•˜๋ฉฐ ์ด๋Š” ์ค‘๊ฐ„์— ๋„๋‹ฌํ•˜๊ธฐ ์œ„ํ•œ ์ข‹์€ ์ €๋น„์šฉ ๊ฒฝ๋กœ์ด๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค.

@soumith pytorch ์— ํŒจ๋”ฉ ๋ชจ๋“œ๊ฐ€ ๋™์ผํ•˜๋ฉด ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@soumith ์ปดํŒŒ์ผ ์œ ํ˜• ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์€ ์–ด๋–ป์Šต๋‹ˆ๊นŒ?

model=torch.compile(model,input_shape=(3,224,224))

TensorFlow๊ฐ€ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ์‹์— ๋”ฐ๋ผ ํŒฝ์ฐฝ๊ณผ ๋ณดํญ์„ ์ง€์›ํ•˜๋Š” ๋™์ผํ•œ ํŒจ๋”ฉ์œผ๋กœ Conv2D๋ฅผ ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์‹ค์‹œ๊ฐ„์œผ๋กœ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ๋ฏธ๋ฆฌ ๊ณ„์‚ฐํ•˜๋ ค๋ฉด ํŒจ๋”ฉ์„ init()์œผ๋กœ ์ด๋™ํ•˜๊ณ  ์ž…๋ ฅ ํฌ๊ธฐ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๊ฐ€์ง€๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

import torch as tr
import math

class Conv2dSame(tr.nn.Module):

    def __init__(self, in_channels, out_channels, kernel_size, stride=1, dilation=1):
        super(Conv2dSame, self).__init__()
        self.F = kernel_size
        self.S = stride
        self.D = dilation
        self.layer = tr.nn.Conv2d(in_channels, out_channels, kernel_size, stride, dilation=dilation)

    def forward(self, x_in):
        N, C, H, W = x_in.shape
        H2 = math.ceil(H / self.S)
        W2 = math.ceil(W / self.S)
        Pr = (H2 - 1) * self.S + (self.F - 1) * self.D + 1 - H
        Pc = (W2 - 1) * self.S + (self.F - 1) * self.D + 1 - W
        x_pad = tr.nn.ZeroPad2d((Pr//2, Pr - Pr//2, Pc//2, Pc - Pc//2))(x_in)
        x_out = self.layer(x_pad)
        return x_out

์˜ˆ 1:
์ž…๋ ฅ ๋ชจ์–‘: (1, 3, 96, 96)
ํ•„ํ„ฐ: 64
ํฌ๊ธฐ: 9x9

Conv2dSame(3, 64, 9)

ํŒจ๋”ฉ ๋ชจ์–‘: (1, 3, 104, 104)
์ถœ๋ ฅ ํ˜•ํƒœ: (1, 64, 96, 96)

์˜ˆ 2:
์ด์ „๊ณผ ๋™์ผํ•˜์ง€๋งŒ stride=2

Conv2dSame(3, 64, 9, 2)

ํŒจ๋”ฉ ๋ชจ์–‘ = (1, 3, 103, 103)
์ถœ๋ ฅ ํ˜•ํƒœ = (1, 64, 48, 48)

@jpatts ์ถœ๋ ฅ ๋ชจ์–‘ ๊ณ„์‚ฐ์ด ์ž˜๋ชป๋˜์—ˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ceil(input_dimension / stride)์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํŒŒ์ด์ฌ์˜ ์ •์ˆ˜ ๋‚˜๋ˆ„๊ธฐ๋Š” ๋ฐ”๋‹ฅ ๋‚˜๋ˆ„๊ธฐ์ž…๋‹ˆ๋‹ค. ์ฝ”๋“œ๋Š” h=w=28, stride=3, kernel_size=1 ๋Œ€ํ•ด tensorflow์™€ ๋‹ค๋ฅธ ๊ฒฐ๊ณผ๋ฅผ ๊ฐ€์ ธ์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋‹ค์Œ์€ ๋ฏธ๋ฆฌ ๊ณ„์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ณ€ํ˜•์ž…๋‹ˆ๋‹ค.

def pad_same(in_dim, ks, stride, dilation=1):
    """
    Refernces:
          https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/common_shape_fns.h
          https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/framework/common_shape_fns.cc#L21
    """
    assert stride > 0
    assert dilation >= 1
    effective_ks = (ks - 1) * dilation + 1
    out_dim = (in_dim + stride - 1) // stride
    p = max(0, (out_dim - 1) * stride + effective_ks - in_dim)

    padding_before = p // 2
    padding_after = p - padding_before
    return padding_before, padding_after

์ž…๋ ฅ ์ฐจ์›์ด ์•Œ๋ ค์ ธ ์žˆ๊ณ  ์ฆ‰์‹œ ๊ณ„์‚ฐ๋˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

# Pass this to nn.Sequential
def conv2d_samepad(in_dim, in_ch, out_ch, ks, stride, dilation=1, bias=True):
    pad_before, pad_after = pad_same(in_dim, ks, stride, dilation)
    if pad_before == pad_after:
        return [nn.Conv2d(in_ch, out_ch, ks, stride, pad_after, dilation, bias=bias)]
    else:
        return [nn.ZeroPad2d((pad_before, pad_after, pad_before, pad_after)),
                nn.Conv2d(in_ch, out_ch, ks, stride, 0, dilation, bias=bias)]

๊ทธ๋Ÿฌ๋‚˜ ์ด ๊ฒฝ์šฐ ์ž…๋ ฅ ์ฐจ์›์— ๋Œ€ํ•ด ์ผ๋ถ€ ๋ถ€๊ธฐ ๊ด€๋ฆฌ๋ฅผ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•˜๋ฏ€๋กœ(์ด๊ฒƒ์ด ํ•ต์‹ฌ ๋ฌธ์ œ์ž„) ์œ„์˜ ๋‚ด์šฉ์„ ์‚ฌ์šฉํ•˜๋ฉด ์œ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

def conv_outdim(in_dim, padding, ks, stride, dilation):
    if isinstance(padding, int) or isinstance(padding, tuple):
        return conv_outdim_general(in_dim, padding, ks, stride, dilation)
    elif isinstance(padding, str):
        assert padding in ['same', 'valid']
        if padding == 'same':
            return conv_outdim_samepad(in_dim, stride)
        else:
            return conv_outdim_general(in_dim, 0, ks, stride, dilation)
    else:
        raise TypeError('Padding can be int/tuple or str=same/valid')


def conv_outdim_general(in_dim, padding, ks, stride, dilation=1):
    # See https://arxiv.org/pdf/1603.07285.pdf, eq (15)
    return ((in_dim + 2 * padding - ks - (ks - 1) * (dilation - 1)) // stride) + 1


def conv_outdim_samepad(in_dim, stride):
    return (in_dim + stride - 1) // stride

@mirceamironenco ์ง€์ ํ•ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ด๊ฒƒ์„ ๋น ๋ฅด๊ณ  ๋”๋Ÿฝ๊ฒŒ ๋งŒ๋“ค์—ˆ๊ณ  ๊ฒฐ์ฝ” ํ™•์ธํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๋Œ€์‹  ์ฒœ์žฅ์„ ์‚ฌ์šฉํ•˜๋„๋ก ์—…๋ฐ์ดํŠธ๋จ

@harritaylor ๋™์˜ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ธฐ๋Šฅ์€ Keras/TF ๋ชจ๋ธ์„ PyTorch๋กœ ์ด์‹ํ•˜๋Š” ์ž‘์—…์„ ํ™•์‹คํžˆ ๋‹จ์ˆœํ™”ํ•ฉ๋‹ˆ๋‹ค. ๋•Œ๋•Œ๋กœ ๋‚˜๋Š” ์—ฌ์ „ํžˆ ํŒจ๋”ฉ ํฌ๊ธฐ์˜ "์ˆ˜๋™" ๊ณ„์‚ฐ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋™์ผํ•œ ํŒจ๋”ฉ ๋ ˆ์ด์–ด๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

@kylemcdonald

๋‹ค์Œ์€ ์ฐธ์กฐ์šฉ์œผ๋กœ same ํŒจ๋”ฉ์ด ์žˆ๋Š” ๋งค์šฐ ๊ฐ„๋‹จํ•œ Conv2d ๋ ˆ์ด์–ด์ž…๋‹ˆ๋‹ค. ์ •์‚ฌ๊ฐํ˜• ์ปค๋„๊ณผ stride=1, dilation=1, groups=1๋งŒ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

class Conv2dSame(torch.nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, bias=True, padding_layer=torch.nn.ReflectionPad2d):
        super().__init__()
        ka = kernel_size // 2
        kb = ka - 1 if kernel_size % 2 == 0 else ka
        self.net = torch.nn.Sequential(
            padding_layer((ka,kb,ka,kb)),
            torch.nn.Conv2d(in_channels, out_channels, kernel_size, bias=bias)
        )
    def forward(self, x):
        return self.net(x)

c = Conv2dSame(1,3,5)
print(c(torch.rand((16,1,10,10))).shape)

# torch.Size([16, 3, 10, 10])

kb = ka - 1 if kernel_size % 2 else ka ์ด์–ด์•ผ ํ• ๊นŒ์š”?

์ด๊ฒƒ์€ Conv1d์—๋„ ์ ์šฉ๋ฉ๋‹ˆ๊นŒ?

ConvND ํด๋ž˜์Šค์— ์ƒˆ๋กœ์šด ํŒจ๋”ฉ ๋ฐฉ๋ฒ•์„ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์€ ์šฐ์•„ํ•œ ์„ ํƒ์ผ ์ˆ˜ ์žˆ์œผ๋ฉฐ, ๋ฉ”์„œ๋“œ๋ฅผ ์˜ค๋ฒ„๋กœ๋“œํ•˜์—ฌ ํŒจ๋”ฉ ์ผ์ •์„ ์‰ฝ๊ฒŒ ์—ฐ์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@sumith ๊ฐ€ ๊ทธ ์ œ์•ˆ์„ ์ž‘์„ฑํ•œ ์ ์ด ์žˆ๊ฑฐ๋‚˜ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•  ์ž‘์—…์„ ์š”์•ฝํ•˜๋ฉด ์•„๋งˆ๋„ ์ด๊ฒƒ์„ ๋ฐ›์•„๋“ค์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์œ„์—์„œ ๋งŽ์€ ๋…ผ์˜๊ฐ€ ์žˆ์—ˆ๊ณ  ์šฐ๋ฆฌ๊ฐ€ ๋ฌด์—‡์„ ๊ฒฐ์ •ํ–ˆ๋Š”์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— ๋”ฐ๋ผ ํŒจ๋”ฉ์„ ๊ณ„์‚ฐํ•˜๋Š”์ง€ ์—ฌ๋ถ€, ํ’€์—๋„ padding="same" ๋ฅผ ๊ตฌํ˜„ํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ?

์ธ๊ณผ๊ด€๊ณ„ ํŒจ๋”ฉ๋„ ์ถ”๊ฐ€ํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ด๊ฒƒ์„ conv1d์—๋„ ์ถ”๊ฐ€ํ•ด์ฃผ์„ธ์š”.
๋‚˜๋Š” ์–ด๋Š ์‹œ์ ์—์„œ ์ฃผ์„์„ ๋”ฐ๋ฅด๋Š” ๊ฒƒ์„ ์ค‘๋‹จํ–ˆ์ง€๋งŒ ์ด ๊ธฐ๋Šฅ์€ keras์—์„œ ๋งค์šฐ ์ž˜ ์ˆ˜ํ–‰๋˜์—ˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์ •ํ™•ํžˆ ๋”ฐ๋ผ์•ผ ํ•ฉ๋‹ˆ๋‹ค.

@์น ๋ฆฌ ์—ฌ๊ธฐ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฒ”์œ„

๋‹ค์Œ ๋ ˆ์ด์–ด์— ํŒจ๋”ฉ์„ ์ถ”๊ฐ€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

  • ์ „ํ™˜*์ผ
  • ์ตœ๋Œ€ ํ’€*d
  • ํ‰๊ท  ํ’€*d

์ฒซ ๋ฒˆ์งธ PR์˜ ๊ฒฝ์šฐ ๋‹จ์ˆœํ•˜๊ฒŒ ์œ ์ง€ํ•˜๊ณ  Conv*d๋ฅผ ๊ณ ์ˆ˜ํ•ฉ์‹œ๋‹ค.

๋ณต์žก์„ฑ๊ณผ ๋‹จ์ 

์œ„์—์„œ ๋…ผ์˜ํ•œ ๋ณต์žก์„ฑ์€ same ํŒจ๋”ฉ ์˜ต์…˜์ด ์ž‘์„ฑ๋œ ํ›„ ๋ ˆ์ด์–ด๊ฐ€ ๋ณธ์งˆ์ ์œผ๋กœ ๋™์ ์œผ๋กœ ๋ณ€ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ฆ‰, ๋ชจ๋ธ ๋‚ด๋ณด๋‚ด๊ธฐ(์˜ˆ: ONNX ๋‚ด๋ณด๋‚ด๊ธฐ)์— ์ข‹์€ ์ •์ ์œผ๋กœ ์•Œ๋ ค์ง„ ๋ ˆ์ด์–ด์˜ ๋งค๊ฐœ๋ณ€์ˆ˜์—์„œ ๋™์ ์ธ ๋ ˆ์ด์–ด์˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋กœ ์ด๋™ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ ๋™์  ๋งค๊ฐœ๋ณ€์ˆ˜๋Š” padding ์ž…๋‹ˆ๋‹ค.
์ด๊ฒƒ์€ ๋งค์šฐ ๋ฌดํ•ดํ•ด ๋ณด์ด์ง€๋งŒ, ์˜ˆ๋ฅผ ๋“ค์–ด ์ •์  ํ˜•ํƒœ ๋ถ„์„ ๋ฐ ์ตœ์ ํ™”๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ ค๋Š” ๋ชจ๋ฐ”์ผ ๋˜๋Š” ์ด๊ตญ์ ์ธ ํ•˜๋“œ์›จ์–ด ๋Ÿฐํƒ€์ž„๊ณผ ๊ฐ™์€ ์ œํ•œ๋œ ๋Ÿฐํƒ€์ž„์—์„œ ๋น„์ •์ ์„ฑ์€ ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๋‹ค๋ฅธ ์‹ค์šฉ์ ์ธ ๋‹จ์ ์€ ๋™์ ์œผ๋กœ ๊ณ„์‚ฐ๋œ padding ๊ฐ€ ๋” ์ด์ƒ ํ•ญ์ƒ ๋Œ€์นญ์ ์ด์ง€ ์•Š๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ปค๋„์˜ ํฌ๊ธฐ/๋ณดํญ, ํŒฝ์ฐฝ ๊ณ„์ˆ˜ ๋ฐ ์ž…๋ ฅ ํฌ๊ธฐ์— ๋”ฐ๋ผ ํŒจ๋”ฉ์ด ๋น„๋Œ€์นญ(์ฆ‰, ๋‹ค๋ฅธ ์™ผ์ชฝ ๋Œ€ ์˜ค๋ฅธ์ชฝ์˜ ํŒจ๋”ฉ ์–‘). ์˜ˆ๋ฅผ ๋“ค์–ด CuDNN ์ปค๋„์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†๋‹ค๋Š” ์˜๋ฏธ์ž…๋‹ˆ๋‹ค.

์„ค๊ณ„

ํ˜„์žฌ Conv2d์˜ ์„œ๋ช…์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

torch.nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')

์—ฌ๊ธฐ์„œ ์šฐ๋ฆฌ๋Š” padding ๊ฐ€ int ๋˜๋Š” tuple ์˜ ์ •์ˆ˜๊ฐ€ ๋˜๋„๋ก ์ง€์›ํ•ฉ๋‹ˆ๋‹ค(์ฆ‰, ๋†’์ด / ๋„ˆ๋น„์˜ ๊ฐ ์ฐจ์›์— ๋Œ€ํ•ด).
๊ฐ’์ด same ๋ฌธ์ž์—ด์„ ์‚ฌ์šฉํ•˜๋Š” padding ๋Œ€ํ•œ ์ถ”๊ฐ€ ์˜ค๋ฒ„๋กœ๋“œ๋ฅผ ์ง€์›ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

same ํŒจ๋”ฉ์€ output ํฌ๊ธฐ๊ฐ€ input ํฌ๊ธฐ์™€ ๋™์ผํ•˜๋„๋ก ์ปจ๋ณผ๋ฃจ์…˜์— ์ œ๊ณตํ•˜๊ธฐ ์ „์— input ๋ฅผ ํŒจ๋”ฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๊ตฌํ˜„ ์„ธ๋ถ€ ์ •๋ณด

'same' ๊ฐ€ padding ์ฃผ์–ด์กŒ์„ ๋•Œ, ์šฐ๋ฆฌ๋Š” ๊ฐ ์ฐจ์›์—์„œ ํ•„์š”ํ•œ ์™ผ์ชฝ๊ณผ ์˜ค๋ฅธ์ชฝ ํŒจ๋”ฉ์˜ ์–‘์„ ๊ณ„์‚ฐํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

ํ•„์š”ํ•œ L(์™ผ์ชฝ) ๋ฐ R(์˜ค๋ฅธ์ชฝ) ํŒจ๋”ฉ์ด ๊ณ„์‚ฐ๋œ ํ›„ ๊ณ ๋ คํ•ด์•ผ ํ•  ๋‘ ๊ฐ€์ง€ ๊ฒฝ์šฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

  • L == R: ์ด ๊ฒฝ์šฐ ๋Œ€์นญ ํŒจ๋”ฉ์ž…๋‹ˆ๋‹ค. ํ•˜๋‚˜๋Š” ๋‹จ์ˆœํžˆ ํ˜ธ์ถœ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค F.conv2d A๋ฅผ padding ๊ฐ€์น˜๊ฐ€ ๋™์ผ L
  • L != R: ์ด ๊ฒฝ์šฐ ํŒจ๋”ฉ์€ ๋น„๋Œ€์นญ์ด๋ฉฐ ์„ฑ๋Šฅ๊ณผ ๋ฉ”๋ชจ๋ฆฌ์— ์ƒ๋‹นํ•œ ์˜ํ–ฅ์„ ๋ฏธ์นฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

    • input_padded = F.pad(input, ...) ํ˜ธ์ถœํ•˜๊ณ  input_padded ๋ฅผ F.conv2d ๋กœ ๋ณด๋ƒ…๋‹ˆ๋‹ค.

    • ์šฐ๋ฆฌ๋Š” ์ด ๊ฒฝ์šฐ์— ๋Œ€ํ•ด ์„ฑ๋Šฅ ์˜ํ–ฅ์— ๋Œ€ํ•ด ๊ฒฝ๊ณ ๋ฅผ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค(์ตœ์†Œํ•œ ์ดˆ๊ธฐ ๋ฆด๋ฆฌ์Šค์˜ ๊ฒฝ์šฐ, ๊ฒฝ๊ณ ๊ฐ€ ํ•„์š”ํ•œ ๊ฒฝ์šฐ ๋‹ค์‹œ ๋ฐฉ๋ฌธํ•  ์ˆ˜ ์žˆ์Œ).

    • ๊ณต์‹์˜ ์„ธ๋ถ€ ์‚ฌํ•ญ๊ณผ ์ด ๊ฒฝ์šฐ๋ฅผ ์ž…๋ ฅํ•œ ์œ„์น˜๋Š” ๊ธฐ์–ต๋‚˜์ง€ ์•Š์ง€๋งŒ, ๊ธฐ์–ตํ•œ๋‹ค๋ฉด ํฌ๊ธฐ๊ฐ€ ๊ฐ™์€ ์ปค๋„์„ ๊ฐ–๋Š” ๊ฒƒ๋งŒํผ ๊ฐ„๋‹จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ ๊ฒฝ๊ณ ๋ฅผ ํ†ตํ•ด ์‚ฌ์šฉ์ž ์ธก์—์„œ ์‰ฝ๊ฒŒ ์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋งํ•  ํ•„์š”๋„ ์—†์ด JIT ๊ฒฝ๋กœ์—์„œ๋„ ์ž‘๋™ํ•˜๋ ค๋ฉด ํ…Œ์ŠคํŠธ๋ฅผ ๊ฑฐ์ณ์•ผ ํ•ฉ๋‹ˆ๋‹ค.

@Chilee ์ฐธ๊ณ ์šฉ, ๋‹ค์Œ์€ https://github.com/mlperf/inference/blob/master/others/edge/object_detection/ssd_mobilenet/pytorch/utils.py#L40 ์—์„œ ์˜๊ฐ์„ ์–ป์„ ์ˆ˜ ์žˆ๋Š” ์ž ์žฌ์  ๊ตฌํ˜„์ž…๋‹ˆ๋‹ค.

ํ…Œ์ŠคํŠธํ•œ ๊ตฌ์„ฑ์— ๋Œ€ํ•œ TF ๊ตฌํ˜„๊ณผ ์ผ์น˜ํ–ˆ์ง€๋งŒ ํ…Œ์ŠคํŠธ๊ฐ€ ์™„์ „ํ•˜์ง€๋Š” ์•Š์•˜์Šต๋‹ˆ๋‹ค.

@soumith ๋ช‡ ๊ฐ€์ง€ ๊ฐ„๋‹จํ•œ ์งˆ๋ฌธ:

  1. functional.conv2d ํ†ตํ•ด ์ด๊ฒƒ์„ ๊ตฌํ˜„ํ•˜์ง€ ๋ง์•„์•ผ ํ•  ์ด์œ ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ? ๋‹น์‹ ์ด ์“ด ๋””์ž์ธ์€ ๊ทธ๋ ‡์ง€ ์•Š๋‹ค๋Š” ๊ฒƒ์„ ์•”์‹œํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. padding = "same"์— ๋Œ€ํ•ด์„œ๋Š” ๋ ˆ์ด์–ด์— ํŠน์ •ํ•ด์•ผ ํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ด๋Š” ๊ฒƒ์ด ์—†์Šต๋‹ˆ๋‹ค. (ํŽธ์ง‘: Nvm, ๋‚ด๊ฐ€ ๋ณด๊ณ  ์žˆ๋Š” F.conv2d impl์ด ์–‘์žํ™”๋œ ๊ฒƒ์ž„์„ ๊นจ๋‹ซ์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค).
  2. Tensorflow์˜ valid ํŒจ๋”ฉ ๋ชจ๋“œ๋Š” ๋‹จ์ˆœํžˆ padding=0 ๊ฐ€ ์žˆ๋Š” ๊ฒƒ๊ณผ ๋™์ผํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๋งž์Šต๋‹ˆ๊นŒ?

๋˜ํ•œ ์‚ฌ์šฉ์ž๊ฐ€ ๋น„๋Œ€์นญ ํŒจ๋”ฉ์„ ์‰ฝ๊ฒŒ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ ๊ฐ™์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋ฐœ์ƒํ•ด์•ผ ํ•˜๋Š” ํŒจ๋”ฉ์˜ ์–‘์„ ๊ฒฐ์ •ํ•˜๋Š” ์ „์ฒด ๊ทœ์น™์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
์ฐจ์›์„ ๋”ฐ๋ผ (ceil(x/stride) -1)*stride + (filter-1)*dilation + 1 - x ์ž…๋‹ˆ๋‹ค. ํŠนํžˆ, ์ด๊ฒƒ์ด 2์˜ ๋ฐฐ์ˆ˜๊ฐ€ ์•„๋‹Œ ๊ฒฝ์šฐ ๋น„๋Œ€์นญ ํŒจ๋”ฉ์„ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ์ง์ˆ˜ ํฌ๊ธฐ์˜ ํ•„ํ„ฐ์—์„œ๋งŒ ๋ฐœ์ƒํ•˜๊ธฐ๋ฅผ ๋ฐ”๋ผ๋Š” ๊ฒƒ์— ๋Œ€ํ•œ ๋ฐ˜๋ก€๋กœ input = 10, stride=3, filter=3, dilation=1 ์ทจํ•˜์‹ญ์‹œ์˜ค. ๋‚˜๋Š” ์ด๊ฒƒ์ด ์ผ์–ด๋‚  ์ˆ˜ ์žˆ๋Š” ์ƒํ™ฉ์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ์–ด๋–ค ๊ฐ„๋‹จํ•œ ๊ทœ์น™๋„ ๋ณด์ง€ ๋ชปํ•œ๋‹ค.

๋˜ํ•œ stride=1 , ceil(x/stride) = x ์ธ ๊ฒฝ์šฐ๋ฅผ ์ œ์™ธํ•˜๊ณ  ํŒจ๋”ฉ์„ ์ •์ ์œผ๋กœ ๊ฒฐ์ •ํ•  ์ˆ˜ ์—†์œผ๋ฉฐ ํŒจ๋”ฉ์ด (filter-1)*dilation .

@Chillee (1)์— ๋Œ€ํ•ด ์ด์œ ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ์„ฑ๋Šฅ ๋˜๋Š” ๊ธฐํƒ€ ์˜๋ฏธ์— ๋Œ€ํ•ด ์ƒ๊ฐํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.

(2) ๋„ค.

๋˜ํ•œ stride=1์ธ ๊ฒฝ์šฐ๋ฅผ ์ œ์™ธํ•˜๊ณ ๋Š” ํŒจ๋”ฉ์„ ์ •์ ์œผ๋กœ ๊ฒฐ์ •ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ceil(x/stride) = x์ด๊ณ  ํŒจ๋”ฉ์€ (filter-1)*dilation๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์˜ˆ, ํ•˜์ง€๋งŒ stride=1์€ ์ถฉ๋ถ„ํžˆ ์ผ๋ฐ˜์ ์ด๋ฉฐ ์ •์  ํŒจ๋”ฉ์˜ ์ด์ ์€ ํ™•์‹คํžˆ ํŠน๋ณ„ํžˆ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•  ๋งŒํผ ์ข‹์Šต๋‹ˆ๋‹ค.

๋น„๋Œ€์นญ ํŒจ๋”ฉ์— ๋Œ€ํ•ด, ์Œ.....

padding=SAME ์˜ ์„ ํƒ์  API๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์—†๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ํŒจ๋”ฉ์— ๋Œ€ํ•œ ์ถ”๊ฐ€ ๋น„์šฉ์„ ๊ธฐ๊บผ์ด ์ง€๋ถˆํ•  ์˜์‚ฌ๊ฐ€ ์žˆ๋‹ค๋ฉด ๊ทธ๋ ‡๊ฒŒ ํ•˜๋„๋ก ํ•˜์‹ญ์‹œ์˜ค. ๋งŽ์€ ์—ฐ๊ตฌ์ž์—๊ฒŒ ๋น ๋ฅธ ํ”„๋กœํ† ํƒ€์ดํ•‘์€ ์š”๊ตฌ ์‚ฌํ•ญ์ž…๋‹ˆ๋‹ค.

์˜ˆ,

padding=SAME ์˜ ์„ ํƒ์  API๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์—†๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ํŒจ๋”ฉ์— ๋Œ€ํ•œ ์ถ”๊ฐ€ ๋น„์šฉ์„ ๊ธฐ๊บผ์ด ์ง€๋ถˆํ•  ์˜์‚ฌ๊ฐ€ ์žˆ๋‹ค๋ฉด ๊ทธ๋ ‡๊ฒŒ ํ•˜๋„๋ก ํ•˜์‹ญ์‹œ์˜ค. ๋งŽ์€ ์—ฐ๊ตฌ์ž์—๊ฒŒ ๋น ๋ฅธ ํ”„๋กœํ† ํƒ€์ดํ•‘์€ ์š”๊ตฌ ์‚ฌํ•ญ์ž…๋‹ˆ๋‹ค.

๋™์˜ํ•˜๋‹ค! ๋‚˜๋Š” ์ด ๋นŒ์–ด๋จน์„ "ํŒจ๋”ฉ"์— 4์‹œ๊ฐ„ ๋™์•ˆ ๊ฐ‡ํ˜€ ์žˆ์—ˆ๋‹ค.

์ด ๋ฌธ์ œ์— ๋Œ€ํ•œ ์†”๋ฃจ์…˜์— ๋Œ€ํ•œ ์—…๋ฐ์ดํŠธ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?

์™€์šฐ ๊ทธ๋ฆฌ๊ณ  ์—ฌ๊ธฐ์—์„œ ์ €๋Š” Pytorch๊ฐ€ Keras/Tensorflow 2.0๋ณด๋‹ค ์‰ฌ์šธ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ–ˆ์Šต๋‹ˆ๋‹ค...

@zwep ์‹œ์ž‘ํ•˜๋Š” ๋ฐ ์กฐ๊ธˆ ๋” ๋งŽ์€ ๋…ธ๋ ฅ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ์„ฑ๊ฐ€์‹ค ์ˆ˜ ์žˆ๋Š” trianing ๋ฃจํ”„๋ฅผ ์ž‘์„ฑํ•ด์•ผ ํ•˜๊ณ  ๋ ˆ์ด์–ด๋ฅผ ๋” ๋ช…์‹œ์ ์œผ๋กœ ์ž‘์„ฑํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ผ๋‹จ ๋‹น์‹ ์ด ๊ทธ๊ฒƒ์„ ๋๋‚ด๋ฉด(ํ•œ ๋ฒˆ) ๋‹น์‹ ์€ ๊ทธ ์ด์ƒ์œผ๋กœ ์‹ค์ œ ๊ฐœ์„ ์— ๋Œ€ํ•ด ํ›จ์”ฌ ๋” ๋ฐœ์ „ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‚ด๊ฐ€ ๊ฒฝํ—˜ํ•œ ๊ทœ์น™์€ ๋ฐฑ๋งŒ ๋ฒˆ/์ตœ๊ณ  ์ˆ˜์ค€์˜ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•œ ๊ฒฝ์šฐ Keras๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์—ฐ๊ตฌ ๊ฐœ๋ฐœ์ด ํ•„์š”ํ•  ๋•Œ๋งˆ๋‹ค pytorch๋ฅผ ์‚ฌ์šฉํ•˜์‹ญ์‹œ์˜ค.

ํŒจ๋”ฉ 1d ์ „ํ™˜์— ๋Œ€ํ•œ ๋‚ด ์ฝ”๋“œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์ˆ˜์ž… ํ† ์น˜
ํ† ์น˜ ์ˆ˜์ž… nn์—์„œ
numpy๋ฅผ np๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ
ํ† ์น˜.๊ธฐ๋Šฅ์„ F๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ

class Conv1dSamePad(nn.Module):
    def __init__(self, in_channels, out_channels, filter_len, stride=1, **kwargs):
        super(Conv1dSamePad, self).__init__()
        self.filter_len = filter_len
        self.conv = nn.Conv1d(in_channels, out_channels, filter_len, padding=(self.filter_len // 2), stride=stride,
                              **kwargs)
        nn.init.xavier_uniform_(self.conv.weight)
        # nn.init.constant_(self.conv.bias, 1 / out_channels)

    def forward(self, x):
        if self.filter_len % 2 == 1:
            return self.conv(x)
        else:
            return self.conv(x)[:, :, :-1]


class Conv1dCausalPad(nn.Module):
    def __init__(self, in_channels, out_channels, filter_len, **kwargs):
        super(Conv1dCausalPad, self).__init__()
        self.filter_len = filter_len
        self.conv = nn.Conv1d(in_channels, out_channels, filter_len, **kwargs)
        nn.init.xavier_uniform_(self.conv.weight)

    def forward(self, x):
        padding = (self.filter_len - 1, 0)
        return self.conv(F.pad(x, padding))


class Conv1dPad(nn.Module):
    def __init__(self, in_channels, out_channels, filter_len, padding="same", groups=1):
        super(Conv1dPad, self).__init__()
        if padding not in ["same", "causal"]:
            raise Exception("invalid padding type %s" % padding)
        self.conv = Conv1dCausalPad(in_channels, out_channels, filter_len, groups=groups) \
            if padding == "causal" else Conv1dSamePad(in_channels, out_channels, filter_len, groups=groups)

    def forward(self, x):
        return self.conv(x)

@danFromTelAviv ์ฝ”๋“œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๊ทธ pytorch ์ฒ ํ•™์„ ์—ผ๋‘์— ๋‘˜ ๊ฒƒ์ž…๋‹ˆ๋‹ค!

2020๋…„์ž…๋‹ˆ๋‹ค. ์•„์ง Pytorch์— padding='same' ๊ฐ€ ์—†์Šต๋‹ˆ๊นŒ?

์ด๊ฒƒ์€ ๋ชจ๋“  ์ปค๋„ ํฌ๊ธฐ, ๋ณดํญ ๋ฐ ํŒฝ์ฐฝ์— ๋Œ€ํ•ด ๋™์ผํ•œ ํŒจ๋”ฉ์ด ์ž‘๋™ํ•˜๋„๋ก ํ•˜๋Š” ํ•œ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค(์ปค๋„ ํฌ๊ธฐ๋„ ์ž‘๋™ํ•จ).

class Conv1dSame(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride=1, dilation=1):
        super().__init__()
        self.cut_last_element = (kernel_size % 2 == 0 and stride == 1 and dilation % 2 == 1)
        self.padding = math.ceil((1 - stride + dilation * (kernel_size-1))/2)
        self.conv = nn.Conv1d(in_channels, out_channels, kernel_size, padding=self.padding, stride=stride, dilation=dilation)

    def forward(self, x):
        if self.cut_last_element:
            return self.conv(x)[:, :, :-1]
        else:
            return self.conv(x)

nn.Conv2d ์—๋„ "๋™์ผํ•œ ํŒจ๋”ฉ" ๊ธฐ๋Šฅ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

BTW, ์œ„์—์„œ ๋…ผ์˜ํ•œ ์„ฑ๋Šฅ/์ง๋ ฌํ™” ๋ฌธ์ œ ์™ธ์—๋„ TF์˜ ํฌ๊ธฐ ์ข…์† "๋™์ผํ•œ" ํŒจ๋”ฉ ๋ชจ๋“œ๊ฐ€ ์ข‹์€ ๊ธฐ๋ณธ๊ฐ’์ด ์•„๋‹Œ ์ด์œ ์— ๋Œ€ํ•œ ์ •ํ™•์„ฑ/์ •ํ™•์„ฑ ์ด์œ ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. https://github.com/tensorflow/tensorflow/issues/18213 ์—์„œ ๋…ผ์˜ํ–ˆ์œผ๋ฉฐ ์‹ค์ œ๋กœ ๋งŽ์€ Google ์ž์ฒด ์ฝ”๋“œ๊ฐ€ ํฌ๊ธฐ ๋…๋ฆฝ์ ์ธ "๋™์ผํ•œ" ํŒจ๋”ฉ ๋ชจ๋“œ๋ฅผ ๋Œ€์‹  ์‚ฌ์šฉํ•œ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค.

์ด ๋ฌธ์ œ์— ๋Œ€ํ•ด ํ˜„์žฌ ์ง„ํ–‰ ์ค‘์ธ ์ž‘์—…์ด ์—†๋Š” ๊ฒƒ ๊ฐ™์ง€๋งŒ ๋งŒ์•ฝ ์žˆ๋‹ค๋ฉด ํฌ๊ธฐ ๋…๋ฆฝ์ ์ธ ์†”๋ฃจ์…˜์ด๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š”, @ppwwyyxx Yuxin, ๋‹ต๋ณ€ ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
@McHughes288 ์˜ ๊ตฌํ˜„์ด ์ข‹๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋ฉฐ ๊ทธ์˜ ๊ตฌํ˜„์— ๋Œ€ํ•œ ๊ท€ํ•˜์˜ ์˜๊ฒฌ์ด ๊ถ๊ธˆํ•ฉ๋‹ˆ๋‹ค.

๋‹ค์Œ์€ Conv1D SAME ํŒจ๋”ฉ์— ๋Œ€ํ•œ ๋‚ด ์†”๋ฃจ์…˜์ž…๋‹ˆ๋‹ค( dilation==1 & groups==1 ์ธ ๊ฒฝ์šฐ์—๋งŒ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์ž‘๋™ํ•˜๋ฉฐ ํŒฝ์ฐฝ ๋ฐ ๊ทธ๋ฃน์„ ๊ณ ๋ คํ•  ๋•Œ ๋” ๋ณต์žกํ•จ).

import torch.nn.functional as F
from torch import nn

class Conv1dSamePadding(nn.Conv1d):
    """Represents the "Same" padding functionality from Tensorflow.
    NOTE: Only work correctly when dilation == 1, groups == 1 !!!
    """
    def forward(self, input):
        size, kernel, stride = input.size(-1), self.weight.size(
            2), self.stride[0]
        padding = kernel - stride - size % stride
        while padding < 0:
            padding += stride
        if padding != 0:
            # pad left by padding // 2, pad right by padding - padding // 2
            # in Tensorflow, one more padding value(default: 0) is on the right when needed
            input = F.pad(input, (padding // 2, padding - padding // 2))
        return F.conv1d(input=input,
                        weight=self.weight,
                        bias=self.bias,
                        stride=stride,
                        dilation=1,
                        groups=1)

@Chillee ์ด ๊ธฐ๋Šฅ์„ ๊ณ„์† ์ž‘์—…ํ•  ์˜ํ–ฅ์ด ์žˆ์Šต๋‹ˆ๊นŒ? ์ด ๋ฌธ์ œ์˜ ์ง„ํ–‰ ์ƒํ™ฉ์„ ๋” ์ž˜ ์ถ”์ ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ง€๊ธˆ์€ ํ• ๋‹น์„ ์ทจ์†Œํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. ์•„์ง ์ž‘์—… ์ค‘์ธ ๊ฒฝ์šฐ ์–ธ์ œ๋“ ์ง€ ๋‹ค์‹œ ํ• ๋‹นํ•ด ์ฃผ์„ธ์š”.

@wizcheu ์˜ ์ฝ”๋“œ๋ฅผ ์ฝ์€ ํ›„ padding='same'์œผ๋กœ ๋‹ค๋ฅธ ๋ฒ„์ „์˜ conv1d๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.

class Conv1dPaddingSame(nn.Module):
    '''pytorch version of padding=='same'
    ============== ATTENTION ================
    Only work when dilation == 1, groups == 1
    =========================================
    '''
    def __init__(self, in_channels, out_channels, kernel_size, stride):
        super(Conv1dPaddingSame, self).__init__()
        self.kernel_size = kernel_size
        self.stride = stride
        self.weight = nn.Parameter(torch.rand((out_channels, 
                                                 in_channels, kernel_size)))
        # nn.Conv1d default set bias=True๏ผŒso create this param
        self.bias = nn.Parameter(torch.rand(out_channels))

    def forward(self, x):
        batch_size, num_channels, length = x.shape
        if length % self.stride == 0:
            out_length = length // self.stride
        else:
            out_length = length // self.stride + 1

        pad = math.ceil((out_length * self.stride + 
                         self.kernel_size - length - self.stride) / 2)
        out = F.conv1d(input=x, 
                       weight = self.weight,
                       stride = self.stride, 
                       bias = self.bias,
                       padding=pad)
        return out

์ด์— ๋Œ€ํ•œ ์—…๋ฐ์ดํŠธ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?

์–ด๋–ค ์—…๋ฐ์ดํŠธ??

@peterbell10 ์ด ํŒ”๋กœ์šฐํ•  ์ˆ˜ ์žˆ๋Š” PR ์ดˆ์•ˆ์„ ์—ฐ๊ฒฐํ–ˆ์Šต๋‹ˆ๋‹ค.

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰