Detectron: 4GB ์นด๋“œ์˜ ๋ฉ”๋ชจ๋ฆฌ ๋ถ€์กฑ

์— ๋งŒ๋“  2018๋…„ 01์›” 24์ผ  ยท  24์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: facebookresearch/Detectron

Nvidia GTX 1050Ti์—์„œ Faster-RCNN์„ ์‹คํ–‰ํ•˜๋ ค๊ณ  ํ•˜๋Š”๋ฐ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋ถ€์กฑํ•ฉ๋‹ˆ๋‹ค. Nvidia-smi๋Š” ์•ฝ 170MB๊ฐ€ ์ด๋ฏธ ์‚ฌ์šฉ ์ค‘์ด๋ผ๊ณ  ๋งํ•˜์ง€๋งŒ Faster-RCNN์€ ์‹ค์ œ๋กœ 3.8GB์˜ VRAM์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€๋ฅผ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๊นŒ?

Mask-RCNN๋„ ์‹œ๋„ํ–ˆ๊ณ (์‹œ์ž‘ํ•˜๊ธฐ ํŠœํ† ๋ฆฌ์–ผ์˜ ๋ชจ๋ธ) ์ถฉ๋Œํ•˜๊ธฐ ์ „์— ์•ฝ 4๊ฐœ์˜ ์ด๋ฏธ์ง€(๋‚ด ๋ธŒ๋ผ์šฐ์ €๋ฅผ ๋‹ซ์€ ๊ฒฝ์šฐ 5๊ฐœ)๋ฅผ ์–ป์—ˆ์Šต๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ๋ฒ„๊ทธ์ž…๋‹ˆ๊นŒ ์•„๋‹ˆ๋ฉด ์‹ค์ œ๋กœ 4GB ์ด์ƒ์˜ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ํ•„์š”ํ•œ๊ฐ€์š”?

INFO infer_simple.py: 111: Processing demo/18124840932_e42b3e377c_k.jpg -> /home/px046/prog/Detectron/output/18124840932_e42b3e377c_k.jpg.pdf
terminate called after throwing an instance of 'caffe2::EnforceNotMet'
  what():  [enforce fail at blob.h:94] IsType<T>(). wrong type for the Blob instance. Blob contains nullptr (uninitialized) while caller expects caffe2::Tensor<caffe2::CUDAContext> .
Offending Blob name: gpu_0/conv_rpn_w.
Error from operator: 
input: "gpu_0/res4_5_sum" input: "gpu_0/conv_rpn_w" input: "gpu_0/conv_rpn_b" output: "gpu_0/conv_rpn" name: "" type: "Conv" arg { name: "kernel" i: 3 } arg { name: "exhaustive_search" i: 0 } arg { name: "pad" i: 1 } arg { name: "order" s: "NCHW" } arg { name: "stride" i: 1 } device_option { device_type: 1 cuda_gpu_id: 0 } engine: "CUDNN"
*** Aborted at 1516787658 (unix time) try "date -d @1516787658" if you are using GNU date ***
PC: @     0x7f08de455428 gsignal
*** SIGABRT (@0x3e800000932) received by PID 2354 (TID 0x7f087cda9700) from PID 2354; stack trace: ***
    @     0x7f08de4554b0 (unknown)
    @     0x7f08de455428 gsignal
    @     0x7f08de45702a abort
    @     0x7f08d187db39 __gnu_cxx::__verbose_terminate_handler()
    @     0x7f08d187c1fb __cxxabiv1::__terminate()
    @     0x7f08d187c234 std::terminate()
    @     0x7f08d1897c8a execute_native_thread_routine_compat
    @     0x7f08def016ba start_thread
    @     0x7f08de52741d clone
    @                0x0 (unknown)
Aborted (core dumped)

enhancement

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

ํ•œ ๊ฐ€์ง€ ์ถ”๊ฐ€ ์‚ฌํ•ญ: ํ˜„์žฌ ๊ตฌํ˜„์€ ํ•™์Šต ์ค‘์— ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”๋ฅผ ์‚ฌ์šฉํ•˜์ง€๋งŒ ์ถ”๋ก  ์ค‘์—๋Š” ์‚ฌ์šฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ถ”๋ก ์˜ ๊ฒฝ์šฐ ์ค‘๊ฐ„ ํ™œ์„ฑํ™”๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ํฌ๊ฒŒ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ–ฅํ›„ ์ถ”๋ก  ์ „์šฉ ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์„ ๊ณ ๋ คํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋ชจ๋“  24 ๋Œ“๊ธ€

@Omegastick๋‹˜ , Faster R-CNN ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ๋ฉ”๋ชจ๋ฆฌ ์š”๊ตฌ ์‚ฌํ•ญ์€ ๋ฐฑ๋ณธ ๋„คํŠธ์›Œํฌ ์•„ํ‚คํ…์ฒ˜ ๋ฐ ์‚ฌ์šฉ๋œ ํ…Œ์ŠคํŠธ ์ด๋ฏธ์ง€ ์Šค์ผ€์ผ์„ ํฌํ•จํ•œ ์—ฌ๋Ÿฌ ์š”์ธ์— ๋”ฐ๋ผ ๋‹ค๋ฆ…๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด ๋‹ค์Œ์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ธฐ๋ณธ ResNet-50 ๊ตฌ์„ฑ์œผ๋กœ Faster R-CNN์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

python2 tools/infer_simple.py \
  --cfg configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml \
  --output-dir /tmp/detectron-visualizations \ 
  --image-ext jpg \
  --wts https://s3-us-west-2.amazonaws.com/detectron/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl \
  demo

๋ฐ๋ชจ ์ด๋ฏธ์ง€์—์„œ ์‹คํ–‰ํ•˜๋Š” ๋ฐ 3GB ์ด์ƒ์ด ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

ํ•œ ๊ฐ€์ง€ ์ถ”๊ฐ€ ์‚ฌํ•ญ: ํ˜„์žฌ ๊ตฌํ˜„์€ ํ•™์Šต ์ค‘์— ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”๋ฅผ ์‚ฌ์šฉํ•˜์ง€๋งŒ ์ถ”๋ก  ์ค‘์—๋Š” ์‚ฌ์šฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ถ”๋ก ์˜ ๊ฒฝ์šฐ ์ค‘๊ฐ„ ํ™œ์„ฑํ™”๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ํฌ๊ฒŒ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ–ฅํ›„ ์ถ”๋ก  ์ „์šฉ ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์„ ๊ณ ๋ คํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@Omegastick ๋‚ด ์ปดํ“จํ„ฐ์—์„œ ํ…Œ์ŠคํŠธํ•œ Faster RCNN-resnet 101 ๋ฐ Mask RCNN-resnet 101 ๋ชจ๋‘ ์•ฝ 4GB GPU ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

@ir413 ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์—ฐ๊ฒฐํ•œ ๋ชจ๋ธ์ด ๋‚ด ์ปดํ“จํ„ฐ์—์„œ ํ›Œ๋ฅญํ•˜๊ฒŒ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค(2.5GB VRAM ์‚ฌ์šฉ๋Ÿ‰์œผ๋กœ ์‹คํ–‰).

์ถ”๋ก ์— GPU๊ฐ€ ์ „ํ˜€ ํ•„์š”ํ•˜์ง€ ์•Š๋‹ค๋ฉด ๋ฉ‹์งˆ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์–ด๋–ป๊ฒŒ 2G ๋ฉ”๋ชจ๋ฆฌ GPU๋กœ mask-rcnn์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ์•„๋ฌด๋„ ๋‚˜๋ฅผ ๋„์šธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

์ด ๋ฌธ์ œ๋Š” Caffe 2 ๋˜๋Š” Detectron์˜ ๊ตฌํ˜„์œผ๋กœ ์ธํ•œ ๊ฒƒ์ž…๋‹ˆ๊นŒ? ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋ ค๋ฉด Detectron์—์„œ ์–ด๋–ค ํŒŒ์ผ์„ ํ™•์ธํ•ด์•ผ ํ•˜๋‚˜์š”?

@rbgirshick

์ถ”๋ก ์˜ ๊ฒฝ์šฐ ์ค‘๊ฐ„ ํ™œ์„ฑํ™”๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์„ ํฌ๊ฒŒ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ–ฅํ›„ ์ถ”๋ก  ์ „์šฉ ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์„ ๊ณ ๋ คํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

PyTorch/Caffe2์— ์ด๋ฏธ ๊ตฌํ˜„๋œ ๊ฒƒ์ด ์žˆ์Šต๋‹ˆ๊นŒ? ๊ทธ๋ ‡๋‹ค๋ฉด ์–ด๋””๋ฅผ ํŒŒ์•ผํ•ฉ๋‹ˆ๊นŒ?

@gadcam์ด ์˜ค๋žซ๋™์•ˆ ๋‚ด ํ•  ์ผ ๋ชฉ๋ก์—์™”๋‹ค,ํ•˜์ง€๋งŒ ๋ถˆํ–‰ํžˆ๋„ ์šฐ์„  ์ˆœ์œ„๋Š” ๊ฐ์†Œํ•˜๋Š” ๋Œ€์‹  ์ฆ๊ฐ€ํ•˜๊ณ ์žˆ๋‹ค : /. ๋‚ด ์ƒ๊ฐ caffe2.python.memonger.release_blobs_when_used (https://github.com/pytorch/pytorch/blob/master/caffe2/python/memonger.py#L229) ์šฐ๋ฆฌ๊ฐ€ ํ•„์š”ํ•œ ๋Œ€๋ถ€๋ถ„์„ ๊ตฌํ˜„ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํ•ด๊ฒฐํ•ด์•ผ ํ•  ๋ช‡ ๊ฐ€์ง€ ์ค‘์š”ํ•˜์ง€ ์•Š์€ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ์ผ๋ถ€ ๋„คํŠธ์›Œํฌ(์˜ˆ: Mask R-CNN)์˜ ๊ฒฝ์šฐ ์ถ”๋ก  ์‹œ๊ฐ„์— ๋‹ค์ค‘ ๋„คํŠธ๊ฐ€ ์‚ฌ์šฉ๋˜๋ฏ€๋กœ ํ•˜๋‚˜์˜ ๊ทธ๋ž˜ํ”„์— ๋Œ€ํ•ด์„œ๋งŒ ์ถ”๋ก ํ•˜์—ฌ ๋ชจ๋“  ํ™œ์„ฑํ™”๋ฅผ ํ•ด์ œํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค(์ด๋Š” ๋งˆ์Šคํฌ ํ—ค๋“œ ๋„คํŠธ์™€ ๊ฐ™์€ ๋‹ค๋ฅธ ๊ทธ๋ž˜ํ”„์— ํ•„์š”ํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—).
  • ์ด ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ํ…Œ์ŠคํŠธํ•˜์ง€ ์•Š์€ ์บ์‹ฑ ๋ฉ”๋ชจ๋ฆฌ ๊ด€๋ฆฌ์ž๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋ฏ€๋กœ ๋‹จ์ˆœํžˆ ์ผœ๋Š” ๋ฐ ๋ฌธ์ œ๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@rbgirshick ์ž์„ธํ•œ ์„ค๋ช… ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!

๊ทธ๋ž˜์„œ ๋‚ด๊ฐ€ ์ดํ•ดํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ์šฐ๋ฆฌ์—๊ฒŒ release_blobs_when_used ๋Š” ์ผ๋ฐ˜ Proto์—์„œ "๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”"๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.

์ผ๋ถ€ ๋„คํŠธ์›Œํฌ(์˜ˆ: Mask R-CNN)์˜ ๊ฒฝ์šฐ ์ถ”๋ก  ์‹œ๊ฐ„์— ๋‹ค์ค‘ ๋„คํŠธ๊ฐ€ ์‚ฌ์šฉ๋˜๋ฏ€๋กœ ํ•˜๋‚˜์˜ ๊ทธ๋ž˜ํ”„์— ๋Œ€ํ•ด์„œ๋งŒ ์ถ”๋ก ํ•˜์—ฌ ๋ชจ๋“  ํ™œ์„ฑํ™”๋ฅผ ํ•ด์ œํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค(์ด๋Š” ๋งˆ์Šคํฌ ํ—ค๋“œ ๋„คํŠธ์™€ ๊ฐ™์€ ๋‹ค๋ฅธ ๊ทธ๋ž˜ํ”„์— ํ•„์š”ํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—).

๋‹ค์‹œ ๋งํ•ด์„œ dont_free_blobs ๋ฅผ ๋‘ ๋ฒˆ์งธ ๋‹จ๊ณ„์—์„œ ์‚ฌ์šฉํ•˜๋Š” ์–ผ๋ฃฉ์œผ๋กœ ์ฑ„์›Œ์•ผ ํ•œ๋‹ค๋Š” ๋ง์ž…๋‹ˆ๊นŒ?

์ด ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ํ…Œ์ŠคํŠธํ•˜์ง€ ์•Š์€ ์บ์‹ฑ ๋ฉ”๋ชจ๋ฆฌ ๊ด€๋ฆฌ์ž๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋ฏ€๋กœ ๋‹จ์ˆœํžˆ ์ผœ๋Š” ๋ฐ ๋ฌธ์ œ๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ํ…Œ์ŠคํŠธํ•˜๋ ค๋ฉด FLAGS_caffe2_cuda_memory_pool ๋ฅผ cub (๋˜๋Š” thc )๋กœ ์„ค์ •ํ•ด์•ผ ํ•˜์ง€๋งŒ Python์—์„œ ์ด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?
๋‚ด๊ฐ€ ์ฐพ์„ ์ˆ˜์žˆ๋Š” ๋งค์šฐ ๋ถ€์กฑํ•œ ์ฐธ์กฐ ์ค‘ ํ•˜๋‚˜๋Š” https://github.com/pytorch/pytorch/blob/6223bfdb1d3273a57b58b2a04c25c6114eaf3911/caffe2/core/context_gpu.cu#L190์ž…๋‹ˆ๋‹ค.

@gadcam

๊ทธ๋ž˜์„œ ๋‚ด๊ฐ€ ์ดํ•ดํ•˜๋Š” ํ•œ ์šฐ๋ฆฌ์—๊ฒŒ release_blobs_when_used๋Š” ์ผ๋ฐ˜ Proto์—์„œ "๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”๋œ" ๊ฒƒ์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.

๋„ค, ๋งž์Šต๋‹ˆ๋‹ค. ๊ณ„์‚ฐ ๊ทธ๋ž˜ํ”„๋ฅผ ๋ถ„์„ํ•˜๊ณ  ๊ฐ ๋ธ”๋กญ์ด ๋” ์ด์ƒ ์‚ฌ์šฉ๋˜์ง€ ์•Š์„ ์‹œ๊ธฐ๋ฅผ ๊ฒฐ์ •ํ•œ ๋‹ค์Œ ๋ฉ”๋ชจ๋ฆฌ ํ•ด์ œ ์—ฐ์‚ฐ์„ ์‚ฝ์ž…ํ•ฉ๋‹ˆ๋‹ค.

๋‹ค์‹œ ๋งํ•ด dont_free_blobs๋ฅผ ๋‘ ๋ฒˆ์งธ ๋‹จ๊ณ„์—์„œ ์‚ฌ์šฉํ•˜๋Š” ์–ผ๋ฃฉ์œผ๋กœ ์ฑ„์›Œ์•ผ ํ•œ๋‹ค๋Š” ๋ง์ž…๋‹ˆ๊นŒ?

์˜ˆ, ์ด ๊ธฐ๋Šฅ์ด ์–ผ๋งˆ๋‚˜ ์ž˜ ์‚ฌ์šฉ ๋ฐ/๋˜๋Š” ํ…Œ์ŠคํŠธ๋˜์—ˆ๋Š”์ง€ ํ™•์‹คํ•˜์ง€ ์•Š๋‹ค๋Š” ๊ฒฝ๊ณ ์™€ ํ•จ๊ป˜... grepping ์ฝ”๋“œ์—์„œ ์‹ค์ œ๋กœ ์‚ฌ์šฉ๋˜์ง€ ์•Š๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์˜ˆ์ƒ๋Œ€๋กœ ์ž‘๋™ํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Œ์„ ์—ผ๋‘์— ๋‘๊ฒ ์Šต๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ํ…Œ์ŠคํŠธํ•˜๋ ค๋ฉด FLAGS_caffe2_cuda_memory_pool์„ cub(๋˜๋Š” thc)๋กœ ์„ค์ •ํ•ด์•ผ ํ•˜์ง€๋งŒ Python์—์„œ ์ด๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

์˜ˆ. ์ƒˆ๋กœ ์ถ”๊ฐ€๋œ thc ๋ฉ”๋ชจ๋ฆฌ ๊ด€๋ฆฌ์ž๊ฐ€ ๋” ํšจ์œจ์ ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ตœ๊ทผ์˜(๋‹ค๋ฅด์ง€๋งŒ) ์‚ฌ์šฉ ์‚ฌ๋ก€์— ๋Œ€ํ•ด cub ๋Œ€์‹  ์ด๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ–ˆ์Šต๋‹ˆ๋‹ค.

@rbgirshick ๋‹น์‹  ๋ง์ด ๋งž์•„์š”, ์œ„ํ—˜ํ•œ ๊ธธ์ด ๋ณด์ž…๋‹ˆ๋‹ค!

์˜ˆ. ์ƒˆ๋กœ ์ถ”๊ฐ€๋œ ๋ฉ”๋ชจ๋ฆฌ ๊ด€๋ฆฌ์ž๊ฐ€ ๋” ํšจ์œจ์ ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ตœ๊ทผ์˜(๋‹ค๋ฅด์ง€๋งŒ) ์‚ฌ์šฉ ์‚ฌ๋ก€๋ฅผ ์œ„ํ•ด cub ๋Œ€์‹ ์— ๊ทธ๊ฒƒ์„ ์‚ฌ์šฉํ•ด์•ผ ํ–ˆ์Šต๋‹ˆ๋‹ค.

๋‚ด๊ฐ€ ์˜๋ฏธํ•˜๋Š” ๋ฐ”๋Š” ๋ฌธ์„œ๋ฅผ ์–ด๋””์„œ ์ฐพ์„ ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ ์•„๋‹ˆ๋ฉด ์˜ˆ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ? (์ด ๋ถ€๋ถ„์„ ๊ณ ์ง‘ํ•˜๊ฒŒ ๋˜์–ด ์ •๋ง ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. ์ œ๊ฐ€ ๋†“์นœ ๋ถ€๋ถ„์ด ์žˆ์„ ์ˆ˜ ์žˆ์ง€๋งŒ ๊ด€๋ จ ๋ฌธ์„œ๋ฅผ ์ฐพ์„ ์ˆ˜ ์—†์—ˆ์Šต๋‹ˆ๋‹ค.)

๋ฌธ์„œ์— ๊ด€ํ•œ @gadcam , ๋‚ด๊ฐ€ ์•Œ๊ณ  ์žˆ๋Š” ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค. ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค!

@asaadaldien ๊ท€์ฐฎ๊ฒŒ ํ•ด์„œ ์ •๋ง ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค๋งŒ ๋‹น์‹ ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์กฐ์–ธํ•œ ๋ช‡ ์•ˆ ๋˜๋Š” ์‚ฌ๋žŒ ์ค‘ ํ•œ ๋ช…์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

caffe2_cuda_memory_pool์ด ์„ค์ •๋˜์—ˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.

memonger ๋˜๋Š” data_parallel_model์„ ์‚ฌ์šฉํ•  ๋•Œ (์ฐธ๊ณ ๋กœ ์—ฌ๊ธฐ์— ์žˆ์Šต๋‹ˆ๋‹ค ).
์บ์‹ฑ ๋ฉ”๋ชจ๋ฆฌ ๊ด€๋ฆฌ์ž๋ฅผ ํ™œ์„ฑํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ํžŒํŠธ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ? (ํŒŒ์ด์ฌ์—์„œ Caffe2 ์‚ฌ์šฉํ•˜๊ธฐ)

@gadcam cub๋ฅผ caffe2_cuda_memory_pool ํ”Œ๋ž˜๊ทธ์— ์ „๋‹ฌํ•˜์—ฌ cub ์บ์‹œ ํ• ๋‹น์ž๋ฅผ ํ™œ์„ฑํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ:

workspace.GlobalInit([
'--caffe2_cuda_memory_pool=cub',
])

๊ทธ๋Ÿฌ๋‚˜ ์ด๊ฒƒ์€ ๋™์  ๋ฉ”๋ชจ๋ฆฌ ๊ธฐ์–ต๊ธฐ๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ๋งŒ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

@asaadaldien
GlobalInit ๋Œ€ํ•œ ๋ฌธ์„œ๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ํŒŒ์•…ํ•˜๋Š” ๋ฐ ๋งŽ์€ ์‹œ๊ฐ„์ด ๊ฑธ๋ ธ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๋‹น์‹ ์˜ ๋„์›€์„ ์ฃผ์…”์„œ ๋Œ€๋‹จํžˆ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค! ์ด์ œ ๋ช‡ ๊ฐ€์ง€ ์‹คํ—˜์„ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!

์ด ๋ฌธ์ œ์— ๋Œ€ํ•œ ๊ฐ„๋‹จํ•œ ํ•ด๊ฒฐ์ฑ…์ด ์žˆ์Šต๋‹ˆ๋‹ค.
์ค‘๊ฐ„ blob ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ 'P2~P5' ๋ฐ 'rois'๋ฅผ ์ถœ๋ ฅ blob์œผ๋กœ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋ฉด ๋ฉ”๋ชจ๋ฆฌ ์ตœ์ ํ™”๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ ์ตœ์ ํ™”๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋‚˜๋ฅผ ์œ„ํ•ด ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.
๋‚ด๊ฐ€ ํ…Œ์ŠคํŠธํ•œ ๋ชจ๋ธ์€ e2e_keypoint_rcnn_R-50-FPN_s1x.yaml ์ž…๋‹ˆ๋‹ค.
model.net ๋ถ€๋ถ„์— ๋Œ€ํ•ด ํ…Œ์ŠคํŠธ๋ฅผ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค.

ํ…Œ์ŠคํŠธ์— infer_simple.py ๋ฅผ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค.

workspace.GlobalInit(['caffe2', '--caffe2_log_level=0', '--caffe2_cuda_memory_pool=thc']) 

๊ทธ๋ฆฌ๊ณ 

dont_free_blobs = set(model.net.Proto().external_output)
expect_frees = set(i for op in model.net.Proto().op for i in op.input)
expect_frees -= dont_free_blobs

opti_net = release_blobs_when_used(model.net.Proto(), dont_free_blobs, selector_fun=None)
model.net.Proto().op.extend(copy.deepcopy(opti_net.op))

test_release_blobs_when_used(model.net.Proto(), expect_frees) 

์—ฌ๊ธฐ์„œ test_release_blobs_when_used ๋Š” https://github.com/pytorch/pytorch/blob/bf58bb5e59fa64fb49d77467f3466c6bc0cc76c5/caffe2/python/memonger_test.py#L731์—์„œ ์˜๊ฐ์„ ๋ฐ›์•˜์Šต๋‹ˆ๋‹ค.

def test_release_blobs_when_used(with_frees, expect_frees):
    found_frees = set()
    for op in with_frees.op:
        if op.type == "Free":
            print("OP FREEE", op)
            assert(not op.input[0] in found_frees)  # no double frees
            found_frees.add(op.input[0])
        else:
            # Check a freed blob is not used anymore
            for inp in op.input:
                assert(not inp in found_frees)
            for outp in op.output:
                assert(not outp in found_frees)

    try:
        assert(expect_frees == found_frees)
    except:
        print("Found - Expect frees Nb=", len(found_frees - expect_frees), found_frees - expect_frees, "\n\n\n")
        print("Expect - Found frees Nb=", len(expect_frees - found_frees), expect_frees - found_frees, "\n\n\n")
       #assert(False)

dont_free_blobs ์ด ์˜ฌ๋ฐ”๋ฅธ ๊ฐ’์œผ๋กœ ์„ค์ •๋˜์ง€ ์•Š์•˜์Œ์„ ์œ ์˜ํ•˜์‹ญ์‹œ์˜ค!

์ด ํ•จ์ˆ˜๋Š” ์˜ˆ๊ธฐ์น˜ ์•Š์€ ์–ผ๋ฃฉ์ด ํ•ด์ œ๋˜์ง€ ์•Š๊ณ  ์ผ๋ถ€๊ฐ€ ๋ˆ„๋ฝ๋˜์—ˆ์Œ์„ ์•Œ๋ ค์ค๋‹ˆ๋‹ค.
( dont_free_blobs ๊ฐ€ ์˜ฌ๋ฐ”๋ฅด์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ์ •์ƒ์ž…๋‹ˆ๋‹ค)
๊ทธ๋ž˜์„œ ์ €๋Š” ๊ณ„์†ํ•ด์„œ ๋ชจ๋ธ์„ ์šด์˜ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ ... ์•„๋ฌด ์ผ๋„ ์ผ์–ด๋‚˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. save_graph ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ™•์ธํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ฌด๋ฃŒ ์ž‘์—…์ด ์‹ค์ œ๋กœ ์˜ฌ๋ฐ”๋ฅธ ์œ„์น˜์— ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๋ผ์ธ์˜ ์ƒ˜ํ”Œ ์ž…๋ ฅ์— ๋Œ€ํ•œ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์€ 1910 Mo +/- 5 Mo์ž…๋‹ˆ๋‹ค.
https://github.com/facebookresearch/Detectron/blob/6c5835862888e784e861824e0ad6ac93dd01d8f5/detectron/core/test.py#L158

ํ•˜์ง€๋งŒ ๋ฉ”๋ชจ๋ฆฌ ๊ด€๋ฆฌ์ž๋ฅผ CUB๋กœ ์„ค์ •ํ•˜๋ฉด ์ •๋ง ๋†€๋ผ์šด ์ผ์ด ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.

workspace.GlobalInit(['caffe2', '--caffe2_log_level=0', '--caffe2_cuda_memory_pool=cub']) 

RunNet ๋ผ์ธ์˜ RAM ์‚ฌ์šฉ๋Ÿ‰์€ 3 Go!! (์ผ๋ฐ˜ ์ฝ”๋“œ ๋˜๋Š” ๋ฌด๋ฃŒ blob์ด ์žˆ๋Š” ์‚ฌ์šฉ์ž ์ง€์ • ์ฝ”๋“œ ์‚ฌ์šฉ)

๋ฌด์Šจ ์ผ์ด ์ผ์–ด๋‚˜๊ณ  ์žˆ๋Š”์ง€ ์ดํ•ดํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค ...

#507์— ์„ค๋ช…๋œ ๋Œ€๋กœ Jetson TX1์—์„œ ์ถ”๋ก ์„ ์‹œ์ž‘ํ•  ๋•Œ ๋ฉ”๋ชจ๋ฆฌ ๋ถ€์กฑ ์˜ค๋ฅ˜๋„ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.
์ด ์Šค๋ ˆ๋“œ์— ์„ค๋ช…๋œ ์†”๋ฃจ์…˜์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
python2 tools/infer_simple.py \ --cfg configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml \ --output-dir /tmp/detectron-visualizations \ --image-ext jpg \ --wts https://s3-us-west-2.amazonaws.com/detectron/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl \ demo
๋˜ํ•œ ์ž‘๋™ํ•˜์ง€ ์•Š๊ณ  ์ด 4GB RAM์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ์—ฌ์ „ํžˆ ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋ถ€์กฑํ•ฉ๋‹ˆ๋‹ค(CPU ๋ฐ GPU ๋ฉ”๋ชจ๋ฆฌ๋Š” ๊ณต์œ ๋˜์ง€๋งŒ).
์•„์ง ์‹œ๋„ํ•  ์ˆ˜ ์žˆ๋Š” ๋” ์ž‘์€ ๋ชจ๋ธ์ด ์žˆ์Šต๋‹ˆ๊นŒ?
@Omegastick์ด ์„ค๋ช…

@johannathiemich ๋‚˜๋Š” ๊ฐ™์€ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ค๋ฅ˜๋Š” ์—†์ง€๋งŒ ํ”„๋กœ์„ธ์Šค๊ฐ€ ์ข…๋ฃŒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์…จ๋‚˜์š”? ์ ฏ์Šจ TX1๋„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

@ll884856 ๋„ค, ์‚ฌ์‹ค ๊ทธ๋žฌ์Šต๋‹ˆ๋‹ค. ๋ฒ ์ด์Šค ๋„คํŠธ๋ฅผ ์Šคํ€ด์ฆˆ๋„คํŠธ๋กœ ๊ต์ฒดํ•˜๊ณ  ๋„คํŠธ๋ฅผ ๋‹ค์‹œ ํ›ˆ๋ จ์‹œ์ผฐ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์„ฑ๋Šฅ์€ ์›๋ž˜ ResNet ๋ฐฑ๋ณธ๋ณด๋‹ค ํ›จ์”ฌ ๋‚˜์˜๋‹ค๋Š” ์ ์„ ๋ช…์‹ฌํ•˜์‹ญ์‹œ์˜ค.
๋ฒ ์ด์Šค๋„ท์„ ๋ณ€๊ฒฝํ•˜๊ธฐ ์ „์— ์‹œ๋„ํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์€ FPN์„ ๋„๋Š” ๊ฒƒ ๋˜ํ•œ ๋„์›€์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๊ฐ์†Œ๊ฐ€ ๋‚˜์˜์ง€ ์•Š๊ธฐ๋ฅผ ๋ฐ”๋ผ์ง€๋งŒ ์„ฑ๋Šฅ๋„ ๊ฐ์†Œํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๋‹น์‹ ์ด ์›ํ•œ๋‹ค๋ฉด ๋‚ด๊ฐ€ ๋‹น์‹ ์—๊ฒŒ ์Šคํ€ด์ฆˆ๋„ท์˜ ๊ตฌํ˜„๊ณผ ๋ฌด๊ฒŒ๋ฅผ ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ €๋Š” ํ˜„์žฌ ์ด ์ฃผ์ œ์— ๋Œ€ํ•œ ํ•™์‚ฌ ํ•™์œ„ ๋…ผ๋ฌธ์„ ์ž‘์„ฑ ์ค‘์ž…๋‹ˆ๋‹ค.

@johannathiemich ๋‹ต๋ณ€ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค! ์‚ฌ์‹ค ์ €๋Š” ์ด ๋ถ„์•ผ์— ๋ง‰ ์ž…๋ฌธํ•œ์ง€ ์–ผ๋งˆ ๋˜์ง€ ์•Š์•„ Mask R-CNN์˜ ์•„ํ‚คํ…์ฒ˜์— ๋Œ€ํ•ด ์ž˜ ๋ชจ๋ฆ…๋‹ˆ๋‹ค. ๊ตฌํ˜„๊ณผ ๊ฐ€์ค‘์น˜๋ฅผ ์•Œ๋ ค์ฃผ์‹œ๋ฉด Mask R-CNN์„ ์ดํ•ดํ•˜๊ณ  ๊ตฌํ˜„ํ•˜๋Š” ๋ฐ ๋งŽ์€ ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‚ด ์ด๋ฉ”์ผ์€ [email protected]์ž…๋‹ˆ๋‹ค.
๊ณ ๋ง™์Šต๋‹ˆ๋‹ค !

์˜ˆ, ํƒ์ง€๊ธฐ๊ฐ€ ์•„๋‹Œ CPU์—์„œ Mask-RCNN์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ณด๋‹ค:
https://vimeo.com/277180815

๋น„์Šทํ•œ ๋ฌธ์ œ๊ฐ€ ํ•˜๋‚˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—์„œ ์ €๋ฅผ ๋„์™€์ค„ ์‚ฌ๋žŒ์ด ์žˆ๋‹ค๋ฉด ์ •๋ง ๊ฐ์‚ฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. https://github.com/facebookresearch/detectron2/issues/1539 ์™œ ์ด๋Ÿฐ ์ผ์ด ์ผ์–ด๋‚˜๋Š”์ง€ ์ •๋ง ์ดํ•ด๊ฐ€ ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ torch.nograd() ๋ถ€๋ถ„์„ ํฌํ•จ์‹œํ‚จ ํ›„ CPU์—์„œ ์ผ๊ด„ ์ฒ˜๋ฆฌ๋กœ 25๊ฐœ์˜ ์ด๋ฏธ์ง€๋ฅผ ์˜ˆ์ธกํ•˜๋ ค๋ฉด 9.3GB์˜ RAM์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰