Nvidia GTX 1050Tiã§Faster-RCNNãå®è¡ããããšããŠããŸãããã¡ã¢ãªãäžè¶³ããŠããŸãã Nvidia-smiã«ãããšãçŽ170MBããã§ã«äœ¿çšãããŠããŸãããFaster-RCNNã¯å®éã«3.8GBã®VRAMã䜿çšããŠç»åãåŠçããŠããŸããïŒ
Mask-RCNNïŒå ¥éãã¥ãŒããªã¢ã«ã®ã¢ãã«ïŒãè©ŠããŠã¿ãŸããããã¯ã©ãã·ã¥ããåã«çŽ4ã€ã®ç»åïŒãã©ãŠã¶ãŒãéããå Žåã¯5ã€ïŒãååŸããŸããã
ããã¯ãã°ã§ããããããšãæ¬åœã«4GB以äžã®ã¡ã¢ãªãå¿ èŠã§ããïŒ
INFO infer_simple.py: 111: Processing demo/18124840932_e42b3e377c_k.jpg -> /home/px046/prog/Detectron/output/18124840932_e42b3e377c_k.jpg.pdf
terminate called after throwing an instance of 'caffe2::EnforceNotMet'
what(): [enforce fail at blob.h:94] IsType<T>(). wrong type for the Blob instance. Blob contains nullptr (uninitialized) while caller expects caffe2::Tensor<caffe2::CUDAContext> .
Offending Blob name: gpu_0/conv_rpn_w.
Error from operator:
input: "gpu_0/res4_5_sum" input: "gpu_0/conv_rpn_w" input: "gpu_0/conv_rpn_b" output: "gpu_0/conv_rpn" name: "" type: "Conv" arg { name: "kernel" i: 3 } arg { name: "exhaustive_search" i: 0 } arg { name: "pad" i: 1 } arg { name: "order" s: "NCHW" } arg { name: "stride" i: 1 } device_option { device_type: 1 cuda_gpu_id: 0 } engine: "CUDNN"
*** Aborted at 1516787658 (unix time) try "date -d @1516787658" if you are using GNU date ***
PC: @ 0x7f08de455428 gsignal
*** SIGABRT (@0x3e800000932) received by PID 2354 (TID 0x7f087cda9700) from PID 2354; stack trace: ***
@ 0x7f08de4554b0 (unknown)
@ 0x7f08de455428 gsignal
@ 0x7f08de45702a abort
@ 0x7f08d187db39 __gnu_cxx::__verbose_terminate_handler()
@ 0x7f08d187c1fb __cxxabiv1::__terminate()
@ 0x7f08d187c234 std::terminate()
@ 0x7f08d1897c8a execute_native_thread_routine_compat
@ 0x7f08def016ba start_thread
@ 0x7f08de52741d clone
@ 0x0 (unknown)
Aborted (core dumped)
ããã«ã¡ã¯@ Omegastick ãFaster R-CNNã¢ã«ãŽãªãºã ã®ã¡ã¢ãªèŠä»¶ã¯ãããã¯ããŒã³ãããã¯ãŒã¯ã¢ãŒããã¯ãã£ã䜿çšãããã¹ãã€ã¡ãŒãžã¹ã±ãŒã«ãªã©ãããŸããŸãªèŠå ã«ãã£ãŠç°ãªããŸãã
ããšãã°ã次ã䜿çšããŠãããã©ã«ãã®ResNet-50æ§æã§FasterR-CNNãå®è¡ã§ããŸãã
python2 tools/infer_simple.py \
--cfg configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml \
--output-dir /tmp/detectron-visualizations \
--image-ext jpg \
--wts https://s3-us-west-2.amazonaws.com/detectron/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl \
demo
ãã¢ã€ã¡ãŒãžã§å®è¡ããã®ã«3GB以äžã¯å¿ èŠãããŸããã
1ã€ã®è¿œå ã®æ³šæïŒçŸåšã®å®è£ ã§ã¯ããã¬ãŒãã³ã°äžã«ã¡ã¢ãªã®æé©åã䜿çšãããŸãããæšè«äžã«ã¯äœ¿çšãããŸããã æšè«ã®å ŽåãäžåºŠæ¶è²»ããããšäžéã¢ã¯ãã£ããŒã·ã§ã³ãäžèŠã«ãªããããã¡ã¢ãªäœ¿çšéãå€§å¹ ã«åæžã§ããŸãã å°æ¥çã«ã¯ãæšè«ã®ã¿ã®ã¡ã¢ãªæé©åãè¿œå ããããšãæ€èšããŸãã
@Omegastickç§ã®ãã·ã³ã§ãã¹ããããšãããFaster RCNN- resnet101ãšMaskRCNN- resnet101ã¯ã©ã¡ããçŽ4GBã®GPUã¡ã¢ãªã䜿çšããŠããŸãã
@ ir413ãããã§ãããªãããªã³ã¯ããã¢ãã«ã¯ç§ã®ãã·ã³ã§ããŸãæ©èœããŸãïŒ2.5GBã®VRAM䜿çšéã§å®è¡ãããŸãïŒã
æšè«ã«GPUããŸã£ããå¿ èŠãªãã®ã§ããã°ãããã¯ãã°ãããããšã§ãã
2Gã¡ã¢ãªGPUã§mask-rcnnãå®è¡ããã«ã¯ã©ãããã°ããã§ããïŒ èª°ããç§ãå©ããããšãã§ããŸããïŒ
ãã®åé¡ã¯ãCaffe 2ãŸãã¯Detectronã®å®è£ ãåå ã§ããïŒ ãã®åé¡ã解決ããã«ã¯ãDetectronã®ã©ã®ãã¡ã€ã«ã確èªããå¿ èŠããããŸããïŒ
@rbgirshick
æšè«ã®å ŽåãäžåºŠæ¶è²»ããããšäžéã¢ã¯ãã£ããŒã·ã§ã³ãäžèŠã«ãªããããã¡ã¢ãªäœ¿çšéãå€§å¹ ã«åæžã§ããŸãã å°æ¥çã«ã¯ãæšè«ã®ã¿ã®ã¡ã¢ãªæé©åãè¿œå ããããšãæ€èšããŸãã
PyTorch / Caffe2ã«ã¯ãã§ã«äœããå®è£ ãããŠããŸããïŒ ã¯ãã®å Žåãã©ããæãå¿ èŠããããŸããïŒ
@gadcamããã¯é·ãéç§ã®caffe2.python.memonger.release_blobs_when_used
ïŒhttps://github.com/pytorch/pytorch/blob/master/caffe2/python/memonger.py#L229ïŒã¯ãå¿
èŠãªãã®ã®ã»ãšãã©ãå®è£
ããå¿
èŠããããšæããŸãã ãã ãã察åŠããå¿
èŠã®ããéèŠãªåé¡ãããã€ããããŸãã
@rbgirshick詳ãã説æããããšãããããŸãïŒ
ç§ãç解ããŠããããã«ãç§ãã¡ã«ãšã£ãŠrelease_blobs_when_used
ã¯ãéåžžã®Protoãããã¡ã¢ãªæé©åããžã®ã³ã³ããŒã¿ãŒãšããŠæ©èœããŸãã
äžéšã®ãããã¯ãŒã¯ïŒãã¹ã¯R-CNNãªã©ïŒã§ã¯ãæšè«æã«è€æ°ã®ãããã䜿çšãããããã1ã€ã®ã°ã©ãã®ã¿ãæšè«ããŠãã¹ãŠã®ã¢ã¯ãã£ããŒã·ã§ã³ã解æŸã§ããããã§ã¯ãããŸããïŒãã¹ã¯ãããããããªã©ã®å¥ã®ã°ã©ãã§å¿ èŠã«ãªãå ŽåãããããïŒã
èšãæããã°ã dont_free_blobs
ã第2段éã§äœ¿çšããããããã§åããå¿
èŠããããŸããïŒ
ãã®æ©èœã§ã¯ããã¹ãããŠããªããã£ãã·ã¥ã¡ã¢ãªãããŒãžã£ãŒã䜿çšããå¿ èŠããããããåã«ãªã³ã«ããã ãã§ã¯åé¡ãçºçããå¯èœæ§ããããŸãã
ãããã£ãŠããã¹ãããå Žåã¯ã FLAGS_caffe2_cuda_memory_pool
ãcub
ïŒãŸãã¯thc
ïŒã«èšå®ããå¿
èŠããããŸãããPythonã§ãããè¡ãããšã¯ã§ããŸããïŒ
ç§ãèŠã€ããããšãã§ãããããžã®éåžžã«åžå°ãªåç
§ã®1ã€ã¯ããã«ãããŸãhttps://github.com/pytorch/pytorch/blob/6223bfdb1d3273a57b58b2a04c25c6114eaf3911/caffe2/core/context_gpu.cu#L190
@gadcam
ç§ãç解ããŠããããã«ãrelease_blobs_when_usedã¯ãéåžžã®Protoãããã¡ã¢ãªæé©åããžã®ã³ã³ããŒã¿ãŒãšããŠæ©èœããŸãã
ããã¯æ£è§£ã§ãã èšç®ã°ã©ããåæããåblobããã€äœ¿çšãããªããªããã決å®ããŠãããã¡ã¢ãªè§£æŸæäœãæ¿å ¥ããŸãã
èšãæããã°ãdont_free_blobsã第2段éã§äœ¿çšãããblobã§åããå¿ èŠããããŸããïŒ
ã¯ãããã®é¢æ°ãã©ãã ãããŸã䜿çšããã³/ãŸãã¯ãã¹ããããŠãããããããªããšããèŠåããããŸã...ã³ãŒããgrepããããšãããå®éã«ã¯äœ¿çšãããŠããªãããã§ãã ãããã£ãŠãæåŸ ã©ããã«æ©èœããªãå¯èœæ§ãããããšã«æ³šæããŠãã ããã
ãããã£ãŠããã¹ãããå Žåã¯ãFLAGS_caffe2_cuda_memory_poolãcubïŒãŸãã¯thcïŒã«èšå®ããå¿ èŠããããŸãããPythonã§ãããè¡ãããšã¯ã§ããŸããïŒ
ã¯ãã æ°ããè¿œå ãããthc
ã¡ã¢ãªãããŒãžã£ãŒã®æ¹ãå¹ççã ãšæããŸãã æè¿ã®ïŒç°ãªããã®ã®ïŒãŠãŒã¹ã±ãŒã¹ã§ã¯ã cub
代ããã«ããã䜿çšããå¿
èŠããããŸããã
@rbgirshickããªãã¯æ£ããã§ããããã¯å±éºãªéã®ããã«èŠããŸãïŒ
ã¯ãã æ°ããè¿œå ãããã¡ã¢ãªãããŒãžã£ã®æ¹ãå¹ççã ãšæããŸãã æè¿ã®ïŒç°ãªããã®ã®ïŒãŠãŒã¹ã±ãŒã¹ã§ã¯ãcubã®ä»£ããã«ããã䜿çšããå¿ èŠããããŸããã
ç§ãæå³ããã®ã¯ããããè¡ãããã®ããã¥ã¡ã³ããã©ãã«ãããç¥ã£ãŠããŸããããããšãäŸããããŸããïŒ ïŒããã䞻匵ããŠæ¬åœã«ç³ãèš³ãããŸãããå€åç§ã¯äœããéãããããããŸããããããã«é¢ããããã¥ã¡ã³ããèŠã€ããããšãã§ããŸããã§ããïŒ
ããã¥ã¡ã³ãã«é¢ãã
@asaadaldienãè¿·æããããããŠç³ãèš³ãããŸããããããªãã¯ã¢ããã€ã¹ãããã ããæ°å°ãªã人ã®1人ã®ããã§ãã
caffe2_cuda_memory_poolãèšå®ãããŠããããšã確èªããŠãã ãã
memongerãŸãã¯data_parallel_modelã䜿çšããå ŽåïŒåç
§çšã«ããã«ãããŸãïŒã
ãã£ãã·ã¥ã¡ã¢ãªãããŒãžã£ãŒãæå¹ã«ããæ¹æ³ã«ã€ããŠã®ãã³ãã¯ãããŸããïŒ ïŒPythonã§Caffe2ã䜿çšïŒ
@gadcam cubãcaffe2_cuda_memory_poolãã©ã°ã«æž¡ãããšã§ãcubãã£ãã·ã¥ã¢ãã±ãŒã¿ãæå¹ã«ã§ããŸãã äŸïŒ
workspace.GlobalInit([
'--caffe2_cuda_memory_pool=cub',
])
ãã ããããã¯åçã¡ã¢ãªã¡ã¢ãªã䜿çšããå Žåã«ã®ã¿å¿ èŠã§ãã
@asaadaldien
GlobalInit
ã«é¢ããããã¥ã¡ã³ãããªãããããã®æ¹æ³ãç解ããã®ã«å€ãã®æéãããããŸããã
æäŒã£ãŠãããŠããããšãããããŸãïŒ ã ããä»ãç§ã¯ããã€ãã®å®éšãå§ããããšãã§ããŸãïŒ
ç§ã¯ãã®åé¡ã«å¯Ÿããç°¡åãªè§£æ±ºçãæã£ãŠããŸãã
'P2ãP5'ãš 'rois'ãåºåblobãšããŠèšå®ã§ããŸãããäžå€®ã®blobã ãã§ãªããã¡ã¢ãªæé©åã䜿çšããå Žåã¯æé©åãããŸããã
ç§ã«ã¯ããŸããããªãããã§ãã
ç§ããã¹ãããã¢ãã«ã¯e2e_keypoint_rcnn_R-50-FPN_s1x.yaml
ã§ãã
model.net
éšåã«å¯ŸããŠãã¹ãããŠã¿ãŸããã
ãã¹ãã«ã¯infer_simple.py
ãŸããã
workspace.GlobalInit(['caffe2', '--caffe2_log_level=0', '--caffe2_cuda_memory_pool=thc'])
ãš
dont_free_blobs = set(model.net.Proto().external_output)
expect_frees = set(i for op in model.net.Proto().op for i in op.input)
expect_frees -= dont_free_blobs
opti_net = release_blobs_when_used(model.net.Proto(), dont_free_blobs, selector_fun=None)
model.net.Proto().op.extend(copy.deepcopy(opti_net.op))
test_release_blobs_when_used(model.net.Proto(), expect_frees)
ããã§ã test_release_blobs_when_used
ã¯https://github.com/pytorch/pytorch/blob/bf58bb5e59fa64fb49d77467f3466c6bc0cc76c5/caffe2/python/memonger_test.py#L731ã«è§ŠçºãããŠã
def test_release_blobs_when_used(with_frees, expect_frees):
found_frees = set()
for op in with_frees.op:
if op.type == "Free":
print("OP FREEE", op)
assert(not op.input[0] in found_frees) # no double frees
found_frees.add(op.input[0])
else:
# Check a freed blob is not used anymore
for inp in op.input:
assert(not inp in found_frees)
for outp in op.output:
assert(not outp in found_frees)
try:
assert(expect_frees == found_frees)
except:
print("Found - Expect frees Nb=", len(found_frees - expect_frees), found_frees - expect_frees, "\n\n\n")
print("Expect - Found frees Nb=", len(expect_frees - found_frees), expect_frees - found_frees, "\n\n\n")
#assert(False)
dont_free_blobs
ãæ£ããå€ã«èšå®ãããŠããªãããšã«æ³šæããŠãã ããïŒ
ãã®é¢æ°ã¯ãäºæããªãblobã解æŸãããããšã¯ãªããäžéšãæ¬ èœããŠããããšã瀺ããŠããŸãã
ïŒ dont_free_blobs
ãæ£ãããªããããããã¯æ£åžžã§ãïŒ
ã ããç§ã¯ã¢ãã«ãå®è¡ãç¶ããŸãã
ãããŠ...äœãèµ·ãããŸããã save_graph
é¢æ°ã䜿çšããŠç¢ºèªããŸãããç¡æã®æäœã¯ãå®éã«é©åãªå Žæã«ãããŸãã
ãã®ã©ã€ã³ã®ãµã³ãã«å
¥åã®ã¡ã¢ãªäœ¿çšéã¯1910Mo +/- 5Moã§ãã
https://github.com/facebookresearch/Detectron/blob/6c5835862888e784e861824e0ad6ac93dd01d8f5/detectron/core/test.py#L158
ããããã¡ã¢ãªãããŒãžã£ãCUBã«èšå®ãããšãæ¬åœã«é©ãã¹ãããšãèµ·ãããŸãã
workspace.GlobalInit(['caffe2', '--caffe2_log_level=0', '--caffe2_cuda_memory_pool=cub'])
RunNet
è¡ã®RAM䜿çšéã¯ã3 Go !!ã®ããã«å¢å ããŸãã ïŒéåžžã®ã³ãŒããŸãã¯ç¡æã®blobã䜿çšããã«ã¹ã¿ã ã³ãŒãã䜿çšïŒ
äœãèµ·ãã£ãŠããã®ãç解ã§ããŸãã...
ïŒ507ã§èª¬æãããŠããããã«ãJetson TX1ã§æšè«ãéå§ãããšãã¡ã¢ãªäžè¶³ãšã©ãŒãçºçããŸãã
ãã®ã¹ã¬ããã§èª¬æãããŠãããœãªã¥ãŒã·ã§ã³ã¯ã次ã®ããã«ãªããŸãã
python2 tools/infer_simple.py \
--cfg configs/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml \
--output-dir /tmp/detectron-visualizations \
--image-ext jpg \
--wts https://s3-us-west-2.amazonaws.com/detectron/35857389/12_2017_baselines/e2e_faster_rcnn_R-50-FPN_2x.yaml.01_37_22.KSeq0b5q/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl \
demo
ãŸããåäœããŸãããåèš4 GBã®RAMã䜿çšã§ããŸããïŒCPUãšGPUã®ã¡ã¢ãªã¯å
±æãããŠããŸããïŒãã¡ã¢ãªãäžè¶³ããŠããŸãã
ç§ãè©Šãããšãã§ããããå°ããªã¢ãã«ã¯ãŸã ãããŸããïŒ
@Omegastickã説æããããã«ãæ倧ââ2.5 GBã®ã¡ã¢ãªããå¿
èŠãšããªãã¯ãã§ãããããã§ãJetsonã«ã¯åãŸããªãããã§ãã ç§ãè©Šãããšãã§ããä»ã®ææ¡ã¯ãããŸããïŒ
@johannathiemichåãåé¡ãçºçããŸããã ãšã©ãŒã¯ãããŸããããããã»ã¹ã¯åŒ·å¶çµäºãããŸããã åé¡ã解決ããŸãããïŒ ç§ãJetsonTX1ã䜿çšããŠããŸãã
@ ll884856ã¯ããå®éã«ããããŸããã çµå±ãããŒã¹ããããã¹ã¯ã€ãŒãºããããšäº€æããããããå床ãã¬ãŒãã³ã°ããŸããã ãã ããããã©ãŒãã³ã¹ã¯å
ã®ResNetããã¯ããŒã³ãããã¯ããã«æªãããšã«æ³šæããŠãã ããã
ããŒã¹ããããå€æŽããåã«è©Šãããšãã§ããã®ã¯ãFPNããªãã«ããããšã§ããããã圹ç«ã€å¯èœæ§ããããŸãã ããããããã¯ããã©ãŒãã³ã¹ãäœäžãããŸãããäœäžãããã»ã©æªããªãããšãé¡ã£ãŠããŸãã
ãããããã°ãsqueezenetã®å®è£
ãšéã¿ããäŒãããŸãã ç§ã¯çŸåšããã®ãããã¯ã«é¢ããåŠå£«è«æã«åãçµãã§ããŸãã
@johannathiemichè¿ä¿¡ããããšãããããŸãïŒ å®éãç§ã¯ãã®åéã«æºãã£ãã°ããã§ãããMaskR-CNNã®ã¢ãŒããã¯ãã£ã«ã€ããŠã¯ããŸãæ確ã§ã¯ãããŸããã å®è£
ãšéã¿ãæããŠããã ããã°ãMaskR-CNNãç解ããŠå®è£
ããã®ã«å€§ãã«åœ¹ç«ã¡ãŸãã ç§ã®ã¡ãŒã«ã¢ãã¬ã¹ã¯[email protected]ã§ãã
ããããšãããããŸãã ïŒ
ãããæ€åºåšã䜿ããã«CPUã§Mask-RCNNãå®è¡ã§ããŸãã
èŠãïŒ
https://vimeo.com/277180815
ç§ã«ãåæ§ã®åé¡ã1ã€ããã®ã§ãããã§ç§ãå©ããŠããã人ãããããæ¬åœã«æè¬ããŸãhttps://github.com/facebookresearch/detectron2/issues/1539ãªããããèµ·ãã£ãŠããã®ãæ¬åœã«ããããŸããã ãããã£ãŠãtorch.nogradïŒïŒéšåãå«ããåŸãCPUã§ãããã§25æã®ç»åãäºæž¬ããã«ã¯ã9.3GBã®RAMãå¿ èŠã§ãã
æãåèã«ãªãã³ã¡ã³ã
1ã€ã®è¿œå ã®æ³šæïŒçŸåšã®å®è£ ã§ã¯ããã¬ãŒãã³ã°äžã«ã¡ã¢ãªã®æé©åã䜿çšãããŸãããæšè«äžã«ã¯äœ¿çšãããŸããã æšè«ã®å ŽåãäžåºŠæ¶è²»ããããšäžéã¢ã¯ãã£ããŒã·ã§ã³ãäžèŠã«ãªããããã¡ã¢ãªäœ¿çšéãå€§å¹ ã«åæžã§ããŸãã å°æ¥çã«ã¯ãæšè«ã®ã¿ã®ã¡ã¢ãªæé©åãè¿œå ããããšãæ€èšããŸãã