Tensorflow: OpenCL ์ง€์›

์— ๋งŒ๋“  2015๋…„ 11์›” 09์ผ  ยท  541์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: tensorflow/tensorflow

TensorFlow๋Š” CUDA๋งŒ ์ง€์›ํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์ดํ•ดํ•ฉ๋‹ˆ๋‹ค. OpenCL ์ง€์›์— ์ถ”๊ฐ€ํ•˜๋ ค๋ฉด ๋ฌด์—‡์„ ํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ?

contributions welcome

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

Google์ด ๋…์  CUDA๋ฅผ ์œ„ํ•ด OpenCL์„ ๋ฒ„๋ฆฐ ๊ฒƒ์€ ์ด์ƒํ•ฉ๋‹ˆ๋‹ค.
im-just-saying

๋ชจ๋“  541 ๋Œ“๊ธ€

Google์ด ๋…์  CUDA๋ฅผ ์œ„ํ•ด OpenCL์„ ๋ฒ„๋ฆฐ ๊ฒƒ์€ ์ด์ƒํ•ฉ๋‹ˆ๋‹ค.
im-just-saying

์ตœ์†Œํ•œ Eigen ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” OpenCL์„ ์ง€์›ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

:+1:

:+1:

:+1:

์—„์ง€์†๊ฐ€๋ฝ๊ณผ ๊ทธ ๋ชจ๋“  ๊ฒƒ.

OpenCL๋กœ Tensor Flow๋ฅผ ํ™•์žฅํ•˜๋Š” ๋ฐ ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ด๋ฏธ OpenCL ์นดํŽ˜๋ฅผ ์ถœ์‹œํ–ˆ์Šต๋‹ˆ๋‹ค. https://github.com/amd/OpenCL-caffe. ๋ฐ”๋ผ๊ฑด๋Œ€ ๊ทธ๊ฒƒ์€ ๊ฐ€๋ฒผ์šด ๋ฐฉ์‹์œผ๋กœ ํ†ตํ•ฉ ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ํ•จ๊ป˜ ์ž‘์—…ํ•˜๋Š” ๋ฐ ๊ด€์‹ฌ์ด ์žˆ๋Š” ์‚ฌ๋žŒ์ด ์žˆ์Šต๋‹ˆ๊นŒ?

@gujunli ์—ฌ๊ธฐ์„œ AMD๋ฅผ ๋งŒ๋‚˜์„œ ๋ฐ˜๊ฐ‘์Šต๋‹ˆ๋‹ค. /cc @naibaf7 @lunochod

ํ›Œ๋ฅญํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

:+1:

Eigen/OpenCL/SYCL์šฉ /cc @lukeiwanski

@gujunli ํ™•์‹คํžˆ ๊ธฐ์—ฌ์— ๊ด€์‹ฌ์ด ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์–ธ์ œ ์‹œ์ž‘ํ•  ๊ณ„ํš์ธ์ง€ ์•Œ๋ ค์ฃผ์„ธ์š”.

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„,

์—ฌ๊ธฐ Codeplay์—์„œ ์šฐ๋ฆฌ๋Š” SYCL(OpenCL ์œ„์— ์žˆ๋Š” ์ตœ์‹  C++ ๋ ˆ์ด์–ด)์„ ์‚ฌ์šฉํ•˜์—ฌ GPU์—์„œ ์‹คํ–‰๋˜๋Š” Eigen์˜ ํ…์„œ๋ฅผ ์กฐ์‚ฌํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ง€๊ธˆ๊นŒ์ง€ ์ˆ˜์ง‘ํ•œ ๋‚ด์šฉ์—์„œ GPU ํ…์„œ ๋””์ž์ธ์€ CUDA์™€ ๋งค์šฐ ๋ฐ€์ ‘ํ•˜๊ฒŒ ๊ฒฐํ•ฉ๋˜์–ด ์žˆ์œผ๋ฉฐ ๋‹ค๋ฅธ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ชจ๋ธ, ํŠนํžˆ SYCL ๋ฐ OpenCL 1.2 ๋ฒ„์ „์— ๋Œ€ํ•œ ์ธํ„ฐํŽ˜์ด์Šค ๋ณ€๊ฒฝ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๋” ๊นŠ์ด ํŒŒ๊ณ ๋“œ๋Š” ๋ฐ ๊ด€์‹ฌ์ด ์žˆ๋‹ค๋ฉด / ์šฐ๋ฆฌ๋Š” ๊ธฐ์—ฌํ•˜๋Š” ๋ฐ ๊ฐ€์žฅ ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐ์‚ฌ ํ•ด์š”,
๋ฃจํฌ

@lukeiwanski ํ”ผ๋“œ๋ฐฑ ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. @benoitsteiner ๋Š” eigen์˜ ํ…์„œ ํ™•์žฅ ๋ถ€๋ถ„์—์„œ ์ž‘์—…ํ–ˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

:+1: ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๊ณ„ํš์„ ์„ธ์šฐ๊ณ  ์ž‘์—…์„ ์ž‘์—…์œผ๋กœ ๋‚˜๋ˆ„๋Š” ๊ฒฝ์šฐ OpenCL/SYCL ์ฝ”๋”ฉ์„ ๋„์šธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ €๋Š” Boost.Compute๋ฅผ OpenCL์šฉ ๋ž˜ํผ๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค(์ปค๋„ ์‹คํ–‰, ํ…Œ์ŠคํŠธ, ํ…œํ”Œ๋ฆฟ์„ ๋” ์‰ฝ๊ฒŒ ๋งŒ๋“ญ๋‹ˆ๋‹ค).

+1

:+1:

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„,

๊ณ„์†ํ•ด์„œ ์†Œ์‹์„ ์ „ํ•˜๊ธฐ ์œ„ํ•ด SYCL/OpenCL 1.2 ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ชจ๋ธ์— ๋” ์ž˜ ๋งž๋„๋ก Eigen ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ๋ณ€๊ฒฝํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๊ณ„์† ์กฐ์‚ฌํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์ด๊ธฐ์ข… ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ชจ๋ธ( OpenCL / SYCL ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ )์„ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋Š” ํ•ฉ๋ฆฌ์ ์ธ ์ ‘๊ทผ ๋ฐฉ์‹์ด ๋‚˜์˜ค๋ฉด ์ œ์•ˆ์„œ๋ฅผ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค.

๊ฐ์‚ฌ ํ•ด์š”,
๋ฃจํฌ

๊ณ„์† ์—…๋ฐ์ดํŠธํ•ด ์ฃผ์„ธ์š”. AMD์šฉ opencl-caffe๋ฅผ ๊ฐœ๋ฐœํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋„๋ณด๊ณ ์žˆ๋‹ค
ํ…์„œ ํ๋ฆ„.

๊ฐ์‚ฌ ํ•ด์š”.
์ค€๋ฃจ
2015๋…„ 12์›” 8์ผ ์˜ค์ „ 10์‹œ 19๋ถ„์— "Luke Iwanski" [email protected] ์ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„,

๊ณ„์† ์•Œ๋ ค๋“œ๋ฆฌ๊ธฐ ์œ„ํ•ด ๋ณ€๊ฒฝ ๋ฐฉ๋ฒ•์„ ๊ณ„์† ์กฐ์‚ฌ ์ค‘์ž…๋‹ˆ๋‹ค.
SYCL/OpenCL 1.2 ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ชจ๋ธ์— ๋” ์ž˜ ๋งž๋Š” ๊ณ ์œ  ์ธํ„ฐํŽ˜์ด์Šค.
ํ•ฉ๋ฆฌ์ ์ธ ์ ‘๊ทผ ๋ฐฉ์‹์ด ๋‚˜์˜ค๋ฉด ์ œ์•ˆ์„œ๋ฅผ ์ž‘์„ฑํ•ฉ๋‹ˆ๋‹ค.

๊ฐ์‚ฌ ํ•ด์š”,
๋ฃจํฌ

โ€”
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment -162967662
.

/cc @ptillet @gongzg ์ธํ…”์—์„œ ์ด์— ๋Œ€ํ•œ ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๊นŒ? AMD ํฌํฌ, Intel ํ†ตํ•ฉ๋˜์ง€ ์•Š์€ PR, ๋˜ ๋‹ค๋ฅธ ์ค€๋น„๊ณต์‹ AMD PR, ์žฅ๊ธฐ ์Šคํ…Œ์ด์ง• ์‚ฌ์šฉ์ž PR์ด ์žˆ๋Š” Caffe์™€ ๊ฐ™์ด ์—ฌ๊ธฐ์—์„œ OPENCL์„ ์กฐ๊ฐํ™”ํ•˜์ง€ ์•Š๊ธฐ๋ฅผ ์ง„์‹ฌ์œผ๋กœ ๋ฐ”๋ž๋‹ˆ๋‹ค. ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ์—ญ์‚ฌ์— ๊ด€์‹ฌ์ด ์žˆ๋‹ค๋ฉด https://github.com/BVLC/caffe/pull/2610 ๋Œ“๊ธ€์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@bhack ์šฐ๋ฆฌ๋Š” ์ด๊ฒƒ์— ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์•Œ๋ ค ์ค˜์„œ ๊ณ ๋งˆ์›Œ. Eigen์˜ OpenCL/SYCL ๊ตฌํ˜„์— ๋Œ€ํ•œ ์ œ์•ˆ์ด ์žˆ์œผ๋ฉด ์ธํ…” ์ธก์—์„œ ๋ฌด์—‡์„ ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

:+1:

https://github.com/ptilet/isaac ์˜ ํฅ๋ฏธ๋กœ์šด ์ด๋‹ˆ์…”ํ‹ฐ๋ธŒ๋Š” ์—ฌ๊ธฐ์—์„œ Eigen ํ…์„œ ํ™•์žฅ์— ์˜์กดํ•˜๋Š” ๊ฒฝ์šฐ์—๋„ ๋งˆ์ฐฌ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค.

์ €๋„ ๊ธฐ์—ฌํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. @benoitsteiner ์ •๋ฆฌํ•  ์ˆ˜ ์žˆ๋‚˜์š”?

์ด๊ฒƒ์€ ๋กœ๋“œ๋งต์— ํฌํ•จ๋˜์—ˆ์ง€๋งŒ ๋ฐฉํ–ฅ/๋ถ€ํŠธ์ŠคํŠธ๋žฉ์ด ์ •๋ง ์œ ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ธฐ์—ฌ๋กœ ํƒœ๊ทธ๋˜๊ธฐ๋„ ํ–ˆ์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ์กฐ์งํ•˜๋Š” ๋ฐ ๊ธฐ์—ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. OpenCL ์ง€์›์„ ๋‹ด๋‹นํ•˜๋Š” ์‚ฌ๋žŒ
์ด์ œ ํ…์„œ ํ๋ฆ„?

์ •๋ง ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
์ค€๋ฆฌ

2016๋…„ 1์›” 19์ผ ํ™”์š”์ผ ์˜ค์ „ 7์‹œ 50๋ถ„์— bhack [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ๋กœ๋“œ๋งต์— ํฌํ•จ๋˜์—ˆ์ง€๋งŒ ๊ธฐ์—ฌ๋กœ ํƒœ๊ทธ๊ฐ€ ์ง€์ •๋˜์–ด
direction/bootstrap์€ ์ •๋ง ์œ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

โ€”
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment -172894538
.


๊ตฌ์ค€๋ฆฌ - ่ฐทไฟŠไธฝ
์กฐ์ • ๊ณผํ•™ ์—ฐ๊ตฌ์‹ค
์ผ๋ฆฌ๋…ธ์ด ๋Œ€ํ•™๊ต ์–ด๋ฐ”๋‚˜ ์ƒดํŽ˜์ธ


Benoit๊ฐ€ ๊ธฐ๋Šฅ์„ ์ง์ ‘ ํ• ๋‹นํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ทธ๋ƒฅ Benoit๋กœ ๊ฐ€์ •ํ–ˆ์ง€๋งŒ Junli๋Š” ๋‹น์‹ ์ด ๊ทธ๊ฒƒ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค! ๊ด€์‹ฌ ์žˆ๋Š” ์‚ฌ๋žŒ๋“ค์˜ ์ด๋ฉ”์ผ์ด๋‚˜ ํฌ๋Ÿผ ์Šค๋ ˆ๋“œ๋กœ ์‹œ์ž‘ํ•˜์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

@benoitsteiner ๋Š” ํ‘œ์‹œํ•˜์ง€ ์•Š์•˜์„ ์ˆ˜ ์žˆ๋Š” ์ดํ•ด ๊ด€๊ณ„์ž์— ๋Œ€ํ•ด ๋” ๋งŽ์ด ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์ด ์Šค๋ ˆ๋“œ(๋˜๋Š” ์ด ๋ฌธ์ œ)์— ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๊ฐ€ ์กฐ์ •ํ•˜๊ธฐ๋ฅผ ๊ธฐ๋‹ค๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค.
์ค‘๋ณต ์ž‘์—…์„ ํ”ผํ•˜์‹ญ์‹œ์˜ค.

2016๋…„ 1์›” 19์ผ ํ™”์š”์ผ ์˜ค์ „ 11:42 Dan McLaughlin [email protected]
์ผ๋‹ค:

๋‚˜๋Š” Benoit๊ฐ€ ๊ธฐ๋Šฅ์„ ์Šค์Šค๋กœ ํ• ๋‹นํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ถ”์ธกํ–ˆ์ง€๋งŒ
๋‹น์‹ ์€ ๊ทธ๊ฒƒ์„ ์–ป์—ˆ๋‹ค Junli! ์ด๋ฉ”์ผ์ด๋‚˜ ํฌ๋Ÿผ ์Šค๋ ˆ๋“œ๋กœ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ดํ•ด ๊ด€๊ณ„์ž?

โ€”
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment -172963537
.

๋‚ด๊ฐ€ ๊ด€์‹ฌ. ๋กœ๋“œ๋งต์ด ์žˆ์Šต๋‹ˆ๊นŒ?

2016๋…„ 1์›” 19์ผ ์˜ค์ „ 11์‹œ 46๋ถ„์— Martin Wicke [email protected] ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

@benoitsteiner ๋Š” ํ‘œ์‹œํ•˜์ง€ ์•Š์•˜์„ ์ˆ˜ ์žˆ๋Š” ์ดํ•ด ๊ด€๊ณ„์ž์— ๋Œ€ํ•ด ๋” ๋งŽ์ด ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์ด ์Šค๋ ˆ๋“œ(๋˜๋Š” ์ด ๋ฌธ์ œ)์— ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๊ฐ€ ์กฐ์ •ํ•˜๊ธฐ๋ฅผ ๊ธฐ๋‹ค๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค.
์ค‘๋ณต ์ž‘์—…์„ ํ”ผํ•˜์‹ญ์‹œ์˜ค.

2016๋…„ 1์›” 19์ผ ํ™”์š”์ผ ์˜ค์ „ 11:42 Dan McLaughlin [email protected]
์ผ๋‹ค:

๋‚˜๋Š” Benoit๊ฐ€ ๊ธฐ๋Šฅ์„ ์Šค์Šค๋กœ ํ• ๋‹นํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ถ”์ธกํ–ˆ์ง€๋งŒ
๋‹น์‹ ์€ ๊ทธ๊ฒƒ์„ ์–ป์—ˆ๋‹ค Junli! ์ด๋ฉ”์ผ์ด๋‚˜ ํฌ๋Ÿผ ์Šค๋ ˆ๋“œ๋กœ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ดํ•ด ๊ด€๊ณ„์ž?

โ€”
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment -172963537
.

โ€”
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ํšŒ์‹ ํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.

Tensorflow๊ฐ€ ์˜์กดํ•˜๋Š” CUDA ์ข…์†์„ฑ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ชฉ๋ก์ด ์žˆ์Šต๋‹ˆ๊นŒ?

์ด๊ฒƒ์€ ์šฐ๋ฆฌ๊ฐ€ ์ฆ‰๊ฐ์ ์ธ OpenCL ๋Œ€์•ˆ์„ ๊ฐ€์งˆ ์ˆ˜ ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@hsaputra
clFFT, clBLAS(๋˜๋Š” ViennaCL)๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚œ์ˆ˜ ์ƒ์„ฑ๊ธฐ๋Š” CPU ์ƒ์„ฑ๊ธฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  GPU๋กœ ์ „์†กํ•˜๊ฑฐ๋‚˜ RNG์šฉ์œผ๋กœ ๋‹ค๋ฅธ ๊ธฐ์กด ์ปค๋„์„ โ€‹โ€‹์‚ฌ์šฉํ•˜๋Š” ์กฐ๊ธˆ ๋” ๊นŒ๋‹ค๋กญ์Šต๋‹ˆ๋‹ค(์ปค๋Ÿฐ๋“œ ์—†์Œ).

๊ฐ€์žฅ ํฐ ํ•จ์ •์€ ๋‹ค์‹œ ํšจ์œจ์ ์ธ ์ปจ๋ณผ๋ฃจ์…˜ ๊ตฌํ˜„(cuDNN๊ณผ ๊ฐ™์€ ๊ฒƒ)์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฌธ์ œ์— ๋Œ€ํ•œ ๊ฒฝํ—˜์ด ์žˆ์Šต๋‹ˆ๋‹ค.
https://github.com/BVLC/caffe/pull/2610
https://github.com/BVLC/caffe/pull/2195
https://github.com/amd/OpenCL-caffe

Tensorflow๋Š” Eigen์œผ๋กœ ์—…์ŠคํŠธ๋ฆผ๋œ ํ…์„œ ํ™•์žฅ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ Eigen์— ๋Œ€ํ•œ Opencl/Sycl ์ง€์›์ด ํ•„์š”ํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์ด ์Šค๋ ˆ๋“œ ๋ณด๊ธฐ

@naibaf7 ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ, ํ˜„์žฌ OpenCL์šฉ cuDNN์— ๋Œ€ํ•œ ์‹คํ–‰ ๊ฐ€๋Šฅํ•œ ๋Œ€์•ˆ์ด ์—†๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์›น์‚ฌ์ดํŠธ http://opencl.org ๋Š” ์ด์™€ ๊ฐ™์€ ์˜คํ”ˆ ์†Œ์Šค ํฌํŒ… ํ”„๋กœ์ ํŠธ๋ฅผ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด ๋งŒ๋“ค์–ด์กŒ์Šต๋‹ˆ๋‹ค! ์šฐ๋ฆฌ๋Š” ํ˜„์žฌ ์›น์‚ฌ์ดํŠธ์— ํ•„์š”ํ•œ ๋ชจ๋“  ๋„๊ตฌ๋ฅผ ์„ค์น˜ํ•˜๊ณ  ์žˆ์œผ๋ฉฐ https://github.com/OpenCL/ ์— ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๋ฅผ ์œ„ํ•œ ๊ณต๊ฐ„์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚˜์ค‘์— ์—ฌ๋Ÿฌ ์œ ํ˜•์˜ ํ•˜๋“œ์›จ์–ด๋ฅผ ํ…Œ์ŠคํŠธํ•˜๊ธฐ ์œ„ํ•ด ๋นŒ๋“œ ์„œ๋ฒ„๋ฅผ ์ถ”๊ฐ€ํ•˜๊ณ  ๋‹ค์Œ ๋ถ„์•ผ์— ๋Œ€ํ•œ ์ „๋ฌธ ์ง€์‹์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ˆ˜๋งŽ์€ ํ•˜๋“œ์›จ์–ด์—์„œ ์ตœ๊ณ  ์†๋„๋กœ ์‹คํ–‰๋˜๋Š” ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜๋Š” ๋ฐฉ๋ฒ•.

๋‹ค์Œ ์ฃผ์— GEGL์— ๋Œ€ํ•œ ์ด์‹ ๊ณ„ํš์„ ์‹œ์ž‘ํ•˜์ง€๋งŒ ๊ธฐ๊บผ์ด ์—ฌ๋Ÿฌ๋ถ„์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

@bhack ํ•ด๋‹น ์Šค๋ ˆ๋“œ์—์„œ ์—ฌ๊ธฐ์—์„œ @lukeiwanski ๊ฐ€ ๊ทธ๊ฒƒ์„ ์กฐ์‚ฌํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๊ธฐ๊บผ์ด ์ž‘์—…ํ•  ์ˆ˜ ์žˆ๋Š” ์‚ฌ๋žŒ๋“ค์ด ์ถฉ๋ถ„ํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์กฐ์ •ํ•˜๋ ค๋ฉด @benoitsteiner , @lukeiwanski ๋˜๋Š” @gujunli ๋งŒ ์žˆ์œผ๋ฉด ๋ฉ๋‹ˆ๋‹ค. Benoit๋Š” ์กฐ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค. ์•„๋งˆ๋„ ํœด๊ฐ€ ์ค‘์ผ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ด ์ด๋‹ˆ์…”ํ‹ฐ๋ธŒ์— ๊ธฐ์—ฌํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

์•ˆ๋…• ๋ชจ๋‘,

Eigen์˜ ํ…์„œ ๋ชจ๋“ˆ์„ OpenCL์šฉ SYCL๋กœ ์ด์‹ํ•˜๋Š” ๋…ธ๋ ฅ์„ ์กฐ์ •ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๋ฏธ ๋Œ€๋ถ€๋ถ„ ์ž‘๋™ํ•˜๋Š” ๊ฒƒ์ด ์žˆ์ง€๋งŒ ์•„์ง ๊ฒ€ํ† ํ•  ์ค€๋น„๊ฐ€ ๋˜์ง€ ์•Š์•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

์šฐ๋ฆฌ๋Š” ์ด ์ ‘๊ทผ ๋ฐฉ์‹์„ ์„ ํ˜ธํ•ฉ๋‹ˆ๋‹ค. ์ฝ”๋“œ ๊ธฐ๋ฐ˜์— ๋Œ€ํ•œ ์นจ์ž…์ด ์ค„์–ด๋“ค๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. SYCL์€ eigen์ด ์ด๋ฏธ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š” ๋‹จ์ผ ์†Œ์Šค C++ ํ…œํ”Œ๋ฆฟ ๋ชจ๋ธ์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

๋กœ๋“œ๋งต ๋””์ž์ธ์ด ์ง„ํ–‰ ์ค‘์ด๋ฏ€๋กœ ๋„ˆ๋ฌด ์˜ค๋ž˜ ๊ฑธ๋ฆฌ์ง€๋Š” ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๊ฐ์‚ฌ ํ•ด์š”,
๋ฃจํฌ

@lukeiwanski ์—…์ŠคํŠธ๋ฆผ๊ณผ ์ž‘์—… ์ค‘์ด๊ฑฐ๋‚˜ ์—ฐ๋ฝ ์ค‘์ž…๋‹ˆ๊นŒ? Eigen์˜ ์—…์ŠคํŠธ๋ฆผ์—์„œ ์Šน์ธ๋  ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•˜์‹ญ๋‹ˆ๊นŒ?

+1

์ข‹์€ ์†Œ์‹ @lukeiwanski , ๋„์›€์ด ํ•„์š”ํ•˜๋ฉด ์•Œ๋ ค์ฃผ์‹ญ์‹œ์˜ค.

๊ฐœ๋ฐœ์ž/์—ฐ๊ตฌ์›์ด ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” SYCL ๊ตฌํ˜„์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์–ด๋–ค ํ”Œ๋žซํผ์—์„œ?

@lukeiwanski SYCL์€ Eigen๊ณผ ๊ด€๋ จ๋œ ํ…œํ”Œ๋ฆฟ ๋ฉ”ํƒ€ํ”„๋กœ๊ทธ๋ž˜๋ฐ์˜ ์–‘์„ ๊ณ ๋ คํ•  ๋•Œ ์˜ฌ๋ฐ”๋ฅธ ๋ฐฉ๋ฒ•์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๋‚ด ์ž์‹ ์˜ ์‹ ๊ฒฝ๋ง๊ณผ ์„ ํ˜• ๋Œ€์ˆ˜ํ•™ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฅผ ๊ฐœ๋ฐœํ•˜๋ฉด์„œ ์–ป์€ OpenCL ๊ฒฝํ—˜์ด ์žˆ๋Š” ์ˆ™๋ จ๋œ C++ ๊ฐœ๋ฐœ์ž์ž…๋‹ˆ๋‹ค. ์ €๋Š” ์ด ๋…ธ๋ ฅ์„ ๋•๊ณ  SYCL๊ณผ ํ•จ๊ป˜ ๊ฐœ๋ฐœ์„ ์‹œ์ž‘ํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

@bhack ์šฐ๋ฆฌ๋Š” @benoitsteiner ์™€ ์ ‘์ด‰ํ•˜๊ณ  ์žˆ์ง€๋งŒ ๋„ˆ๋ฌด ๋งŽ์€ ๋…ธ๋ ฅ์„ ํˆฌ์žํ•˜๊ธฐ ์ „์— ์—…์ŠคํŠธ๋ฆผ ์œ ์ง€ ๊ด€๋ฆฌ์ž์™€ ์ œ์•ˆ์— ๋Œ€ํ•ด ๋…ผ์˜ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@DanMcLaughlin , @ville-k ์šฐ๋ฆฌ๋Š” SYCL, ComputeCpp(https://www.codeplay.com/products/computecpp) ๊ตฌํ˜„์„ ๊ฐœ๋ฐœ ์ค‘์ž…๋‹ˆ๋‹ค. ๋” ์ž์„ธํ•œ ์ •๋ณด๋ฅผ ์›ํ•˜์‹œ๋ฉด ์ œ ํ”„๋กœํ•„์— ์žˆ๋Š” ์ด๋ฉ”์ผ ์ฃผ์†Œ๋ฅผ ํ†ตํ•ด ์ €์—๊ฒŒ ์—ฐ๋ฝํ•ด ์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

@lukeiwanski ๊ณ„ํš์— ๊ด€ํ•œ ์—…๋ฐ์ดํŠธ/์ถ”์ •์ด ์žˆ์Šต๋‹ˆ๊นŒ?

+1.
๋…ธํŠธ๋ถ์— AMD GPU์™€ Intel GPU๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‘˜ ๋‹ค OpenCL ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ์žˆ๊ณ  AMD์˜ ์ง€์›์ด ํ›จ์”ฌ ๋” ๋‚˜์€ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” 2๊ฐœ์˜ OpenCL ์žฅ์น˜๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋” ๋†’์€ ์„ฑ๋Šฅ์„ ๊ฐ€์งˆ ๊ฒƒ์ž…๋‹ˆ๋‹ค. OpenCL ์žฅ์น˜๋กœ ํ™•์žฅํ•  ์ˆ˜ ์žˆ๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„,

๊ด€์‹ฌ์„ ๊ฐ€์ ธ ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!
์ด ์‹œ์ ์—์„œ ์šฐ๋ฆฌ๋Š” ์šฐ๋ฆฌ๊ฐ€ ํ•˜๋Š” ์–ด๋–ค ๊ฒƒ๋„ ํšŒ๊ท€๋ฅผ ๋„์ž…ํ•˜์ง€ ์•Š๋„๋ก ํ…Œ์ŠคํŠธ ์ธํ”„๋ผ๋ฅผ ์„ค์ •ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์šฐ๋ฆฌ๋Š” @benoitsteiner ์™€ ์—ฐ๋ฝํ•˜์—ฌ ๊ทธ๊ฐ€ ์ง€๊ธˆ๊นŒ์ง€ ํ•œ ์ผ๊ณผ ์ผ์น˜ํ•˜๋Š”์ง€ ํ™•์ธํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์šฐ๋ฆฌ๋Š” ์•„์ง ํ†ตํ•ฉ ํ”„๋กœ์„ธ์Šค๋ฅผ ์œ„ํ•œ ๋กœ๋“œ๋งต์„ ์ž‘์„ฑ ์ค‘์ž…๋‹ˆ๋‹ค. ๋ช…ํ™•ํžˆ ํ•ด์•ผ ํ•  ๋ช‡ ๊ฐ€์ง€ ๋น„์ฆˆ๋‹ˆ์Šค ์„ธ๋ถ€ ์‚ฌํ•ญ์ด ์žˆ์œผ๋ฏ€๋กœ ๋ช‡ ์ฃผ ์•ˆ์— ์™„๋ฃŒํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์šฐ๋ฆฌ์˜ ๋ชฉํ‘œ๋Š” ์˜ฌํ•ด ๋ง๊นŒ์ง€ OpenCL์„ Eigen์„ ํ†ตํ•ด TensorFlow๋กœ ๊ฐ€์ ธ์˜ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๊ฐ์‚ฌ ํ•ด์š”,

๊ด€์‹ฌ์žˆ๋Š”. ๊ธฐ์—ฌํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

์ข‹์Šต๋‹ˆ๋‹ค. ์‹ค์ œ๋กœ Google ๋‚ด๋ถ€์— ์ผ์ข…์˜ ๋™๊ธฐํ™”๋ฅผ ์‚ฌ์šฉํ•˜๋Š” Codeplay์˜ ๋…ธ๋ ฅ์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ AMD์™€ Intel ๊ฐ€์ž…์ž์˜ ์—ญํ• ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

/cc @keryell SYCL/FPGA ์„ธ๊ณ„์—์„œ ์ด์— ๋Œ€ํ•œ ๊ด€์‹ฌ์ด ์žˆ๋Š” ๊ฒฝ์šฐ

์ตœ๊ทผ์— ์ด ํ† ๋ก ์— ๋” ๋งŽ์€ ๊ธฐ์—ฌ๋ฅผ ํ•˜์ง€ ๋ชปํ•œ ๊ฒƒ์— ๋Œ€ํ•ด ์‚ฌ๊ณผ๋“œ๋ฆฝ๋‹ˆ๋‹ค. ์ง€๋‚œ 2์ฃผ ๋™์•ˆ ์ œ ์ ‘์‹œ๊ฐ€ ๊ฐ€๋“ ์ฐผ์Šต๋‹ˆ๋‹ค.

TensorFlow ์ธก์—์„œ OpenCL ์ž‘์—…์„ ์กฐ์ •ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์šฐ๋ฆฌ์˜ ํ˜„์žฌ ์ƒ๊ฐ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • TensorFlow๋Š” C++11์— ์˜์กดํ•˜๊ณ  "๋‹จ์ผ ์†Œ์Šค" ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ทจํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— SYCL์ด ๋งค์šฐ ์ ํ•ฉํ•ด ๋ณด์ž…๋‹ˆ๋‹ค.
  • ์šฐ๋ฆฌ๋Š” ์‚ฌ๋‚ด์— OpenCL ๊ฒฝํ—˜์ด ๋งŽ์ง€ ์•Š์œผ๋ฏ€๋กœ ์ด ๊ฒฉ์ฐจ๋ฅผ ํ•ด์†Œํ•˜๊ธฐ ์œ„ํ•ด Codeplay์™€ ๊ธด๋ฐ€ํ•˜๊ฒŒ ํ˜‘๋ ฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ Codeplay๋Š” ํ˜„์žฌ Eigen ํ…์„œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— SYCL์— ๋Œ€ํ•œ ์ง€์›์„ ์ถ”๊ฐ€ํ•˜๋ ค๋Š” ๋…ธ๋ ฅ์„ ์ฃผ๋„ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
  • TensorFlow๋Š” cuDNN ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ์˜์กดํ•˜์—ฌ NVidia GPU์—์„œ ์ปจ๋ณผ๋ฃจ์…˜์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ๋ˆ„๊ตฐ๊ฐ€ OpenCL์— ์ƒ์‘ํ•˜๋Š” ๊ธฐ์—ฌ์— ๊ด€์‹ฌ์ด ์žˆ๋‹ค๋ฉด ๊ธฐ๊บผ์ด ๋„์™€๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค.

๋…ธ๋ ฅ์„ ๊ตฌ์กฐํ™”ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋˜๋„๋ก [email protected]์ด๋ผ๋Š” ๋ฉ”์ผ๋ง ๋ฆฌ์ŠคํŠธ๋ฅผ ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค.

@bhack ํ™•์‹คํžˆ FPGA์˜ ๊ณ ๊ธ‰ C++์— ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๋‹ค :-)
TensorFlow๋Š” triSYCL์— ๋Œ€ํ•œ ์ข‹์€ ๊ฒ€์ฆ ์‚ฌ์šฉ ์‚ฌ๋ก€์ฒ˜๋Ÿผ ๋“ค๋ฆฝ๋‹ˆ๋‹ค.
๊ทธ๊ฑด ๊ทธ๋ ‡๊ณ , ์—ฌ๊ธฐ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ์ด ์ฃผ์ œ์— ๋Œ€ํ•œ ์ธํ„ด์‹ญ์„ ์ฐพ๊ณ  ์žˆ๋‹ค๋ฉด ๋ช‡ ๊ฐ€์ง€ ์ง์ฑ…์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚ด๊ฐ€ ๊ทธ๋“ค์˜ ์›น์‚ฌ์ดํŠธ๋ฅผ ์‹ ๋ขฐํ•œ๋‹ค๋ฉด Codeplay๋„ ๋ช‡๋ช‡ ์‚ฌ๋žŒ๋“ค์„ ์ฐพ๊ณ  ์žˆ๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

@karlrupp ๋ฐ @hughperkins ์˜๊ฒฌ์— ์ •๋ง ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋“ค์ด ์ƒˆ๋กœ์šด Google ๊ทธ๋ฃน์— ๋Œ€ํ•œ ํ† ๋ก ์— ์ฐธ์—ฌํ•˜๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค.

@benoitsteiner ์—…๋ฐ์ดํŠธํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. @KhronosGroup ์˜ ๋ชจ๋“  ๊ด€๋ จ ํŒŒํŠธ๋„ˆ(Google, Nvidia, Amd, Intel, Codeplay, Xilinx ๋“ฑ)๊ฐ€ ํ‘œ์ค€ํ™”๋œ ๋ฐฉ์‹์œผ๋กœ API์™€ ๊ฐ™์€ cudnn์„ ํ™๋ณดํ•œ๋‹ค๋ฉด ํ›Œ๋ฅญํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ผ์ข…์˜ Khronos openvx ์ปดํ“จํ„ฐ ๋น„์ „ ํ‘œ์ค€ํ™” ๋…ธ๋ ฅ์ด์ง€๋งŒ ๋”ฅ ๋Ÿฌ๋‹์„ ์œ„ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@bhack ์–ด๋–ค ์ƒˆ๋กœ์šด Google ๊ทธ๋ฃน์ธ๊ฐ€์š”?

๊ทธ ์™ธ์— OpenCL๊ณผ CUDA๋Š” ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ฐฉ์‹์ด ๋„ˆ๋ฌด ๋‹ค๋ฆ…๋‹ˆ๋‹ค. CUDA๋Š” ํ•œ ํšŒ์‚ฌ๊ฐ€ ๋ชจ๋“  ๊ฒƒ์„ ์™„์ „ํžˆ ์ œ์–ดํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์žˆ๋Š” ๊ทธ๋Œ€๋กœ ์ž‘๋™ํ•˜๋ฏ€๋กœ ๋ฐ”์ด๋„ˆ๋ฆฌ Blob์„ ํฌํ•จํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ์ตœ์ข… ์‹คํ–‰ ํŒŒ์ผ์— ๋ฌด์—‡์ด ์žˆ๋Š”์ง€ ๋ˆ„๊ฐ€ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. SyCL ๊ฒฝ๋กœ๋ฅผ ๋”ฐ๋ผ ๋‚ด๋ ค๊ฐ€๊ณ (๋‚ด ๊ฑฑ์ •์ด ์žˆ์Šต๋‹ˆ๋‹ค...) SyCL ์ปดํŒŒ์ผ๋Ÿฌ ๊ณต๊ธ‰์—…์ฒด๊ฐ€ ๋ชจ๋“  ๊ฐ€๋Šฅํ•œ ๋Œ€์ƒ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์™„์ „ํžˆ ์ œ์–ดํ•˜์ง€ ์•Š๋Š” ํ•œ OpenCL์—์„œ๋Š” ์ด๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค(์‹ค์ œ๋กœ ๊ฐ€๋Šฅ์„ฑ์ด ์—†๊ฑฐ๋‚˜ ๋ถˆ๊ฐ€๋Šฅํ•จ). ์ „๋ฐ˜์ ์œผ๋กœ, ์ œ ์ƒ๊ฐ์—๋Š” ์ข‹์€ OpenCL ์ง€์› ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์—ฌ๊ธฐ ์ €๊ธฐ ์•ฝ๊ฐ„์˜ ์กฐ์ • ์ด์ƒ์ด ํ•„์š”ํ•˜๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์•„๋งˆ๋„ ๋‹น์‹ ์ด ๋“ฃ๊ณ  ์‹ถ์—ˆ๋˜ ๊ฒƒ์€ ์•„๋‹ˆ์ง€๋งŒ ๋‹น์‹ ์€ ๋‚ด ์˜๊ฒฌ์„ ๋ฌผ์—ˆ์Šต๋‹ˆ๋‹ค :-)

@karlrupp ๊ตฌ๊ธ€ ๊ทธ๋ฃน์€ ๋์— https://github.com/tensorflow/tensorflow/issues/22#issuecomment -176406416์„ ์ฐธ์กฐํ•˜์„ธ์š”.
์—ฌ๋Ÿฌ ๋ฐฑ์—”๋“œ(CPU, GPU, MIC)์™€ ๋Œ€์ˆ˜ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์ธํ„ฐํŽ˜์ด์Šคํ•˜๋Š” ViennaCL์— ๋Œ€ํ•œ ํ›Œ๋ฅญํ•œ ๊ฒฝํ—˜์ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ท€ํ•˜์˜ ์˜๊ฒฌ์„ ๋ฌผ์—ˆ์Šต๋‹ˆ๋‹ค. Tensorflow๋Š” Eigein ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์™€ Google ์—…์ŠคํŠธ๋ฆผ์—์„œ ์ œ๊ณตํ•œ ์ƒˆ๋กœ์šด ํ…์„œ ํ™•์žฅ์— ์˜์กดํ•ฉ๋‹ˆ๋‹ค(CUDA ๋ฐฑ์—”๋“œ์—๋งŒ ํ•ด๋‹น). ๋‚˜๋Š” ๊ทธ๋“ค์ด ์ด ๊ฐœ๋ฐœ ๊ธฐ๊ฐ„ ๋™์•ˆ ViennaCL์—์„œ ์ด๋ฏธ ๊ฒช์—ˆ๋˜ ๋ชจ๋“  ํ•จ์ •์„ ๋งŽ์ด ๊ฒฝํ—˜ํ•˜์ง€ ์•Š์•˜๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

@bhack ์ €ํฌ๋Š” ํ˜„์žฌ ์ด๋ฒˆ ์ฃผ ์‹œ์• ํ‹€์—์„œ ๋Œ€๋ฉด ํšŒ์˜๋ฅผ ํ•˜๊ณ  ์žˆ์ง€๋งŒ DNN ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์— ๋Œ€ํ•œ ์ด์•ผ๊ธฐ์ธ์ง€ ์•„๋‹Œ์ง€๋Š” ๋งํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. :-)

@keryell ์‹œ์• ํ‹€์—์„œ ๋Œ€์˜๋ฅผ ์œ„ํ•ด ๋…ธ๋ ฅํ•˜์‹ญ์‹œ์˜ค ;)

@karlrupp ๋งž์Šต๋‹ˆ๋‹ค. OpenCL๊ณผ CUDA๋Š” ๋„ˆ๋ฌด ๋‹ค๋ฅธ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์ ‘๊ทผ ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด CUDA ๋ฐ OpenMP 4.5์—์„œ ๋ณผ ์ˆ˜ ์žˆ๋Š” ๋‹จ์ผ ์†Œ์Šค ์ธก๋ฉด์€ ์†Œํ”„ํŠธ์›จ์–ด ์—”์ง€๋‹ˆ์–ด๋ง ๊ด€์ ์—์„œ ๋งค์šฐ ๊ฐ•๋ ฅํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ์‹ค์ œ C++ ํ”„๋กœ๊ทธ๋ž˜๋จธ๋ฅผ ์œ„ํ•œ SYCL ํ‘œ์ค€์ด ์žˆ๋Š” ์ด์œ ์ž…๋‹ˆ๋‹ค. SYCL์€ ์–ธ์–ด ํ™•์žฅ์ด ์—†๊ณ  ์ผ๋ถ€ OpenMP ์ธก๋ฉด(์ž‘์—…)์ด ์žˆ๋Š” ์Šคํ…Œ๋กœ์ด๋“œ์—์„œ CUDA๋กœ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์ธ SYCL ์žฅ์น˜ ์ปดํŒŒ์ผ๋Ÿฌ๋Š” SPIR-V ์ปค๋„์„ ์ƒ์„ฑํ•  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋ฉ๋‹ˆ๋‹ค.

์ด์‹์„ฑ์— ๋Œ€ํ•œ ๊ท€ํ•˜์˜ ์šฐ๋ ค๋Š” OpenCL 2.1 ๋ฐ Vulkan์—์„œ ๋ฐ˜๋“œ์‹œ ์ˆ˜์šฉํ•ด์•ผ ํ•˜๋Š” SPIR-V ํ‘œ์ค€(Vulkan ๋ฐ OpenCL ์„ธ๊ณ„์—์„œ nVidia PTX/AMDIL/...์˜ ํœด๋Œ€์šฉ ๋™๋“ฑ๋ฌผ)์˜ ๋ฌธ์ œ๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ SPIR-V๋ฅผ ์ƒ์„ฑํ•˜๋Š” ํ”„๋ŸฐํŠธ ์—”๋“œ๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ ์‹คํ–‰ํ•  ํ•˜๋“œ์›จ์–ด์˜ ์„ธ๋ถ€ ์‚ฌํ•ญ์— ๋Œ€ํ•œ ํŠน๋ณ„ํ•œ ์ง€์‹์ด ํ•„์š”ํ•˜์ง€ ์•Š๋‹ค๋Š” ์žฅ์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค. LLVM IR๊ณผ SPIR-V ์‚ฌ์ด์— Khronos ์˜คํ”ˆ ์†Œ์Šค ์–‘๋ฐฉํ–ฅ ๋ณ€ํ™˜๊ธฐ๊ฐ€ ์žˆ์œผ๋ฏ€๋กœ ์™„์ „ํžˆ ์ƒˆ๋กœ์šด ์˜์—ญ์ด ์—ด๋ฆฝ๋‹ˆ๋‹ค.

@keryell ๋‚˜๋Š” SPIR-V๊ฐ€ ํ•œ ๊ฑธ์Œ ๋” ๋‚˜์•„๊ฐ”๋‹ค๋Š” ๋ฐ ๋™์˜ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์™„์ „ํ•œ ์ง€ํŒ…์˜ ๋ชจ๋“  ๋ฌธ์ œ๋ฅผ ๋‹ค๋ฃจ์ง€๋Š” ์•Š์Šต๋‹ˆ๋‹ค.

์‹คํ–‰ํ•  ํ•˜๋“œ์›จ์–ด์˜ ์„ธ๋ถ€ ์‚ฌํ•ญ์— ๋Œ€ํ•œ ํŠน๋ณ„ํ•œ ์ง€์‹์ด ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ์ •ํ™•ํžˆ ๋™์ผํ•˜๋‹ค๊ณ  ์ฃผ์žฅํ•˜๋Š” OpenCL 1.0 ๋งˆ์ผ€ํŒ…์˜ ๋ณต์‚ฌ ๋ฐ ๋ถ™์—ฌ๋„ฃ๊ธฐ์ž…๋‹ˆ๊นŒ? ์ตœ๋Œ€ ์„ฑ๋Šฅ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค๋ฉด _ํ•ญ์ƒ_ ๊ธฐ๋ณธ ํ•˜๋“œ์›จ์–ด์˜ ์„ธ๋ถ€ ์‚ฌํ•ญ์œผ๋กœ ๋‚ด๋ ค๊ฐ€์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ํŠนํžˆ ๋น ๋ฅธ ํ…์„œ ์ˆ˜์ถ•์˜ ๋งฅ๋ฝ์—์„œ ๊ทธ๋ ‡์Šต๋‹ˆ๋‹ค.

... @scott-gray๊ฐ€ ๋„ค์˜จ ์œผ๋กœ ์‹œ์—ฐํ•œ ๊ฒƒ์ฒ˜๋Ÿผ

@karlrupp

์ด๊ฒƒ์€ ์ •ํ™•ํžˆ ๋™์ผํ•˜๋‹ค๊ณ  ์ฃผ์žฅํ•˜๋Š” OpenCL 1.0 ๋งˆ์ผ€ํŒ…์˜ ๋ณต์‚ฌ ๋ฐ ๋ถ™์—ฌ๋„ฃ๊ธฐ์ž…๋‹ˆ๊นŒ?

ใ…‹. :-)

์ตœ๋Œ€ ์„ฑ๋Šฅ์„ ๋ชฉํ‘œ๋กœ ํ•˜๋Š” ๊ฒฝ์šฐ ํ•ญ์ƒ ๊ธฐ๋ณธ ํ•˜๋“œ์›จ์–ด์˜ ์„ธ๋ถ€ ์ •๋ณด๋กœ ์ด๋™ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ํŠนํžˆ ๋น ๋ฅธ ํ…์„œ ์ˆ˜์ถ•์˜ ๋งฅ๋ฝ์—์„œ ๊ทธ๋ ‡์Šต๋‹ˆ๋‹ค.

๋ฌผ๋ก , ๊ทธ๋Ÿฌ๋‚˜ 2์ฐจ ์ตœ์ ํ™”๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ์ „์ฒด ํ…œํ”Œ๋ฆฟ C++ ์ฝ”๋“œ์˜ ์ƒ๋‹น ๋ถ€๋ถ„์„ ๊ฐ€์†ํ™”๋œ ๋ฐฉ์‹์œผ๋กœ ์‹คํ–‰ํ•˜๋Š” ๊ฒƒ์ด ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.

์ตœ์ ํ™”๋ฅผ ์œ„ํ•ด ์ตœ์ ํ™”๋œ ๋ฐ”์ด๋„ˆ๋ฆฌ ์ปค๋„์„ NervanaSys๋กœ ๊ฟฐ๋งค๊ฑฐ๋‚˜ SYCL์ด ์ˆœ์ˆ˜ C++์ด๋ฏ€๋กœ asm("...")์„ ๋งŽ์€ #ifdef์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜์—ฌ ๋Œ€์ƒ ์•„ํ‚คํ…์ฒ˜๋ฅผ ํ…Œ์ŠคํŠธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. :-) ์ฆ‰, SPIR-V๋Š” ๊ทธ ์ž์ฒด๋กœ ํ™•์žฅ ๊ฐ€๋Šฅํ•˜๋ฉฐ ์ธ๋ผ์ธ VHDL ๋˜๋Š” Verilog๋ฅผ ์–ด๋Š ์‹œ์ ์— ๋„ฃ์„ ์ˆ˜ ์—†๋Š” ์ด์œ ๋ฅผ ์•Œ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. :-)

๊ทธ๋Ÿฌ๋‚˜ ๋ณด๋‹ค ๊ตฌ์ฒด์ ์œผ๋กœ, ์ตœ๊ทผ์— ๋„์ž…๋œ ํ•˜์œ„ ๊ทธ๋ฃน ์ž‘์—…์€ ์ด์‹ ๊ฐ€๋Šฅํ•œ ๋ฐฉ์‹์œผ๋กœ ์ข‹์€ ์„ฑ๋Šฅ์„ ๋‹ฌ์„ฑํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋  ๊ฒƒ์ด๋ฉฐ ๊ฐ„๋‹จํ•œ ๋‚ด์žฅ ad-hoc ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๋„์›€์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

C++์—๋Š” X ๋˜๋Š” Y ํ•˜๋“œ์›จ์–ด์— ๋” ์ ํ•ฉํ•œ ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด clBLAS ๋˜๋Š” ๊ธฐํƒ€ ํ”„๋ ˆ์ž„์›Œํฌ์—์„œ ์‚ฌ์šฉ๋˜๋Š” ๋Œ€๋ถ€๋ถ„์˜ ์ฝ”๋“œ ์ƒ์„ฑ๊ธฐ๋ฅผ ๋Œ€์ฒดํ•  ์ˆ˜ ์žˆ๋Š” ํฅ๋ฏธ๋กœ์šด ๋ฉ”ํƒ€ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๊ธฐ๋Šฅ์ด ์ถ”๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ C++17์˜ N4355๋Š” ์กฐ๋งŒ๊ฐ„ ๊ฒŒ์ž„์— ๋“ค์–ด๊ฐˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@karlrupp , @bhack tensorflow ์ ‘๊ทผ ๋ฐฉ์‹์€ ์ผ๋ฐ˜์ ์ธ ์‹ ๊ฒฝ๋ง์— ํ•„์š”ํ•œ ๋Œ€๋ถ€๋ถ„์˜ ์ž‘์—…์— ๋Œ€ํ•ด ํ•˜๋“œ์›จ์–ด ์ถ”์ƒํ™”(ํ…์„œ ๋ชจ๋“ˆ)์— ์˜์กดํ•˜๋Š” ๋ฐ˜๋ฉด, ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ช‡ ๊ฐ€์ง€ ์ž‘์—…์—๋Š” ํŠน์ˆ˜ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ(์˜ˆ: cudnn)์— ์˜์กดํ•ฉ๋‹ˆ๋‹ค. ์ •๋ง ์ค‘์š”ํ•œ ์„ฑ๋Šฅ์ž…๋‹ˆ๋‹ค. ํ•˜๋“œ์›จ์–ด ์ถ”์ƒํ™”๋ฅผ ํ†ตํ•ด ์šฐ๋ฆฌ๋Š” ๋Œ€๋ถ€๋ถ„์˜ TensorFlow ์ž‘์—…์„ ํ•œ ๋ฒˆ๋งŒ ๊ตฌํ˜„ํ•˜๊ณ  ์ถฉ๋ถ„ํ•œ ์„ฑ๋Šฅ ์ด์ƒ์˜ ๊ฐ€์†๊ธฐ์—์„œ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@bhack ์˜ˆ ์ €๋Š” ๋‹ค์ฐจ์› ๋ฐฐ์—ด์„ ์ข‹์•„ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ์šฐ๋ฆฌ์˜ ๊ด€์‹ฌ ์˜์—ญ์—๋Š” ์ด๋Ÿฌํ•œ ๋ฌธ์ œ์— ๊ด€์‹ฌ์ด ์žˆ๋Š” ๋ชจ๋“  ์‚ฌ๋žŒ๋“ค ์ด ํ‘œ์ค€์œผ๋กœ ์ˆ˜๋ ดํ•˜๋„๋ก ํ•˜๋Š” C++ ์œ„์›ํšŒ์˜ SG14๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
https://groups.google.com/a/isocpp.org/forum/#!forum/sg14
๋ฌผ๋ก  SYCL์€ ๋…ผ์˜ ์ค‘์ž…๋‹ˆ๋‹ค. :-)

@benoitsteiner ์ฃผ๋กœ ํ’€๋ง ๋ฐ ์ปจ๋ณผ๋ฃจ์…˜์„ ์œ„ํ•ด cudnn์—์„œ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ๊ณต๊ธ‰์—…์ฒด๊ฐ€ ์ž์ฒด ๋ฐ”์ด๋„ˆ๋ฆฌ ์–ด์…ˆ๋ธ”๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด ์ž‘์—…์„ ์œ„ํ•œ ์ž์ฒด ํ•˜๋“œ์›จ์–ด๋กœ API๋ฅผ ์ƒ์‚ฐํ•œ๋‹ค๋ฉด ํ™•์žฅ ๊ฐ€๋Šฅํ•œ ์ ‘๊ทผ ๋ฐฉ์‹์ด ์•„๋‹ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡๊ธฐ ๋•Œ๋ฌธ์— ์„ฑ๋Šฅ์— ์ค‘์š”ํ•œ ์ผ๋ถ€ API ํ˜ธ์ถœ์€ ์–ด๋–ค ์‹์œผ๋กœ๋“  ํ‘œ์ค€ํ™”ํ•˜๋Š” ๊ฒƒ์ด ๋” ๋‚ซ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

@keryell ํŠนํžˆ vector/SIMD ํ˜ธ์ถœ ์˜์ œ์—์„œ ์ƒˆ๋กœ์šด SG14 c++์˜ Matrix/Tensor์— ๋Œ€ํ•œ ์ •๋ง ํฅ๋ฏธ๋กœ์šด ์ฃผ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์•„๋ฌด๋„ ์ปจ๋ณผ๋ฃจ์…˜, ํ’€๋ง ๋ฐ ๊ธฐํƒ€ ์œ ์šฉํ•œ "์•ˆ์ •ํ™”๋œ" ๋”ฅ ๋Ÿฌ๋‹ ์ธํ„ฐํŽ˜์ด์Šค์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐํ•˜์ง€ ์•Š์€ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์ด ํŠน์ • ํ‘œ์ค€ํ™” ํ•˜์œ„ ๊ทธ๋ฃน์—๋Š” Nvidia, Intel, Amd, CodePlay ๋“ฑ์˜ ์‚ฌ๋žŒ๋“ค์ด ์žˆ์ง€๋งŒ ๋‹ค๋ฅธ ๊ทธ๋ฃน์— ์žˆ๋Š” ๊ฒฝ์šฐ์—๋„ Google์ด ์•„๋‹Œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

:+1:

@bhack ์˜ˆ, SG14์—๋Š” ์•„์ง ๊ธฐ๊ณ„ ํ•™์Šต ์Šคํƒ€์ผ ์ œ์•ˆ์ด ์—†์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ฐธ์—ฌ๋Š” ์—ด๋ ค ์žˆ์œผ๋ฏ€๋กœ ๋ช‡ ๊ฐ€์ง€ ์ œ์•ˆ์„ ๋ณด๋‚ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. :-) ํ•˜์ง€๋งŒ ์•„๋งˆ๋„ SG6(์ˆซ์ž ์ฃผ์ œ)์ด ๋” ์ ์ ˆํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ๋“ค์ด ์•„์ง ์ž์‹ ์˜ ๋ฉ”์ผ๋ง ๋ฆฌ์ŠคํŠธ/ํฌ๋Ÿผ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

@gujunli OpenCL Caffe๋Š” Android์—์„œ ์‹คํ–‰๋˜๋‚˜์š”? ์—ฌ๊ธฐ์— ์งˆ๋ฌธํ•ด์„œ ์ฃ„์†กํ•˜์ง€๋งŒ ๋‹ค๋ฅธ ๊ณณ์—์„œ ์งˆ๋ฌธํ•  ์ˆ˜ ์žˆ๋Š” ๊ณณ์„ ์ฐพ์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. :) Android ๊ธฐ๊ธฐ์—์„œ ์‹คํ–‰๋˜๋Š” ๋”ฅ ๋Ÿฌ๋‹ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์žˆ์œผ๋ฉด ์ข‹๊ฒ ์ง€๋งŒ GPU๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜๋Š” ์žˆ์ง€๋งŒ ํ˜„์žฌ๋กœ์„œ๋Š” ์—†๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. (ํ‹€ ๋ ธ์œผ๋ฉด ๋งํ•ด์ค˜!)

@krikru
๊ณต์‹(๊ทธ๋Ÿฌ๋‚˜ ์‹คํ—˜์ ์ธ) OpenCL Caffe ๋ธŒ๋žœ์น˜๋Š” Android GPU์—์„œ ์‹คํ–‰๋˜๋„๋ก ๋งŒ๋“ค ์ˆ˜ ์žˆ์ง€๋งŒ ํ˜„์žฌ ์„ฑ๋Šฅ์€ ์ตœ์ ๊ณผ๋Š” ๊ฑฐ๋ฆฌ๊ฐ€ ๋ฉ€์Šต๋‹ˆ๋‹ค. https://github.com/sh1r0/caffe-android-lib/issues/23 ๋ฐ https://github.com/BVLC/caffe/tree/opencl์„ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค.

cudnn์˜ ์ง„์ •ํ•œ ๋Œ€์•ˆ์€ Tensor, NdConvolution, NdPooling ์—ฐ์‚ฐ์ž ๋ฐ (์•„๋งˆ๋„) ํ‘œ์ค€ํ™” ๊ฐ€๋Šฅํ•œ ๊ฒƒ์œผ๋กœ ๊ฐ„์ฃผ๋  ์ˆ˜ ์žˆ๋Š” ๋‹ค๋ฅธ ์—ฐ์‚ฐ์ž๋ฅผ ์ง€์›ํ•˜๋Š” OpenVx ํ‘œ์ค€ ๊ฐœ์ฒด ์˜ ํ™•์žฅ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋˜ํ•œ cudnn ํŒ€์€ ๋ชจ๋“  ๋ฆด๋ฆฌ์Šค์—์„œ ์–ด๋–ค ์ƒˆ๋กœ์šด API์™€ ์—ฐ์‚ฐ์ž๋ฅผ ๋„์ž…ํ• ์ง€ ์„ ํƒํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ฌผ๋ก  ํ‘œ์ค€์€ cudnn ๋ฆด๋ฆฌ์Šค๋งŒํผ ๋น ๋ฅด๊ฒŒ ์ด๋™ํ•  ์ˆ˜๋Š” ์—†์ง€๋งŒ ์ผ๋ถ€ ์ž‘์—… ๋ฐ ๊ฐœ์ฒด์—๋Š” ํ‘œ์ค€ํ™”ํ•˜๊ธฐ์— ์ถฉ๋ถ„ํ•œ "์ธ์šฉ ๊ธฐ๋ก"์ด ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

@hughperkins ํ˜„์žฌ๋กœ์„œ๋Š” ๋”ฅ ๋Ÿฌ๋‹ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‹œ๋„ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์–ด๋–ค ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์ž ์žฌ์ ์œผ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ์•„๋ณด๊ธฐ ์œ„ํ•ด ๋ช‡ ๊ฐ€์ง€ ์ •์ฐฐ์„ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. Android์—์„œ cltorch์™€ DeepCL์„ ์‚ฌ์šฉํ•ด ๋ณด์…จ์Šต๋‹ˆ๊นŒ? Android ์ „์šฉ Torch ๊ตฌํ˜„์ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— cltorch๊ฐ€ Android์—์„œ ์ž‘๋™ํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  Android์—์„œ ์ž‘๋™ํ•˜๊ณ  _๊ทธ๋ฆฌ๊ณ _ OpenCL์„ ์‚ฌ์šฉํ•˜๋Š” ๊ตฌํ˜„์ด ์ด๋ฏธ ์žˆ์—ˆ๋‹ค๋ฉด ์™œ ๊ทธ๋Ÿฐ ๊ตฌํ˜„์„ ํ–ˆ์„๊นŒ์š”? ํ•˜์ง€๋งŒ ๋‚ด๊ฐ€ ๋” ์ž˜ ์•Œ์•˜์–ด์•ผ ํ–ˆ์„์ง€๋„ ๋ชฐ๋ผ.

@hughperkins ์–ด๋–ค ์ด์œ ๋กœ ๋‚˜๋Š” ํ† ์น˜-์•ˆ๋“œ๋กœ์ด๋“œ ๊ฐ€ ์•ˆ๋“œ๋กœ์ด๋“œ์˜ ๊ณต์‹ ํ† ์น˜ ๊ตฌํ˜„์ด๋ผ๊ณ  ์ƒ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ์ฆ‰, cltorch๋ฅผ ํฌํ•จํ•˜์—ฌ ๋‹ค๋ฅธ ํ† ์น˜ ๊ตฌํ˜„(์ ์–ด๋„ ๊ณต์‹์ ์ด์ง€ ์•Š์€)์€ ์•ˆ๋“œ๋กœ์ด๋“œ์—์„œ ์›ํ™œํ•˜๊ฒŒ ์‹คํ–‰๋  ๊ฐ€๋Šฅ์„ฑ์ด ์—†์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ์™œ ๊ทธ๋Ÿฐ ์ƒ๊ฐ์„ ํ–ˆ๋Š”์ง€ ๋ชจ๋ฅด๊ฒ ์ง€๋งŒ, ๋‹น์—ฐํžˆ ๋ง์ด ์•ˆ ๋œ๋‹ค.

์Œ... Soumith ์ข…๋ฅ˜์˜ ์กฐ์ • ํšƒ๋ถˆ ๊ฐœ๋ฐœ. ๊ทธ๋Š” Facebook AI Research์—์„œ ์ผํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ, ํ† ์น˜-์•ˆ๋“œ๋กœ์ด๋“œ ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๋Š” Soumith์— ์†ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๊ณต์‹์— ์ƒ๋‹นํžˆ ๊ฐ€๊น๋‹ค๊ณ  ๋งํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์–ด๋–ค ์ด์œ ๋กœ ์ฝ”์–ด์˜ ์ผ๋ถ€๊ฐ€ ์•„๋‹ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•ด๋‹น ๋ฆฌํฌ์ง€ํ† ๋ฆฌ ๋˜๋Š” https://groups.google.com/forum/#!forum/torch7 ์—์„œ ๋ฌธ์ œ๋กœ ์งˆ๋ฌธํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์‹ค์ œ๋กœ Soumith๋Š” https์˜ ์š”์ฒญ์„ ์ฒ˜๋ฆฌํ•˜๋Š” ์ฃผ์š” ์‚ฌ๋žŒ์ด๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. //groups.google.com/forum/#!forum/torch7 , ์•„๋งˆ๋„ ๊ฑฐ๊ธฐ์— ์งˆ๋ฌธ์„ ๊ฒŒ์‹œํ•˜๊ณ  ์‹ถ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

cltorch๋ฅผ ํฌํ•จํ•˜์—ฌ Android์—์„œ ์›ํ™œํ•˜๊ฒŒ ์‹คํ–‰๋˜๋Š” ๋‹ค๋ฅธ Torch ๊ตฌํ˜„(์ ์–ด๋„ ๊ณต์‹์ ์ด์ง€๋Š” ์•Š์Œ)์ด ์—†์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

cltorch๋Š” ํ† ์น˜์˜ ๊ตฌํ˜„์ด ์•„๋‹™๋‹ˆ๋‹ค. ํ”Œ๋Ÿฌ๊ทธ์ธ์ด๋ฉฐ OpenCL์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋‘˜ ๋‹ค ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

cltorch๋Š” ํ† ์น˜์˜ ๊ตฌํ˜„์ด ์•„๋‹™๋‹ˆ๋‹ค. ํ”Œ๋Ÿฌ๊ทธ์ธ์ด๋ฉฐ OpenCL์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋‘˜ ๋‹ค ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

์•„, ์„ค๋ช… ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

@naibaf7 OpenCL Caffe ๋ธŒ๋žœ์น˜์™€ AMD์˜ OpenCL Caffe ๊ตฌํ˜„์€ ์ด๋ฆ„ ์™ธ์— ๊ณตํ†ต์ ์ด ๋” ์žˆ์Šต๋‹ˆ๊นŒ? ๋‘˜์„ ๋น„๊ตํ–ˆ๊ฑฐ๋‚˜ ์„ฑ๋Šฅ์— ์ฐจ์ด๊ฐ€ ์žˆ๋Š”์ง€ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ? OpenCL ๋ถ„๊ธฐ๊ฐ€ ์ตœ์ ์˜ ์„ฑ๋Šฅ๊ณผ๋Š” ๊ฑฐ๋ฆฌ๊ฐ€ ๋ฉ€๋‹ค๊ณ  ์”๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์ด ์˜๋ฏธํ•˜๋Š” ๋ฐ”๋Š” ๋ฌด์—‡์ด๋ฉฐ ์ด๋ฅผ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ๊ฒƒ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? Android์—์„œ ์‹œ๋„ํ•˜๋Š” ๊ฒƒ์€ ํฅ๋ฏธ๋กœ์šธ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์šฐ๋ฆฌ๋Š” ์ฃผ์ œ๋ฅผ ๋ฒ—์–ด๋‚˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค

@bhack ์˜ˆ, ์ด ์Šค๋ ˆ๋“œ๋ฅผ ๊ฐ€๋กœ์ฑ„์„œ ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. ์–ด๋””์—์„œ ์งˆ๋ฌธ์„ ํ•ด์•ผ ํ• ์ง€ ๋ชฐ๋ž์„ ๋ฟ์ž…๋‹ˆ๋‹ค.

@krikru
Caffe ๋ธŒ๋žœ์น˜์—์„œ ์ด์— ๋Œ€ํ•œ ๋ฌธ์ œ๋ฅผ ์ œ๊ธฐํ•˜๊ณ  Android ๋ฐ OpenCL๋กœ ํ”Œ๋ž˜๊ทธ๋ฅผ ์ง€์ •ํ•˜์‹ญ์‹œ์˜ค. ๊ทธ๋Ÿฌ๋ฉด ์ด์— ๋Œ€ํ•ด ๋” ๋…ผ์˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐ์‚ฌ ํ•ด์š”.

@keryell 3์›”์˜ ๋‹ค์Œ f2f SG14 ํšŒ์˜๋Š” Google์—์„œ ์ฃผ์ตœํ•  ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๋‚ด๋ถ€์— ํ…์„œํ”Œ๋กœ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?

/cc @jfbastien

์•„๋งˆ๋„ @benoitsteiner ๋Š” ํ˜„์ง€์ธ์ด๊ธฐ ๋•Œ๋ฌธ์— ๋“ค๋ฅผ ์ˆ˜ ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๊ทธ๋Ÿฌ๋‚˜ ์ด ์ด๋ฒคํŠธ ์ „์— ํ”Œ๋กœ๋ฆฌ๋‹ค ์žญ์Šจ๋นŒ์—์„œ ์›”๋ง์— ์™„์ „ํ•œ C++ F2F๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
https://isocpp.org/files/papers/N4568.pdf
๋ถˆํ–‰ํžˆ๋„ ๋‚˜๋Š” ๊ทธ๋“ค ์ค‘ ์–ด๋Š ๊ฒƒ์—๋„ ์ฐธ์„ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

+1

@bhack ๋‹ค์ฐจ์› ๋ฐฐ์—ด์— ๋Œ€ํ•ด ์ง€์ ํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ํฅ๋ฏธ๋กญ๊ณ  ์‹ค์ œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜์ง€๋งŒ C++์—์„œ ๊ทธ๋Œ€๋กœ ๋น„์ค€ํ•˜๊ธฐ์—๋Š” ๋„ˆ๋ฌด ์ž„์‹œ์ ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๊ฐœ์ธ์ ์œผ๋กœ ์ €๋Š” Boost.MultiArray๋ฅผ ์‚ฌ์šฉํ•˜๋ฉฐ Boost.MultiArray์˜ ์„ธ๋ จ๋œ ๋ฒ„์ „์ด ๋” ์ž์‹  ์žˆ์Šต๋‹ˆ๋‹ค.

WG21 ์—๋„ ๋ช‡ ๊ฐ€์ง€ ๋…ผ๋ฌธ ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋ณด์‹œ๋‹ค์‹œํ”ผ Google์˜ @jfbastien์€ WG21 ์—์„œ ์ผ๋ถ€ ํ™œ๋™์„ ํ–ˆ์œผ๋ฉฐ 3์›”์— Google์—์„œ SG14 f2f ํšŒ์˜๋ฅผ ์ฃผ์ตœํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ฃผ๊ธฐ๋„ ํ–ˆ์Šต๋‹ˆ๋‹ค.

@bhack @keryell ์„ธ๋ถ€ ์‚ฌํ•ญ์ด OpenCL/tensorflow์™€ ๊ด€๋ จ์ด ์—†๊ธฐ ๋•Œ๋ฌธ์— ์ด ํ† ๋ก ์„ SG14 ๋ฉ”์ผ๋ง ๋ฆฌ์ŠคํŠธ ๋กœ ๊ฐ€์ ธ๊ฐˆ ๊ฐ€์น˜๊ฐ€ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ, ๋” ์ด์ƒ ๋ชจ๋“  ์„ธ๋ถ€ ์‚ฌํ•ญ๊ณผ ํ•จ๊ป˜ ๋” ์ด์ƒ ์—„๊ฒฉํ•˜๊ฒŒ ์ œํ•œ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. Eigen/sycl ์ง€์› ์™ธ cudnn ํ˜ธ์ถœ ๊ณ„ํš์ด ์žˆ์Šต๋‹ˆ๊นŒ?

+1 ๋งค์šฐ ํฅ๋ฏธ๋กœ์šด ์ฃผ์ œ. ๊ณง ์ถœ์‹œ๋˜๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค.

์ด ์Šค๋ ˆ๋“œ๋Š” ๋งค์šฐ ํฅ๋ฏธ๋กญ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์•ˆ๋“œ๋กœ์ด๋“œ์—์„œ ์ž‘๋™ํ•˜๋Š” ์นดํŽ˜๋ฅผ ์–ป์œผ๋ ค๊ณ  ๋…ธ๋ ฅํ•ด ์™”์Šต๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ๋Š” ๋†€๋ผ์šด ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. Mali gpu๋กœ ์‹คํ–‰๋˜๋Š” caffe๋Š” cpu๋ณด๋‹ค 2-3๋ฐฐ ๋Š๋ฆฌ์ง€๋งŒ ์•ฝ 4-5๋ฐฐ ๋” ์—๋„ˆ์ง€ ํšจ์œจ์ ์ž…๋‹ˆ๋‹ค. ํ…Œ์ŠคํŠธ๋Š” Galaxy S6(Mali T760, Peak Performance 200 GFlops)์—์„œ ์‹คํ–‰๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

GEMM์€ caffe์—์„œ ์ปจ๋ณผ๋ฃจ์…˜์˜ ํ•ต์‹ฌ์ด๋ฏ€๋กœ Android์—์„œ ์„ฑ๋Šฅ์„ ํ”„๋กœํŒŒ์ผ๋งํ•˜๊ธฐ๋กœ ๊ฒฐ์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. ViennaCL์€ ์ผ๋ถ€ ๊ฐ„๋‹จํ•œ ์ปค๋„๋งŒํผ ํšจ์œจ์ ์ด์ง€ ์•Š์€ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ด์ œ ํฐ ํ–‰๋ ฌ(2k x 2k)์— ๋Œ€ํ•ด GPU๋ฅผ CPU๋งŒํผ ๋น ๋ฅด๊ฒŒ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ GPU๊ฐ€ ํ›จ์”ฌ ๋” ๋น ๋ฅผ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์€ ์—ฌ์ „ํžˆ โ€‹โ€‹์ง๊ด€์ ์ด์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋ณด๋‹ค:
https://github.com/strin/mocha-profile

์ปค๋„ ๊ตฌํ˜„์€ ๋‹ค์Œ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

GEMM์šฉ OpenCL ์ปค๋„: https://github.com/strin/gemm-android

์ด๊ฒฌ์žˆ๋Š” ์‚ฌ๋žŒ?

@bhack ๊ณต์œ ํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์ด ์Šค๋ ˆ๋“œ๋Š” ๋งค์šฐ ํฅ๋ฏธ๋กญ๊ฒŒ ๋ณด์ž…๋‹ˆ๋‹ค. ์ œ์•ˆํ•œ ๋Œ€๋กœ DVFS๋ฅผ ์ผœ๋ ค๊ณ  ํ–ˆ์ง€๋งŒ ViennaCL์—์„œ sgemm์— ๋Œ€ํ•ด ์ค‘์š”ํ•œ ์„ฑ๋Šฅ์ด ๋‚˜ํƒ€๋‚˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.

+1

@strin MALI SDK์—์„œ ๋งˆ์ง€๋ง‰ sgemm ๋ฒ„์ „์„ ์‚ฌ์šฉํ•ด ๋ณด์…จ์Šต๋‹ˆ๊นŒ?

Tensorflow๊ฐ€ ๋Šฆ์—ˆ์Šต๋‹ˆ๋‹ค! ์•„ ์•„
https://gist.github.com/jarutis/ff28bca8cfb9ce0c8b1a

์ด๊ฒƒ์€ ์ „๋žต์— ์˜ํ–ฅ์„ ๋ฏธ์น  ๊ฒƒ์ž…๋‹ˆ๋‹ค: http://lists.llvm.org/pipermail/llvm-dev/2016-March/096576.html?
ํŽธ์ง‘ํ•˜๋‹ค:
"StreamExecutor๋Š” ํ˜„์žฌ ๋Œ€๋ถ€๋ถ„์˜ Google ๋‚ด๋ถ€ GPGPU ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ๋Ÿฐํƒ€์ž„์œผ๋กœ ์‚ฌ์šฉ๋˜๋ฉฐ, ์ด ์Šค๋ƒ…์ƒท์€ GPGPU ๋Ÿฐํƒ€์ž„ ์—ญํ• ์„ ํ•˜๋Š” ์˜คํ”ˆ ์†Œ์Šค TensorFlow_ ํ”„๋กœ์ ํŠธ์— ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค."

+1

์ž‘์—…ํ•˜๋Š” ์‚ฌ๋žŒ๋“ค์ด tensorflow๊ฐ€ 1.0์— ๊ฐ€๊นŒ์›Œ์งˆ ๋•Œ๊นŒ์ง€ CUDNN ๋Œ€์•ˆ ๋ฌธ์ œ๋ฅผ ๊ทน๋ณตํ•  ์ˆ˜ ์žˆ๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค.

@martinwick ์ด ๋ฌธ์ œ๊ฐ€ ์ข…๋ฃŒ๋œ ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

๋‚˜๋Š” ๋‹น์‹ ์˜ ์ปค๋ฐ‹์ด ์ด๊ฒƒ์„ ๊ณ ์น  ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋‹ค๋ฅธ ์ €์žฅ์†Œ์—์„œ ํ•ญ์ƒ ๋™์ผํ•œ ์ปค๋ฐ‹ ์ฃผ์„์„ ์‚ฌ์šฉํ•  ์ˆ˜๋Š” ์—†์Šต๋‹ˆ๋‹ค.) https://github.com/tensorflow/skflow/issues/22

์˜ค ๊นƒํ—ˆ๋ธŒ

@vrv ์ด์ œ ๊ท€ํ•˜๊ฐ€ ์šฐ๋ฆฌ์—๊ฒŒ ๋งค์šฐ ํ†ต๋ณด๋ฅผ ํ•˜์˜€์œผ๋‹ˆ ์ŠคํŠธ๋ฆผ ์‹คํ–‰์ž ์ „๋žต์— ๋Œ€ํ•œ ํ”ผ๋“œ๋ฐฑ์„ ์ฃผ์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ;)

OpenCL ์ง€์› ๋ถ€์กฑ์„ ํฌํ•จํ•˜์—ฌ ๋ชจ๋“  ๊ฒƒ์— ๋Œ€ํ•ด GitHub๋ฅผ ๋น„๋‚œํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ;)

@benoitsteiner ๊ฐ€ ๋” ๋งŽ์€ ์˜๊ฒฌ์„ ์ œ์‹œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. '์ŠคํŠธ๋ฆผ ์ง‘ํ–‰์ž' ์ „๋žต์ด ๋ฌด์—‡์„ ์˜๋ฏธํ•˜๋Š”์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ํ˜„์žฌ ์ŠคํŠธ๋ฆผ ์‹คํ–‰๊ธฐ ๋ฒ„์ „๊ณผ CuDNN ๋ฐ Eigen์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์œผ๋ฉฐ ๋ชจ๋‘ ํ•จ๊ป˜ ์ž˜ ์ž‘๋™ํ•˜๋ฏ€๋กœ OpenCL ์ธก๋ฉด์—์„œ ๊ณ„ํš์ด ์–ด๋–ป๊ฒŒ ๋ณ€๊ฒฝ๋˜์—ˆ๋Š”์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋‚ด ๋ง์€:
"StreamExecutor๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?
==========================
StreamExecutor ๋Š” CUDA ๋ฐ OpenCL ํ˜ธ์ŠคํŠธ ์ธก ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ชจ๋ธ(๋Ÿฐํƒ€์ž„)์— ๋Œ€ํ•œ ํ†ตํ•ฉ ๋ž˜ํผ์ž…๋‹ˆ๋‹ค. ํ˜ธ์ŠคํŠธ ์ฝ”๋“œ๊ฐ€ ๋™์ผํ•˜๊ฒŒ ์ž‘๋™ํ•˜๋Š” ๋ฐ์ดํ„ฐ ๋ณ‘๋ ฌ ์ปค๋„์„ ์‚ฌ์šฉํ•˜๋Š” CUDA ๋˜๋Š” OpenCL ์žฅ์น˜๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค."

๋ฏธ๋ฆฌ ์ค€๋น„๋œ ์ž‘์—…
===================
StreamExecutor๋Š” ์ผ๋ฐ˜์ ์ธ ๋ฐ์ดํ„ฐ ๋ณ‘๋ ฌ ์ž‘์—…์„ ์œ„ํ•ด ๋ฏธ๋ฆฌ ์ •์˜๋œ ์—ฌ๋Ÿฌ ์ปค๋„์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
์ง€์›๋˜๋Š” ์ž‘์—… ํด๋ž˜์Šค๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • BLAS: ๊ธฐ๋ณธ ์„ ํ˜• ๋Œ€์ˆ˜ ํ•˜์œ„ ํ”„๋กœ๊ทธ๋žจ,
  • DNN: ์‹ฌ์ธต ์‹ ๊ฒฝ๋ง,
  • FFT: ๊ณ ์† ํ‘ธ๋ฆฌ์— ๋ณ€ํ™˜ ๋ฐ
  • RNG: ๋‚œ์ˆ˜ ์ƒ์„ฑ.

@keryell ์•ˆ๋…•ํ•˜์„ธ์š”, ์ €๋Š” Xilinx C++ ๋˜๋Š” OpenCL๊ณผ ๊ฐ™์€ ๊ณ ๊ธ‰ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ FPGA์—์„œ TensorFlow๋ฅผ ๊ตฌํ˜„ํ•˜๋Š” ๋ฐ๋„ ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ณ„ํš์ด ์žˆ๋‹ค๋ฉด ๊ธฐ๊บผ์ด ๊ธฐ์—ฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

@henline Opencl ๋ฐ ๊ด€๋ จ Canned์—์„œ StreamExecutor์˜ ์—ญํ• ์ด ๋ฌด์—‡์ธ์ง€ ์„ค๋ช…ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?
Tensorflow์— ๋Œ€ํ•œ ์ž‘์—…. ๋‚˜๋Š” ์ด๊ฒƒ์ด Eigen ๋ฐ cudnn(๋Œ€์ฒด?)์˜ SyCL ๊ณ„ํš๊ณผ ์–ด๋–ป๊ฒŒ ํ†ตํ•ฉ๋˜๋Š”์ง€ ์—ฌ์ „ํžˆ ์•Œ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

:+1: ์ €๋„ ์ด์— ๊ธฐ์—ฌํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

@bhack StreamExecutor๋Š” CUDA ๋Ÿฐํƒ€์ž„ ๋ฐ ์ผ๋ถ€ CUDA ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ(์˜ˆ: cublas ๋˜๋Š” cudnn)์™€ ๋™์ผํ•œ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์—ฌ์ „ํžˆ ์šฐ๋ฆฌ๊ฐ€ Eigen์„ ์‚ฌ์šฉํ•˜๋Š” GPU ์ปค๋„์„ ์ž‘์„ฑํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

@benoitsteiner ๊ทธ๋ž˜์„œ ์—ฌ์ „ํžˆ ๋‘ ๊ฐœ์˜ ์ปค๋„์„ ์ž‘์„ฑํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ? ํ•˜๋‚˜๋Š” CUDA์šฉ์ด๊ณ  ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” Opencl์šฉ์ž…๋‹ˆ๊นŒ?

@benoitsteiner ๋‚ด๋ถ€์  ์œผ๋กœ ์•„์ง tensorflow/tensorflow/stream_executor/opencl/ ๋Œ€์‘๋ฌผ์ด ์—†๋‚˜์š”? "Canned operator"๋Š” ์–ด๋–ป์Šต๋‹ˆ๊นŒ?

@bhack Eigen์„ ์‚ฌ์šฉํ•˜๋ฉด ํ•œ ๋ฒˆ ์ˆ˜ํ–‰ํ•˜๋ ค๋Š” ๊ณ„์‚ฐ์„ ์„ค๋ช…ํ•˜๋Š” ํ‘œํ˜„์‹์„ ์ž‘์„ฑํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ CPU์—์„œ ํ•ด๋‹น ํ‘œํ˜„์‹์„ ํ‰๊ฐ€ํ•˜๋Š” ์ปค๋„(ํ‰๊ฐ€์ž๋ผ๊ณ  ํ•จ)์„ ์ž๋™์œผ๋กœ ์ƒ์„ฑํ•˜๊ณ  CUDA ์žฅ์น˜์—์„œ ํ‘œํ˜„์‹์„ ํ‰๊ฐ€ํ•˜๋Š” ๋‹ค๋ฅธ ์ปค๋„์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Eigen์—์„œ OpenCL์„ ์ง€์›ํ•˜๋ฉด(๊ฑฐ์˜ ๊ฐ€๊นŒ์›Œ์ง€๊ณ  ์žˆ์Œ) OpenCL ์ปค๋„๋„ ์ž๋™์œผ๋กœ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์„ฑ๋Šฅ์ด ์ค‘์š”ํ•œ ๋ช‡ ๊ฐ€์ง€ TensorFlow ์ž‘์—…(์˜ˆ: ์ปจ๋ณผ๋ฃจ์…˜)์˜ ๊ฒฝ์šฐ ์ˆ˜๋™์œผ๋กœ ์ตœ์ ํ™”๋œ ์ปค๋„ ๋ฐ/๋˜๋Š” ํƒ€์‚ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฝ์šฐ ์ด๋Ÿฌํ•œ ์ž‘์—…์— ๋Œ€ํ•œ ์ข‹์€ OpenCL ๊ตฌํ˜„์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

:+1:

https://bitbucket.org/benoitsteiner/eigen-opencl์—์„œ ๋” ๋งŽ์€ ์ฝ”๋“œ๋ฅผ ํ‘ธ์‹œํ•  ๊ณ„ํš์ด ์žˆ์Šต๋‹ˆ๊นŒ? sycl ์ปดํŒŒ์ผ๋Ÿฌ๋Š” ์–ด๋–ป์Šต๋‹ˆ๊นŒ? ๊ณต๊ฐœ๋œ ์˜คํ”ˆ ์†Œ์Šค GPU ๋Œ€์ƒ ๊ตฌํ˜„์ด ์—†๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

@bhack @benoitsteiner
์ €๋Š” ๊ณง Caffe์˜ OpenCL์— ๋Œ€ํ•œ cuDNN ๋Œ€์ฒดํ’ˆ(์ด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐ ๊ฐ€์žฅ ์ค‘์š”ํ•œ ์„ฑ๋Šฅ๊ณผ ๋ฉ”๋ชจ๋ฆฌ์ด๊ธฐ ๋•Œ๋ฌธ์— ์ปจ๋ณผ๋ฃจ์…˜ ๋ถ€๋ถ„๋งŒ)์„ ์ถœ์‹œํ•  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค. ์•„๋งˆ๋„ Tensorflow ํฌํŠธ์—๋„ ์‚ฌ์šฉ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@bhack : Codeplay๋Š” opencl ์ธก๋ฉด์—์„œ ๋งŽ์€ ๋ฐœ์ „์„ ์ด๋ฃจ์—ˆ์Šต๋‹ˆ๋‹ค. ์•ž์œผ๋กœ ๋ช‡ ์ฃผ ๋™์•ˆ https://bitbucket.org/benoitsteiner/eigen-opencl ์— ๋Œ€ํ•œ ๋Œ€๋Œ€์ ์ธ ํ‘ธ์‹œ๋ฅผ ๊ธฐ๋Œ€ํ•ด ์ฃผ์‹ญ์‹œ์˜ค.

@naibaf7 : Convolution ์—ฐ์‚ฐ์˜ ๋น ๋ฅธ ๊ตฌํ˜„์€ TensorFlow์—์„œ ๋งค์šฐ ์œ ์šฉํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ธฐ๋Œ€๋ฉ๋‹ˆ๋‹ค.

@benoitsteiner ์–ด๋–ป๊ฒŒ cuda ๊ตฌํ˜„์„ ๊ฐ„๋‹จํžˆ ์ œ๊ฑฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? '#ifdef GOOGLE_CUDA'๊ฐ€ ๋„ˆ๋ฌด ๋ณต์žกํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๋•Œ๋กœ๋Š” CUDA๋ฅผ ์˜๋ฏธํ•˜๊ธฐ๋„ ํ•˜๊ณ  ๋•Œ๋กœ๋Š” GPU๋ฅผ ์˜๋ฏธํ•˜๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค.

์ด ๋ฌธ์ œ๊ฐ€ ๋กœ๋“œ๋งต ์— ๋„๋‹ฌํ–ˆ๊ธฐ ๋•Œ๋ฌธ์—(_Platforms_ ์ฐธ์กฐ): OpenCL ์ง€์›์ด TensorFlow์— ์–ธ์ œ ์˜ํ–ฅ์„ ๋ฏธ์น ์ง€ ๋Œ€๋žต์ ์œผ๋กœ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ๋ฒ„์ „ 0.9 / 1.0์ด ๋งˆ์Œ์— ๋“œ์‹œ๋‚˜์š”? 2016๋…„ 3/4๋ถ„๊ธฐ? ์•„๋‹ˆ๋ฉด 2017๋…„์ด ๋” ํ˜„์‹ค์ ์ž…๋‹ˆ๊นŒ?

@benoitsteiner eigen-opencl https://bitbucket.org/benoitsteiner/eigen-opencl ์ด opencl ํ…์„œ ํ๋ฆ„ ๊ฐœ๋ฐœ์„ ์ง€์›ํ•  ๋งŒํผ ์ถฉ๋ถ„ํžˆ ์ค€๋น„๋˜์–ด ์žˆ์Šต๋‹ˆ๊นŒ?

tensorflow๋Š” Eigen ํ…์„œ์—๋งŒ ์˜์กดํ•ฉ๋‹ˆ๊นŒ ์•„๋‹ˆ๋ฉด Eigen์˜ ๋‹ค๋ฅธ ์ข…์†์„ฑ์ด ์žˆ์Šต๋‹ˆ๊นŒ?

@NEELMCW Codeplay๋Š” Eigen Tensor์— OpenCL์— ๋Œ€ํ•œ ๋ถ€๋ถ„ ์ง€์›์„ ๋ฐฉ๊ธˆ ์ถœ์‹œํ–ˆ์Šต๋‹ˆ๋‹ค. ์ฝ”๋“œ๋Š” ์ด bitbucket ์ €์žฅ์†Œ ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ TensorFlow๋Š” Eigen ํ…์„œ์— ์˜์กดํ•ฉ๋‹ˆ๋‹ค. ์„ ํ˜• ๋Œ€์ˆ˜ ์—ฐ์‚ฐ์„ ์œ„ํ•œ Eigen์— ๋Œ€ํ•œ ์ถ”๊ฐ€ ์ข…์†์„ฑ์ด ์žˆ์ง€๋งŒ ์ด๋Ÿฌํ•œ ์—ฐ์‚ฐ์˜ OpenCL ํ˜ธํ™˜ ๊ตฌํ˜„์„ ์ œ๊ณตํ•  ํ•„์š”๋Š” ์—†์Šต๋‹ˆ๋‹ค(์ตœ์†Œํ•œ ์ดˆ๊ธฐ์—๋Š” ์•„๋‹˜). ๋”ฐ๋ผ์„œ ์šฐ๋ฆฌ๋Š” TensorFlow์—์„œ OpenCL ์ง€์›์„ ์‹œ์ž‘ํ•˜๊ธฐ์— ๋งค์šฐ ์ข‹์€ ์œ„์น˜์— ์žˆ์Šต๋‹ˆ๋‹ค.

๊ธฐ์—ฌ์— ๊ด€์‹ฌ์ด ์žˆ์œผ์‹œ๋ฉด ์ด ์Šคํ”„๋ ˆ๋“œ์‹œํŠธ ์—์„œ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•  ์ž‘์—…์„ ์ถ”์ ํ•˜๊ธฐ ์‹œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค.

@benoitsteiner ์ €๋Š” C++11 OpenCL BLAS ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ(https://github.com/CNugteren/CLBlast)์˜ ์ €์ž์ด๋ฉฐ ํ˜„์žฌ ๊ทธ๊ณณ์—์„œ ๋ฐ˜์ •๋ฐ€๋„ ์ง€์›์„ ๊ตฌํ˜„ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ €๋Š” ์ด ํ”„๋กœ์ ํŠธ์˜ BLAS/GEMM ๋ถ€๋ถ„์— ๊ธฐ์—ฌํ•˜๊ณ /๋˜๋Š” ๊ท€ํ•˜์˜ ์š”๊ตฌ์— ๋” ์ž˜ ๋งž๋„๋ก CLBlast๋ฅผ ์ˆ˜์ •ํ•˜๊ฒŒ ๋˜์–ด ๊ธฐ์ฉ๋‹ˆ๋‹ค.

@CNugteren
CLBlast๋Š” ์ด์ œ OpenCL-Caffe์—์„œ๋„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ณด์…จ๋‚˜์š”? :)
libDNN ์ปจ๋ณผ๋ฃจ์…˜๋„ ๋ณผ ๊ธฐํšŒ๊ฐ€ ์žˆ์—ˆ๋‚˜์š”?

@naibaf7 ๋ดค์–ด, ๋„ค! :) ๋‚˜๋Š” ์ง€๊ธˆ๊นŒ์ง€ libDNN์„ ์ „ํ˜€ ๋ณด์ง€ ์•Š์•˜์ง€๋งŒ ์ •ํ™•ํžˆ ๋ฌด์—‡์„ ์˜๋ฏธํ•˜๋Š”์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ปจ๋ณผ๋ฃจ์…˜์ด GEMM์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ตฌํ˜„๋˜์—ˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๊นŒ?

@CNugteren
๋„ค, ๊ทธ๋ƒฅ ์‚ดํŽด๋ณด์‹œ๊ณ  libdnn์— ๋Œ€ํ•œ ๊ฐœ์„  ์‚ฌํ•ญ์ด๋‚˜ ํŠœ๋‹ ํŒ์„ ์ฃผ์‹œ๋ฉด ์ข‹์„ ๊ฒƒ ๊ฐ™์•„์š”.
(https://github.com/naibaf7/caffe/blob/master/src/caffe/greentea/libdnn.cpp).
GEMM์„ ์‚ฌ์šฉํ•˜์ง€๋งŒ ์•”์‹œ์ (BLAS๋ฅผ ํ†ตํ•˜์ง€ ์•Š๊ณ  ์ž‘์—… ๊ทธ๋ฃน ์ˆ˜์ค€์—์„œ ์ž‘์€ GEMM๋งŒ)์„ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ๋” ๋†’์€ ์ˆ˜์ค€์˜ ๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ๊ฐ€ ๊ฐ€๋Šฅํ•˜๊ณ  ์ค‘๊ฐ„ ๋ฒ„ํผ๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค(GEMM ์ฒด๊ณ„๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ํ’€๊ธฐ ์œ„ํ•ด).

๋ชจ๋‘ ์•ˆ๋…•,

@benoitsteiner ์šฐ๋ฆฌ์˜ ํ‘ธ์‹œ๋ฅผ ์–ธ๊ธ‰ํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค! ๋„์›€์ด ๋˜๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค!

์ด ์ฝ”๋“œ๋ฅผ ์ปดํŒŒ์ผํ•˜๋ ค๋ฉด SYCL ์ปดํŒŒ์ผ๋Ÿฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ํ˜„์žฌ ์ง€์›๋˜๋Š” ์œ ์ผํ•œ ์ปดํŒŒ์ผ๋Ÿฌ๋Š” Codeplay์˜ ํ‰๊ฐ€ ํ”„๋กœ๊ทธ๋žจ์„ ํ†ตํ•ด ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” Codeplay์˜ ComputeCpp์ž…๋‹ˆ๋‹ค. ComputeCpp๋Š” 2016๋…„ ํ›„๋ฐ˜์— ๊ณต๊ฐœ ์˜คํ”ˆ ๋ฒ ํƒ€๋กœ ๋ฌด๋ฃŒ๋กœ ์ œ๊ณต๋  ์˜ˆ์ •์ด๋ฉฐ 2017๋…„์—๋Š” ๋ฌด๋ฃŒ ๋ฒ„์ „(ComputeCpp Community Edition)์œผ๋กœ ์ถœ์‹œ๋  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋ˆ„๊ตฌ๋‚˜ AMD ๋˜๋Š” Intel GPU์™€ ๊ฐ™์€ OpenCL ์žฅ์น˜์—์„œ TensorFlow๋ฅผ ์ปดํŒŒ์ผํ•˜๊ณ  ๊ฐœ๋ฐœํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  CPU.

btw. ์ด ๋ฌธ์ œ์— OpenCL ๋ ˆ์ด๋ธ”์ด ์—†์–ด์•ผ ํ•ฉ๋‹ˆ๊นŒ? :)

๊ฐ์‚ฌ ํ•ด์š”,
๋ฃจํฌ

์˜คํ”ˆ ์†Œ์Šค ๋„๊ตฌ๋กœ๋„ ์ปดํŒŒ์ผํ•  ์ˆ˜ ์žˆ๊ธฐ๋ฅผ ์ง„์‹ฌ์œผ๋กœ ๋ฐ”๋ž๋‹ˆ๋‹ค. @keryell ์ƒˆ Opencl ๋ธŒ๋žœ์น˜ ๊ฐ€ ์–ด๋–ป๊ฒŒ ์ง„ํ–‰๋˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ?

@bhack ๋จผ์ € CPU OpenMP ํ˜ธ์ŠคํŠธ ์žฅ์น˜ ๋ชจ๋“œ์—์„œ triSYCL๊ณผ ํ•จ๊ป˜ ์ž‘๋™ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๊ฒƒ์ด ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ง€๊ธˆ TensorFlow/Eigen ๋นŒ๋“œ ์‹œ์Šคํ…œ์— ๋“ค์–ด๊ฐˆ ๋Œ€์—ญํญ์ด ์—†์Šต๋‹ˆ๋‹ค. :-( ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ์‹œ๋„ํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด ์ž์œ ๋กญ๊ฒŒ ์‹œ๋„ํ•˜์‹ญ์‹œ์˜ค. :-)

https://github.com/keryell/triSYCL/commits/opencl ์€ OpenCL ์ƒํ˜ธ ์šด์šฉ์„ฑ ๋ชจ๋“œ์—์„œ ๊ณง OpenCL ์ปค๋„์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ—ˆ์šฉํ•ด์•ผ ํ•˜์ง€๋งŒ Clang/LLVM ์•„์›ƒ๋ผ์ด๋„ˆ๊ฐ€ ์•„์ง ์—†๊ธฐ ๋•Œ๋ฌธ์— ์šฐ๋ฆฌ ๋ชจ๋‘๊ฐ€ ๊ฟˆ๊พธ๋Š” SYCL ๋‹จ์ผ ์†Œ์Šค ๋ชจ๋“œ์—์„œ๋Š” ๊ทธ๋ ‡์ง€ ์•Š์Šต๋‹ˆ๋‹ค. SYCL์—์„œ ์ปค๋„์„ ์ถ”์ถœํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ Khronos๋Š” ์ตœ๊ทผ AMD์™€ Intel์˜ ๊ตฌ์„ฑ ์š”์†Œ๋ฅผ ์˜คํ”ˆ ์†Œ์Šคํ™”ํ•˜์—ฌ OpenCL C++ 2.2 ๋ฐ SPIR-V๋ฅผ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๊ทธ๊ฒƒ์€ "๋‹จ์ง€" ์‹œ๊ฐ„์˜ ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค...

๋ˆ„๊ตฐ๊ฐ€ Tensorflow๊ฐ€ OpenCL(AMD GPU)๊ณผ ํ•จ๊ป˜ ์‹คํ–‰๋  ์ˆ˜ ์žˆ๋Š” ์‹œ๊ธฐ์— ๋Œ€ํ•œ ์ถ”์ •์น˜๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ๊ทธ๋ฆฌ๊ณ  ์‹œ๊ฐ„์ด ์ง€๋‚จ์— ๋”ฐ๋ผ ์„ฑ๋Šฅ/์‚ฌ์šฉ์„ฑ ๊ณก์„ ์ด ์–ด๋–ป๊ฒŒ ์ƒ๊ฒผ์Šต๋‹ˆ๊นŒ? ๋ชจ๋“  ๊ณผ๊ฑฐ ์ •๋ณด๋ฅผ ์‹คํ–‰ ๊ฐ€๋Šฅํ•œ ํ•˜๋“œ์›จ์–ด ๊ตฌ๋งค ์ •๋ณด๋กœ ๊ตฌ๋ฌธ ๋ถ„์„ํ•˜๋Š” ๊ฒƒ์€ ์–ด๋ ต์Šต๋‹ˆ๋‹ค. :)

๋ฏธ๋ฆฌ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค!

@djan92
๋ถˆํ–‰ํžˆ๋„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์„ ๋•Œ๊นŒ์ง€ 1๋…„์„ ๋‹ฌ๋ผ๊ณ  ๋งํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ์ตœ์‹  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์™€ ๊ธฐ์ˆ ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ตฌ์ถ•๋  ๊ฒƒ์œผ๋กœ ๋ณด์ด์ง€๋งŒ ๋Œ€๋ถ€๋ถ„์€ ์•„์ง ์ค€๋น„๊ฐ€ ์™„๋ฃŒ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.
๋˜ํ•œ ์™„์ „ํ•œ ๋„๊ตฌ ์Šคํƒ์„ OpenSource๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜๋Š” ์ฆ‰์‹œ ์˜จ๋ณด๋“œ๋กœ ์ด๋™ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@naibaf7

๋ถˆํ–‰ํžˆ๋„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์„ ๋•Œ๊นŒ์ง€ 1๋…„์„ ๋‹ฌ๋ผ๊ณ  ๋งํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ์ตœ์‹  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์™€ ๊ธฐ์ˆ ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ๊ตฌ์ถ•๋  ๊ฒƒ์œผ๋กœ ๋ณด์ด์ง€๋งŒ ๋Œ€๋ถ€๋ถ„์€ ์•„์ง ์ค€๋น„๊ฐ€ ์™„๋ฃŒ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.
๋˜ํ•œ ์™„์ „ํ•œ ๋„๊ตฌ ์Šคํƒ์„ OpenSource๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜๋Š” ์ฆ‰์‹œ ์˜จ๋ณด๋“œ๋กœ ์ด๋™ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

SYCL ํฌํŠธ๊ฐ€ ์ค€๋น„๋  ๋•Œ๊นŒ์ง€ ๊ธฐ๋‹ค๋ฆฌ๋Š” ๋™์•ˆ CL ๋ฒ„์ „์„ ๋จผ์ € ๊ตฌํ˜„ํ•˜์ง€ ์•Š๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ์—ฌ๊ธฐ์— ๊ธฐ๊บผ์ด ๋„์›€์„ ์ค„ ์‚ฌ๋žŒ๋“ค์ด ๊ฝค ์žˆ์„ ๊ฑฐ๋ผ ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. 1๋…„์€ ๋„ˆ๋ฌด ๊ธธ๊ฒŒ ๋“ค๋ฆฝ๋‹ˆ๋‹ค.

@djan92
๋„ค, ๋งž์Šต๋‹ˆ๋‹ค. #22๋Š” ๊ฑฐ์˜ 8๊ฐœ์›”์ด ์ง€๋‚ฌ๊ณ  100๊ฐœ ์ด์ƒ์˜ ๊ฒŒ์‹œ๋ฌผ์ด ์žˆ์Šต๋‹ˆ๋‹ค! ์ •๋ณด๊ฐ€ ํœฉ์“ธ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!

๋ˆ„๊ตฐ๊ฐ€ Tensorflow๊ฐ€ OpenCL(AMD GPU)๊ณผ ํ•จ๊ป˜ ์‹คํ–‰๋  ์ˆ˜ ์žˆ๋Š” ์‹œ๊ธฐ์— ๋Œ€ํ•œ ์ถ”์ •์น˜๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

TensorFlow๋Š” (Tensor ๋ชจ๋“ˆ์—์„œ) ํ…์„œ ๊ณ„์‚ฐ์„ ์œ„ํ•ด Eigen ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. SYCL(https://bitbucket.org/benoitsteiner/opencl ๋ถ„๊ธฐ Codeplay)์„ ์‚ฌ์šฉํ•˜์—ฌ OpenCL 1.2์— ๋Œ€ํ•œ ๋ถ€๋ถ„ ๊ตฌํ˜„์„ ์ปค๋ฐ‹ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด ์ž‘์—…์— SYCL์„ ์‚ฌ์šฉํ•œ ์ด์œ ๋Š” TensorFlow์˜ ์ด ์„น์…˜์ด C++ ํ‘œํ˜„์‹ ํŠธ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ด๋Š” OpenCL์šฉ SYCL์—์„œ๋Š” ๊ฐ€๋Šฅํ•˜์ง€๋งŒ OpenCL C์—์„œ๋Š” ์ง์ ‘ ๋ถˆ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ์ปจ๋ณผ๋ฃจ์…˜์ด๋‚˜ BLAS์™€ ๊ฐ™์€ TensorFlow์˜ ๋‹ค๋ฅธ ๊ตฌ์„ฑ ์š”์†Œ๋Š” OpenCL C๋ฅผ ์ง์ ‘ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ˜„์žฌ ์ €๋Š” ComputeCpp(Codeplay์˜ SYCL ์ปดํŒŒ์ผ๋Ÿฌ)๋ฅผ bazel ๋นŒ๋“œ ์‹œ์Šคํ…œ์— ํ†ตํ•ฉํ•˜๋Š” ์ž‘์—…์„ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๊ณง ์ค€๋น„๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค(์ด ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๋ฅผ ๋”ฐ๋ฅด์‹ญ์‹œ์˜ค: https://github.com/benoitsteiner/tensorflow-opencl/ ). ์™„๋ฃŒ๋˜๋ฉด ComputeCpp๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ OpenCL SPIR(์˜ˆ: AMD ๋˜๋Š” Intel)์„ ์ง€์›ํ•˜๋Š” ์‹œ์Šคํ…œ์—์„œ TensorFlow๋ฅผ ๊ฐ€์†ํ™”ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋” ๋งŽ์€ TensorFlow๋ฅผ ๊ฐ€์†ํ™”ํ•˜๊ณ  ๋” ๋งŽ์€ OpenCL ๊ตฌํ˜„๊ณผ triSYCL ์˜คํ”ˆ ์†Œ์Šค SYCL์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•œ ์ถ”๊ฐ€ ์ž‘์—…์ด ๊ณ„์†๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. SYCL ๋ฐ OpenCL์€ ๋‹ค์ค‘ ๊ณต๊ธ‰์—…์ฒด์˜ ๋กœ์—ดํ‹ฐ ํ”„๋ฆฌ ๊ฐœ๋ฐฉํ˜• ํ‘œ์ค€์ด๋ฏ€๋กœ AMD GPU๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ด ์ ‘๊ทผ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜์—ฌ ์ง€์›ํ•  ์ˆ˜ ์žˆ๋Š” ํ”Œ๋žซํผ๊ณผ ์žฅ์น˜๊ฐ€ ๋งŽ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

ComputeCpp Community Edition ์ปดํŒŒ์ผ๋Ÿฌ๋Š” 2016๋…„ ํ›„๋ฐ˜์— ๋ฌด๋ฃŒ๋กœ ์ œ๊ณต๋  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค(๋ฒ ํƒ€ ํ˜•์‹: ์™„์ „ํ•œ ์ ํ•ฉ์„ฑ์€ 2017๋…„ ์ดˆ์— ๋ฌด๋ฃŒ๋กœ ์ถœ์‹œ๋  ์˜ˆ์ •).

TensorFlow์˜ C++๊ฐ€ ์•„๋‹Œ ๋ถ€๋ถ„(์˜ˆ: BLAS ๋ฐ ์ปจ๋ณผ๋ฃจ์…˜)์„ ๊ฐ€์†ํ™”ํ•˜๋Š” ์ž‘์—…์€ SYCL ์—†์ด ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๋ณ„๋„๋กœ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ํ•˜๋“œ์›จ์–ด ๊ณต๊ธ‰์—…์ฒด๋Š” ๊ฐ€์†ํ™”๋ฅผ ๋„์šธ ์ˆ˜ ์žˆ๋Š” ์ด๋Ÿฌํ•œ ๊ธฐ๋Šฅ์— ๋Œ€ํ•ด ์ž์ฒด์ ์œผ๋กœ ์ตœ์ ํ™”๋œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜๋Š” ์ด๋Ÿฌํ•œ ๊ธฐ๋Šฅ์„ ์œ„ํ•ด C++์™€ ํ•จ๊ป˜ Eigen์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ์‹œ๊ฐ„์ด ์ง€๋‚จ์— ๋”ฐ๋ผ ์„ฑ๋Šฅ/์‚ฌ์šฉ์„ฑ ๊ณก์„ ์ด ์–ด๋–ป๊ฒŒ ์ƒ๊ฒผ์Šต๋‹ˆ๊นŒ?

์šฐ๋ฆฌ๋Š” ์‹ค์ ์ด ๊พธ์ค€ํžˆ ํ–ฅ์ƒ๋  ๊ฒƒ์ด๋ผ๊ณ  ๋ฏฟ์Šต๋‹ˆ๋‹ค. ๋‹ค์–‘ํ•œ ์žฅ์น˜์—์„œ ๊ฐ€์†ํ™”ํ•˜๋ ค๋ฉด ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด๋‹ค ํšจ์œจ์ ์œผ๋กœ ๊ด€๋ฆฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด "๊ด€๋ฆฌ๋˜๋Š” ํ…์„œ" ์ž‘์—… ํ•ญ๋ชฉ์ด ์žˆ๋Š” ์ด์œ ์ด๋ฏ€๋กœ ํ˜ธ์ŠคํŠธ์™€ ์—ฌ๋Ÿฌ ์žฅ์น˜ ๊ฐ„์˜ ๋ฐ์ดํ„ฐ ์ด๋™์„ ๋ณด๋‹ค ํšจ์œจ์ ์œผ๋กœ ๊ด€๋ฆฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ๋กœ์„œ๋Š” ๊ด‘๋ฒ”์œ„ํ•œ ์žฅ์น˜์—์„œ ์„ฑ๋Šฅ์ด ์–ด๋–ป๊ฒŒ ๋‹ฌ๋ผ์งˆ์ง€ ์˜ˆ์ธกํ•˜๊ธฐ ์–ด๋ ต์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ ๊ฐ€์†ํ™”๋˜๋Š” ๊ฒƒ์€ ๊ฑฐ์˜ ์—†์ง€๋งŒ TensorFlow์—์„œ ๊ณต๊ฐœ ํ‘œ์ค€ ๊ฐ€์†ํ™”๋ฅผ ํ—ˆ์šฉํ•˜๋Š” ์ธํ”„๋ผ๋ฅผ ๊ตฌ์ถ•ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

@naibaf7

๋ถˆํ–‰ํžˆ๋„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์„ ๋•Œ๊นŒ์ง€ 1๋…„์„ ๋‹ฌ๋ผ๊ณ  ๋งํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

๊ธฐ๋ณธ ์ž‘์—…์ด ๊ณง ์™„๋ฃŒ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ฐœ๋ฐฉํ˜• ํ‘œ์ค€ ๊ธฐ๋ฐ˜ ๊ฐ€์†์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด ์ฝ”๋“œ ๋‚ด์— ๊ธฐ๋ณธ ์ธํ”„๋ผ๋ฅผ ๋ฐฐ์น˜ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ปค๋ฎค๋‹ˆํ‹ฐ ์ง€์›์„ ํ†ตํ•ด ๊ฐ€์†ํ™”๋˜๊ณ  ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ฒ„์ „์ด 1๋…„ ์ด๋‚ด์— ์ค€๋น„๋  ๊ฒƒ์ด๋ผ๊ณ  ๋ฏฟ์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ ์™„์ „ํ•œ ๋„๊ตฌ ์Šคํƒ์„ OpenSource๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜๋Š” ์ฆ‰์‹œ ์˜จ๋ณด๋“œ๋กœ ์ด๋™ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ComputeCpp๋Š” 2016๋…„์— ๋ฌด๋ฃŒ๋กœ ๊ณต๊ฐœ๋  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค. ์˜คํ”ˆ ์†Œ์Šค triSYCL ์ง€์›์ด ๋’ค๋”ฐ๋ฅผ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์˜คํ”ˆ ์†Œ์Šค OpenCL์€ ์ด๋ฏธ pocl, Shamrock, Clover, Beignet์—์„œ ์ง€์›๋ฉ๋‹ˆ๋‹ค.

@robertwgh
Eigen์˜ C++ ํ…์„œ ์ฝ”๋“œ๋Š” SYCL์ด ์—†์œผ๋ฉด OpenCL C๋กœ ์‰ฝ๊ฒŒ ์ด์‹ํ•  ์ˆ˜ ์—†์ง€๋งŒ OpenCL C์—์„œ ์ž˜ ์ž‘๋™ํ•˜๋Š” ๋‹ค๋ฅธ ๊ธฐ๋Šฅ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์Šคํ”„๋ ˆ๋“œ์‹œํŠธ๋ฅผ ์‚ดํŽด๋ณด์„ธ์š”. https://docs.google.com/spreadsheets/d /1YbHn7dAFPPG_PgTtgCJlWhMGorUPYsF681TsZ4Y4LP0/edit#gid =0 ๋ฐ ๋ฌด๋ฃŒ ์ฑ„์šฐ๊ธฐ ์ผ๋ฐ˜ OpenCL C๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋Š” ๊ธฐ๋Šฅ(์˜ˆ: BLAS ๋ฐ ์ปจ๋ณผ๋ฃจ์…˜)์— ์ด๋ฆ„์„ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.

์šฐ๋ฆฌ๋Š” ๊ณต๊ฐœ ๋ฆด๋ฆฌ์Šค ์ „์— ComputeCpp์— ๋Œ€ํ•œ ํ‰๊ฐ€ ๋ฒ„์ „์„ ์ œ๊ณตํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์›ํ•˜์‹œ๋Š” ๋ถ„์€ ๋ฉ”์ผ ์ฃผ์„ธ์š” :)

@lukeiwanski ํ›Œ๋ฅญํ•ฉ๋‹ˆ๋‹ค. ์—…๋ฐ์ดํŠธํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ๊ธฐ๋Šฅ์„ 1๋…„ ์ด๋‚ด์— ์™„๋ฃŒํ•˜๋Š” ๊ฒƒ์ด ์˜ณ๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค.

LLVM์—์„œ Streamexecutor์˜ ๋˜ ๋‹ค๋ฅธ ๋‹จ๊ณ„

rx 480์—์„œ ๊ฐ€์†์„ ์–ป์„ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ์Šต๋‹ˆ๊นŒ?

@benoitsteiner
LibDNN ๋…๋ฆฝ ์‹คํ–‰ํ˜•์€ ํ†ตํ•ฉ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
https://github.com/naibaf7/libdnn

์ด ์ž‘์—…์ด ์ง„ํ–‰ ์ค‘์ž…๋‹ˆ๋‹ค. Beignet 2.0์ด ์—ฐ๋งˆ๋˜๋ฉด ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ง€๊ธˆ Skylake์™€ Iris๋Š” ๋งŽ์€ ์ž ์žฌ๋ ฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ์‚ดํŽด๋ณด๊ณ  ์‹ถ๋‹ค๋ฉด ์ตœ๊ทผ pull ์š”์ฒญ์ด https://github.com/benoitsteiner/tensorflow-opencl/pull/1 ์— ์ถ”๊ฐ€๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

Imagination(GPU)์˜ OpenCL SDK์— ์•ก์„ธ์Šคํ•˜๋ ค๋ฉด NDA๊ฐ€ ํ•„์š”ํ•˜๋ฉฐ ๊ณต์œ  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋งŒ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ tensorflow๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

@์•Œ๋ ˆํ”„๋งŒ
OpenCL ํ”„๋กœ๊ทธ๋žจ์„ ๋นŒ๋“œํ•˜๊ธฐ ์œ„ํ•ด ๊ณต๊ธ‰์—…์ฒด๋ณ„ ํ—ค๋” ํŒŒ์ผ์ด ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. https://www.khronos.org/registry/cl/api/1.2/cl.hpp ์—์„œ cl.hpp๋ฅผ ์‹œ๋„ํ•˜๊ณ  ๋‹ค๋ฅธ SDK์—์„œ opencl.h/cl.h๋ฅผ ์‹œ๋„ํ•˜์‹ญ์‹œ์˜ค. ์˜ˆ๋ฅผ ๋“ค์–ด - ์ €๋Š” ์ตœ์†Œํ•œ 3๊ฐœ์˜ OpenCL ํ”Œ๋žซํผ์„ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฉฐ ๋ชจ๋“  ํ”Œ๋žซํผ์ด ํ•˜๋‚˜์˜ ๊ณต์œ  /usr/include/CL/cl.h์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

OpenCL์—์„œ ์‹คํ–‰๋˜๋Š” TensorFlow๋Š” ์•„์ง ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ง„ํ–‰์ค‘์ธ ์ž‘์—…์ž…๋‹ˆ๋‹ค. ํ˜„์žฌ ์šฐ๋ฆฌ๋Š” AMD GPU๋ฅผ ์—ฐ๊ตฌํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. PowerVR ์ง€์›์ด ๋”ฐ๋ผ์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๊ฐœ๋ฐœ์— ๊ธฐ์—ฌํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด ๋‹น์‚ฌ(Codeplay)์— ์ง์ ‘ ์—ฐ๋ฝํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. PowerVR์—์„œ TensorFlow๋ฅผ ์‹คํ–‰ํ•˜๋ ค๋ฉด ์กฐ๊ธˆ ๋” ์ง„ํ–‰์„ ๊ธฐ๋‹ค๋ ค์•ผ ํ•ฉ๋‹ˆ๋‹ค.

@inferrna ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๋ฒค๋”๋ณ„ ๊ตฌํ˜„์„ ์ˆจ๊ธฐ๋Š” OpenGL๊ณผ ๋น„์Šทํ•ด ๋ณด์ž…๋‹ˆ๋‹ค.

@andrerichards ๊ฐœ๋ฐœ์— ๊ธฐ์—ฌํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ์—ฐ๋ฝ ๋ฐฉ๋ฒ•์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

๊ฐ€์žฅ ์‰ฌ์šด ๋ฐฉ๋ฒ•์€ https://www.codeplay.com/products/computecpp ํŽ˜์ด์ง€์—์„œ "๊ด€์‹ฌ ๋“ฑ๋ก"์„ ํด๋ฆญํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๊ทธ๋Ÿฌ๋ฉด ๊ฐœ๋ฐœ์ž ํ”„๋กœ๊ทธ๋žจ์— ์ฐธ์—ฌํ•˜๊ฒŒ ๋˜๋ฉฐ ์ด @alephman์—์„œ ํ•จ๊ป˜ ์ž‘์—…ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์›ํ•œ๋‹ค๋ฉด ์˜คํ”ˆ ์†Œ์Šค ๋Œ€์•ˆ์œผ๋กœ ์ปดํŒŒ์ผํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ณต๋™ ๊ธฐ์—ฌ๋„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. https://github.com/tensorflow/tensorflow/issues/22#issuecomment -221841173 ์ฐธ์กฐ

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„!
Tensorflow ์ง€์›์ด Nvidia Cuda ์™ธ๋ถ€๋กœ ํ™•์žฅ๋œ๋‹ค๋Š” ์†Œ์‹์„ ๋“ค์œผ๋‹ˆ ๋งค์šฐ ๊ธฐ์ฉ๋‹ˆ๋‹ค. http://www.amd.com/en-us/products/processors/laptop-processors#sectionOne ๊ณผ ๊ฐ™์€ APU์—์„œ๋„ ์ž‘๋™ํ•˜๋„๋ก ๊ณ ๋ คํ•˜๊ณ  ์žˆ๋Š”์ง€ ๊ถ๊ธˆํ•ฉ๋‹ˆ๋‹ค.

@kgocheva
APU๋Š” CPU์™€ GPU ๋ถ€๋ถ„ ๋ชจ๋‘์— ๋Œ€ํ•ด OpenCL์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
์ด๊ฒƒ์€ OpenCL ์ง€์›์ด ์ค€๋น„๋˜๋ฉด ๊ฑฐ์˜ ์ฆ‰์‹œ ์ž‘๋™ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
ํ•œํŽธ, ์ด๋ฏธ APU๊ฐ€ ์žˆ๊ณ  ๋‹ค๋ฅธ ML ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์‹œ๋„ํ•˜๋ ค๋Š” ๊ฒฝ์šฐ BVLC OpenCL Caffe๊ฐ€ ์ด๋ฏธ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

@ naibaf7 ์„ค๋ช… ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์ €๋Š” Tensorflow๋ฅผ ๋กœ์ปฌ์—์„œ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•œ ๋น„์šฉ ํšจ์œจ์ ์ธ ํ•˜๋“œ์›จ์–ด/์†Œํ”„ํŠธ์›จ์–ด ์กฐํ•ฉ์„ ์ฐพ๊ณ  ์žˆ์œผ๋ฉฐ OpenCL ๊ฐœ๋ฐœ ์ง„ํ–‰ ์ƒํ™ฉ์„ ํ™•์‹คํžˆ ๋”ฐ๋ฅผ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@hughperkins
์˜ˆ๊ฐ€ ๋ฌธ์ œ๊ฐ€ ๋  ์ˆ˜ ์žˆ์ง€๋งŒ im2col/col2im ๋ฐ ๊ธฐํƒ€ ์ปจ๋ณผ๋ฃจ์…˜ ๊ตฌํ˜„๊ณผ ๊ฐ™์€ ๋ถ€๋ถ„์ด ์‹ค์ œ๋กœ GCLA์˜ ๋ฌธ์ œ์ธ ๊ฒฝ์šฐ ์™ธ๋ถ€ API๋กœ ํ”Œ๋Ÿฌ๊ทธ์ธ๋  ์ˆ˜๋„ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๊ทธ๋Ÿฌํ•œ ์ž‘์—…์˜ ์›์ €์ž์—๊ฒŒ๋„ ๋” ์ข‹์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@hughperkins ์šฐ๋ฆฌ๋Š” OpenCL 1.2์šฉ SYCL์„ ํ†ตํ•ด OpenCL์„ TensorFlow๋กœ ๊ฐ€์ ธ์˜ค๊ธฐ ์œ„ํ•ด ๋…ธ๋ ฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
https://docs.google.com/spreadsheets/d/1YbHn7dAFPPG_PgTtgCJlWhMGorUPYsF681TsZ4Y4LP0/edit#gid =1625897530์—์„œ "todos" ๋ฐ ์ง„ํ–‰ ์ƒํ™ฉ์„ ํ™•์ธํ•˜์„ธ์š”.
์ตœ๊ทผ์— ComputeCpp Comunity Edition์ด๋ผ๋Š” SYCL https://www.codeplay.com/products/computesuite/computecpp ์šฉ ์ปดํŒŒ์ผ๋Ÿฌ๋ฅผ ์ถœ์‹œํ–ˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ๋žŒ๋“ค์ด ๊ทธ๊ฒƒ์„ ์‹œ๋„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!
๋˜ํ•œ ๊ณ ์œ ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ https://bitbucket.org/benoitsteiner/opencl/branch/ComputeCpp ์— ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. TensorFlow์˜ MNIST๊ฐ€ ์š”๊ตฌํ•˜๋Š” ๋‹จ๊ณ„์— ๋„๋‹ฌํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋ช‡ ๊ฐ€์ง€๊ฐ€ ๋‚จ์•„ ์žˆ์Šต๋‹ˆ๋‹ค.
์ œ์•ฝ ์กฐ๊ฑด๊ณผ ๊ด€๋ จํ•˜์—ฌ ํ˜„์žฌ ComputeCpp CE ๋ฆด๋ฆฌ์Šค๋Š” Ubuntu 14.04 64๋น„ํŠธ ๋ฐ CentOS 64๋น„ํŠธ๋ฅผ ์ง€์›ํ•˜๋Š” ํ”Œ๋žซํผ์— ๋Œ€ํ•ด Intel(CPU, GPU) ๋ฐ AMD(CPU, GPU)์— ๋Œ€ํ•ด ํ…Œ์ŠคํŠธ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
ComptueCpp๋Š” ๋ฌด๋ฃŒ๋กœ ๋‹ค์šด๋กœ๋“œํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ์ƒ์šฉ ๋ฐ ์˜คํ”ˆ ์†Œ์Šค ํ”„๋กœ์ ํŠธ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์šฐ๋ฆฌ๋Š” <3๊ฐœ์˜ ์—ด๋ฆฐ ์ปค๋ฎค๋‹ˆํ‹ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— :)

@lukeiwanski ์—ฌ๊ธฐ ์Šค๋ ˆ๋“œ์—์„œ ๋…ผ์˜/์งˆ๋ฌธํ•ด์„œ ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์—๊ฒŒ๋„ ๊ด€์‹ฌ์ด ์žˆ์„ ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์ €๋Š” Codeplay๊ฐ€ OpenCL ๊ตฌํ˜„์„ ์œ„ํ•œ SYCL์— ๋งค์šฐ ๊ด€์‹ฌ์ด ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์ดํ•ดํ•˜๋ฉฐ ์ด๋ฏธ ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์ด ์ด ์ž‘์—…์— ๊ด€์‹ฌ์„ ๊ฐ–๊ณ  ์žˆ๋‹ค๋Š” ์†Œ์‹์„ ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค. ๋„ˆ๋„. ์˜ˆ๋ฅผ ๋“ค์–ด Movidius ๊ด€๊ณ„์ž์˜ ๊ฒŒ์‹œ๋ฌผ์„ ์ฝ์—ˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด์— ๋Œ€ํ•œ ๊ตฌ๊ธ€์˜ ๊ธฐ์—ฌ๋Š” ๊ณผ์—ฐ ๋ฌด์—‡์ธ์ง€ ๋ฌป๊ณ  ์‹ถ๋‹ค. AMD ๋“ฑ์„ ์ œ์™ธํ•˜๊ณ  Movidius๊ฐ€ Codeplay์˜ ํŒŒํŠธ๋„ˆ๋กœ ๋“ฑ๋ก๋˜์–ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— OpenCL์šฉ SYCL์„ ๊ถŒ์žฅํ•˜๊ฑฐ๋‚˜ ์ง€์›ํ•˜๋Š” ๊ฒƒ๋„ ์ดํ•ดํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ์ œ๊ฐ€ ์•„๋Š” ํ•œ Google์€ ๊ท€ํ•˜์˜ ํŒŒํŠธ๋„ˆ๊ฐ€ ์•„๋‹ˆ๋ฉฐ ์ง€๊ธˆ๊นŒ์ง€ ๊ธฐ์—ฌํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค!

์˜คํ•ดํ•˜์ง€ ๋งˆ์„ธ์š”. ์ €๋Š” ๋‹น์‹ ์˜ ์ž‘์—…์„ ์ •๋ง ์ข‹์•„ํ•˜์ง€๋งŒ, ๋…ธ๋ ฅ์„ ํ†ตํ•ฉํ•˜๊ณ , ์ž์›์„ ๋ชจ์œผ๊ณ , Google๊ณผ ํ•จ๊ป˜ ์ผํ•˜๋ ค๊ณ  ํ•˜๋Š” ๊ฒƒ์ด ์ข‹์€ ์ƒ๊ฐ์ด ์•„๋‹๊นŒ์š”? ๋‚ด๊ฐ€ ๋ณด๊ธฐ์—๋Š” TensorFlow์šฉ OpenCL์— ๋งŽ์€ ๋‹ค๋ฅธ ๋‹น์‚ฌ์ž๋“ค์ด ๊ด€์‹ฌ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ด์ง€๋งŒ, ์ด ๋‹น์‚ฌ์ž๋“ค์ด ํ•จ๊ป˜ ๊ฐœ๋ฐœํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ์—„์ฒญ๋‚œ ์ž ์žฌ๋ ฅ์€ ์‚ฌ์šฉ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค!

๋‚ด๊ฐ€ ํ‹€๋ฆด ์ˆ˜ ์žˆ๊ณ  ์ด๊ฒƒ์ด ์ถฉ๋ถ„ํžˆ ๋…ผ์˜๋˜์—ˆ๋‹ค๋ฉด ์Šค์Šค๋กœ ์‚ฌ๊ณผํ•˜์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋‚˜๋Š” ์—ฌ์ „ํžˆ Google(๋˜๋Š” ๋‹ค๋ฅธ ๋‹น์‚ฌ์ž)์ด ์ด ๋ฌธ์ œ์— ๋Œ€ํ•ด ํ˜‘๋ ฅํ•˜๋ ค๋Š” ์ฃผ์š” ์‹œ๋„๋ฅผ ์•Œ์ง€ ๋ชปํ•˜๋ฉฐ, ๊ทธ ๊ฒฐ๊ณผ ์ปค๋ฎค๋‹ˆํ‹ฐ๊ฐ€ ์–ด๋–ป๊ฒŒ ์ง์ ‘์ ์ธ ๊ธฐ๋ถ€, ํ…Œ์ŠคํŠธ ๋˜๋Š” ๊ธฐํƒ€ ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด (๋…์‹  ๊ฐœ์ธ๊ณผ ๊ฐ™์ด) ๋•๊ฑฐ๋‚˜ ์ง€์›ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@ascenator ์šฐ๋ฆฌ Google์€ ๊ฑฐ์˜ 12๊ฐœ์›” ๋™์•ˆ ์ด ํ”„๋กœ์ ํŠธ์—์„œ Luke ๋ฐ ๊ทธ์˜ Codeplay ๋™๋ฃŒ์™€ ๊ธด๋ฐ€ํ•˜๊ฒŒ ํ˜‘๋ ฅํ•ด ์™”์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋…ธ๋ ฅ์— ๋Œ€ํ•œ Codeplay์˜ ๊ณตํ—Œ์€ ์—„์ฒญ๋‚ฌ์œผ๋ฏ€๋กœ OpenCL๊ณผ ๊ด€๋ จ๋œ ์—…๋ฐ์ดํŠธ๋ฅผ ์ „๋‹ฌํ•˜๋Š” ๋ฐ ์žˆ์–ด์„œ Codeplay๊ฐ€ ์ฃผ๋„๊ถŒ์„ ์žก๋„๋ก ํ•ด์•ผ ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ์šฐ๋ฆฌ๊ฐ€ ์ฃผ์ œ์— ๋Œ€ํ•ด ๋งŽ์ด ๋“ฃ์ง€ ๋ชปํ•œ ์ด์œ ์ž…๋‹ˆ๋‹ค :)

์ด์ œ ComputeCpp ์ปดํŒŒ์ผ๋Ÿฌ๊ฐ€ ๋„๋ฆฌ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•˜๋ฏ€๋กœ ์ง€๊ธˆ๊นŒ์ง€ ์ˆ˜ํ–‰ํ•œ ์ž‘์—…์„ ๋ณ‘ํ•ฉํ•  ๊ณ„ํš์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋จผ์ € ์šฐ๋ฆฌ๋Š” ๊ธฐ์กด ์ฝ”๋“œ๋ฒ ์ด์Šค๋ฅผ ๋ถˆ์•ˆ์ •ํ•˜๊ฒŒ ๋งŒ๋“ค์ง€ ์•Š๋„๋ก ํฌ๊ด„์ ์ธ ํ…Œ์ŠคํŠธ ์ธํ”„๋ผ๋ฅผ ๊ตฌ์„ฑํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

์šฐ๋ฆฌ๋Š” ์ด ๋…ธ๋ ฅ์— ๋Œ€ํ•œ ๋ชจ๋“  ๊ธฐ์—ฌ๋ฅผ ํ™˜์˜ํ•˜๋ฏ€๋กœ ๋„์›€์ด ํ•„์š”ํ•˜๋ฉด ์–ธ์ œ๋“ ์ง€ ์ €์—๊ฒŒ ์—ฐ๋ฝํ•˜์‹ญ์‹œ์˜ค. ์šฐ๋ฆฌ๋Š” ํŠนํžˆ ํ–‰๋ ฌ ๊ณฑ์…ˆ๊ณผ ์ปจ๋ณผ๋ฃจ์…˜์„ ์œ„ํ•œ ๊ณ ์„ฑ๋Šฅ OpenCL ์ปค๋„์— ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ํ›„๋ณด๊ฐ€ ์ œ์•ˆ๋˜์—ˆ์ง€๋งŒ ๊ฐ๊ฐ์˜ ์žฅ๋‹จ์ ์ด๋‚˜ ํ†ตํ•ฉ ๋ฐฉ๋ฒ•์„ ์กฐ์‚ฌํ•˜๊ธฐ ์‹œ์ž‘ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.

@benoitsteiner ์„ค๋ช…ํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•˜๊ณ  ์ž˜๋ชป๋œ ์ •๋ณด ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค! ์ด๊ฒƒ์€ ๋งค์šฐ ์ข‹๊ณ  ์œ ๋งํ•˜๊ฒŒ ๋“ค๋ฆฝ๋‹ˆ๋‹ค! ๊ทธ๋Ÿฌ๋ฉด ComputeCpp๋ฅผ ํ™•์‹คํžˆ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. TensorFlow์— ๋Œ€ํ•œ OpenCL ์ง€์›์ด ์ •๋ง ๊ธฐ๋Œ€๋ฉ๋‹ˆ๋‹ค. ์™œ๋ƒํ•˜๋ฉด ์ด๊ฒƒ์ด ๋กœ๋ด‡ ๊ณตํ•™(๋”ฅ ๋Ÿฌ๋‹ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ์œ„ํ•ด TensorFlow๋ฅผ ์—ฐ๊ตฌํ•˜๊ณ  ์‚ฌ์šฉํ•˜๋Š” ๋ถ„์•ผ)์— ๋Œ€ํ•œ ๋งŽ์€ ์ƒˆ๋กœ์šด ๊ฐ€๋Šฅ์„ฑ์„ ์ œ๊ณตํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ ์–ด๋„ ์ดˆ๊ธฐ ๋ฆด๋ฆฌ์Šค๋ฅผ ์‚ดํŽด๋ณด๊ณ  ํ…Œ์ŠคํŠธ/๋””๋ฒ„๊ทธ๋ฅผ ์‹œ๋„ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ธํ…” ์นฉ๊ณผ ํ…Œ์ŠคํŠธ๋ฅผ ๊ธฐ๋‹ค๋ฆฌ๋Š” ARM CPU๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.)

@hughperkins... ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค๋งŒ ์ด๊ฒƒ์€ ์—ฌ๊ธฐ์—์„œ ์™„์ „ํžˆ ๋ฒ—์–ด๋‚œ ์ฃผ์ œ๊ฐ€ ์•„๋‹Œ๊ฐ€์š”? ์ด๊ฒƒ์ด OpenCL TF์™€ ์–ด๋–ค ๊ด€๋ จ์ด ์žˆ๋Š”์ง€ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.

์ €๋Š” ์—ฌ๊ธฐ์—์„œ ํ–‰๋ ฌ ๊ณฑ์…ˆ ๋ฐ ์ปจ๋ณผ๋ฃจ์…˜ ์ปค๋„์— ๋Œ€ํ•œ ์กฐ์ • ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ทจํ•  ๊ฒƒ์ธ์ง€, ๊ทธ๋ฆฌ๊ณ  SPIR-V๋ฅผ ์ƒ์„ฑํ•  CompiteCpp์— ๋Œ€ํ•œ ์œ ํšจํ•œ ์˜คํ”ˆ ์†Œ์Šค ๋Œ€์•ˆ ์ด ๋  ๊ฒƒ์ธ์ง€ ์•Œ๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

๋‘ ๊ฐœ์˜ ์ƒˆ๋กœ์šด Kronos Group ํ‘œ์ค€์ด https://www.khronos.org/news/press/khronos-launches-dual-neural-network-standard-initiatives ์—์„œ ๋ฐœํ‘œ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

๋„์›€์ด ๋œ๋‹ค๋ฉด ๋” ๋‚˜์€ isaac ๋ฒ„์ „์ด ์ถœ์‹œ๋˜์—ˆ์Šต๋‹ˆ๋‹ค: https://github.com/ptillet/isaac , Maxwell, Pascal ๋ฐ Fiji์—์„œ clBLAS ๋ฐ cuBLAS์— ๋น„ํ•ด ์ƒ๋‹นํ•œ ์†๋„ ํ–ฅ์ƒ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ 1D ๋ฐ 2D ์ถ•์†Œ๋ฅผ ์œ„ํ•ด Tensorflow๋ณด๋‹ค ๋น ๋ฅธ(์ž…๋ ฅ ์ธ์‹) ์ปค๋„์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

@hughperkins ๋Š” CUDA-OpenCL ๋ณ€ํ™˜๊ธฐ๋ณด๋‹ค ๋ชจ๋“  OpenCL ์žฅ์น˜์— ๋Œ€ํ•ด CUDA ์ปดํŒŒ์ผ๋Ÿฌ๋ฅผ ์ž‘์„ฑํ•  ๊ธฐํšŒ๊ฐ€ ๋” ๋งŽ์€ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

@hughperkins OpenCL 2.0์˜ SVM ๊ธฐ๋Šฅ์ด ํฌ์ธํ„ฐ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์„๊นŒ์š”? Nvidia(AMD, Intel, ARM, Qualcomm)๋ฅผ ์ œ์™ธํ•œ ๋ชจ๋‘๊ฐ€ OpenCL 2.0์„ ์ง€์›ํ•˜๊ธฐ ์‹œ์ž‘ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ข‹์€ ํ•ด๊ฒฐ์ฑ…์ด ์•„๋‹๊นŒ์š”?

@hughperkins ๊ทธ๊ฒƒ์€ blas ๊ตฌํ˜„ ์ž์ฒด์ž…๋‹ˆ๋‹ค. clblas ๋ฐ cublas ํ—ค๋”์— ์ผ๋ถ€ ๊ธฐํ˜ธ๋ฅผ ๊ตฌํ˜„ํ•˜๋ฏ€๋กœ ์žฌ์ปดํŒŒ์ผ ๋ฐ ์ฝ”๋“œ ์ˆ˜์ •์ด ์—†์Šต๋‹ˆ๋‹ค. ํ•„์ˆ˜์ ์ด๋‹ค. ๋‹ค๋ฅธ ํ—ค๋”๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— clblast.h์— ๋Œ€ํ•œ ์ผ๋ถ€ ๊ธฐํ˜ธ๋ฅผ ๊ตฌํ˜„ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด์‚ญ์˜ ์žฅ์ ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • ์™„์ „ํžˆ ๋™์ ์ด๋ฏ€๋กœ ์žฌ์ปดํŒŒ์ผ ์—†์ด CUDA ๋˜๋Š” OpenCL์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ž…๋ ฅ ์ธ์‹ , ํฐ ์ •์‚ฌ๊ฐํ˜• ํ–‰๋ ฌ์— ๋Œ€ํ•ด ์ปค๋„์„ ์กฐ์ •ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์žฌ์กฐ์ • ์—†์ด ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋“  ๋ชจ์–‘์—์„œ ์ž˜ ์ˆ˜ํ–‰๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
  • numpy/arrayfire์™€ ์œ ์‚ฌํ•œ C++ API. ์š”์†Œ๋ณ„ ์ž‘์—…์„ ์ถ•์†Œ์™€ ๊ฒฐํ•ฉํ•˜๊ธฐ ์œ„ํ•œ ์ผ๋ถ€ ์œตํ•ฉ

@marty1885
์„ค๋งˆ. AMD๋Š” AMDGPU-PRO ๋“œ๋ผ์ด๋ฒ„์—์„œ 1.2 ์ง€์›์œผ๋กœ ๋Œ์•„๊ฐ”์Šต๋‹ˆ๋‹ค. ์™„์ „ํ•œ 2.0 ์ง€์›์ด ๋„๋ฆฌ ๋ณด๊ธ‰๋  ๋•Œ๊นŒ์ง€ ์‹œ๊ฐ„์ด ๊ฑธ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ™•์‹คํžˆ ๋‹จ๊ธฐ์ ์ธ ํ•ด๊ฒฐ์ฑ…์€ ์•„๋‹™๋‹ˆ๋‹ค.

  • ๋„ค
  • ํ•„์š”ํ•œ ๊ฒฝ์šฐ ์—ฌ๋Ÿฌ ์ž‘์—…์— ๋Œ€ํ•œ ํ˜ธํ™˜์„ฑ์„ ํ•ดํ‚นํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค(์˜ˆ: **MV๋ฅผ GEMV๋กœ ์ „๋‹ฌ). ๋ณต์žกํ•œ ์ง€์›์€ ๊นŒ๋‹ค๋กœ์šธ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด์ค‘ ์ง€์›์€ ์ด๋ฏธ ์—ฌ๊ธฐ์— ์žˆ์ง€๋งŒ ์ด์— ๋Œ€ํ•ด ์กฐ์ •๋œ ์•„ํ‚คํ…์ฒ˜๋Š” ์—†์Šต๋‹ˆ๋‹ค.

@hughperkins

๋‚ด ์ฝ”๋“œ๊ฐ€ ๋ช…๋ฐฑํ•œ OpenCL ๊ทœ์น™์„ ์œ„๋ฐ˜ํ•˜์ง€ ์•Š๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์˜ˆ, ํฌ์ธํ„ฐ๋ฅผ ํฌํ•จํ•˜๋Š” ๋ชจ๋“  __global ๊ตฌ์กฐ(์˜ˆ: ๋ฐฐ์—ด ๋˜๋Š” ๊ตฌ์กฐ์ฒด)๋ฅผ ์ „๋‹ฌํ•˜๋Š” ๊ฒƒ์€ ํ•ด๋‹น ํฌ์ธํ„ฐ๊ฐ€ ๋‹ค๋ฅธ ์žฅ์น˜์˜ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๊ฐ€๋ฆฌํ‚ฌ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์˜ฌ๋ฐ”๋ฅด์ง€ ์•Š์Šต๋‹ˆ๋‹ค(OpenCL์€ ํ•œ ์žฅ์น˜๊ฐ€ ๋‹ค๋ฅธ ์žฅ์น˜์˜ ๋ฉ”๋ชจ๋ฆฌ์— ์•ก์„ธ์Šคํ•  ์ˆ˜ ์—†๋Š” ๋‹ค์ค‘ ์žฅ์น˜ ํŒจ๋Ÿฌ๋‹ค์ž„์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค). ๊ทธ๋Ÿฌ๋‚˜ OpenCL ์ฝ”๋“œ์— ๋Œ€ํ•œ ์ค‘๊ฐ„ ๋ฒˆ์—ญ ์—†์ด IR ์ˆ˜์ค€์—์„œ ๊ทน๋ณตํ•˜๋Š” ๊ฒƒ์ด ๊ฐ€๋Šฅํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ์ œ๊ฐ€ ๊ฐ€์ •ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค. :)

@benoitsteiner , @henline , https://github.com/henline/streamexecutordoc ์—์„œ streamexecutor๊ฐ€ ๊ธฐ๋ณธ์ ์œผ๋กœ CL ๋ฒ„์ „ ๋ฏธ๋ฆฌ ์ค€๋น„๋œ ์ž‘์—…(DNN, BLAS์™€ ๊ฐ™์€)์„ ์ง€์›ํ–ˆ์Œ์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. Google์ด ์ด๋ฏธ Tensorflow์— ์‚ฌ์šฉํ•  ์ค€๋น„๊ฐ€ ๋œ clDNN, clBLAS ๊ตฌํ˜„์„ ๊ฐ€์ง€๊ณ  ์žˆ์ง€๋งŒ ์•„์ง ์˜คํ”ˆ ์†Œ์Šค๊ฐ€ ์•„๋‹˜์„ ์‹œ์‚ฌํ•ฉ๋‹ˆ๊นŒ?

๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ๋™์ผํ•œ ์†Œํ”„ํŠธ์›จ์–ด ์•„ํ‚คํ…์ฒ˜๋ฅผ ์œ ์ง€ํ•˜๋ ค๋Š” ๊ฒฝ์šฐ OpenCL 2.0+ ๋ฐ SYCL 2.2๊ฐ€ SVM์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
OpenCL 2.0+๋Š” ์˜ˆ๋ฅผ ๋“ค์–ด AMD ๋ฐ Intel GPU์—์„œ ์ง€์›๋ฉ๋‹ˆ๋‹ค. ์ž„๋ฒ ๋””๋“œ ์„ธ๊ณ„์—์„œ๋Š” ํ˜ธ์ŠคํŠธ ๋ฉ”๋ชจ๋ฆฌ์™€ ๋””๋ฐ”์ด์Šค ๋ฉ”๋ชจ๋ฆฌ๊ฐ€ ๋น„์šฉ์ƒ์˜ ์ด์œ ๋กœ ์ข…์ข… ๋™์ผํ•˜๊ธฐ ๋•Œ๋ฌธ์— OpenCL 1.x์—์„œ๋„ ์ข…์ข… ๋ถ€์ž‘์šฉ์— ์˜ํ•ด ์ง€์›๋ฉ๋‹ˆ๋‹ค.

@keryell
๊ทธ๋Ÿฌ๋‚˜ ๊ฐ€์žฅ ์ฃผ๋ชฉํ• ๋งŒํ•œ ํ”Œ๋žซํผ์ธ Linux + ์ƒˆ๋กœ์šด AMD GPU(RX 480, ๊ณง ์ถœ์‹œ๋  Vega)๋Š” ํ˜„์žฌ๋กœ์„œ๋Š” OpenCL 1.2๋งŒ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค... ๊ทธ๋ฆฌ๊ณ  ์–ธ์ œ ๋ณ€๊ฒฝ๋ ์ง€ ๋ˆ„๊ฐ€ ์•Œ๊ฒ ์Šต๋‹ˆ๊นŒ(๋‚ด ์˜ˆ์ƒ์€ 1๋…„ ํ›„์ž…๋‹ˆ๋‹ค). OpenCL 2.0์šฉ Beignet(์˜คํ”ˆ ์†Œ์Šค Linux Intel)๋„ ์—ฌ์ „ํžˆ ๋ฒ„๊ทธ๊ฐ€ ๋งŽ์€ ์—‰๋ง์ž…๋‹ˆ๋‹ค. ์•ˆ์ • ๋ฒ„์ „์—๋Š” 1.2๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
๋˜ํ•œ OpenCL ํ˜ธํ™˜ ์นฉ์„ ๋งŒ๋“œ๋Š” ๋ชจ๋“  ์†Œ๊ทœ๋ชจ ํšŒ์‚ฌ๊ฐ€ 1.2 ์ง€์›์„ ๊ฐ„์‹ ํžˆ ๋Œ์–ด๋‚ด๊ณ  ์žˆ๋‹ค๋Š” ์ ์„ ๊ณ ๋ คํ•˜๋ฉด. ๋”ฐ๋ผ์„œ OpenCL 2.0์— ์˜์กดํ•˜๋Š” ๋ชจ๋“  ๊ฒƒ์€ ์‹ค์ œ๋กœ ๋งค์šฐ ๋‚˜์œ ์ ์‘๋ฅ ์„ ๋ณด์ผ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์ œ ์ƒ๊ฐ์—๋Š”.. ์–ด๋–ค ํ•˜๋“œ์›จ์–ด ๊ณต๊ธ‰์—…์ฒด๋ผ๋„ SPIR-V๋ฅผ ๊ธ‰ํ•˜๊ฒŒ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ? Vulkan์— ๋Œ€ํ•œ ๊ทธ๋ž˜ํ”ฝ/์…ฐ์ด๋” ์••๋ ฅ์ด Opencl ์ธก์—์„œ ๋„์›€์ด ๋  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

@naibaf7 OpenCL 2์— ๋Œ€ํ•œ ๋…ผ์˜๋กœ ๋Œ์•„๊ฐ€๊ฑฐ๋‚˜ ๋ง๊ฑฐ๋‚˜, ์–ด๋Š ์‹œ์ ์—์„œ ์‹ค์ œ ๋ฌผ๊ฑด์ด ์ „๋‹ฌ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค... ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด TensorFlow๊ฐ€ ์‹คํ–‰ ์ค‘์ธ nVidia GPU ๋ฐ CUDA๊ฐ€ ์ด๋ฏธ ์žˆ์Šต๋‹ˆ๋‹ค... :-)
๊ทธ๋Ÿฌ๋‚˜ ๋ฌผ๋ก  SVM์ด ์—†๋Š” TensorFlow ๋ฒ„์ „์—๋Š” ์•ฝ๊ฐ„์˜ ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

@keryell Vulkan SPIR-V๊ฐ€ ๋“œ๋ผ์ด๋ฒ„์—์„œ ์–ผ๋งˆ๋‚˜ ์ž‘๋™ํ•ฉ๋‹ˆ๊นŒ(์ด๋ฏธ ์ข‹์€ ์žฅ์น˜ ์ ์šฉ ๋ฒ”์œ„๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Œ) ์ตœ์‹  Opencl ๋ฒ„์ „์„ ํ‘ธ์‹œํ•  ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•˜์‹ญ๋‹ˆ๊นŒ?

@naibaf7 Khronos ๋ชจ์ž„์€ ๋‹ค์Œ ์ฃผ์— OpenCL๊ณผ Vulkan ์‚ฌ๋žŒ๋“ค๊ณผ ํ•จ๊ป˜ ์„œ์šธ์—์„œ ์—ด๋ฆฌ์ง€๋งŒ ํ† ๋ก ์€ ๊ณต๊ฐœ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๊ฐ ์„ธ๊ณ„๊ฐ€ ๋‹ค๋ฅธ ์„ธ๊ณ„๋ฅผ ๊ฐœ์„ ํ•˜๊ณ  ์–ธ์  ๊ฐ€๋Š” TensorFlow์— ์ด์ ์ด ์žˆ๋‹ค๋Š” ๊ฒƒ์€ ์ข‹์€ ์ƒ๊ฐ์ฒ˜๋Ÿผ ๋“ค๋ฆฝ๋‹ˆ๋‹ค. :-)

@keryell
๋„ค, ์ €๋Š” ๊ทธ๋“ค์ด DeepLearning์— ์œ ์ตํ•œ ๋‚ด์šฉ์— ๋Œ€ํ•ด ๋…ผ์˜ํ•˜๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค. :)

์ถ•ํ•˜ ํ•ด์š”! HIP ํ”„๋กœ์ ํŠธ๋„ ๊ฐ™์€ ๋ฌธ์ œ๋ฅผ ํ’€๊ธฐ ์œ„ํ•ด ๋…ธ๋ ฅํ–ˆ์œผ๋ฏ€๋กœ ๋ฐ˜๋“œ์‹œ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค. ๊ทธ๋“ค์€ ์ˆ˜๋™์œผ๋กœ ๋ณ€ํ™˜ํ•ด์•ผ ํ•˜๋Š” ํ•ญ๋ชฉ์„ ์ •์˜ํ•˜๋Š” HIP๋ผ๋Š” ์ƒˆ๋กœ์šด ์–ธ์–ด๋ฅผ ๋งŒ๋“ค๊ธฐ๋กœ ์„ ํƒํ–ˆ์Šต๋‹ˆ๋‹ค(์˜ˆ: ๊ณ„์‚ฐ ์ˆ˜์ค€์„ ํ™•์ธํ•˜์—ฌ ๋ฐฐ์ •๋ฐ€๋„ ์ง€์› ํ™•์ธ). ํ”„๋กœ์ ํŠธ๊ฐ€ ์ง„ํ–‰๋˜๋Š” ๋™์•ˆ ์ˆ˜๋™ ๋ฒˆ์—ญ์˜ ์–‘์€ ์ค„์–ด๋“ค ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ฐธ์กฐ: https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP

๋‚ด๊ฐ€ ์ œ์•ˆํ•˜๋Š” ๊ฒƒ์€ HIP๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  Tensorflow ๋˜๋Š” ์ž์‹ ์˜ ๋ชฉํ‘œ๋ฅผ ๋ฐœ์ „์‹œํ‚ค๋Š” ๋ฐ ๋ฐฉํ•ด๊ฐ€ ๋˜๋Š” ๋ช‡ ๊ฐ€์ง€ ๋ฒ„๊ทธ๋ฅผ ์ˆ˜์ •ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด์ œ LLVM์„ ์ดํ•ดํ•˜๊ฒŒ ๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ์ด๋ฏธ ์ˆ˜์ •ํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•  ํ•„์š”๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

@hughperkins
์ด https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md#create -the-pip-package-and-install์— ๋”ฐ๋ผ ํฌํฌ๋กœ ํŒŒ์ด์ฌ ๋ชจ๋“ˆ์„ ๋นŒ๋“œํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

INFO: From Compiling tensorflow/core/kernels/gather_functor_gpu.cu.cc:
gpus/crosstool: -x cuda
gpus/crosstool: using cocl
gpus/crosstool: PATH=/usr/bin:/usr/local/bin /usr/local/bin/cocl -D_FORCE_INLINES -gencode=arch=compute_30,\"code=sm_30,compute_30\"   -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=1 -DNDEBUG -DEIGEN_MPL2_ONLY -std=c++11  -I. -Ibazel-out/local_linux-py3-opt/genfiles -Iexternal/bazel_tools -Ibazel-out/local_linux-py3-opt/genfiles/external/bazel_tools -Iexternal/eigen_archive -Ibazel-out/local_linux-py3-opt/genfiles/external/eigen_archive  --compiler-bindir=/usr/bin/gcc -I . -fPIC  -x cu  -O2 -c  -o bazel-out/local_linux-py3-opt/bin/tensorflow/core/kernels/_objs/gather_functor_gpu/tensorflow/core/kernels/gather_functor_gpu.cu.pic.o tensorflow/core/kernels/gather_functor_gpu.cu.cc
dirname: invalid option -- 'O'
Try 'dirname --help' for more information.

์ €๋Š” ์šฐ๋ถ„ํˆฌ 16.04๋ฅผ ์‚ฌ์šฉ ์ค‘์ด๋ฉฐ dirname์€ coreutils-8.25-2ubuntu2์—์„œ ๊ฐ€์ ธ์™”์Šต๋‹ˆ๋‹ค.

@hughperkins ์ด ์ง€์นจ์œผ๋กœ ์ €์žฅ์†Œ์˜ TF dockerfile์„ ์กฐ์ •ํ•˜๋ฉด ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์ด ์‰ฝ๊ฒŒ ์„ค์ •ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ, ๋” ๊ธฐ๋Šฅ์ ์ธ ๊ฒƒ์ด ์žˆ์„ ๋•Œ. ๊ธฐ๋ณธ์ ์œผ๋กœ ๊ท€ํ•˜๊ฐ€ ๊ฒŒ์‹œํ•œ ์ด ์ง€์นจ์˜ ์‚ฌ๋ณธ์ด์ž ๊ณผ๊ฑฐ์ž…๋‹ˆ๋‹ค.

ATI 6770M(OpenCL 1.2)์ด ์„ค์น˜๋œ 2015๋…„ ํ›„๋ฐ˜ MacBook์˜ MacOS 10.10.5์—์„œ ์ด๊ฒƒ์„ ์‹คํ—˜ ์ค‘์ž…๋‹ˆ๋‹ค.

Xcode 8, Anaconda(Python 3.5) ๋ฐ clang+llvm์— ํ•ด๋‹นํ•˜๋Š” MacPorts๋ฅผ ์„ค์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค.

apt-get ํ–‰ ๋Œ€์‹  ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•˜์‹ญ์‹œ์˜ค.

sudo ํฌํŠธ ์„ค์น˜ clang-3.8 llvm-3.8

/proc/cpuinfo๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋Œ€์‹  ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

NUM_PROCS=$(system_profiler SPHardwareDataType | grep "์ด ์ฝ”์–ด ์ˆ˜" | cut -d ":" -f 2)

๊ทธ๋Ÿฐ ๋‹ค์Œ macport๋ฅผ ์‚ฌ์šฉํ•˜๋„๋ก Makefile์„ ์ˆ˜์ •ํ•˜๊ณ  make๋ฅผ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

perl -pi.bak -e 's|(CLANG)=.+|$1=/opt/local/libexec/llvm-3.8/bin/clag++|' ๋ฉ”์ดํฌํŒŒ์ผ
perl -pi -e 's|(LLVM_CONFIG)=.+|$1=/opt/local/bin/llvm-config-mp-3.8|' ๋ฉ”์ดํฌํŒŒ์ผ
perl -pi -e 's|(LLVM_INCLUDE)=.+|$1=/opt/local/libexec/llvm-3.8/include|' ๋ฉ”์ดํฌํŒŒ์ผ

Macos OpenCL ๋””๋ ‰ํ† ๋ฆฌ๋กœ ์—…๋ฐ์ดํŠธ ๋ฏธ๋ž˜: /System/Library/Frameworks/OpenCL.framework/Versions/Current/Headers/cl.h '#ifdef APPLE ' ์กฐ๊ฑด๋ถ€ ์‚ฌ์šฉ

grep -Rl 'include "CL/' * | xargs perl -pi.bak -e 's|include "CL/|"OpenCL/|g' ํฌํ•จ
make -j ${NUM_PROCS}

์ด๊ฒƒ์€ ๋‚ด๊ฐ€ ์–ป๋Š” ํ•œ :

$ make -j ${NUM_PROCS}
mkdir -p ๋นŒ๋“œ
mkdir -p ๋นŒ๋“œ
mkdir -p ๋นŒ๋“œ
/opt/local/libexec/llvm-3.8/bin/clang++ -c -o build/hostside_opencl_funcs.o -std=c++11 -fPIC -g -O2 -I pwd /include -I pwd /src/EasyCL src/hostside_opencl_funcs.cpp
/opt/local/libexec/llvm-3.8/bin/clang++ -I/usr/lib/llvm-3.8/include -fPIC -fvisibility-inlines-hidden -ffunction-sections -fdata-sections -g -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT -D__STDC_LIMIT_MACROS -std=c++11 -fcxx-exceptions -c -o build/mutations.o -g -I/opt/local/libexec/llvm-3.8/include src/mutations.cpp
/opt/local/libexec/llvm-3.8/bin/clang++ -I/usr/lib/llvm-3.8/include -fPIC -fvisibility-inlines-hidden -ffunction-sections -fdata-sections -g -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT -D__STDC_LIMIT_MACROS -std=c++11 -fcxx-์˜ˆ์™ธ -c -o build/struct_clone.o -g -I/opt/local/libexec/llvm-3.8/include src/struct_clone.cpp
/opt/local/libexec/llvm-3.8/bin/clang++ -I/usr/lib/llvm-3.8/include -fPIC -fvisibility-inlines-hidden -ffunction-sections -fdata-sections -g -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT -D__STDC_LIMIT_MACROS -std=c++11 -fcxx-์˜ˆ์™ธ -c -o build/readIR.o -g -I/opt/local/libexec/llvm-3.8/include src/readIR.cpp
src/hostside_opencl_funcs.cpp:17์— ํฌํ•จ๋œ ํŒŒ์ผ:
/Users/erybski/git/tensorflow-cl/third_party/cuda-on-cl/include/cocl/cocl.h:91:16: ๊ฒฝ๊ณ : 'ํ˜ธ์ŠคํŠธ' ์†์„ฑ์ด ๋ฌด์‹œ๋จ [-Wignored-attributes]
์†์„ฑ ((ํ˜ธ์ŠคํŠธ)) ์ธ๋ผ์ธ unsigned long long atomicExch(ํœ˜๋ฐœ์„ฑ unsigned long long _p, unsigned long long val) {
^^
src/hostside_opencl_funcs.cpp:194:33: ์˜ค๋ฅ˜: 'in' ๋ฉค๋ฒ„ ํ•จ์ˆ˜์— ๋Œ€ํ•œ ํ˜ธ์ถœ์ด ๋ชจํ˜ธํ•ฉ๋‹ˆ๋‹ค.
launchConfiguration.kernel->in(์˜คํ”„์…‹);
~ ~ ~ ~ ~~~ ^~
/Users/erybski/git/tensorflow-cl/third_party/cuda-on-cl/src/EasyCL/CLKernel.h:101:15: ์ฐธ๊ณ : ํ›„๋ณด ํ•จ์ˆ˜
CLKernel in(๋ถ€๋™๊ฐ’);^^/Users/erybski/git/tensorflow-cl/third_party/cuda-on-cl/src/EasyCL/CLKernel.h:104:15: ์ฐธ๊ณ : ํ›„๋ณด ํ•จ์ˆ˜CLKernel *in(int32_t ๊ฐ’);^^/Users/erybski/git/tensorflow-cl/third_party/cuda-on-cl/src/EasyCL/CLKernel.h:106:15: ์ฐธ๊ณ : ํ›„๋ณด ํ•จ์ˆ˜CLKernel *in(int64_t ๊ฐ’);^^/Users/erybski/git/tensorflow-cl/third_party/cuda-on-cl/src/EasyCL/CLKernel.h:108:15: ์ฐธ๊ณ : ํ›„๋ณด ํ•จ์ˆ˜CLKernel *in(uint64_t ๊ฐ’);^^/Users/erybski/git/tensorflow-cl/third_party/cuda-on-cl/src/EasyCL/CLKernel.h:110:15: ์ฐธ๊ณ : ํ›„๋ณด ํ•จ์ˆ˜CLKernel *in(uint32_t ๊ฐ’);^^/Users/erybski/git/tensorflow-cl/third_party/cuda-on-cl/src/EasyCL/CLKernel.h:73:15: ์ฐธ๊ณ : ํ›„๋ณด ํ•จ์ˆ˜๊ฐ€ ์‹คํ–‰ ๊ฐ€๋Šฅํ•˜์ง€ ์•Š์Œ: 'size_t'(์ผ๋ช… 'unsigned long ') 'easycl::CLArray *'์ฒซ ๋ฒˆ์งธ ์ธ์ˆ˜์— ๋Œ€ํ•ดCLKernel *in(CLArray *clarray1d) { ๋ฐ˜ํ™˜ ์ž…๋ ฅ(clarray1d);



}^^/Users/erybski/git/tensorflow-cl/third_party/cuda-on-cl/src/EasyCL/CLKernel.h:91:36: ์ฐธ๊ณ : ํ›„๋ณด ํ•จ์ˆ˜ ํ…œํ”Œ๋ฆฟ์ด ์‹คํ–‰ ๊ฐ€๋Šฅํ•˜์ง€ ์•Š์Œ: 2๊ฐœ์˜ ์ธ์ˆ˜๊ฐ€ ํ•„์š”ํ•˜์ง€๋งŒ 1๊ฐœ๊ฐ€ ์ œ๊ณต๋จ์ฃผํ˜•CLKernel *in(int N, const T *data);^^1๊ฐœ์˜ ๊ฒฝ๊ณ ์™€ 1๊ฐœ์˜ ์˜ค๋ฅ˜๊ฐ€ ์ƒ์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.make: *_* [build/hostside_opencl_funcs.o] ์˜ค๋ฅ˜ 1๋งŒ๋“ค๋‹ค: * * ๋๋‚˜์ง€ ์•Š์€ ์ž‘์—…์„ ๊ธฐ๋‹ค๋ฆฌ๋ฉฐ....
src/struct_clone. cpp:245 :12: ๊ฒฝ๊ณ : 11 ์—ด๊ฑฐํ˜• ๊ฐ’์ด ์Šค์œ„์น˜์—์„œ ์ฒ˜๋ฆฌ๋˜์ง€ ์•Š์Œ: 'HalfTyID', 'X86_FP80TyID', 'FP128TyID'... [-Wswitch]
์Šค์œ„์น˜(typeID) {
^^
1๊ฐœ์˜ ๊ฒฝ๊ณ ๊ฐ€ ์ƒ์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

launchConfiguration.kernel->in((int64_t)offset);

์ด ํŒจ์น˜๊ฐ€ ์ž‘๋™ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

์ด๊ฒƒ์„ ์ ์šฉํ•œ ํ›„ ๋นŒ๋“œ๋ฅผ ๊ณ„์†ํ•˜๋ฉด size_t ๋„ค์ž„์ŠคํŽ˜์ด์Šค ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.

$ make -j ${NUM_PROCS}
mkdir -p ๋นŒ๋“œ
mkdir -p ๋นŒ๋“œ
/opt/local/libexec/llvm-3.8/bin/clang++ -c -o build/hostside_opencl_funcs.o -std=c++11 -fPIC -g -O2 -I pwd /include -I pwd /src/EasyCL src/hostside_opencl_funcs.cpp
/opt/local/libexec/llvm-3.8/bin/clang++ -I/usr/lib/llvm-3.8/include -fPIC -fvisibility-inlines-hidden -ffunction-sections -fdata-sections -g -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT -D__STDC_LIMIT_MACROS -std=c++11 -fcxx-exceptions -o build/ir-to-opencl -g -I/opt/local/libexec/llvm-3.8/include src/ir-to-opencl.cpp build/struct_clone .o ๋นŒ๋“œ/readIR.o src/ir-to-opencl-common.cpp ๋นŒ๋“œ/mutations.o /opt/local/bin/llvm-config-mp-3.8 --ldflags --system-libs --libs all
/opt/local/libexec/llvm-3.8/bin/clang++ -c -o build/cocl_events.o -std=c++11 -fPIC -g -O2 -I pwd /src/CLBlast/include -I pwd /include -I pwd /src/EasyCL src/cocl_events.cpp
/opt/local/libexec/llvm-3.8/bin/clang++ -I/usr/lib/llvm-3.8/include -fPIC -fvisibility-inlines-hidden -ffunction-sections -fdata-sections -g -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT -D__STDC_LIMIT_MACROS -std=c++11 -fcxx-exceptions -o build/patch-hostside -g -I/opt/local/libexec/llvm-3.8/include src/patch-hostside.cpp ๋นŒ๋“œ/readIR.o ๋นŒ๋“œ/ mutations.o ๋นŒ๋“œ/struct_clone.o src/ir-to-opencl-common.cpp /opt/local/bin/llvm-config-mp-3.8 --ldflags --system-libs --libs all
src/hostside_opencl_funcs์— ํฌํ•จ๋œ ํŒŒ์ผ์— ์žˆ์Šต๋‹ˆ๋‹ค. cpp:17 :
/Users/erybski/git/tensorflow-cl/third_party/cuda-on-cl/include/cocl/cocl.h:91:16: ๊ฒฝ๊ณ : 'ํ˜ธ์ŠคํŠธ' ์†์„ฑ์ด ๋ฌด์‹œ๋จ [-Wignored-attributes]
์†์„ฑ ((ํ˜ธ์ŠคํŠธ)) ์ธ๋ผ์ธ unsigned long long atomicExch(ํœ˜๋ฐœ์„ฑ unsigned long long _p, unsigned long long val) {
^^
/opt/local/libexec/llvm-3.8/bin/clang++ -c -o build/cocl_blas.o -std=c++11 -fPIC -g -O2 -I pwd /src/CLBlast/include -I pwd /include -I pwd /src/EasyCL src/cocl_blas.cpp
1๊ฐœ์˜ ๊ฒฝ๊ณ ๊ฐ€ ์ƒ์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
/opt/local/libexec/llvm-3.8/bin/clang++ -c -o build/cocl_error.o -std=c++11 -fPIC -g -O2 -I pwd /src/CLBlast/include -I pwd /include -I pwd /src/EasyCL src/cocl_error.cpp
src/cocl_blas์— ํฌํ•จ๋œ ํŒŒ์ผ์— ์žˆ์Šต๋‹ˆ๋‹ค. cpp:15 :
/Users/erybski/git/tensorflow-cl/third_party/cuda-on-cl/include/cocl/cocl_blas.h:8:9: ์˜ค๋ฅ˜: 'std' ๋„ค์ž„์ŠคํŽ˜์ด์Šค์— 'size_t'๋ผ๋Š” ์œ ํ˜•์ด ์—†์Šต๋‹ˆ๋‹ค. ๋‹จ์ˆœํžˆ 'size_t'๋ฅผ ์˜๋ฏธํ–ˆ๋‚˜์š”?
typedef std::size_t cublasStatus_t;
^ ~ ~
size_t
/opt/local/libexec/llvm-3.8/bin/../lib/clang/3.8.1/include/stddef.h:62:23: ์ฐธ๊ณ : 'size_t'๊ฐ€ ์—ฌ๊ธฐ์— ์„ ์–ธ๋จ
typedef SIZE_TYPE size_t;
^^
src/cocl_blas์— ํฌํ•จ๋œ ํŒŒ์ผ์— ์žˆ์Šต๋‹ˆ๋‹ค. cpp:15 :
/Users/erybski/git/tensorflow-cl/third_party/cuda-on-cl/include/cocl/cocl_blas.h:17:5: ์˜ค๋ฅ˜: 'std' ๋„ค์ž„์ŠคํŽ˜์ด์Šค์— 'size_t'๋ผ๋Š” ์œ ํ˜•์ด ์—†์Šต๋‹ˆ๋‹ค. ๋‹จ์ˆœํžˆ 'size_t'๋ฅผ ์˜๋ฏธํ–ˆ๋‚˜์š”?
std::size_t cublasCreate(cublasHandle_t ํŒฌ๋“ค);^ ~ ~size_t/opt/local/libexec/llvm-3.8/bin/../lib/clang/3.8.1/include/stddef.h:62:23: ์ฐธ๊ณ : 'size_t'๊ฐ€ ์—ฌ๊ธฐ์— ์„ ์–ธ๋จtypedef SIZE_TYPE size_t;^^src/cocl_blas์— ํฌํ•จ๋œ ํŒŒ์ผ์— ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹จ์ˆœํžˆ 'size_t'๋ฅผ ์˜๋ฏธํ–ˆ๋‚˜์š”?std::size_t cublasDestroy(cublasHandle_t ํ•ธ๋“ค);^ ~ ~size_t/opt/local/libexec/llvm-3.8/bin/../lib/clang/3.8.1/include/stddef.h:62:23: ์ฐธ๊ณ : 'size_t'๊ฐ€ ์—ฌ๊ธฐ์— ์„ ์–ธ๋จtypedef SIZE_TYPE size_t;^^src/cocl_blas์— ํฌํ•จ๋œ ํŒŒ์ผ์— ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹จ์ˆœํžˆ 'size_t'๋ฅผ ์˜๋ฏธํ–ˆ๋‚˜์š”?std::size_t cublasSgemm(cublasHandle_t blas, int transA, int transB, int M, int N, int K,^ ~ ~size_t/opt/local/libexec/llvm-3.8/bin/../lib/clang/3.8.1/include/stddef.h:62:23: ์ฐธ๊ณ : 'size_t'๊ฐ€ ์—ฌ๊ธฐ์— ์„ ์–ธ๋จtypedef SIZE_TYPE size_t;^^src/cocl_blas์— ํฌํ•จ๋œ ํŒŒ์ผ์— ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹จ์ˆœํžˆ 'size_t'๋ฅผ ์˜๋ฏธํ–ˆ๋‚˜์š”?std::size_t cublasSetPointerMode(cublasHandle_t ํ•ธ๋“ค, cublasPointerMode_t ๋ชจ๋“œ);^ ~ ~size_t/opt/local/libexec/llvm-3.8/bin/../lib/clang/3.8.1/include/stddef.h:62:23: ์ฐธ๊ณ : 'size_t'๊ฐ€ ์—ฌ๊ธฐ์— ์„ ์–ธ๋จtypedef SIZE_TYPE size_t;^^src/cocl_blas์— ํฌํ•จ๋œ ํŒŒ์ผ์— ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹จ์ˆœํžˆ 'size_t'๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๊นŒ?std::size_t cublasGetPointerMode(cublasHandle_t ํ•ธ๋“ค, cublasPointerMode_t *๋ชจ๋“œ);^ ~ ~size_t/opt/local/libexec/llvm-3.8/bin/../lib/clang/3.8.1/include/stddef.h:62:23: ์ฐธ๊ณ : 'size_t'๊ฐ€ ์—ฌ๊ธฐ์— ์„ ์–ธ๋จtypedef SIZE_TYPE size_t;^^src/cocl_blas์— ํฌํ•จ๋œ ํŒŒ์ผ์— ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹จ์ˆœํžˆ 'size_t'๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๊นŒ?std::size_t cublasSetStream(cublasHandle_t ํ•ธ๋“ค, cudaStream_t streamId);^ ~ ~size_t/opt/local/libexec/llvm-3.8/bin/../lib/clang/3.8.1/include/stddef.h:62:23: ์ฐธ๊ณ : 'size_t'๊ฐ€ ์—ฌ๊ธฐ์— ์„ ์–ธ๋จtypedef SIZE_TYPE size_t;^^/opt/local/libexec/llvm-3.8/bin/clang++ -c -o build/cocl_memory.o -std=c++11 -fPIC -g -O2 -I pwd /src/CLBlast/include -I pwd /include -I pwd /src/EasyCL src/cocl_memory.cpp/opt/local/libexec/llvm-3.8/bin/clang++ -c -o build/cocl_device.o -std=c++11 -fPIC -g -O2 -I pwd /src/CLBlast/include -I pwd /include -I pwd /src/EasyCL src/cocl_device.cpp7๊ฐœ์˜ ์˜ค๋ฅ˜๊ฐ€ ์ƒ์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.make: *_* [build/cocl_blas.o] ์˜ค๋ฅ˜ 1make: * * ๋๋‚˜์ง€ ์•Š์€ ์ž‘์—…์„ ๊ธฐ๋‹ค๋ฆฌ๋Š” ์ค‘....

์Šค๋ ˆ๋“œ๋ฅผ ๊ณ„์† ์ฝ์„ ์ˆ˜ ์žˆ๋„๋ก ๊ธด ๋กœ๊ทธ์˜จ ์š”์ ์„ ํ‘ธ์‹œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

์งˆ๋ฌธ: ์ฃผ์†Œ ๊ณต๊ฐ„ ๋ฌธ์ œ๋ฅผ ์–ด๋–ป๊ฒŒ ํ•ด๊ฒฐํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ?

@hughperkins SYCL ์‚ฌ์–‘์€ ์„น์…˜ 5.8("์ฃผ์†Œ ๊ณต๊ฐ„ ๊ณต์ œ")์— ์„ค๋ช…๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
๊ตฌํ˜„์ด ๋‹ค์–‘ํ•œ ๋ฉ”๋ชจ๋ฆฌ ์œ ํ˜•์„ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•˜๋Š” ๋ฐฉ๋ฒ•. ์ด
PlayStation 3์— ๋Œ€ํ•ด ์ˆ˜ํ–‰๋œ ์ด์ „ ์ž‘์—…๊ณผ ์œ ์‚ฌํ•˜๋ฉฐ ์— ์„ค๋ช…๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.
์ด ๋ฐฑ์„œ: ์˜คํ”„๋กœ๋“œ โ€“ ์ด๊ธฐ์ข…์œผ๋กœ ์ฝ”๋“œ ๋งˆ์ด๊ทธ๋ ˆ์ด์…˜ ์ž๋™ํ™”๋ฉ€ํ‹ฐ์ฝ”์–ด ์‹œ์Šคํ…œ ๋˜๋Š” C++: Clang์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹จ์ผ ์†Œ์Šค SYCL ๋ฐ HSA ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ชจ๋ธ ์ง€์›

๋„์›€์ด ๋˜๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค.

@hughperkins ๋‚ด ARM ๋ณด๋“œ๋ฅผ ์ ์šฉํ•˜๊ธฐ ์œ„ํ•ด tensorflow-opencl repo ์ฝ”๋“œ๋ฅผ ์ปดํŒŒ์ผํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ๋‚ด ARM ๋ณด๋“œ์—๋Š” opencl 1.2๋ฅผ ์ง€์›ํ•˜๋Š” Imagination GPU๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

tf/intel ์ง€์›์„ ๊ฒ€์ƒ‰ํ•˜๋Š” ๋™์•ˆ ์ด ์Šค๋ ˆ๋“œ๋ฅผ ์šฐ์—ฐํžˆ ๋ฐœ๊ฒฌํ–ˆ์Šต๋‹ˆ๋‹ค.

Intel MacBook Pro๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์–ด๋–ป๊ฒŒ ๋„์™€๋“œ๋ฆด๊นŒ์š”? ๋‚˜๋Š” c/c++๋ฅผ ๋ชจ๋ฅด์ง€๋งŒ ๋นŒ๋“œ/์ปดํŒŒ์ผ/ํ…Œ์ŠคํŠธ ์ง€์นจ์„ ๋”ฐ๋ฅด๊ณ  (pastebin) ๊ฒฐ๊ณผ๋ฅผ ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค...

derek$ system_profiler SPDisplaysDataType
๊ทธ๋ž˜ํ”ฝ/๋””์Šคํ”Œ๋ ˆ์ด:

Intel Iris:

  Chipset Model: Intel Iris
  Type: GPU
  Bus: Built-In
  VRAM (Dynamic, Max): 1536 MB
  Vendor: Intel (0x8086)
  Device ID: 0x0a2e
  Revision ID: 0x0009
  Metal: Supported
  Displays:
    Color LCD:
      Display Type: Retina LCD
      Resolution: 2560 x 1600 Retina
      Retina: Yes
      Pixel Depth: 32-Bit Color (ARGB8888)
      Main Display: Yes
      Mirror: Off
      Online: Yes
      Automatically Adjust Brightness: Yes
      Built-In: Yes
    PL2202W:
      Resolution: 1680 x 1050 @ 60 Hz
      Pixel Depth: 32-Bit Color (ARGB8888)
      Display Serial Number: 05884C7A57014
      Mirror: Off
      Online: Yes
      Rotation: Supported
      Adapter Type: Apple Mini DisplayPort To VGA Adapter
      Automatically Adjust Brightness: No
      Adapter Firmware Version: 1.03

@hughperkins ์ง€์‹œํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!
arm ํ”Œ๋žซํผ์—์„œ cuda-on-cl์„ ์ปดํŒŒ์ผํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. cuda-on-cl์˜ ๊ฐ€์ด๋“œ๋ฅผ ๋”ฐ๋ฅด์‹ญ์‹œ์˜ค.
๋‚ด ARM ๋ณด๋“œ ์ •๋ณด:
arm64, gcc 4.9, clang ๋ฐ llvm 3.5, openCL 1.2

* clang++-3.8 ๋ฒ„์ „์„ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋‚˜์š”? *
์ž์‹ ํด๋ก  --์žฌ๊ท€ https://github.com/hughperkins/cuda-on-cl
๋งŒ๋“ค๋‹ค
์˜ค๋ฅ˜:
clang++-3.8: ๋ช…๋ น์„ ์ฐพ์„ ์ˆ˜ ์—†์Œ
๋‹ค์Œ๊ณผ ๊ฐ™์ด Makefile์„ ํŽธ์ง‘ํ•ฉ๋‹ˆ๋‹ค. CLANG=clang++ LLVM_CONFIG=llvm-config LLVM_INCLUDE=/usr/include/llvm
๊ทธ๋Ÿฐ ๋‹ค์Œ ๋‹ค์‹œ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
์˜ค๋ฅ˜:
src/mutations.h:3:10: ์น˜๋ช…์ ์ธ ์˜ค๋ฅ˜: 'llvm/IR/Module.h' ํŒŒ์ผ์„ ์ฐพ์„ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

make run-test-cocl-cuda_sample์„ ์‹คํ–‰ํ•ด ๋ณด์‹ญ์‹œ์˜ค.
make: cocl: ๋ช…๋ น์„ ์ฐพ์„ ์ˆ˜ ์—†์Œ

@hughperkins ํ•œ๋ฒˆ ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

tensorflow๋กœ keras๋ฅผ ํ…Œ์ŠคํŠธํ•˜๋Š” ๋™์•ˆ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.

keras$ KERAS_BACKEND=tensorflow pytest3

์ถœ๋ ฅ ์˜ค๋ฅ˜:

Invalid kernel name, code -46, kernel _ZN5Eigen8internal15EigenMetaKernelINS_15TensorEvaluatorIKNS_14TensorAssignOpINS_9TensorMapINS_6TensorIfLi1ELi1EiEELi16ENS_11MakePointerEEEKNS_18TensorCwiseUnaryOpINS0_12scalar_rightIffNS0_17scalar_product_opIffEEEEKNS4_INS5_IKfLi1ELi1EiEELi16ES7_EEEEEENS_9GpuDeviceEEEiEEvT_T0_
__internal__ build log: 
"/tmp/OCL11307T1.cl", line 3: error: variable with automatic storage duration
          cannot be stored in the named address space
      local float mem[1024];

์•”ํ˜ธ:

inline float __shfl_down_3(float v0, int v1, int v2) {
    local float mem[1024];
    int tid = get_local_id(0);
    int warpid = tid % 32;
    int warpstart = tid - warpid;
    mem[tid] = v0;
    //barrier(CLK_LOCAL_MEM_FENCE);
    int warpsrc = warpid + v1;
    warpsrc = warpsrc >= 32 ? warpid : warpsrc;
    return mem[warpstart + warpsrc];
}

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„, ์ œ ์ด๋ฆ„์€ ricardo์ž…๋‹ˆ๋‹ค. ์ €๋Š” C++ ๊ฒฝํ—˜์ด ๋งŽ์€ C++ ํ”„๋กœ๊ทธ๋ž˜๋จธ์ด๋ฉฐ Cuda์— ๋Œ€ํ•œ ์ง€์‹์ด ๊ฑฐ์˜ ์—†์Šต๋‹ˆ๋‹ค. ์ด ๋…ธ๋ ฅ์— ๊ธฐ๊บผ์ด ๊ธฐ์—ฌํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด ์ผ์— ์–ด๋–ป๊ฒŒ ๊ธฐ์—ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

์ž, Mali-T628 MP6(OpenGL ES 3.0/2.0/1.1 ๋ฐ OpenCL 1.1 ์ „์ฒด ํ”„๋กœํ•„)์ด ์žˆ๋Š” Odroid Xu3๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
OS์—์„œ ์‹คํ–‰: LUbuntu 1404 64๋น„ํŠธ
๋‚˜๋Š” ์™„์ „ํ•œ ์„ค์น˜๋ฅผ ๋งŒ๋“ค๊ณ  ์ด ํ”Œ๋žซํผ์— ๊ฒฐ๊ณผ๋ฅผ ๊ฒŒ์‹œํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๋ฒ„๊ทธ์— ๋Œ€ํ•ด ๋ฒ„๊ทธ ๋ชฉ๋ก(Bugzilla์™€ ๊ฐ™์€ ๊ฒƒ)์ด๋‚˜ ๋ฒ„๊ทธ ๋ชฉ๋ก์ด ์žˆ๋Š” ์Šคํ”„๋ ˆ๋“œ์‹œํŠธ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?
๊ฑด๋ฐฐ!

HIP๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์€ ์–ด๋–ป์Šต๋‹ˆ๊นŒ?
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/blob/master/docs/markdown/hip_faq.md#how -does-hip-compare-with-opencl
https://github.com/RadeonOpenCompute/hcc
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/45
"๋‹น์‹ ์˜ ์†Œ์›์ด ์ด๋ฃจ์–ด์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. Eigen์€ HIP๋ฅผ ํ†ตํ•ด AMD GPU๋ฅผ ํ†ตํ•ด ์ด์‹๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹น์‹ ์˜ ์š”์ฒญ์˜ ๋‘ ๋ฒˆ์งธ ๋ถ€๋ถ„์€ ์šฐ๋ฆฌ์˜ ๋ชจ๋“  GFX8 GPU์™€ ํ•จ๊ป˜ ์ œ๊ณต๋˜๋Š” FLOAT16์„ ์ง€์›ํ•˜๋Š” ํ‘œ์ค€ํ™”๋œ ๋„๊ตฌ๋ฅผ ๊ฐ€์ ธ์˜ฌ ์ˆ˜ ์žˆ๋Š”์ง€์ž…๋‹ˆ๋‹ค. ์†Œ์›์ด ์ด๋ฃจ์–ด์กŒ์Šต๋‹ˆ๋‹ค."
AMDGPU ์ปดํŒŒ์ผ๋Ÿฌ์˜ ๊ฐœ๋ฐœ ๋ถ„๊ธฐ๋Š” ์ด์ œ FP16/Int16์—์„œ Float ๋ฐ ๊ทธ ๋ฐ˜๋Œ€๋กœ ๋ณ€ํ™˜ํ•˜๊ธฐ ์œ„ํ•ด ์ƒํ–ฅ ๋ณ€ํ™˜ ๋ฐ ํ•˜ํ–ฅ ๋ณ€ํ™˜ ๋ช…๋ น์–ด๋กœ FP16/Int16์„ ์—๋ฎฌ๋ ˆ์ดํŠธํ•˜๋Š” ๋Œ€์‹  Float16 ๋ฐ Int16 ๊ธฐ๋ณธ ๋ช…๋ น์–ด๋ฅผ ๋ชจ๋‘ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ๋ณ€ํ™˜ ๋ฐ ๊ธฐ๋ณธ ๋ช…๋ น์–ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ ˆ๋ฐ˜ ์œ ํ˜•์˜ ํ–‰๋ ฌ ๊ณฑ์…ˆ์„ ์„ฑ๊ณต์ ์œผ๋กœ ์‹คํ–‰ํ•˜๋Š” ํ”ผ์ง€ ํ•˜๋“œ์›จ์–ด์— ๋Œ€ํ•œ f16 ํ…Œ์ŠคํŠธ์ž…๋‹ˆ๋‹ค."

๋˜ํ•œ ๊ด€๋ จ์ด ์—†์ง€๋งŒ 1.2 ๋Œ€์‹  syCL/openCL 2.0์„ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. nvidia๋Š” ์ด๋ฏธ CUDA๋ฅผ ํ†ตํ•ด ์ง€์›๋˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  openCL 2.0์€ AMD์™€ Intel Windows ๋“œ๋ผ์ด๋ฒ„ ๋ชจ๋‘์—์„œ ์ง€์›๋ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ AMD๋Š” ๊ณง Linux์šฉ un openCL 2.0 ๋“œ๋ผ์ด๋ฒ„(Intel์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Œ, opensource magic)๋ฅผ ๊ณต๊ฐœํ•  ๊ฒƒ์ด๋ผ๊ณ  ๋ฐํ˜”์Šต๋‹ˆ๋‹ค(Intel์€ ์ด๋ฏธ ์„ฑ์ˆ™์ด ํ•„์š”ํ•œ Linux openCL 2.0 ๊ตฌํ˜„์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.) Intel๊ณผ AMD์— ๋ฌผ์–ด๋ณด๋ฉด ์•„๋งˆ๋„ ํ…์„œํ”Œ๋กœ๋Š” ๊ฒฝ์ œ์  ์ด์ต์„ ์œ„ํ•ด ์ค‘์š”ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ž‘์—… ์†๋„๋ฅผ ๋†’์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ๋“ค์€ ์ด๋ฏธ ์ด ์ฝ”๋ฉ˜ํŠธ ์„น์…˜์—์„œ ๊ทธ๋“ค์ด ๋•๊ณ  ์‹ถ๋‹ค๊ณ  ๋งํ–ˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ๋ชจ๋“  ์ฃผ์š” ARM ์ œ์กฐ์‚ฌ๋Š” openCL 2.0์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ Android(Google์˜ ๊ฒฝ์ œ์  ์ด์ต์— ํ•ด๋‹น), ๋ผ์ฆˆ๋ฒ ๋ฆฌ์™€ ๊ฐ™์€ ์Šค๋งˆํŠธ TV ๋“ฑ์— ๋งŽ์€ ๊ธฐํšŒ๋ฅผ ์—ด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ์ค‘๊ธฐ์—๋Š” ๊ฒฐ๊ตญ ์ง€์›๋˜์ง€ ์•Š๋Š” ํ•˜๋“œ์›จ์–ด์— ๋Œ€ํ•œ opencl 1.2 ํด๋ฐฑ ๊ณ„์ธต์„ ๊ฐœ๋ฐœํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๊ทธ๋ฆฌ๊ณ  ๊ตฌํ˜„์€ ๋˜ํ•œ openVX(ํ˜„์žฌ ๋ชจ๋“  ์ฃผ์š” ํ•˜๋“œ์›จ์–ด ์ œ์กฐ์—…์ฒด๊ฐ€ ์ง€์›ํ•˜๊ณ  AMD๋Š” ์˜คํ”ˆ ์†Œ์Šค ๊ตฌํ˜„์ด ์žˆ์Œ)์™€ https://www.khronos.org/news/press/khronos-launches-dual-neural-network ๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
๊ทธ๋ฆฌ๊ณ  ๋ชจ๋“  ๊ฒƒ์ด Spir-V(Vulkan ๋ฐ openGL์—์„œ ๋™์‹œ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Œ)์ž…๋‹ˆ๋‹ค.
์ด๋ฏธ ๋งํ•œ ๊ฒƒ์„ ๋ณต์ œํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ๋งํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ์ข…ํ•ฉํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.
๋งˆ์ง€๋ง‰์œผ๋กœ tensorflow๊ฐ€ HSA๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

http://www.hsafoundation.com
HSA๋Š” Android์—์„œ ํ›Œ๋ฅญํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

HIP๊ฐ€ ์œ ์šฉํ•œ์ง€ ์•„๋‹Œ์ง€ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ผ๋ถ€ AMD ์นด๋“œ์—์„œ๋งŒ ์ง€์›๋˜๋ฏ€๋กœ ๋ชจ๋“  ์žฅ์น˜๋ฅผ ์ง€์›ํ•˜๋ ค๋ฉด OpenCL ๊ตฌํ˜„์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. HIP ๊ตฌํ˜„์ด ํ˜„์ €ํ•˜๊ฒŒ ๋” ๋น ๋ฅด๋‹ค๋ฉด ์—ฌ์ „ํžˆ ๊ฐ€์น˜๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿด ์ˆ˜๋„ ์žˆ์ง€๋งŒ ์•„์ง ๋งŽ์€ ๋ฒค์น˜๋งˆํฌ(HIP ๋Œ€ OpenCL)๋ฅผ ๋ณด์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ๋˜ ๋‹ค๋ฅธ ์ด์œ ๋Š” cudnn์„ ๋Œ€์ฒดํ•˜๋Š” MLOpen(HC๋กœ ์ž‘์„ฑ๋จ)์ผ ์ˆ˜ ์žˆ์ง€๋งŒ ์ด๊ฒƒ์ด ์–ผ๋งˆ๋‚˜ ๋น ๋ฅธ์ง€ ๋˜๋Š” ์–ด๋–ค ๊ธฐ๋Šฅ์„ ์ง€์›ํ•˜๋Š”์ง€ ์ „ํ˜€ ๋ชจ๋ฆ…๋‹ˆ๋‹ค.

TensorFlow๋Š” ๋งค์šฐ ๋‚ฎ์€ ์ˆ˜์ค€์ด๊ธฐ ๋•Œ๋ฌธ์— HSA๋ฅผ ์ง์ ‘ ์‚ฌ์šฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ HC(๋ฐ HIP)๊ฐ€ ๊ทธ ์œ„์— ๊ตฌํ˜„๋˜๋ฉฐ if(pocl์ด ๊ทธ๋ ‡๊ฒŒ ํ•จ) ์œ„์— OpenCL์„ ๊ตฌํ˜„ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฆฌ๋ฃจํผ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์—ฌ๊ธฐ์— ๋„์›€์ด ๋ ๊นŒ์š”? http://mozakai.blogspot.ca/2012/05/reloop-all-blocks.html

@hughperkins ์ปดํŒŒ์ผ๋Ÿฌ์— ์•ฝ๊ฐ„์˜ ์ง„์ „์ด ์žˆ๋Š” ๊ฒƒ์„ ๋ณด๋‹ˆ ๋ฐ˜๊ฐ‘์ง€๋งŒ TensorFlow์˜ ์ฃผ์ œ์—์„œ ๋ฒ—์–ด๋‚œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋Œ€์‹  ์ปดํŒŒ์ผ๋Ÿฌ ํ”„๋กœ์ ํŠธ์˜ GitHub ํŽ˜์ด์ง€์—์„œ ๋งŽ์€ ์†Œ๊ทœ๋ชจ ํ† ๋ก  ์Šค๋ ˆ๋“œ๋ฅผ ์‹œ์ž‘ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋” ์ง‘์ค‘์ ์ด๊ณ  ์ƒ์‚ฐ์ ์ผ ๊ฒƒ ๊ฐ™์•„์š”.

https://github.com/kripken/emscripten-fastcomp/blob/master/lib/Target/JSBackend/Relooper.cpp ๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ด€๋ จ ๋…ผ๋ฌธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. https://github.com/kripken/emscripten/blob/master/docs/paper.pdf?raw=true

์ดˆ๊ธฐ OpenCL/SyCL ์ง€์›์€ https://github.com/tensorflow/tensorflow/pull/5267 ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋งˆ์Šคํ„ฐ์—์„œ ๋ณ‘ํ•ฉ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

์ถ•ํ•˜ํ•ฉ๋‹ˆ๋‹ค!

@keryell Btw, triSYCL ์ €์žฅ์†Œ์— ๋ฌด์Šจ ์ผ์ด ์ผ์–ด๋‚ฌ์Šต๋‹ˆ๊นŒ? ๊ทธ๊ฒƒ์€ ์‚ฌ๋ผ์ง„ ๊ฒƒ ๊ฐ™๊ณ  ๊ณต๊ฐœ์ ์œผ๋กœ ์•ก์„ธ์Šคํ•  ์ˆ˜ ์—†๋Š” Khronos์˜ Gitlab์— ๋Œ€ํ•œ ์ฐธ์กฐ๋งŒ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํŽธ์ง‘: ๊ฐœ์ธ ํด๋ก ์„ ์ฐพ์•˜์Šต๋‹ˆ๋‹ค. amd์˜ ํด๋ก ๋งŒ ์‚ฌ๋ผ์กŒ์Šต๋‹ˆ๋‹ค.

@bhack , mac ํ”Œ๋žซํผ์—์„œ opencl-docker๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๊นŒ?

@alephman ์ €๋Š” OSX ํ”Œ๋žซํผ์ด ์—†์ง€๋งŒ ์‹คํ–‰ ๋ช…๋ น์„ ์•ฝ๊ฐ„ ์กฐ์ •ํ•˜๋ฉด ์ž‘๋™ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

@bhack @alephman : ์œ„์˜ mac์— ๋Œ€ํ•œ ๋‚ด ์˜๊ฒฌ์„ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค. ๋นŒ๋“œ ์ง€์นจ์„ ์•Œ๋ ค ์ฃผ์‹œ๋ฉด ์‹œ๋„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

@olesalscheider : ์˜ˆ, triSYCL์ด AMD์—์„œ Xilinx https://github.com/Xilinx/triSYCL ๋กœ ์ด๋™ํ–ˆ์ง€๋งŒ ๋งž์Šต๋‹ˆ๋‹ค. ๋‚ด GitHub ์ž‘์—… ๊ณต๊ฐ„์˜ ๋ฒ„์ „๋„ https://github.com/keryell/triSYCL ์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

์šฐ๋ฆฌ๋Š” ์•„์ง TensorFlow์—์„œ triSYCL์„ ์‹œ๋„ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์‹œ๋„ํ•˜๊ธฐ ์œ„ํ•ด ์ˆ˜ํ–‰ํ•ด์•ผ ํ•  ํฐ ๋นŒ๋“œ ๊ตฌ์„ฑ ์ž‘์—…์ด ์ด๋ฏธ ์žˆ์Šต๋‹ˆ๋‹ค.

@keryell triSYCL ์ƒํƒœ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

Intel beignet opencl 2.0 ์ง€์›์ด ๊ฑฐ์˜ ์™„๋ฃŒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค!
http://phoronix.com/scan.php?page=news_item&px=Beignet-Birthday-CL2

@bhack triSYCL์€ ํ˜„์žฌ Xilinx์—์„œ ์ฃผ๋กœ ๊ฐœ๋ฐœ๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ์ „ํžˆ ๋” ๋งŽ์€ ๊ธฐ๋Šฅ์„ ์ถ”๊ฐ€ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. Clang/LLVM ๊ธฐ๋ฐ˜ ๊ฐœ์š” ์ปดํŒŒ์ผ๋Ÿฌ๋Š” ์žฅ์น˜์—์„œ ์™„์ „ํ•œ ๋‹จ์ผ ์†Œ์Šค ํ™˜๊ฒฝ์„ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด ์•„์ง ๊ฐœ๋ฐœ ์ค‘์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๋ฏธ ๊ตฌํ˜„๋œ OpenCL ํ˜ธํ™˜์„ฑ ๋ชจ๋“œ๋„ ์ ‘๊ทผ์ž๊ฐ€ ํ‘œํ˜„ํ•œ ์ข…์†์„ฑ์— ๋”ฐ๋ผ ์ง€์—ฐ ์ „์†ก์„ ์ˆ˜ํ–‰ํ•˜๋Š” SYCL ๋Ÿฐํƒ€์ž„์œผ๋กœ ํ˜ธ์ŠคํŠธ์™€ ์ปค๋„ ๊ฐ„์˜ ํ†ต์‹ ์„ ๋‹จ์ˆœํ™”ํ•จ์œผ๋กœ์จ ์–ด๋Š ์ •๋„ ๊ฐ€์น˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‚ด Mac์€ OpenCL๊ณผ ํ˜ธํ™˜๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡๋‹ค๋ฉด ์–ด๋–ป๊ฒŒ OpenCL๋กœ ๋‚ด ํ…์„œํ”Œ๋กœ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ๋ฐฉ๊ธˆ ์ƒˆ ์ฝ”๋“œ๋ฅผ ๊ตฌ์„ฑํ•  ๋•Œ tensorflow์—์„œ opencl์ด ์ง€์›๋œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ์•˜์Šต๋‹ˆ๋‹ค.

@hughperkins ๋‚ด Mac์—๋Š” clinfo ๋ช…๋ น์ด ์—†์Šต๋‹ˆ๋‹ค. ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ•˜๋‚˜์š”? ๊ทธ๋Ÿฌ๋‚˜ ๋‚˜๋Š” clang์„ ์‚ฌ์šฉํ•˜์—ฌ opencl์— ๋Œ€ํ•œ ํ…Œ์ŠคํŠธ ์ฝ”๋“œ ๋ฅผ ์—ฌ๊ธฐ ์—์„œ ์ปดํŒŒ์ผํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๊ฒฐ๊ณผ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
clang -framework OpenCL dumpcl.c -o dumpcl && ./dumpcl Device Intel(R) Core(TM) i5-5257U CPU @ 2.70GHz supports OpenCL 1.2 Device Intel(R) Iris(TM) Graphics 6100 supports OpenCL 1.2

@hughperkins ์—๊ฒŒ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์–ด์ œ computecpp๋ฅผ ์‹œ๋„ํ•œ ๊ฒƒ ๊ฐ™์€๋ฐ ๋งฅ๋ถ ์‹œ์Šคํ…œ์€ ์•„์ง computecpp์—์„œ ์ง€์›๋˜์ง€ ์•Š๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ƒˆ๋กœ์šด ์—…๋ฐ์ดํŠธ๋ฅผ ๊ณ„์† ๊ธฐ๋‹ค๋ฆฌ๋Š” ๊ฒƒ์ด ๋‚ด๊ฐ€ ํ•  ์ˆ˜ ์žˆ๋Š” ์œ ์ผํ•œ ์ผ์ž…๋‹ˆ๋‹ค(TT). BTW, ๋‚ด Iris 6100์€ 8์„ธ๋Œ€๋กœ OpenCL 1.2์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.

@hughperkins ์˜ˆ SYCL 1.2๋Š” OpenCL 1.2์— ๋Œ€ํ•œ ์„ ํ—˜์ ์ด๋ฉฐ SYCL 2.2๋Š” OpenCL 2.2์— ๋Œ€ํ•œ ์„ ํ—˜์ ์ž…๋‹ˆ๋‹ค.
SYCL์˜ OpenCL ํ˜ธํ™˜ ๋ชจ๋“œ๋ฅผ ํ•„์š”๋กœ ํ•˜๋Š” ๊ฒƒ์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š”๋‹ค๋ฉด SYCL์€ ์‹ค์ œ๋กœ OpenCL์„ ์ „ํ˜€ ํ•„์š”๋กœ ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— "์„ ํ—˜์ ์œผ๋กœ"๋ผ๊ณ  ๋งํ–ˆ์Šต๋‹ˆ๋‹ค. ์‹ค์ œ๋กœ SYCL์€ ์ด๊ธฐ์ข… ์ปดํ“จํŒ…์„ ์œ„ํ•œ ๋งค์šฐ ์ผ๋ฐ˜์ ์ธ ๋ชจ๋ธ์ด๋ฉฐ ๋ฌด์—‡์ด๋“  ์œ„์—์„œ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ฌผ๋ก  ์‹ค์ œ ๊ตฌํ˜„์—๋Š” OpenCL๋„ ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š”,

์ €๋Š” ๋‹น๋ถ„๊ฐ„ TensorFlow ๋ฐ Keras๋กœ ํ•™์Šต/์ž‘์—… ์ค‘์ด๋ฉฐ macOS์—์„œ OpenCL ์ง€์›์ด ์ž‘๋™ํ•˜๋„๋ก ํ•˜๋Š” ๋ฐ ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๋‹ค... macOS ์ฃผ๋ณ€์—์„œ ์ˆ˜ํ–‰๋œ ์ž‘์—…์— ๋Œ€ํ•œ ์†Œ์‹์ด ์žˆ์Šต๋‹ˆ๊นŒ?

TensorFlow ์ปดํŒŒ์ผ์— ์„ฑ๊ณตํ–ˆ์ง€๋งŒ OpenCL์šฉ์œผ๋กœ ๊ตฌ์„ฑํ•˜๋ ค๊ณ  ํ•˜๋ฉด computeCpp 1.2 ์œ„์น˜๋ฅผ ๋ฌป๋Š” ๋ฉ”์‹œ์ง€๊ฐ€ ํ‘œ์‹œ๋˜๊ณ  macOS์šฉ ComputeCpp๊ฐ€ ์—†๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์—ฌ๋ณด์„ธ์š”. ๊ฒฐ์ฝ” ML/Tensorflow/๋˜๋Š” OpenCL์˜ ์ „๋ฌธ๊ฐ€๋Š” ์•„๋‹ˆ์ง€๋งŒ ๋‚ด์žฅ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์™€ ๊ฐ„๋‹จํ•œ ์ข…์†์„ฑ์„ ์‚ฌ์šฉํ•˜๋Š” ํ†ตํ•ฉ ๋ฐ AMD GPU๊ฐ€ ์žˆ๋Š” ์‹œ์Šคํ…œ์—์„œ Tensorflow์˜ ๋” ๋น ๋ฅธ ์„ฑ๋Šฅ์„ ๊ฐ„์ ˆํžˆ ์›ํ•˜๋Š” ์ˆ™๋ จ๋œ Mac ๊ทธ๋ž˜ํ”ฝ ๊ฐœ๋ฐœ์ž์ž…๋‹ˆ๋‹ค. :)

๋‚ด๊ฐ€ ์–ด๋–ป๊ฒŒ ๋„์›€์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค?

ํŠธ๋ž˜๋น„์Šค ๋กœ๊ทธ @hughperkins ์—์„œ OS X์˜ ๋งˆ์ง€๋ง‰ ์ปดํŒŒ์ผ ์‹คํŒจ๋ฅผ ๋ณด๋ฉด 'xcode-select --install'์„ ์‹คํ–‰ํ•˜๋ฉด ํ•ด๊ฒฐ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? /usr/include ๋””๋ ‰ํ† ๋ฆฌ๋ฅผ ๋‹ค์‹œ ์—ฐ๊ฒฐํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. Xcode ๋ฒ ํƒ€๋ฅผ ๋ฆด๋ฆฌ์Šค๋กœ ์—…๋ฐ์ดํŠธํ•  ๋•Œ ์ด ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ๊ณ  ์ผ๋ถ€ C++ ์ฝ”๋“œ๋ฅผ ์ปดํŒŒ์ผํ•˜๋Š” ๋ฐ ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

XLA ์ปดํŒŒ์ผ๋Ÿฌ(https://www.tensorflow.org/versions/master/resources/xla_prerelease.html)๊ฐ€ ๋ฐ์ดํ„ฐ ํ๋ฆ„ ๊ทธ๋ž˜ํ”„์—์„œ LLVM ์ฝ”๋“œ ์ƒ์„ฑ์„ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ด๋Š” spir-v ๋ฐ Vulkan์˜ ์ปดํ“จํŒ… API์— ๋งค์šฐ ์‰ฝ๊ฒŒ ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ์ฝ”๋“œ ์ƒ์„ฑ์ด ์ •๋ฆฌ๋˜๋ฉด Android์—์„œ ์‹คํ–‰๋˜๋Š” ์‚ฌ์šฉ๋˜์ง€ ์•Š๋Š” ํ†ตํ•ฉ GPU๊ฐ€ ๋งŽ๋‹ค๋Š” ์ ์„ ๊ฐ์•ˆํ•  ๋•Œ Google์ด Vulkan ํ˜ธํ™˜์„ฑ์„ ์ œ๊ณตํ•˜์ง€ ์•Š๋Š”๋‹ค๊ณ  ์ƒ์ƒํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

@hughperkins

๋น ๋ฅด๊ฒŒ: ์ง€๊ธˆ์€ ๋งž์ถคํ˜• C++/Object-C ์ฝ”๋“œ๋ฒ ์ด์Šค์—์„œ Inception v3๋ฅผ ์‹คํ–‰ํ•˜๊ณ  ๋””์ฝ”๋”ฉ๋œ ๋น„๋””์˜ค ํ”„๋ ˆ์ž„์„ ๋„คํŠธ์›Œํฌ์— ์ „๋‹ฌํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚ฎ์€ ์ˆ˜์ค€์˜ ์š”๊ตฌ ์‚ฌํ•ญ์„ ์•Œ๊ธฐ์—๋Š” TF์— ๋Œ€ํ•ด ์ถฉ๋ถ„ํžˆ ์•Œ์ง€ ๋ชปํ•˜์ง€๋งŒ ๋†’์€ ์ˆ˜์ค€: ๋ชจ๋ธ ๋กœ๋“œ, ์„ธ์…˜ ์‹คํ–‰, ์ž‘๋™ํ•  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒํ•ฉ๋‹ˆ๋‹ค. ์†”์งํžˆ ๋งํ•ด์„œ 100% ํ˜ธํ™˜์„ฑ์„ ์˜๋ฏธํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ๊ฒƒ์ด ์šฐ์„ ์ˆœ์œ„๋ฅผ ์ •ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋˜์ง€ ์•Š๋Š”๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ๋ณธ์ ์œผ๋กœ TF /InceptionV3๋ฅผ ์‚ฌ์šฉํ•œ C++ ์ด๋ฏธ์ง€ ์ธ์‹์ด ์ €์˜ ์ถœ๋ฐœ์ ์ด์—ˆ์Šต๋‹ˆ๋‹ค.

Mac์—์„œ ์‹คํ–‰ ์ค‘์ธ cuda-on-cl: ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๋ฅผ ํ™•์ธํ–ˆ์œผ๋ฉฐ ๋‚ด ์‹œ์Šคํ…œ์—์„œ ๋นŒ๋“œ๋ฅผ ๋””๋ฒ„๊ทธ ๋ฐ ์‹คํ–‰ํ•˜๊ณ  ๋‹ค์–‘ํ•œ ํ•˜๋“œ์›จ์–ด ์—์„œ ๊ฒฐ๊ณผ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Dual D700, Nvidia Mac ๋…ธํŠธ๋ถ ๋ฐ ๋ฐ์Šคํฌํƒ‘ ์‹œ์Šคํ…œ.

์ž์„ธํ•œ ํ”ผ๋“œ๋ฐฑ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๋ฅผ ๋ชจ๋‹ˆํ„ฐ๋งํ•˜๊ณ  ๋”ฐ๋ผ ๊ฐ€๋ฉฐ ์ตœ์„ ์„ ๋‹คํ•ด ๋„์™€ ๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค.

ํœด, ์ผ๋ถ€ ๊ธฐ๋Šฅ์ด ๋งคํ•‘๋˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ฐฐ์šฐ๋ ค๋ฉด http://chrec.cs.vt.edu/cu2cl/ ์„ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค.

์šฐ๋ฆฌ ํšŒ์‚ฌ StreamComputing์—๋Š” ๋นŒ๋“œ ํ…Œ์ŠคํŠธ ๋ฐ ๋ฒค์น˜๋งˆํ‚น์„ ์œ„ํ•œ ๋‹ค์–‘ํ•œ GPU๊ฐ€ ์žˆ์œผ๋ฉฐ ์ด๋ฅผ ๊ณ ๊ฐ ํ”„๋กœ์ ํŠธ์— ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. Github์„ Jenkins์— ์—ฐ๊ฒฐํ•˜์—ฌ ๋งค์ฃผ ์‹คํ–‰ํ•˜๋„๋ก ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‹ต๋ณ€ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฒˆ ์ฃผ์— ๊ตฌ์ฒด์ ์ธ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด ์ฃผ์ œ๋กœ ๋‹ค์‹œ ๋Œ์•„๊ฐ€๊ฒ ์Šต๋‹ˆ๋‹ค.

๋‚ด ์‚ฌ์šฉ ์‚ฌ๋ก€๋Š” ๋‚ด ์‹คํ—˜์—์„œ Gensim ๋ฐ Keras/tensorflow๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ…์ŠคํŠธ/๊ตฌ๋ฌธ ์ผ์น˜ ๋ถ„์„์— ๊ด€ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ํ…Œ์ŠคํŠธ์— ๋„์›€์ด ๋˜์—ˆ์œผ๋ฉด ํ•ฉ๋‹ˆ๋‹ค

AMD ์นด๋“œ๊ฐ€ ์žˆ๋Š” Windows PC๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
AMD ์นด๋“œ๊ฐ€ ์žˆ๋Š” MBP
Intel ํ†ตํ•ฉ GPU๊ฐ€ ์žˆ๋Š” MB

์•ˆ๋…•ํ•˜์„ธ์š” @hughperkins - ์ €๋Š” ์˜ค๋Š˜ ์ €๋… AMD R9 390 8GB์—์„œ ์œ„์˜ ํ…Œ์ŠคํŠธ ์„ธํŠธ๋ฅผ ์ง„ํ–‰ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ง€๊ธˆ๊นŒ์ง€ ์ด๋ฏธ ๋‹ค๋ฅธ ๊ฒฐ๊ณผ๋ฅผ ์–ป์—ˆ์Šต๋‹ˆ๋‹ค. logistic_regression.py ๋Š” nan $ ๋ฅผ ํ›ˆ๋ จํ•˜๊ณ  ๋ฐ˜ํ™˜ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ, ์ข‹์•„! ๋งˆ์ง€๋ง‰์— segfault๊ฐ€ ๋ฐœ์ƒํ•˜๋ฏ€๋กœ ์Šคํฌ๋ฆฝํŠธ ๋˜๋Š” cl ์ฝ”๋“œ์— ์˜ค๋ฅ˜๊ฐ€ ์žˆ๋Š”์ง€ ์กฐ์‚ฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

๋‚ด ๊ฒฐ๊ณผ๋ฅผ ์–ด๋””์— ํ‘ธ์‹œํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ? ๊ฐ€์žฅ ์œ ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๊ณณ์œผ๋กœ?
์ž์› ๋ด‰์‚ฌ์ž๊ฐ€ ๊ท€ํ•˜์—๊ฒŒ ํ‘ธ์‹œํ•  ์ˆ˜ ์žˆ๋Š” ํ‘œ์ค€ ๊ฒฐ๊ณผ ์„ธํŠธ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ํ‘œ์ค€ "ํ…Œ์ŠคํŠธ ์Šคํฌ๋ฆฝํŠธ"๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ(๋˜๋Š” ๋กœ์ปฌ CI ๋˜๋Š” ๊ธฐํƒ€ ๋ฌด์—‡์ด๋“ )?

py.test ๋Š” ์–ด๋–ค ์†”๋ฃจ์…˜๋ณด๋‹ค ์ข‹์Šต๋‹ˆ๋‹ค. pip ๊ฑฐ๋ฆฌ์— ์žˆ์œผ๋ฉฐ tensorflow ์„ค์น˜ ํ”„๋กœ์„ธ์Šค์˜ ์ผ๋ถ€์ž…๋‹ˆ๋‹ค.

ํ…Œ์ŠคํŠธ๋ฅผ ์‹œ์ž‘ํ•œ ์ดํ›„๋กœ ๋ช‡ ๊ฐ€์ง€ ํฅ๋ฏธ๋กœ์šด ์‚ฌ์‹ค์„ ๋ฐœ๊ฒฌํ–ˆ์œผ๋ฉฐ Python ์ถœ๋ ฅ๋งŒ์œผ๋กœ๋Š” ๋””๋ฒ„๊ทธํ•  ์ˆ˜ ์—†์„ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

  • ๋™์ผํ•œ ์Šคํฌ๋ฆฝํŠธ์— ๋Œ€ํ•œ ๋‹ค๋ฅธ ํ˜ธ์ถœ์ด ์ผ์ฐ ์ถฉ๋Œํ•˜๊ฑฐ๋‚˜ "์ค‘์ง€"(์ถœ๋ ฅ ์—†์Œ, ์ง„ํ–‰ ์—†์Œ, Ctrl-C ์— ๋Œ€ํ•œ ์‘๋‹ต ์—†์Œ, ํ”„๋กœ์„ธ์Šค๋Š” pkill -9 'd์—ฌ์•ผ ํ•จ)ํ•˜๊ฑฐ๋‚˜ ๋Šฆ๊ฒŒ ์ถฉ๋Œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์œ ํšจ์„ฑ ๊ฒ€์‚ฌ ๋ถ€๋ถ„์—์„œ ๋˜๋Š” ์Šคํฌ๋ฆฝํŠธ๊ฐ€ ์„ฑ๊ณต์ ์œผ๋กœ ์™„๋ฃŒ๋œ ํ›„. ์ถฉ๋Œ(์„ธ๊ทธํดํŠธ)์€ Xorg๋ฅผ ๋‹ค์šด์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๊ฒฐ๊ณผ๋Š” ์•„๋ฌด ์ด์œ  ์—†์ด ๋‹ฌ๋ผ์ง‘๋‹ˆ๋‹ค. ์Šคํฌ๋ฆฝํŠธ๋ฅผ ํ˜ธ์ถœํ•˜๊ณ  segfault๊ฐ€ ๋ฐœ์ƒํ•˜๋„๋ก ํ•œ ๋‹ค์Œ ๋‹ค์‹œ ํ˜ธ์ถœํ•˜๋ฉด ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.
  • ์ž ์‹œ ์ „์— ๋ง ๊ทธ๋Œ€๋กœ ์ž‘๋™ํ•˜๋˜ ์ฝ”๋“œ ๋ถ€๋ถ„์—์„œ ์ค‘๋‹จ์ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ €๋Š” ์ˆ˜๋ฐฑ ๊ฐœ์˜ ๋ฐฐ์น˜๊ฐ€ ์„ฑ๊ณต์ ์œผ๋กœ ๋ฐœ์ƒํ•œ ํ›„ ๊ต์œก ๋ฐฐ์น˜ ๋‚ด ๋˜๋Š” ํ›„์— ํ•œ ๋ฒˆ ์ค‘๋‹จ์ด ๋ฐœ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ GPU ์ธก์— ํ•ด๊ฒฐ๋˜์ง€ ์•Š์€ ํ•ญ๋ชฉ์ด ์žˆ๊ณ  ์ด๋ฅผ ์ œ๊ฑฐํ•˜๋ ค๋ฉด ์ข‹์€ segfault๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๊นŒ? ์ €๋Š” ์•„์ง GPU ๋ชจ๋ธ์ด๋‚˜ OpenCL์— ๋Œ€ํ•ด ์ž˜ ๋ชฐ๋ผ์„œ ์—ฌ๊ธฐ์—์„œ ๋งŽ์€ ๊ธฐ์—ฌ๋ฅผ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ฌด์Šจ ์ผ์ด ์ผ์–ด๋‚˜๊ณ  ์žˆ๋Š”์ง€ ์ œ๋Œ€๋กœ ์กฐ์‚ฌํ•˜๋ ค๋ฉด GPU ๋””๋ฒ„๊น… ์ถœ๋ ฅ์ด ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ, ๋‚˜๋Š” ๋‹น์‹ ์ด github์—์„œ AMD์™€ ํ•จ๊ป˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ–ˆ์ง€๋งŒ, ๋‹น์‹ ์€ ์ด ๋ชจ๋“  CUDA-on-CL ์ž‘์—…์„ ์ž์‹ ์˜ ์‹œ๊ฐ„์— ํ•˜๋Š” "๋ถˆ๋Ÿ‰ ์—์ด์ „ํŠธ"์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ด ์ผ์„ ์ฃผ๋„ํ•ด ์ฃผ์…”์„œ ์ง„์‹ฌ์œผ๋กœ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค! ์ €์™€ ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์ด GPU ํฌ๋ผ์šฐ๋“œ ํŽ€๋”ฉ์„ ํ†ตํ•ด ์—ฌ๋Ÿฌ๋ถ„์˜ ๋…ธ๋ ฅ์— ๊ธฐ์—ฌํ•  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๊นŒ? ๋˜๋Š” Patreon์„ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ”„๋กœ์ ํŠธ์— ๋Œ€ํ•œ ์›”๊ฐ„ ๊ธฐ๋ถ€์— ๋“ฑ๋กํ•˜๊ฒŒ ๋˜์–ด ๊ธฐ์ฉ๋‹ˆ๋‹ค.

AMD GPU์™€ ๊ด€๋ จํ•˜์—ฌ ์šฐ๋ฆฌ๋Š” AMD์˜ ํŒŒํŠธ๋„ˆ์ž…๋‹ˆ๋‹ค. ๋†“์ณค์„ ์ˆ˜๋„ ์žˆ๋Š” 8์ผ ์ „์˜ ๋‚ด ๋ฉ”์‹œ์ง€๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

์šฐ๋ฆฌ ํšŒ์‚ฌ StreamComputing์—๋Š” ๋นŒ๋“œ ํ…Œ์ŠคํŠธ ๋ฐ ๋ฒค์น˜๋งˆํ‚น์„ ์œ„ํ•œ ๋‹ค์–‘ํ•œ GPU๊ฐ€ ์žˆ์œผ๋ฉฐ ์ด๋ฅผ ๊ณ ๊ฐ ํ”„๋กœ์ ํŠธ์— ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. Github์„ Jenkins์— ์—ฐ๊ฒฐํ•˜์—ฌ ๋งค์ฃผ ์‹คํ–‰ํ•˜๋„๋ก ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐ ์ปค๋ฐ‹์—์„œ ์‹คํ–‰๋˜๋Š” CI ์„œ๋ฒ„๋ฅผ ์„ค์ •ํ•  ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋Š”์ง€ ๊ถ๊ธˆํ•ฉ๋‹ˆ๋‹ค.

๋ฌธ์ œ ์—†์–ด์š”. Jenkins๊ฐ€ ๋กœ๊ทธ ํŒŒ์ผ์„ ๋นŒ๋“œ ๋กœ๊ทธ ๋””๋ ‰ํ„ฐ๋ฆฌ์— ์“ธ ์ˆ˜ ์žˆ๋„๋ก ํ”„๋กœ์ ํŠธ์— ๋Œ€ํ•œ ์“ฐ๊ธฐ ์•ก์„ธ์Šค ๊ถŒํ•œ์ด ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๋‹จ์ง€ ๋‹น์‹ ์—๊ฒŒ ์ŠคํŒธ ๋ฉ”์ผ์„ ๋ณด๋‚ด์„œ ์šฐ๋ฆฌ๊ฐ€ ํ† ๋ก ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„,

์ด๋ฏธ ๋ณด์…จ๊ฒ ์ง€๋งŒ ๋งŽ์€ SYCL ํ•ญ๋ชฉ์ด TensorFlow์— ํ‘ธ์‹œ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์•„์ง ์™„์ „ํ•˜์ง€ ์•Š๊ณ  ํ•ด์•ผ ํ•  ์ผ์ด ๋งŽ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์šฐ๋ฆฌ๋Š” ๊ฑฐ๊ธฐ์— ๋„๋‹ฌํ•˜๊ธฐ ์œ„ํ•ด ์ „์ง„ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๊ธฐ์—ฌ์— ๊ด€์‹ฌ์ด ์žˆ๊ฑฐ๋‚˜ ํ˜„์žฌ ์ƒํƒœ๊ฐ€ ๊ถ๊ธˆํ•˜๋‹ค๋ฉด ์•„๋ž˜ ๋ถ„์„์„ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.

ํ•˜๋ถ€ ๊ตฌ์กฐ
Google์€ @benoitsteiner ์˜ TensorFlow ํฌํฌ(https://github.com/benoitsteiner/tensorflow-opencl)๋ฅผ ์ฃผ๊ธฐ์ ์œผ๋กœ ํ…Œ์ŠคํŠธํ•˜๋„๋ก ์„ค์ •๋œ ๋‘ ๋Œ€์˜ ๋จธ์‹ ์„ ์นœ์ ˆํ•˜๊ฒŒ ๊ธฐ๋ถ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

๋‘˜ ๋‹ค AMD GPU๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

CL_DEVICE_NAME : ํ•˜์™€์ด
CL_DRIVER_VERSION : 1912.5(VM)

๊ทธ๋ฆฌ๊ณ 

CL_DEVICE_NAME : ํ”ผ์ง€
CL_DRIVER_VERSION : 1912.5(VM)

Codeplay์—์„œ ์šฐ๋ฆฌ๋Š” ๋‚ด๋…„์—๋„ ์ „์šฉ ๋จธ์‹ ์„ ์ฐพ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. OpenCL ์žฅ์น˜ ๋‹ค์–‘์„ฑ ์ ์šฉ ๋ฒ”์œ„๋ฅผ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค.

์šฐ๋ฆฌ๊ฐ€ ์ง€์›ํ•˜๋Š” ๊ด€๋ จ ํ”Œ๋žซํผ์— ๋Œ€ํ•œ ํ…Œ์ŠคํŠธ ๋นŒ๋“œ ์„œ๋ฒ„๋ฅผ ์ œ๊ณตํ•˜๋Š” ๋ฐ ๊ด€์‹ฌ์ด ์žˆ๋Š” ์‚ฌ๋žŒ์ด ์žˆ๋‹ค๋ฉด ์ „๋ฉด์—์„œ ๊ธฐ์—ฌ์ž๋ฅผ ์ฐพ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
ํ˜„์žฌ ์š”๊ตฌ ์‚ฌํ•ญ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
- ์šฐ๋ถ„ํˆฌ 14.04
- SPIR์„ ์ง€์›ํ•˜๋Š” OpenCL ๋“œ๋ผ์ด๋ฒ„(Intel CPU/GPU ๋˜๋Š” AMD GPU)

@VincentSC ์•„๋งˆ๋„ ๋‹น์‹ ์ด ๊ทธ๊ฒƒ์„ ๋„์šธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

ํ…Œ์ŠคํŠธ
ํ”ผ์ง€ ๋จธ์‹ ( https://ci.tensorflow.org/job/tensorflow-opencl/127/consoleFull )์—์„œ ์šฐ๋ฆฌ๋Š” 164๊ฐœ์˜ ์‹คํŒจ์— ์ง๋ฉดํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

ํ•˜์™€์ด ๋จธ์‹ ( https://ci.tensorflow.org/job/tensorflow-opencl/129/consoleFull )์—์„œ๋Š” 56๊ฐœ๊นŒ์ง€ ์‹คํŒจํ–ˆ์Šต๋‹ˆ๋‹ค.

์šฐ๋ฆฌ๋Š” ์‹คํŒจํ•œ ๊ทธ๋ž˜๋””์–ธํŠธ ํ…Œ์ŠคํŠธ๋ฅผ ์ˆ˜์ •ํ•˜๊ณ  ํ”ผ์ง€ ๋จธ์‹ ์—์„œ ์ถ”๊ฐ€ ์‹คํŒจ์˜ ์›์ธ์„ ์กฐ์‚ฌํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์•„์ด๊ฒ
์ง€๋‚œ ๋ช‡ ๊ฐœ์›” ๋™์•ˆ ์šฐ๋ฆฌ๋Š” Reshaping, Slicing, Basic Reduction ๋“ฑ์„ ํฌํ•จํ•˜์—ฌ TensorFlow์— ํ•„์š”ํ•œ ๊ธฐ๋Šฅ์„ ์ ๊ทน์ ์œผ๋กœ ๊ตฌํ˜„ํ•ด ์™”์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ ์šฐ๋ฆฌ๋Š” Contraction์„ ๊ตฌํ˜„ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋ถ„์„์€ https://docs.google.com/spreadsheets/d/1YbHn7dAFPPG_PgTtgCJlWhMGorUPYsF681TsZ4Y4LP0/edit#gid =0์˜ Eigen Tensor ํƒญ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ…์„œํ”Œ๋กœ์šฐ
Abs, Floor, IsFinite, Log, Pow, Mul ๋“ฑ์„ ํฌํ•จํ•œ ๋งŽ์€ Coefficient-wise ์—ฐ์‚ฐ๊ณผ Reshape, Shape, Identity, Fill ๋“ฑ๊ณผ ๊ฐ™์€ Tensor Manipulations๊ฐ€ ๊ตฌํ˜„๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
์ž์„ธํ•œ ๋ถ„์„์€ https://docs.google.com/spreadsheets/d/1YbHn7dAFPPG_PgTtgCJlWhMGorUPYsF681TsZ4Y4LP0/edit#gid =1719702219์˜ TensorFlow Kernels ํƒญ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์กฐ์ง
์œ„์˜ ์Šคํ”„๋ ˆ๋“œ์‹œํŠธ์—๋Š” ์ „์ฒด ๊ณ„ํš, ๊ณ ์œ  ํ…์„œ, TensorFlow ์ปค๋„, ๋ชจ๋ธ๊ณผ ๊ฐ™์€ ํ”„๋กœ์ ํŠธ์˜ ๋…ธ๋ ฅ์„ ๋ถ„๋ฅ˜ํ•˜๋Š” ์—ฌ๋Ÿฌ ํƒญ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

์ฐธ์—ฌ๋ฅผ ์›ํ•˜์‹œ๋ฉด ์ž‘์—… ์ค‘์ธ ํ•ญ๋ชฉ ์˜†์— ์ด๋ฆ„์„ ๊ธฐ์žฌํ•˜๊ฑฐ๋‚˜ ๋ˆ„๋ฝ๋œ ์ค‘์š”ํ•œ ์‚ฌํ•ญ์„ ์ถ”๊ฐ€ํ•˜์‹ญ์‹œ์˜ค.
๊ฐ์‚ฌ ํ•ด์š”,
๋ฃจํฌ

์ด ๋กœ๋“œ๋งต์ด ํ™œ์„ฑํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๊นŒ?

@lukeiwanski ๋„ค, ๋ฌธ์ œ ์—†์Šต๋‹ˆ๋‹ค. [email protected] ๋ฅผ ํ†ตํ•ด ์ €ํฌ์—๊ฒŒ ์—ฐ๋ฝํ•˜์‹ญ์‹œ์˜ค.

์ด ๋ชจ๋“  ๋‚ด์šฉ์„ ์ฝ์€ ํ›„ macOS/OS X์—์„œ OpenCL์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•œ ํ™•์‹คํ•œ ์†”๋ฃจ์…˜์ด ์•„์ง ์—†๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๊นŒ? OpenCL ์ง€์›์œผ๋กœ Tensorflow C++๋ฅผ ์ปดํŒŒ์ผํ•˜๋ ค๊ณ  ํ–ˆ์Šต๋‹ˆ๋‹ค(๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ์ง€์ ํ•œ ๊ฒƒ์ฒ˜๋Ÿผ SYCL 1.2์šฉ ComputeCpp๊ฐ€ ํ•„์š”ํ•˜๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค).

์ฃผ์œ„๋ฅผ ๋‘˜๋Ÿฌ๋ณด์•˜์ง€๋งŒ SYCL ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๋‹ค์šด๋กœ๋“œ, ์ปดํŒŒ์ผ ๋˜๋Š” ๋นŒ๋“œํ•  ์œ„์น˜๋ฅผ ์ฐพ์ง€ ๋ชปํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ https://www.codeplay.com/ ์ž…๋‹ˆ๊นŒ? ์–ด๋–ป๊ฒŒ ํ•ด์•ผํ• ์ง€ ๋ง‰๋ง‰ํ•˜๋„ค์š”...๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

@dylib ๋‚ด๊ฐ€ ์•„๋Š” ํ•œ macOS์šฉ ComputeCpp๋Š” ์•„์ง ์—†์Šต๋‹ˆ๋‹ค. ์ฆ‰, macOS์šฉ OpenCL์ด ์ค€๋น„๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.

์—ฌ์ „ํžˆ AMD ์นด๋“œ ๋ฐ ์ด‰๋งค ๋“œ๋ผ์ด๋ฒ„ https://github.com/tensorflow/tensorflow/issues/6497์„ ์‚ฌ์šฉํ•˜์—ฌ Ubuntu 16.04์—์„œ ์ž‘๋™ํ•˜๋„๋ก ๋งŒ๋“ค ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๋ฐฉ๋ฒ•์ด ์žˆ๋‚˜์š”?

OpenCL ์ง€์›์œผ๋กœ ์ปดํŒŒ์ผ๋œ TF๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— /usr/local/computecpp/bin/computecpp_info ์ถœ๋ ฅ์„ ํ™•์ธํ•ด์•ผ ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‚ด ๊ฒฝ์šฐ์—๋Š”

  Device is supported                     : NO - Unsupported vendor
  CL_DEVICE_NAME                          : Pitcairn
  CL_DEVICE_VENDOR                        : Advanced Micro Devices, Inc.

์ด์ œ GPU์—์„œ TF๋ฅผ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•œ ๋‘ ๊ฐ€์ง€ ์„ ํƒ ์‚ฌํ•ญ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
์ œํ•œ๋œ ์ˆ˜์˜ ์žฅ์น˜(๊ณต๊ธ‰์—…์ฒด๋ณ„)์—์„œ ์ž˜ ์ž‘๋™ํ•˜์ง€๋งŒ ๋…์  CUDA
์ œํ•œ๋œ ์ˆ˜์˜ ์žฅ์น˜(computecpp ๊ฐœ๋ฐœ์ž์— ์˜ํ•ด) ๋ฐ ๋…์ ์ ์ธ computecpp์—์„œ ์ž˜๋ชป๋œ ์ž‘์—…
์—ฌ์ „ํžˆ OpenCL์„ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

@inferrna ์ „์ฒด TensorFlow ๋ฌธ์„œ์˜ OpenCL ํŠน์ • ์„น์…˜ ์— ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๊ณง tensorflow.org ์‚ฌ์ดํŠธ์— ๊ฒŒ์‹œ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@benoitsteiner opencl ์ปจ๋ณผ๋ฃจ์…˜ ์ง€์›์˜ ํ˜„์žฌ ์ƒํƒœ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ๊ธฐ์กด ์ปค๋„์„ โ€‹โ€‹์ง์ ‘ ํ™œ์šฉํ•  ๊ณ„ํš์ž…๋‹ˆ๊นŒ? ํ–‰๋ ฌ ๊ณฑ์…ˆ์€ ์–ด๋–ป์Šต๋‹ˆ๊นŒ?

๋„์ฐฉ ์˜ˆ์ • ์‹œ๊ฐ„์ด ์žˆ์Šต๋‹ˆ๊นŒ?

HIP๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ CUDA ์ฝ”๋“œ๋ฅผ ํ”Œ๋žซํผ์— ๊ตฌ์• ๋ฐ›์ง€ ์•Š๋Š” ์ฝ”๋“œ๋กœ ์ด์‹ํ•˜๋Š” ๊ฒƒ์€ ์–ด๋–ป์Šต๋‹ˆ๊นŒ? https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/blob/master/docs/markdown/hip_porting_guide.md

AMD๊ฐ€ ์ž‘์—… ์ค‘์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค: https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/45#issuecomment -269827686

์ด๊ฒƒ์€ ์–ด๋–ค๊ฐ€์š”? ์ด ํŒจํ‚ค์ง€๋Š” Radeon GPU์—์„œ ์ž‘๋™ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

https://github.com/RadeonOpenCompute/ROCm

@bhack https://github.com/tensorflow/tensorflow/issues/6449#issuecomment -269245727์—์„œ

@lukeiwanski XLA๊ฐ€ ๋‹น์‹ ์˜ ๋…ธ๋ ฅ์—๋„ ์˜ํ–ฅ์„ ๋ฏธ์น ๊นŒ์š”?

XLA ๋ฐ SYCL ์†”๋ฃจ์…˜์€ ๋‹ค์–‘ํ•œ ์ƒํ™ฉ์— ๋Œ€ํ•ด ๋ณด์™„์ ์ž…๋‹ˆ๋‹ค. SYCL์€ ์™„์ „ํ•œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๊ฐ€๋Šฅ์„ฑ๊ณผ ์‚ฌ์šฉ์ž ์ •์˜ ๊ฐ€๋Šฅ์„ฑ์„ ์ œ๊ณตํ•˜๋„๋ก ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค. XLA๋Š” ๊ทธ๋ž˜ํ”„์—์„œ ์ž˜ ์ •์˜๋œ ํŒจํ„ด์„ ์ตœ์ ํ™”ํ•˜๊ธฐ ์œ„ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

XLA์— ๋Œ€ํ•œ ๋‚˜์˜ ์ดํ•ด๋Š” LLVM ์ปดํŒŒ์ผ๋Ÿฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋Ÿฐํƒ€์ž„์— ๊ธฐ์กด TensorFlow ๊ทธ๋ž˜ํ”„๋ฅผ ์ตœ์ ํ™”ํ•œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜ํ”„์— ์‚ฌ์šฉ๋œ ๊ฐ ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์ปดํŒŒ์ผ๋Ÿฌ์—์„œ ์ตœ์ ํ™” ๋‹จ๊ณ„๋ฅผ ๊ตฌํ˜„ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
SYCL ์ ‘๊ทผ ๋ฐฉ์‹์€ ๊ฐœ๋ฐœ์ž๊ฐ€ ํ•„์š”๋กœ ํ•˜๋Š” CUDA ์ˆ˜์ค€์˜ ํ”„๋กœ๊ทธ๋ž˜๋ฐ์„ ์ œ๊ณตํ•˜๋Š” ์œ ์ผํ•œ ์ ‘๊ทผ ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค.

SYCL์„ ํ†ตํ•ด ์šฐ๋ฆฌ๋Š” ๋ชจ๋“  TensorFlow Ops์— ๋Œ€ํ•œ ์ง€์›์„ ์ œ๊ณตํ•˜๊ณ  ์ƒˆ๋กœ์šด ์ž‘์—…์˜ ๊ฐœ๋ฐœ์„ ์šฉ์ดํ•˜๊ฒŒ ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์ฆ‰, SYCL์„ ์‚ฌ์šฉํ•˜๋ฉด ์ƒˆ๋กœ์šด ๊ณ ์„ฑ๋Šฅ ์ž‘์—…์„ ๋งค์šฐ ์‰ฝ๊ฒŒ ์ž‘์„ฑํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ XLA๋Š” ๊ทธ๋ž˜ํ”„์˜ ๋ชจ๋“  ์ž‘์—…์„ ์ง€์›ํ•˜๋Š” ๊ฒฝ์šฐ ์ „์ฒด ๊ทธ๋ž˜ํ”„๋ฅผ ์ตœ์ ํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

XLA ๋ฐฑ์—”๋“œ LLVM IR์„ https://github.com/KhronosGroup/SPIRV-LLVM์„ ์‚ฌ์šฉํ•˜์—ฌ SPIR-V๋กœ ๋ณ€ํ™˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

๊ทธ๊ฒƒ์ด ๊ฐ€๋Šฅํ•˜์ง€ ์•Š์„ ์ด์œ ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

@lukeiwanski ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ํŠนํžˆ https://www.tensorflow.org/versions/master/experimental/xla/develping_new_backend ๋ฅผ ์ฐพ๊ณ  ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

@k-hashimoto: ์—ฌ๊ธฐ์„œ ์šฐ๋ฆฌ๋Š” TensorFlow๋ฅผ Khronos Group์˜ ํ‘œ์ค€์ธ OpenCL๋กœ, ๊ทธ๋ฆฌ๊ณ  ์‹ค์ œ๋กœ Khronos Group์˜ ํฌ์ŠคํŠธ๋ชจ๋˜ C++ ๋‹จ์ผ ์†Œ์Šค ํ‘œ์ค€์ธ OpenCL SYCL๋กœ ์ด์‹ํ•˜๋Š” ๊ฒƒ์— ๋Œ€ํ•ด ๋…ผ์˜ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
ROCm์€ ์ผ๋ถ€ ๊ณต๊ธ‰์—…์ฒด์˜ ๋˜ ๋‹ค๋ฅธ ๋น„ํ‘œ์ค€ ์†”๋ฃจ์…˜์ฒ˜๋Ÿผ ๋ณด์ž…๋‹ˆ๋‹ค.
๋…์  ์†”๋ฃจ์…˜์— ๊ด€์‹ฌ์ด ์žˆ๋‹ค๋ฉด ์ด๋ฏธ ์ž˜ ์ž‘๋™ํ•˜๋Š” TensorFlow์˜ CUDA ๋ฒ„์ „์ด ์žˆ์Šต๋‹ˆ๋‹ค. :-)

๋™์˜: OpenCL์— ๋Œ€ํ•œ ๋Œ€ํ™”/๋…ธ๋ ฅ์„ ์œ ์ง€ํ•˜๊ณ  ๊ณต๊ธ‰์—…์ฒด๊ฐ€ ํ•ด๋‹น ๊ฐœ๋ฐฉํ˜• ํ‘œ์ค€ ์œ„์— ์žˆ๋Š” ๋ชจ๋“  ๊ฒƒ์„ ๊ตฌํ˜„ํ•˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

2017๋…„ 1์›” 17์ผ 10:01:32 GMT+00:00์—์„œ Ronan Keryell [email protected] ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

@k-hashimoto: ์—ฌ๊ธฐ์—์„œ TensorFlow๋ฅผ
Khronos Group์˜ ํ‘œ์ค€์ธ OpenCL๊ณผ ์‹ค์ œ๋กœ ๋” ๋งŽ์€ OpenCL SYCL,
Khronos Group์˜ ํฌ์ŠคํŠธ๋ชจ๋˜ C++ ๋‹จ์ผ ์†Œ์Šค ํ‘œ์ค€.
ROCm์€ ์ผ๋ถ€ ๊ณต๊ธ‰์—…์ฒด์˜ ๋˜ ๋‹ค๋ฅธ ๋น„ํ‘œ์ค€ ์†”๋ฃจ์…˜์ฒ˜๋Ÿผ ๋ณด์ž…๋‹ˆ๋‹ค.
๋…์  ์†”๋ฃจ์…˜์— ๊ด€์‹ฌ์ด ์žˆ๋‹ค๋ฉด ์ด๋ฏธ CUDA๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
์ž˜ ์ž‘๋™ํ•˜๋Š” TensorFlow ๋ฒ„์ „. :-)

--
๋‹น์‹ ์ด ๋Œ“๊ธ€์„ ๋‹ฌ์•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment -273076892

--
K-9 Mail์„ ์‚ฌ์šฉํ•˜์—ฌ Android ๊ธฐ๊ธฐ์—์„œ ๋ณด๋ƒˆ์Šต๋‹ˆ๋‹ค. ์ œ ๊ฐ„๋žตํ•œ ์„ค๋ช…์„ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

:+1:

๐Ÿ‘

:+1:

์ด ๋ฉ”์‹œ์ง€๋Š” ๋ฉ”์ผ ๋ฐฐ๋‹ฌ ์†Œํ”„ํŠธ์›จ์–ด์— ์˜ํ•ด ์ž๋™์œผ๋กœ ์ƒ์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

๊ท€ํ•˜๊ฐ€ ๋ณด๋‚ธ ๋ฉ”์‹œ์ง€๋ฅผ ํ•˜๋‚˜ ์ด์ƒ์˜ ํ•ด๋‹น ์ฃผ์†Œ๋กœ ์ „๋‹ฌํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
๋ฐ›๋Š” ์‚ฌ๋žŒ. ์ด๊ฒƒ์€ ์ผ์‹œ์ ์ธ ์˜ค๋ฅ˜์ž…๋‹ˆ๋‹ค. ๋‹ค์Œ ์ฃผ์†Œ๊ฐ€ ์—ฐ๊ธฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

[email protected]
๋„๋ฉ”์ธ biomassiv.es๊ฐ€ ํ—ˆ์šฉ๋˜๋Š” ์‹œ๊ฐ„๋‹น ์ตœ๋Œ€ ์ด๋ฉ”์ผ ์ˆ˜(111/100(111%))๋ฅผ ์ดˆ๊ณผํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ฉ”์‹œ์ง€๋Š” ๋‚˜์ค‘์— ๋‹ค์‹œ ์‹œ๋„๋ฉ๋‹ˆ๋‹ค.

------- ์ด๊ฒƒ์€ ๋ชจ๋“  ํ—ค๋”๋ฅผ ํฌํ•จํ•˜๋Š” ๋ฉ”์‹œ์ง€์˜ ์‚ฌ๋ณธ์ž…๋‹ˆ๋‹ค. ------
์ˆ˜์‹ : github-smtp2-ext6.iad.github.net์—์„œ ([192.30.252.197]:48606 helo=github-smtp2b-ext-cp1-prd.iad.github.net)
esmtps๊ฐ€ ์žˆ๋Š” chi-server32.websitehostserver.net(TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256)
(Exim 4.87)
( [email protected] ์—์„œ ๋ด‰ํˆฌ)
์•„์ด๋”” 1cWmiQ-0032as-W9
[email protected]์šฉ; 2017๋…„ 1์›” 26์ผ ๋ชฉ์š”์ผ 10:16:03 -0600
๋‚ ์งœ: 2017๋…„ 1์›” 25์ผ ์ˆ˜์š”์ผ 04:09:21 -0800
DKIM-์„œ๋ช…: v=1; a=rsa-sha256; c=ํŽธ์•ˆํ•จ/ํŽธ์•ˆํ•จ; d=github.com;
s=pf2014; t=1485346161;
bh=N1Pjga2Q9PtEE8ncEMXBtSJzd3kd6HAkJRnj6H2dDEg=;
h= From:Reply-To :To:Cc:In-Reply-To: References:Subject :List-ID:
๋ชฉ๋ก- ์•„์นด์ด๋ธŒ:๋ชฉ๋ก-๊ฒŒ์‹œ๋ฌผ :๋ชฉ๋ก- ๊ตฌ๋… ์ทจ์†Œ:๋ณด๋‚ธ ์‚ฌ๋žŒ;
b=e5r+VKm/UtpLYj0OCnfEPSYlL6a7xCOd9bN+jS3gify2mRv/g4kofW7ZrEeDyeJT+
GvddVV/w5htZFUbHy9+92pYUHGEYEn2XrmFqc6ZFVoPqBsPW5Cxk31O3Kvi1cwuSPI
g8J4X/qvl1DT+yKrh1es7CeXkr23c8mFNgWkG5qk=
๋ณด๋‚ธ ์‚ฌ๋žŒ: Miguel Angel [email protected]
๋‹ต์žฅ: tensorflow/tensorflow [email protected]
๋ฐ›๋Š” ์‚ฌ๋žŒ: tensorflow/tensorflow [email protected]
์ฐธ์กฐ: ๊ตฌ๋… [email protected]
๋ฉ”์‹œ์ง€ ID:
์— ํšŒ์‹ ํ•˜์—ฌ:
์ฐธ์กฐ:
์ œ๋ชฉ: Re: [tensorflow/tensorflow] OpenCL ์ง€์›(#22)
๋งˆ์ž„ ๋ฒ„์ „: 1.0
์ฝ˜ํ…์ธ  ์œ ํ˜•: ๋ฉ€ํ‹ฐํŒŒํŠธ/๋Œ€์•ˆ;
๊ฒฝ๊ณ„="--==_mimepart_5888957158d12_78b73ff902fe113c148134";
๋ฌธ์ž ์ง‘ํ•ฉ=UTF-8
์ฝ˜ํ…์ธ  ์ „์†ก ์ธ์ฝ”๋”ฉ: 7๋น„ํŠธ
์šฐ์„  ์ˆœ์œ„: ๋ชฉ๋ก
X-GitHub-Sender: migpradel
X-GitHub-์ˆ˜์‹ ์ž: ๋ฐ”์ด์˜ค๋งค์Šค
X-GitHub-์ด์œ : ๊ตฌ๋…
๋ชฉ๋ก ID: tensorflow/tensorflow
๋ชฉ๋ก ์•„์นด์ด๋ธŒ: https://github.com/tensorflow/tensorflow
๋ชฉ๋ก-๊ฒŒ์‹œ๋ฌผ: [email protected]
๋ชฉ๋ก-๊ตฌ๋… ์ทจ์†Œ:,
https://github.com/notifications/unsubscribe/AELU4lfFKxIqjh4jaQkUHuRKD7zj_eKCks5rVztxgaJpZM4Gex3i
X-์ž๋™ ์‘๋‹ต-์–ต์ œ: ๋ชจ๋‘
X-GitHub-์ˆ˜์‹ ์ž-์ฃผ์†Œ: [email protected]

----==_mimepart_5888957158d12_78b73ff902fe113c148134
์ฝ˜ํ…์ธ  ์œ ํ˜•: ํ…์ŠคํŠธ/์ผ๋ฐ˜;
๋ฌธ์ž ์ง‘ํ•ฉ=UTF-8
์ฝ˜ํ…์ธ  ์ „์†ก ์ธ์ฝ”๋”ฉ: 7๋น„ํŠธ

image

--
์ด ์Šค๋ ˆ๋“œ์— ๊ฐ€์ž…ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด ๋ฉ”์‹œ์ง€๋ฅผ ๋ฐ›๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment -275092277
----==_mimepart_5888957158d12_78b73ff902fe113c148134
์ฝ˜ํ…์ธ  ์œ ํ˜•: ํ…์ŠคํŠธ/html;
๋ฌธ์ž ์ง‘ํ•ฉ=UTF-8
์ฝ˜ํ…์ธ  ์ „์†ก ์ธ์ฝ”๋”ฉ: 7๋น„ํŠธ

image

โ€”
์ด ์Šค๋ ˆ๋“œ์— ๊ฐ€์ž…ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด ๋ฉ”์‹œ์ง€๋ฅผ ๋ฐ›๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub ์—์„œ ๋ณด๊ฑฐ๋‚˜ ์Šค๋ ˆ๋“œ๋ฅผ ์Œ์†Œ๊ฑฐํ•˜์„ธ์š” .


----==_mimepart_5888957158d12_78b73ff902fe113c148134--

์—ฌ๊ธฐ ์ƒˆ๋กœ์šด. ๋ฏธ๋ž˜์— tensorflow์—์„œ OpenCL ์ง€์›์ด ์žˆ์„ ๊ฒƒ์ธ์ง€ ๋ฌป๊ณ  ์‹ถ์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋ฉด FPGA์—์„œ tensorflow๋ฅผ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•œ ์ง€์›์ด ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๊นŒ?
๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค

@atinzad : OpenCL ๋˜๋Š” SYCL ๋ฒ„์ „ ๋ฐ ์†Œ์Šค ์ฝ”๋“œ๊ฐ€ FPGA ํ™˜๊ฒฝ์—์„œ ์ง€์›๋˜๋Š” ๊ฒฝ์šฐ ์˜ˆ. ๊ทธ๋Ÿฌ๋‚˜ TensorFlow๋Š” ์•„๋งˆ๋„ ๋‹ค์–‘ํ•œ ์ˆ˜๋‹จ์œผ๋กœ ๊ฐ€์žฅ ๋งŽ์ด ์ด์‹๋œ ํ”„๋ ˆ์ž„์›Œํฌ์ด๊ธฐ ๋•Œ๋ฌธ์— ์ด๋ฏธ ์–ด๋”˜๊ฐ€์— FPGA์—์„œ ์‹คํ–‰ ์ค‘์ธ ์ผ๋ถ€๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค...

sycl ๊ฐœ๋ฐœ ๋…ธ๋ ฅ๊ณผ ์ค‘๊ธฐ ๋น„์ „์—์„œ PTX๊ฐ€ ์•„๋‹Œ SPIR-V๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋Š” XLA์˜ ์ฐจ์ด์ ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

์ •๋ง ์ข‹์€ ์งˆ๋ฌธ์ž…๋‹ˆ๋‹ค. ์•„๋งˆ๋„ - ๊ด€๋ จ๋œ ์‚ฌ๋žŒ๋“ค์˜ ์ˆ˜? ์•„๋Š” ๊ฒƒ์ด ๋งค์šฐ ํฅ๋ฏธ๋กœ์šธ ๊ฒƒ์ž…๋‹ˆ๋‹ค!

2017๋…„ 2์›” 16์ผ ์˜คํ›„ 1์‹œ 35๋ถ„์— bhack [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

sycl ๊ฐœ๋ฐœ ๋…ธ๋ ฅ๊ณผ ์ค‘๊ธฐ ๋น„์ „์—์„œ PTX๊ฐ€ ์•„๋‹Œ SPIR-V๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋Š” XLA์˜ ์ฐจ์ด์ ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

โ€”
๋‹น์‹ ์ด ๋Œ“๊ธ€์„ ๋‹ฌ์•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ๋ณด๊ฑฐ๋‚˜ ์Šค๋ ˆ๋“œ๋ฅผ ์Œ์†Œ๊ฑฐํ•˜์„ธ์š”.

sycl ๊ฐœ๋ฐœ ๋…ธ๋ ฅ๊ณผ ์ค‘๊ธฐ ๋น„์ „์—์„œ PTX๊ฐ€ ์•„๋‹Œ SPIR-V๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋Š” XLA์˜ ์ฐจ์ด์ ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

@bhack ์–ด์ œ TensorFlow Dev Summit์—์„œ ์—ด๋ฆด ํ›Œ๋ฅญํ•œ ํ† ๋ก ์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ฆฌ์†Œ์Šค/๊ธฐ๊ณ ์— ํ•„์š”ํ•œ ํ”„๋กœ๊ทธ๋ž˜๋จธ ์œ ํ˜•์— ๋Œ€ํ•ด ๋ฌธ์˜ํ•˜์‹ญ๋‹ˆ๊นŒ?

๊ทธ๋ ‡๋‹ค๋ฉด OpenCL/SYCL ์ ‘๊ทผ ๋ฐฉ์‹์—์„œ C++ ํ”„๋กœ๊ทธ๋ž˜๋จธ/OpenCL C ํ”„๋กœ๊ทธ๋ž˜๋จธ๋Š” ์‹ ์†ํ•˜๊ฒŒ ์†๋„๋ฅผ ๋†’์ด๊ณ  ๊ธฐ์—ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. XLA ์ ‘๊ทผ ๋ฐฉ์‹์€ ์ปดํŒŒ์ผ๋Ÿฌ/llvm ๊ฒฝํ—˜์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

XLA๋Š” ํ™•์žฅ์— ๋”ฐ๋ผ ๋” ๋งŽ์€ ๊ด€๋ จ ๋ฆฌ์†Œ์Šค๊ฐ€ ์žˆ๋Š” Google์˜ ๋‚ด๋ถ€ ํ”„๋กœ์ ํŠธ์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋‹ค๋ฅธ ํ•œํŽธ์œผ๋กœ ๊ทธ๋“ค์˜ ์ž‘์—…์€ ํ›จ์”ฌ ๋” ํฝ๋‹ˆ๋‹ค. ์ปดํŒŒ์ผ๋Ÿฌ๋ฅผ ์ž‘์„ฑํ•˜๋Š” ๊ฒƒ์€ ์‰ฌ์šด ์ž‘์—…์ด ์•„๋‹™๋‹ˆ๋‹ค.

๊ทธ๋ ‡์ง€ ์•Š๊ณ  ๋ชจ๋ธ์— ๋Œ€ํ•ด ๋ฌป๋Š” ๊ฒฝ์šฐ:

์•ž์„œ https://github.com/tensorflow/tensorflow/issues/22#issuecomment -272908870์—์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด ์šฐ๋ฆฌ๋Š” ๋‘ ๊ฐ€์ง€ ๋…ธ๋ ฅ์„ ๋ณด์™„์ ์ธ ์ ‘๊ทผ ๋ฐฉ์‹์œผ๋กœ ๋ณด๊ณ  ์žˆ์œผ๋ฉฐ ๋‘˜ ๋‹ค ์‚ฌ์šฉ ์‚ฌ๋ก€๊ฐ€ ๋‹ค๋ฆ…๋‹ˆ๋‹ค. ๋‚˜๋Š” ์—ฌ์ „ํžˆ ๊ทธ ๋ง์— ๋™์˜ํ•œ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด @tatatodd ๋Š” ํ”„๋ ˆ์  ํ…Œ์ด์…˜์—์„œ ์ผ๋ถ€ Ops๊ฐ€ XLA๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•˜์ง€ ์•Š์„ ๊ฒƒ์ด๋ผ๊ณ  ์–ธ๊ธ‰ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ ๊ฐ„๊ทน์„ ๋ฉ”์šธ ์ˆ˜ ์žˆ๋‹ค๊ณ  ๋ฏฟ์Šต๋‹ˆ๋‹ค.

๊ณ ๋ คํ•ด์•ผ ํ•  ๋‹ค๋ฅธ ์‚ฌํ•ญ์€ ์ƒˆ๋กœ์šด ํ”Œ๋žซํผ์ž…๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ƒˆ๋กœ์šด ์นฉ์ด GPU๋ณด๋‹ค ๋” ์ž์ฃผ ๋‚˜์˜ค๋Š” ๊ฒฝํ–ฅ์ด ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด ์ฃผ์žฅ์„ ์œ„ํ•ด ๋ชจ๋ฐ”์ผ ๋ฐ ์ž„๋ฒ ๋””๋“œ ํ™˜๊ฒฝ์„ ์‚ฌ์šฉํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค(์›์น™์€ ๋™์ผํ•จ).

๋ฐ˜๋„์ฒด๊ฐ€ SYCL/OpenCL์„ ์ง€์›ํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ์ ์œผ๋กœ TF ์ง€์›์„ ๋ฐ›์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค(์ผ๋ถ€ ์„ฑ๋Šฅ ์กฐ์ •์ด ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Œ).

์•„ํ‚คํ…์ฒ˜๊ฐ€ ๋…ํŠนํ•˜๊ณ  ์ด์— ๋Œ€ํ•œ LLVM ๋ฐฑ์—”๋“œ๊ฐ€ ์•„์ง ์—†๋Š” ๊ฒฝ์šฐ XLA์—์„œ ์ถ”๊ฐ€ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค(๋„ˆ๋ฌด ์ž์ฃผ ๋ฐœ์ƒํ•˜์ง€๋Š” ์•Š์ง€๋งŒ ์—ฌ์ „ํžˆ ๋ฐœ์ƒํ•จ). ๋” ์ž์ฃผ ๋ฐœ์ƒํ•˜๋Š” ๊ฒƒ์€ ์•„ํ‚คํ…์ฒ˜๊ฐ€ ์•ฝ๊ฐ„ ๋ณ€๊ฒฝ๋œ ๋‹ค์Œ ์ƒˆ๋กœ์šด ์ตœ์ ํ™” ๋‹จ๊ณ„๋ฅผ ์ถ”๊ฐ€ํ•˜๊ฑฐ๋‚˜ ๊ธฐ์กด ์ตœ์ ํ™” ๋‹จ๊ณ„๋ฅผ ์ˆ˜์ •ํ•˜์—ฌ ์ด์ ์„ ์–ป๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ปค๋„ ์ฝ”๋“œ๋ฅผ ์กฐ์ •ํ•˜๋Š” ๊ฒƒ์ด ๋” ์‰ฝ์Šต๋‹ˆ๋‹ค.

XLA์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์‚ดํŽด๋ณด์ง€๋Š” ์•Š์•˜์ง€๋งŒ XLA๋Š” PTX ์ปค๋„ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ์–ด๋–ป๊ฒŒ๋“  CUDA API๋ฅผ ํ˜ธ์ถœํ•ด์•ผ ํ•˜๋ฏ€๋กœ ๋Œ€์‹  SPIR-V ์ปค๋„์„ ์‹คํ–‰ํ•˜๋ ค๋ฉด OpenCL ๋˜๋Š” Vulkan์œผ๋กœ ์ด์‹ํ•ด์•ผ ํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค. ์นœ์ˆ™ํ•ด์ง€๊ธฐ ์œ„ํ•œ ๋˜ ๋‹ค๋ฅธ ํ”„๋ ˆ์ž„์›Œํฌ์ธ StreamExecutor๋ฅผ ๊ฑฐ์น˜๊ฒŒ ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์•„๋งˆ๋„ ์ƒ๋‹นํ•œ ๋…ธ๋ ฅ์ด ํ•„์š”ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ฆ‰, ๋ฐ˜๋„์ฒด ํšŒ์‚ฌ์™€ ๊ฐœ๋ฐœ์ž ๋ชจ๋‘๊ฐ€ ๋ชฉํ‘œ๋กœ ์‚ผ์„ ์ˆ˜ ์žˆ๋Š” ๋งค์šฐ ํŒŒํŽธํ™”๋œ/์ „ํ™˜๋œ ์ƒํƒœ๊ณ„์—์„œ ํ†ตํ•ฉ๋œ/์•ˆ์ •์ ์ธ ํ”Œ๋žซํผ์„ ์ œ๊ณตํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. XLA๊ฐ€ ์ง€์›์„ ์•ฝ์†ํ•ด์•ผ ํ•˜๋Š” ๊ณณ.

@benoitsteiner ๋˜๋Š” @drpngx ๋Š” ๋Œ€ํ™”๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ๋งŽ์€ ๊ฐ€์ •/๊ฒฐ๋ก ์œผ๋กœ ์ž‘์—…ํ•˜๊ณ  ์žˆ์œผ๋ฏ€๋กœ XLA์— ๋Œ€ํ•œ ๋‚ด๋ถ€ ์ง€์‹์„ ๋” ๋งŽ์ด ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์•„, ์ €๋„ ํŽธํ•˜๊ฒŒ ์†Œํ†ตํ•˜๊ธฐ ์œ„ํ•ด ์Šฌ๋ž™ ์ฑ„๋„์„ ๋งŒ๋“ค์—ˆ์–ด์š” https://tensorflowopencl.slack.com/shared_invite/MTQzNDQ0NzgzNzAyLTE0ODcyOTE1NjctMDZhM2RkODRlYg

์•„์ด๋””:
Slack ๋งํฌ๋Š” ๋” ์ด์ƒ ์œ ํšจํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ฐ€์ž…์„ ์›ํ•˜์‹œ๋ฉด ์ €์—๊ฒŒ ํ•‘์„ ์ฃผ์„ธ์š”.

๋‚˜๋Š” ๊ทธ๊ฒƒ์ด ์˜ณ๊ณ  ๋ถ€๋ถ„์ ์œผ๋กœ ๋ฐ˜๋„์ฒด ์ƒ์‚ฐ์ž๊ฐ€ ์ง€ํ–ฅํ•˜๋Š” ๋ฐฉํ–ฅ์— ๋‹ฌ๋ ค ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.
"์ด ๋ฐฑ์—”๋“œ๋Š” ํšจ์œจ์ ์ธ ๋ฐฉ์‹์œผ๋กœ XLA HLO ๊ณ„์‚ฐ์„ ๋‚˜ํƒ€๋‚ด๋Š” ๋ฐ ํ•„์š”ํ•œ LLVM IR์„ ๋‚ด๋ณด๋‚ธ ๋‹ค์Œ LLVM์„ ํ˜ธ์ถœํ•˜์—ฌ ์ด LLVM IR์—์„œ ๊ธฐ๋ณธ ์ฝ”๋“œ๋ฅผ ๋‚ด๋ณด๋ƒ…๋‹ˆ๋‹ค." ๋”ฐ๋ผ์„œ LLVM IR์€ SPIR-V๋กœ ๋ณ€ํ™˜๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค . ๊ทธ๋Ÿฌ๋‚˜ Opencl SPIRV ๋ฐฉ์–ธ ์€ Vulkan๊ณผ ๋‹ค๋ฆ…๋‹ˆ๋‹ค . Streamexecutor๋Š” LLVM ๋ณ‘๋ ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์—์„œ ํ‘ธ์‹œ๋˜๊ณ  ์žˆ์œผ๋ฉฐ ์›๋ž˜ @henline ์„ค๋ช… ์—์„œ ์›๋ž˜ ๊ณ„ํš์€ opencl์„ ํฌํ•จํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค.

/ ์ฐธ์กฐ @ dneto0

http://phoronix.com/scan.php?page=news_item&px=OpenCL-2.0-NVIDIA-Preps
Nvidia๋Š” ๊ณง Linux์™€ Windows ๋ชจ๋‘์—์„œ opencl 2.0์„ ์ง€์›ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ YUGE์ž…๋‹ˆ๋‹ค!

์„ฑ๋Šฅ๋ฉด์—์„œ๋Š” CUDA๋ณด๋‹ค ๋Š๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Noveau ์ง์›์€ SPIR-V๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Opencl ์—์„œ ๋…๋ฆฝ์ ์œผ๋กœ ์ž‘์—…ํ•˜๊ณ  ์žˆ์Œ์„ ๊ธฐ์–ตํ•˜์‹ญ์‹œ์˜ค. ์ƒํƒœ๊ฐ€ ์•ฝ๊ฐ„ ์˜ค๋ž˜๋˜์—ˆ์ง€๋งŒ ์ƒˆ๋กœ์šด ์ปค๋ฐ‹์ด ์žˆ์Šต๋‹ˆ๋‹ค.

Opencl์€ ๋ณธ์งˆ์ ์œผ๋กœ Cuda๋ณด๋‹ค ๋Š๋ฆฌ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋‹จ์ง€ nvidia๊ฐ€ ๊ทธ์˜ opencl ๋“œ๋ผ์ด๋ฒ„๋ฅผ ๋ฌด๋ ฅํ™”์‹œ์ผœ ์‹œ์žฅ์„ ์‚ฌ์‹ค์ƒ ์ž ๊ทธ๊ณ  ์žˆ์„ ๋ฟ์ž…๋‹ˆ๋‹ค.
๊ทธ๋Ÿฌ๋‚˜ nvidia์˜ ์„ ๋‘์ฃผ์ž๋Š” ๋งˆ์นจ๋‚ด ์ข…๋ง์„ ๋งž์ดํ•˜๊ณ  ์žˆ์œผ๋ฉฐ ๊ทธ๋“ค์˜ ๋ถ€๋„๋•ํ•œ ๋ฐ˜๊ฒฝ์Ÿ์  ๊ด€ํ–‰์กฐ์ฐจ๋„ ๊ทธ๋“ค์„ ๊ตฌํ•˜์ง€ ๋ชปํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ธ์ƒ์ ์ธ Cuda ์ž๋™ ๋ฒˆ์—ญ๊ธฐ HIP ์‚ฌ์šฉ( https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP)
๊ณง ์ถœ์‹œ๋  vega apus, dgpus ๋ฐ ARM์ด Windows์— ์ œ๊ณต๋˜๊ณ  Nvidia์—๋Š” โ€‹โ€‹๋ฏธ๋ž˜๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ๋ฐ”๋กœ ์—…๊ณ„์—์„œ opencl/syCL/HIP/HSA๋ฅผ ์กฐ๋งŒ๊ฐ„ ๋Œ€๊ทœ๋ชจ๋กœ ์ง€์›ํ•ด์•ผ ํ•˜๋Š” ์ด์œ ์ž…๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š”, tensorflow๊ฐ€ ์ƒˆ๋กœ์šด AMD Radeon Instinct๋ฅผ ์ง€์›ํ•  ๊ณ„ํš์ธ๊ฐ€์š”? (http://instinct.radeon.com/en-us/)

์•ˆ๋…•ํ•˜์„ธ์š”, FPGA์— ๋Œ€ํ•œ TF-OpenCL ์ง€์›์— ์ง„์ „์ด ์žˆ์Šต๋‹ˆ๊นŒ?

@alexivia https://github.com/iteong/tensorflow/blob/master/tensorflow/stream_executor/platform.h#L30 ์€ ๋ช‡ ๋‹ฌ ์ „์— ์ œ๊ฑฐ๋˜์—ˆ์œผ๋ฉฐ Streamexecutor ๋กœ๋“œ๋งต์€ ๋ช…ํ™•ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

@bhack ๋น ๋ฅธ ๋‹ต๋ณ€ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค
๊ทธ๋ ‡๋‹ค๋ฉด ์ง€์›์ด ์—†๊ฑฐ๋‚˜ ์˜ฌ๋ฐ”๋ฅธ ์ž‘๋™์ด ๋ณด์žฅ๋˜์ง€ ์•Š๋Š”๋‹ค๋Š” ์˜๋ฏธ์ž…๋‹ˆ๊นŒ?
๋˜ํ•œ, ์ด ์Šค๋ ˆ๋“œ์—์„œ ์ฝ์€ ๋‚ด์šฉ์—์„œ ํ…Œ์ŠคํŠธ๊ฐ€ ์ฃผ๋กœ AMD GPU์— ๊ด€ํ•œ ๊ฒƒ์ž„์„ ์•Œ์•˜์Šต๋‹ˆ๋‹ค... ์ด OpenCL ํฌํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Nvidia GPU์—์„œ ๋„คํŠธ์›Œํฌ๋ฅผ ํ›ˆ๋ จํ•˜๋Š” ์‚ฌ๋žŒ์ด ์žˆ์Šต๋‹ˆ๊นŒ?

Streamexecutor๋Š” LLVM ๋ณ‘๋ ฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์—์„œ ์ด๋ฆ„์ด ๋ฐ”๋€Œ์—ˆ๊ณ  ํ˜„์žฌ๋Š” acxxel ์ž…๋‹ˆ๋‹ค.

CC @zheng-xq

Axcell๊ณผ StreamExecutor๋Š” ๋ณ„๊ฐœ์˜ ํ”„๋กœ์ ํŠธ์ž…๋‹ˆ๋‹ค. ํ˜„์žฌ ๋ณ‘ํ•ฉํ•  ๊ณ„ํš์ด ์—†์Šต๋‹ˆ๋‹ค. ์ „ํ™˜ํ•  ๊ณ„ํš์ธ์ง€ ์—ฌ๋ถ€๋Š” TensorFlow ์‚ฌ๋žŒ๋“ค์—๊ฒŒ ๋งก๊ธฐ๊ฒ ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ StreamExecutor์™€ StreamExecutor llvm๋„ ๊ฐ™์€ ํ”„๋กœ์ ํŠธ๊ฐ€ ์•„๋‹ˆ์—ˆ๋‚˜์š”?

๋งž์Šต๋‹ˆ๋‹ค. ๊ทธ๋“ค์€ ๊ฐ™์€ ํ”„๋กœ์ ํŠธ๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค.

2017๋…„ 3์›” 16์ผ ๋ชฉ์š”์ผ ์˜ค์ „ 11:06์— bhack [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ StreamExecutor์™€ StreamExecutor llvm๋„ ๊ฐ™์€ ํ”„๋กœ์ ํŠธ๊ฐ€ ์•„๋‹ˆ์—ˆ๋‚˜์š”?

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment-287143104 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/AAJMh_4ODoCVglGRbFBs8UmtSEm6D47_ks5rmXoUgaJpZM4Gex3i
.

@jlebar ๋‹ค์Œ ๋ฒˆ์—” ํฌ๋ฆฌ์—์ดํ‹ฐ๋ธŒ ์œ ๋‹› ๋„ค์ด๋ฐ ;) ํ•˜์ง€๋งŒ ์•„๋งˆ๋„ ์ฐฝ์˜์„ฑ์˜ ๋ถ€์กฑ์ด ์•„๋‹ˆ๋ผ TF์—์„œ ์œ ์ง€ ๊ด€๋ฆฌ๋˜๋Š” ๋„๊ตฌ๋กœ ๋ถ„๊ธฐ๋œ ๋‚ด๋ถ€ ๋„๊ตฌ์˜ ์—…์ŠคํŠธ๋ฆผ ๋…ธ๋ ฅ์ด์—ˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค..

@bhack , ์šฐ๋ฆฌ ๋Š” ์ด๋ฆ„์„ ๋ณ€๊ฒฝํ–ˆ์Šต๋‹ˆ๋‹ค. ์ •ํ™•ํžˆ ์šฐ๋ฆฌ๊ฐ€ ๋ณ€๊ฒฝํ–ˆ๋‹ค๋Š” ๊ฒƒ์„ ๊นจ๋‹ฌ์•˜์„ ๋•Œ
StreamExecutor๋ฅผ LLVM ๋„๋งค๋กœ ์˜ฎ๊ธฐ๋Š” ๊ฒƒ์ด ํ•ฉ๋ฆฌ์ ์ด์ง€ ์•Š๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์˜
์ด์ œ "Acxxel"์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

ํ˜ผ๋ž€์„ ๋“œ๋ ค ์ฃ„์†กํ•˜๊ณ  ํ”ผ๋“œ๋ฐฑ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค..
ํ™•์‹คํžˆ ๋ฐฐ์šฐ๋Š” ๊ณผ์ •.

2017๋…„ 3์›” 16์ผ ๋ชฉ์š”์ผ ์˜ค์ „ 11์‹œ 24๋ถ„์— bhack [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

@jlebar https://github.com/jlebar ๋‹ค์Œ ๋ฒˆ ํฌ๋ฆฌ์—์ดํ‹ฐ๋ธŒ ์œ ๋‹› ๋„ค์ด๋ฐ
;) ํ•˜์ง€๋งŒ ์•„๋งˆ๋„ ์ฐฝ์˜์„ฑ ๋™๊ธฐ์˜ ๋ถ€์กฑ์ด ์•„๋‹ˆ๋ผ ๊ทธ์ €
์œ ์ง€๋œ ๋„๊ตฌ๋กœ ๋ถ„๊ธฐ๋œ ๋‚ด๋ถ€ ๋„๊ตฌ์˜ ์—…์ŠคํŠธ๋ฆผ ๋…ธ๋ ฅ
TF์—์„œ..

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment-287148247 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/AAJMh0MMZvdTJ-bUoa71FBrEqHqFpDjvks5rmX5IgaJpZM4Gex3i
.

์˜ˆ, StreamExecutor, ๊ณ ์œ ํ•œ SyCL, XLA ์‚ฌ์ด์— ์—ฌ์ „ํžˆ ์•ฝ๊ฐ„์˜ ํ˜ผ๋™์ด ์žˆ์Šต๋‹ˆ๋‹ค(์‹ค์ œ๋กœ ์ผ๋ถ€ ์Šฌ๋ผ์ด๋“œ์—์„œ๋Š” CPU ๋ฐ opencl ์ด์™ธ์˜ CUDA ๋ฐฑ์—”๋“œ๋งŒ ์žˆ์Œ).

์ถฉ๋Œ

Google์—์„œ ์ด ๋ฌธ์ œ๋ฅผ ์™„ํ™”ํ•˜๊ธฐ ์œ„ํ•ด Apple์ด๋‚˜ AMD์™€ ์ด์•ผ๊ธฐํ•œ ์‚ฌ๋žŒ์ด ์žˆ์Šต๋‹ˆ๊นŒ? AMD ์‚ฌ๋žŒ๋“ค ์€ ๋„ˆ๋ฌด ๊ธธ์„ ์žƒ์–ด ๋ฌธ์ œ๊ฐ€ ์žˆ๋Š”์ง€์กฐ์ฐจ ๋ชจ๋ฅด๊ณ  Nvidia๊ฐ€ ์™œ ๊ทธ๋ ‡๊ฒŒ ํฐ ์‹œ์žฅ ์ ์œ ์œจ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š”์ง€ ๊ถ๊ธˆํ•ดํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” Apple AI ํŒ€์ด ์—ฌ๊ธฐ์—์„œ ๊ธฐ๊บผ์ด ๋„์™€์ค„ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค... OpenCL์ด 2013๋…„ ์ดํ›„๋กœ ๊ทธ๋“ค์˜ ํŽธ์—์„œ ํฌ๊ธฐํ•˜์ง€ ์•Š์•˜๊ณ  ๋” ๋‚˜์œ ๊ฒƒ์€ ๊ทธ๋“ค์˜ ์ƒ์‚ฌ๊ฐ€ Google์— ํ™”๋ฅผ ๋‚ด์ง€ ์•Š์•˜์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ด๊ฒƒ์— ๋Œ€ํ•œ ์ตœ์‹  ์ •๋ณด๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

TF 1.1 ๋ฆด๋ฆฌ์Šค ๋…ธํŠธ ์— ๋”ฐ๋ฅด๋ฉด Mac GPU(Nvidia๋งŒ ํ•ด๋‹น) ์ง€์›์€ ๋” ์ด์ƒ ์‚ฌ์šฉ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด OpenCL ์ ‘๊ทผ ๋ฐฉ์‹์„ ๊ฐœ์„ ํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋˜๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค.

PR ์ƒํƒœ๋„ ํŒ”๋กœ์šฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. https://github.com/tensorflow/tensorflow/pull/9117

๊ฐ์‚ฌ ํ•ด์š”! ์ง€๋‚œ ๋ช‡ ๋‹ฌ ๋™์•ˆ ์ด ๋ฌธ์ œ๋ฅผ ์ถ”์ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. 2013๋…„๋ถ€ํ„ฐ OpenCL 1.2๋ฅผ ๊ณ ์ˆ˜ํ•˜๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— Apple OpenCL ์•ฝ์†์— ๋Œ€ํ•ด ํ™•์‹ ์ด ์—†์Šต๋‹ˆ๋‹ค(Apple์€ ์•„์ง SPIR 1.2 ์ง€์›์„ ์ œ๊ณตํ•˜์ง€ ์•Š์Œ).

OpenCL์˜ TensorFlow๊ฐ€ ๊ท€ํ•˜์˜ ์ž‘์—…์— ๋„์›€์ด ๋œ๋‹ค๋ฉด ์ €์—๊ฒŒ ์•Œ๋ ค์ฃผ์‹ญ์‹œ์˜ค. ๋”ฅ ๋Ÿฌ๋‹์˜ ์—ฐ๊ตฌ ๋ฐ ์‹ค์Šต์„ ๋ฐœ์ „์‹œํ‚ค๋Š” ๋ฐ ๋„์›€์ด ๋  ์ˆ˜ ์žˆ๋Š” ๋ฒ”์œ„์—์„œ ๋„์›€์„ ๋“œ๋ฆฌ๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ์šฐ๋ฆฌ ํšŒ์‚ฌ๋Š” ๊ธฐ๊ธฐ ๋‚ด ์ถ”๋ก  ์ž‘์—…์˜ ์ผ๋ถ€๋กœ ๋‹ค์–‘ํ•œ GPU์— ๋งž๊ฒŒ ์กฐ์ •๋œ TensorFlow์šฉ OpenCL ๋ฐฑ์—”๋“œ๋ฅผ ๊ตฌ์ถ•ํ–ˆ์Šต๋‹ˆ๋‹ค. Windows ๋ฐ Mac์˜ ๊ณตํ†ต ๊ตฌ์„ฑ์„ ํฌํ•จํ•˜์—ฌ ์ฃผ์š” ๋ชจ๋ฐ”์ผ ๋ฐ ๋ฐ์Šคํฌํƒ‘ GPU ์ œํ’ˆ๊ตฐ์—์„œ ํ…Œ์ŠคํŠธํ–ˆ์Šต๋‹ˆ๋‹ค. ์ถฉ๋ถ„ํ•œ ๊ด€์‹ฌ์ด ์žˆ๋‹ค๋ฉด ์šฐ๋ฆฌ๋Š” ์ผ์ข…์˜ ๊ณต๊ฐœ ๋ฐฐํฌ๋ฅผ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ Metal(Apple GPU) ๋ฐ LLVM(CPU)๊ณผ ํ•จ๊ป˜ ์ œ๋กœ ์ข…์†์„ฑ ๋ฐฐํฌ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—์„œ ์•„์ด๋””์–ด๋Š” ๋ชจ๋“  ์žฅ์น˜์— ๋”ฅ ๋Ÿฌ๋‹์„ ์œ„ํ•œ ํ›Œ๋ฅญํ•œ ์ง€์›์„ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@choongng - ์ด ๋ชจ๋“  ๊ฒƒ์ด ๋งค์šฐ ์œ ์šฉํ•˜๊ณ  ๋„์›€์ด ๋˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋‚ด ๊ฐœ์ธ ํ”„๋กœ์ ํŠธ https://github.com/Synopsis/ ๋Š” OS X์˜ OpenCL๊ณผ Metal for iOS ๋ฐ Desktop ๋ฐฐํฌ์˜ ์ด์ ์„ ํฌ๊ฒŒ ๋ˆ„๋ฆด ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด Tensorflow์— ์ ์ ˆํ•˜๊ฒŒ ๋„์ž…๋˜๋Š” ๊ฒƒ์ด ๊ฐ€๋Šฅํ•˜๋‹ค๋ฉด ์ˆ˜๋งŽ์€ ๊ฐœ๋ฐœ์ž๋“ค์—๊ฒŒ ์—„์ฒญ๋‚œ ํ˜œํƒ์ด ๋  ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

@์ถฉ๋™

ํšŒ์‚ฌ์—์„œ OpenCL ๋ฒ„์ „์ด๋‚˜ ๋” ํฅ๋ฏธ๋กœ์šด TensorFlow์˜ Metal ๋ฒ„์ „์„ ๊ฒŒ์‹œํ•˜๋ฉด ์ด๊ฒƒ์ด ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์—๊ฒŒ ์ข‹์€ ์†Œ์‹์ด ๋  ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์ €๋Š” TensorFlow๋ฅผ ์–ป๊ธฐ ์œ„ํ•ด NVidia ์นด๋“œ๋กœ eGPU๋ฅผ ๊ตฌ์ถ•ํ•˜๋Š” ๊ณผ์ •์— ์žˆ์Šต๋‹ˆ๋‹ค. / ๋‚ด ์ž‘์—…์„ ์œ„ํ•ด ๋‚ด MBP์—์„œ ์‹คํ–‰ ์ค‘์ธ Keras...

๊ด€์‹ฌ ์žˆ๋Š” ๋ถ„๋“ค์€ ... eGPU.io ์ปค๋ฎค๋‹ˆํ‹ฐ๋กœ ์ด๋™

@์ถฉ๋™

๋‚˜๋Š” ์ด๊ฒƒ์„ ๋ณด๋Š” ๋ฐ ๋งค์šฐ ๊ด€์‹ฌ์ด ์žˆ์„ ๊ฒƒ์ด๋ฏ€๋กœ ๋‹น์‹ ์ด ๊ทธ๊ฒƒ์„ ์ถ”๊ตฌํ•  ์ˆ˜ ์žˆ์–ด ๋งค์šฐ ๊ฐ์‚ฌํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค! ํŠนํžˆ TF๊ฐ€ CL ์ง€์›์„ ์œ„ํ•ด ์„ ํƒํ•œ ๊ฐœ๋žต์ ์ธ ๋น„๊ณต๊ฐœ ์†Œ์Šค ์ปดํŒŒ์ผ๋Ÿฌ๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ..

2017๋…„ 4์›” 26์ผ 03:33:51 GMT+01:00์— Choong Ng [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

OpenCL์˜ TensorFlow๊ฐ€ ๊ท€ํ•˜์˜ ์ž‘์—…์— ๋„์›€์ด ๋œ๋‹ค๋ฉด ์ €์—๊ฒŒ ์•Œ๋ ค์ฃผ์‹ญ์‹œ์˜ค.
๋‚ด๊ฐ€ ๋”ฅ ๋Ÿฌ๋‹์˜ ์—ฐ๊ตฌ์™€ ์‹ค์Šต์„ ๋ฐœ์ „์‹œํ‚ค๋Š” ๋ฐ ๋„์›€์„ ์ค„ ์ˆ˜ ์žˆ๋‹ค๋ฉด
๋„์›€์„ ์ข‹์•„ํ•ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ ํšŒ์‚ฌ๋Š” TensorFlow์šฉ OpenCL ๋ฐฑ์—”๋“œ๋ฅผ ๊ตฌ์ถ•ํ–ˆ์Šต๋‹ˆ๋‹ค.
์˜จ๋””๋ฐ”์ด์Šค ์ถ”๋ก  ์ž‘์—…์˜ ์ผ๋ถ€๋กœ ๋‹ค์–‘ํ•œ GPU์— ๋งž๊ฒŒ ์กฐ์ •๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
์šฐ๋ฆฌ๋Š” ๋‹ค์Œ์„ ํฌํ•จํ•œ ์ฃผ์š” ๋ชจ๋ฐ”์ผ ๋ฐ ๋ฐ์Šคํฌํƒ‘ GPU ์ œํ’ˆ๊ตฐ์—์„œ ํ…Œ์ŠคํŠธํ–ˆ์Šต๋‹ˆ๋‹ค.
Windows ๋ฐ Mac์˜ ์ผ๋ฐ˜์ ์ธ ๊ตฌ์„ฑ. ๊ด€์‹ฌ์ด ์ถฉ๋ถ„ํ•˜๋‹ค๋ฉด
์ผ์ข…์˜ ๊ณต๊ฐœ ๋ฐฐํฌ๋ฅผ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Metal(Apple GPU)๋„ ์žˆ์Šต๋‹ˆ๋‹ค.
๋ฐ LLVM(CPU)๊ณผ ํ•จ๊ป˜ ์ œ๋กœ ์ข…์†์„ฑ ๋ฐฐํฌ๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋งŒํผ
๋ชจ๋“  ์žฅ์น˜์— ๋”ฅ ๋Ÿฌ๋‹์— ๋Œ€ํ•œ ํ›Œ๋ฅญํ•œ ์ง€์›์„ ์ œ๊ณตํ•˜๋Š” ์•„์ด๋””์–ด์ž…๋‹ˆ๋‹ค.

--
๋‹น์‹ ์ด ๋Œ“๊ธ€์„ ๋‹ฌ์•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment -297220160

--
K-9 Mail์„ ์‚ฌ์šฉํ•˜์—ฌ Android ๊ธฐ๊ธฐ์—์„œ ๋ณด๋ƒˆ์Šต๋‹ˆ๋‹ค. ์ œ ๊ฐ„๋žตํ•œ ์„ค๋ช…์„ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

ํ˜๋ช…์ ์ผ ๊ฒƒ ๊ฐ™์•„์š” ;)

@choongng ์ด ๋…€์„๋“ค๊ณผ ํž˜์„ ํ•ฉ์น˜๋ฉด ๋„์›€์ด ๋˜์ง€ ์•Š์„๊นŒ
https://github.com/benoitsteiner/tensorflow-opencl

@cathalgarvey ๊ทธ๋ ‡๋‹ค๋ฉด ์‚ฌ์šฉํ•˜๊ฒ ๋‹ค๊ณ  ์ œ์•ˆํ•œ ์˜คํ”ˆ ์†Œ์Šค ์ปดํŒŒ์ผ๋Ÿฌ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ์•ผ์ƒ์˜ ๋งŽ์€ ์žฅ์น˜๋ฅผ ์ฒ˜๋ฆฌํ•˜๋Š” ์˜คํ”ˆ ์†Œ์Šค OpenCL ํ˜ธํ™˜ ์†”๋ฃจ์…˜์„ ์ฐพ๋Š” ๊ฒƒ์€ ์–ด๋ ต์Šต๋‹ˆ๋‹ค...
์šฐ๋ฆฌ๋Š” ์–ด๋–ค ์ ์—์„œ ์–ด๋–ป๊ฒŒ๋“  ์†”๋ฃจ์…˜์„ ๋ถ€ํŠธ์ŠคํŠธ๋žฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค...

๋‚˜๋Š” ๊ทธ๊ฒƒ์ด ์‰ฌ์šด ์ˆ˜์ •์ด๋ผ๊ณ  ๋งํ•˜์ง€ ์•Š์•˜๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ OpenCL์ด ๋ฌธ์ œ๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค. ๊ฒฐ๊ตญ CUDA๋Š” ์™„์ „ํžˆ ๋…์ ์ ์ด๋ฉฐ TensorFlow๊ฐ€ ์„ ํƒํ•œ OpenCL ์˜ต์…˜๋ณด๋‹ค ํ›จ์”ฌ ๋‚˜์ฉ๋‹ˆ๋‹ค.

์ฆ‰, ํœด๋Œ€์šฉ ๋ฏธ๋“ค์›จ์–ด ๋Ÿฐํƒ€์ž„ ๋˜๋Š” arrayfire ๋“ฑ์„ ํฌํ•จํ•˜์—ฌ ์ฒ˜์Œ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” ๊ฒฝ์šฐ CL ๋˜๋Š” cuda ์‹œ์Šคํ…œ์— ๋Œ€ํ•œ ์˜ต์…˜์ด ์žˆ์Šต๋‹ˆ๋‹ค. Tensorflow๋Š” CUDA์— ๋„ˆ๋ฌด ๋ฌถ์—ฌ ์žˆ์Šต๋‹ˆ๋‹ค.

๋” ๋งŽ์€ ์‚ฌ์šฉ์ž์—๊ฒŒ ๋„๋‹ฌํ•˜๊ณ  ์‹œ์žฅ ์ƒํƒœ๊ณ„๋ฅผ ์‹ฌ๊ฐํ•˜๊ฒŒ ์„ฑ์žฅ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ์—๋„ ์‚ฌ๋žŒ๋“ค์ด CUDA๋กœ ์ปค๋„์„ ๊ธฐ๊บผ์ด ์ž‘์„ฑํ•˜์ง€๋งŒ CL์šฉ์œผ๋กœ ์ž‘์„ฑํ•˜๋Š” ๊ฒƒ์„ ์ฃผ์ €ํ•œ๋‹ค๋Š” ์‚ฌ์‹ค์ด ์‹ค๋ง์Šค๋Ÿฝ์Šต๋‹ˆ๋‹ค. ๋ชจ๋“  ์‚ฌ๋žŒ์„ ์œ„ํ•œ ๊ฐœ๋ฐฉํ˜• ํ”Œ๋žซํผ์—๋Š” ์ง๊ฐ„์ ‘์ ์ธ ์ด์ ์ด ์žˆ์œผ๋ฉฐ ์žฅ๊ธฐ์ ์œผ๋กœ ๋ชจ๋“  ์‚ฌ๋žŒ์—๊ฒŒ ํฐ ๋น„์šฉ ์ ˆ๊ฐ์œผ๋กœ ์ด์–ด์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

SYSCL์ด ๊ฒฐ๊ตญ ๊ทธ๋ ‡๊ฒŒ ๋˜๋Š” ๋ฐฉ์‹์ด๋ผ๋ฉด ํ›Œ๋ฅญํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡๋‹ค๋ฉด ์™œ ์ผ๋ถ€ ์œ ๋ช… ์ธ์‚ฌ๋“ค์€ ๊ณต๊ฐœ ํ‘œ์ค€์˜ ๋ชฉ์ ์„ ๋ฌด์ƒ‰ํ•˜๊ฒŒ ํ•˜๋Š” ํ”„๋ฆฐ์ง€ ๋…์  ์˜ต์…˜์„ ๊ตฌ๋งคํ•˜๋Š” ๋Œ€์‹  ๊ณต๊ฐœ SYSCL ๋ฐฐํฌ์— ๋ˆ์„ ํˆฌ์žํ•˜์ง€ ์•Š์Šต๋‹ˆ๊นŒ?

2017๋…„ 4์›” 28์ผ 09:13:06 GMT+01:00์—์„œ Ronan Keryell [email protected] ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

@cathalgarvey ๊ทธ๋ ‡๋‹ค๋ฉด ์‚ฌ์šฉํ•˜๊ฒ ๋‹ค๊ณ  ์ œ์•ˆํ•œ ์˜คํ”ˆ ์†Œ์Šค ์ปดํŒŒ์ผ๋Ÿฌ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?
์˜คํ”ˆ ์†Œ์Šค OpenCL ํ˜ธํ™˜ ์†”๋ฃจ์…˜์„ ์ฐพ๊ธฐ๊ฐ€ ์–ด๋ ต์Šต๋‹ˆ๋‹ค.
์•ผ์ƒ์˜ ๋งŽ์€ ์žฅ์น˜๋ฅผ ์ฒ˜๋ฆฌํ•˜์‹ญ์‹œ์˜ค ...
์šฐ๋ฆฌ๋Š” ์–ด๋–ค ์ ์—์„œ ์–ด๋–ป๊ฒŒ๋“  ์†”๋ฃจ์…˜์„ ๋ถ€ํŠธ์ŠคํŠธ๋žฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค...

--
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment -297936468

--
K-9 Mail์„ ์‚ฌ์šฉํ•˜์—ฌ Android ๊ธฐ๊ธฐ์—์„œ ๋ณด๋ƒˆ์Šต๋‹ˆ๋‹ค. ์ œ ๊ฐ„๋žตํ•œ ์„ค๋ช…์„ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

์ด ๋งฅ๋ฝ์—์„œ ๋ฌป๊ณ  ์‹ถ์€ ๊ฒƒ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ Tensorflow์™€ ๊ฐ™์€ ์ผ๋ถ€ ๋”ฅ ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋Š” CUDA์˜ ๋Œ€์•ˆ์œผ๋กœ opencl์˜ ์‚ฌ์šฉ์„ ๋‹ค์†Œ ๋ฏธ์ง€๊ทผํ•˜๊ฒŒ ํƒ์ƒ‰ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฌผ๋ก  CUDA๋Š” cuDNN์ด ๊ฐœ๋ฐœ๋œ "์–ธ์–ด"์ผ ๋ฟ์ด๋ฉฐ (๋‚ด ์ดํ•ด๊ฐ€ ๋งž๋‹ค๋ฉด) ๋Œ€๋ถ€๋ถ„์˜ ๋”ฅ ๋Ÿฌ๋‹ ์–ธ์–ด๊ฐ€ ์‹ค์ œ๋กœ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด ๋งฅ๋ฝ์—์„œ cuDNN์˜ opencl ๋ฒ„์ „์ด ๋ฌด์—‡์ธ์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ AMD๋Š” ์ง€์†์ ์œผ๋กœ ๊ฐœ๋ฐœ ์ค‘์ด๋ฉฐ rocM์ด๋ผ๊ณ  ๋ถ€๋ฅด๋Š” CUDA์— ๋Œ€ํ•œ ์˜คํ”ˆ ์†Œ์Šค ๋Œ€์•ˆ์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐํ•ด ์™”์Šต๋‹ˆ๋‹ค. ๊ทธ๋“ค์€ ๋˜ํ•œ miOpen์ด cuDNN(์ผ๋ฐ˜์ ์ธ ๋”ฅ ๋Ÿฌ๋‹ ๊ธฐ๋Šฅ์„ ์œ„ํ•œ ์ˆ˜์ œ ์–ด์…ˆ๋ธ”๋Ÿฌ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ)๊ณผ ๋™๋“ฑํ•˜๋‹ค๊ณ  ์ด์•ผ๊ธฐํ•˜๊ณ  ์žˆ์ง€๋งŒ ์•„์ง ์ถœ์‹œ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. AMD ์ ‘๊ทผ ๋ฐฉ์‹์€ ์ข€ ๋” ์ด์ฒด์ ์ž…๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๋‹จ์ˆœํžˆ GPU๋กœ ํ—ค๋น„ ์ปดํ“จํŒ…์„ ๋‚ด๋ณด๋‚ด๋Š” ๊ฒƒ์ด ์•„๋‹™๋‹ˆ๋‹ค.

์ด๋Ÿฐ ๋งฅ๋ฝ์—์„œ ๋‚˜๋Š” ์ง„์‹ฌ์œผ๋กœ ํ˜ผ๋ž€์Šค๋Ÿฝ๋‹ค. ์œ„์— ๋‚˜์—ด๋œ ๊ฒƒ๊ณผ ๊ฐ™์€ opencl ๋…ธ๋ ฅ์€ ์–ด๋–ป๊ฒŒ ์„œ๋กœ ๋งž์Šต๋‹ˆ๊นŒ? NVIDIA GPU์˜ ๊ฒฝ์šฐ ์‰ฝ์Šต๋‹ˆ๋‹ค....CUDA๊ฐ€ ์žˆ๊ณ  CUDA๋กœ ์ž‘์„ฑ๋œ cuDNN์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋น„NVIDIA/๋˜๋Š” ์ด ๊ฒฝ์šฐ AMD์˜ ๊ฒฝ์šฐ ํ›จ์”ฌ ๋œ ๋ช…ํ™•ํ•ด ๋ณด์ž…๋‹ˆ๋‹ค. HIP๋Š” ์–ธ์ œ ์„ ํ˜ธ๋ฉ๋‹ˆ๊นŒ? HCC๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์„ ํ˜ธ๋˜๋Š” ๊ฒฝ์šฐ๋Š” ์–ธ์ œ์ž…๋‹ˆ๊นŒ? ์–ธ์ œ opencl์„ ์„ ํ˜ธํ•ฉ๋‹ˆ๊นŒ? ์–ด๋–ค ํ†ต์ฐฐ๋ ฅ์ด๋ผ๋„ ์ •๋ง ๊ฐ์‚ฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค!

@cathalgarvey ์ด ๋ชจ๋“  ๊ฑฐ๋Œ€ํ•œ ์†Œํ”„ํŠธ์›จ์–ด/ํ•˜๋“œ์›จ์–ด ์ธํ”„๋ผ ๋’ค์—๋Š” ๋งŽ์€ ์ •์น˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค... :-(
์ˆœ์ˆ˜ํ•œ ๊ณผํ•™์  ๊ธฐ์ค€์— ๊ทผ๊ฑฐํ•œ ๊นจ๋—ํ•œ ํ•ด๊ฒฐ์ฑ…์„ ๊ฟˆ๊ฟ€ ์ˆ˜ ์žˆ๋‹ค ํ•˜๋”๋ผ๋„ ์‹ค์šฉ์ฃผ์˜์ ์ด์–ด์•ผ ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.
Google์€ TensorFlow ์•„ํ‚คํ…์ฒ˜๋ฅผ ๋„ˆ๋ฌด ๋งŽ์ด ๋ณ€๊ฒฝํ•˜๊ณ  ์‹ถ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด OpenCL ๊ธฐ๋ฐ˜ ์•„ํ‚คํ…์ฒ˜๊ฐ€ ๋งค์šฐ ์œ ์‚ฌํ•ด์•ผ ํ•˜๋Š” ์ด์œ ์ด๋ฉฐ ํ•˜์œ„ ์ˆ˜์ค€์˜ ๋‹จ์ผ ์†Œ์Šค๊ฐ€ ์•„๋‹Œ OpenCL C ์†”๋ฃจ์…˜ ๋Œ€์‹  "CUDA ๋Ÿฐํƒ€์ž„"๊ณผ ๊ฐ™์€ ๋‹จ์ผ ์†Œ์Šค C++๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. Khronos ์˜์—ญ์—์„œ OpenCL์˜ ๋‹จ์ผ ์†Œ์Šค C++ ๋ฒ„์ „์„ SYCL์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
์˜ˆ๋ฅผ ๋“ค์–ด, ์•„์ผ๋žœ๋“œ์— ๊ธฐ๋ฐ˜์„ ๋‘๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋”๋ธ”๋ฆฐ์— ๋“ค๋ฅด์‹ค ๋•Œ ์ด์— ๋Œ€ํ•ด ๋…ผ์˜ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. :-)
๊ทธ ๋™์•ˆ https://github.com/triSYCL/triSYCL ๋ฐ SYCL์„ ๋‹ค๋ฃจ๋Š” TensorFlow & Eigen ๋ธŒ๋žœ์น˜์— ์ž์œ ๋กญ๊ฒŒ ๊ธฐ์—ฌํ•˜์‹ญ์‹œ์˜ค...

@keryell XLA:GPU :OpenCL๋„ SyCL์— ๊ณ„ํš๋˜์–ด ์žˆ๋Š”์ง€ ์•„์‹ญ๋‹ˆ๊นŒ?

์•ˆ๋…•ํ•˜์„ธ์š” @benoitsteiner , ๋‹ค์Œ๊ณผ ๊ด€๋ จํ•˜์—ฌ:

์ „์ฒด TensorFlow ๋ฌธ์„œ์˜ OpenCL ํŠน์ • ์„น์…˜์— ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๊ณง tensorflow.org ์‚ฌ์ดํŠธ์— ๊ฒŒ์‹œ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

OpenCL์— ๋Œ€ํ•ด tensorflow.org์—์„œ ๊ฒ€์ƒ‰ํ–ˆ๋Š”๋ฐ ์ค‘์š”ํ•œ ๊ฒƒ์„ ์ฐพ์„ ์ˆ˜ ์—†์—ˆ๋˜ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋ชจ๋‘๊ฐ€ ๋ฐ”๋กœ ์—ฌ๊ธฐ๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. "๊ณง"์€ ______ ์ด์ „์„ ์˜๋ฏธํ•ฉ๋‹ˆ๊นŒ? ( _์—ฌ๊ธฐ์— ์žฌ๋ฏธ์žˆ๋Š” ํ’์ž ์‚ฝ์ž…_ ).

Mac์—์„œ ์ž‘๋™ํ•˜๋Š” Tensorflow OpenCL ๋ฅผ ์ƒ์„ฑํ•˜๋ ค๋ฉด ๋‹ค๋ฅธ ๊ฒƒ์ด ํ•„์š”ํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•˜์ง€๋งŒ(๊ทธ๋ž˜์š”!) ๊ท€ํ•˜์˜ repo๋ฅผ ์ปดํŒŒ์ผํ•  ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. ์–ธ๊ธ‰๋œ triSYCL ์ปดํŒŒ์ผ๋Ÿฌ๋ฅผ ๋นŒ๋“œํ•˜๋ ค๊ณ  ์‹œ๋„ํ–ˆ์ง€๋งŒ ์Šฌํ”„๊ฒŒ๋„ ์‹คํŒจํ–ˆ์Šต๋‹ˆ๋‹ค.

@bhack SI Google์—์„œ ์ผํ•˜์ง€ ์•Š์•„์„œ XLA ์„ธ๋ถ€ ์‚ฌํ•ญ์— ๋Œ€ํ•ด ์ „ํ˜€ ๋ชจ๋ฆ…๋‹ˆ๋‹ค...

@dylib ๋ถˆํ–‰ํžˆ๋„ ์ด ๋ชจ๋“  ๊ฒƒ์€ ์ง„ํ–‰ ์ค‘์ธ ์ž‘์—…์ž…๋‹ˆ๋‹ค...

@keryell ๋„ค ์•Œ๊ฒ ์Šต๋‹ˆ๋‹ค.. ์ผ๋ถ€ ํšŒ์˜์—์„œ ๋…ผ์˜๊ฐ€ ๋˜์—ˆ๋Š”์ง€ ๊ถ๊ธˆํ•ฉ๋‹ˆ๋‹ค.

OpenCL์€ CUDA์™€ ๊ทผ๋ณธ์ ์œผ๋กœ ๋‹ค๋ฆ…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋‚˜๋Š” ์ด๊ฒƒ์ด ๋Œ€์‹  HIP๋กœ ์ด์‹๋˜๋Š” ๊ฒƒ์„ ํ™•์‹คํžˆ ๋ณผ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๊ทธ๋ž˜์„œ +1 ์ œ์•ˆํ•œ ์—ฌ๋Ÿฌ๋ถ„ ๋ชจ๋‘์—๊ฒŒ.
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP

HIP๋ฅผ ํ†ตํ•ด ๊ฐœ๋ฐœ์ž๋Š” CUDA ์ฝ”๋“œ๋ฅผ ์ด์‹ ๊ฐ€๋Šฅํ•œ C++๋กœ ๋ณ€ํ™˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. NVIDIA ๋˜๋Š” AMD GPU์—์„œ ์‹คํ–‰๋˜๋„๋ก ๋™์ผํ•œ ์†Œ์Šค ์ฝ”๋“œ๋ฅผ ์ปดํŒŒ์ผํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

HIP์— ๋Œ€ํ•ด ์•„๋Š” ์‚ฌ๋žŒ์€ ๋งŽ์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
tensorflow ๋ฐ HIP์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์ •๋ณด๋Š” ์—ฌ๊ธฐ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/37
๊ทธ๋ฆฌ๊ณ 
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/45

์ฐธ๊ณ  ์‚ฌํ•ญ:
๋‚˜๋Š” ์šฐ๋ฆฌ๊ฐ€ Nvidia ๋Œ€ AMD์— ๋Œ€ํ•ด ์‹ธ์šฐ๊ฑฐ๋‚˜ ์ž๋ž‘ํ•ด์•ผ ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋†€๋ผ์šด ํ•˜๋“œ์›จ์–ด์™€ ์†Œํ”„ํŠธ์›จ์–ด ์‹œ๋Œ€๋ฅผ ๋งŒ๋“ค์–ด๊ฐ€๋Š” ์กด๊ฒฝ๋ฐ›๋Š” ๊ธฐ์—…๋“ค์ž…๋‹ˆ๋‹ค. ๋Œ€์‹  ๋” ํฐ ์‚ฌ์šฉ์ž ๊ธฐ๋ฐ˜์— tensorflow๋ฅผ ์ œ๊ณตํ•˜๋Š” ๋ฐ ์ง‘์ค‘ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
๋ฐ”์ธ๋”ฉ์„ ํ†ตํ•ด ๋งŽ์€ ์–ธ์–ด๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•˜๋Š” ๊ฒƒ์€ ์ด๋ฏธ ์ข‹์€ ์ถœ๋ฐœ์ ์ด์ง€๋งŒ ๊ฐ€๋Šฅํ•œ ํ•œ ๋งŽ์€ ํ•˜๋“œ์›จ์–ด๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. (ํด๋ผ์šฐ๋“œ ์†”๋ฃจ์…˜์ด ํ›Œ๋ฅญํ•˜๋”๋ผ๋„ ํ•ญ์ƒ ์ •๋‹ต์€ ์•„๋‹™๋‹ˆ๋‹ค)

์—ฌ๊ธฐ Stream์—์„œ HIP์— ๋Œ€ํ•œ ๊ฒฝํ—˜์ด ์žˆ์Šต๋‹ˆ๋‹ค. ํ•œ ๋ฒˆ ๋ณผ๊ฒŒ์š”.

"์šฐ๋ฆฌ ํšŒ์‚ฌ๊ฐ€ ๋” ๋‚ซ๋‹ค"๋Š” ์ฃผ์žฅ์— ๋™์˜ํ•ฉ๋‹ˆ๋‹ค. TensorFlow๊ฐ€ ํƒ€๊ฒŸํŒ…ํ•ด์•ผ ํ•˜๋Š” GPU๋ฅผ ์•Œ๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ์‹ค์šฉ์ ์ด๊ณ  ์œ ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด Intel GPU ๋˜๋Š” ์ž„๋ฒ ๋””๋“œ GPU(Qualcomm, ARM, Imagination), RaspBerry Pi - ์˜ˆ ๋˜๋Š” ์•„๋‹ˆ์˜ค?

AMD ๋ผ๋ฐ์˜จ ๋ฒ ๊ฐ€ ํ”„๋ก ํ‹ฐ์–ด ์—๋””์…˜

์šฐ๋ฆฌ๋Š” ROCm ๊ฐœ๋ฐฉํ˜• ์†Œํ”„ํŠธ์›จ์–ด ํ”Œ๋žซํผ๊ณผ ๊ธฐ๊ณ„ ํ•™์Šต ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์ง€์†์ ์œผ๋กœ ๊ณต๊ฒฉ์ ์œผ๋กœ ๊ฐœ์„ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ Caffe(4์›” ์ถœ์‹œ)์™€ ๊ฐ™์€ ๊ฐœ๋ฐฉํ˜• ๊ธฐ๊ณ„ ์ง€๋Šฅ ํ”„๋ ˆ์ž„์›Œํฌ๋„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฒˆ ๋ถ„๊ธฐ ํ›„๋ฐ˜์— Torch์— ๋Œ€ํ•œ ์ง€์›์„ ์ œ๊ณตํ•  ๊ณ„ํš์ด๋ฉฐ Tensor Flow ๊ฐ€ ์ž‘์—… ์ค‘์ž…๋‹ˆ๋‹ค.

๊ทธ๋“ค์€ ์ด๋ฏธ Caffe๋ฅผ ์ถœ์‹œํ–ˆ์œผ๋ฉฐ ์ด ์Šค๋ ˆ๋“œ์—์„œ ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์ด ๋นŒ๋“œ/ํ…Œ์ŠคํŠธ์— ๋Œ€ํ•œ ๊ฒฝํ—˜์„ ๊ณต์œ ํ•˜๋Š” ๊ฒƒ์„ ๋“ฃ๋Š” ๋ฐ ๋งค์šฐ ๊ด€์‹ฌ์ด ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

https://github.com/ROCmSoftwarePlatform/hipCaffe

์„ค์น˜๋ฅผ ์‹œ์ž‘ํ–ˆ์ง€๋งŒ clinfo ๊นŒ์ง€ CL์ด ํ•„์š”ํ•œ ๋ชจ๋“  ๊ฒƒ์ด ์ •์ง€๋˜๋Š” ์žฅ์• ๋ฌผ์— ๋ถ€๋”ช์ณค์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ์ผ๋ถ€ ์†Œํ”„ํŠธ์›จ์–ด ๋ฌธ์ œ ๋•Œ๋ฌธ์ธ์ง€ ์•„๋‹ˆ๋ฉด ๋‚ด ์นด๋“œ(R9 390)๊ฐ€ ๋‹จ์ˆœํžˆ ROCm์—์„œ ์ง€์›๋˜์ง€ ์•Š๋Š” ๊ฒƒ์ธ์ง€ ํ™•์‹คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

2017๋…„ 5์›” 17์ผ 15:18:32 GMT+01:00์— Bryan Li [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

AMD ๋ผ๋ฐ์˜จ ๋ฒ ๊ฐ€ ํ”„๋ก ํ‹ฐ์–ดํŒ

์šฐ๋ฆฌ๋Š” ROCm ๊ฐœ๋ฐฉํ˜• ์†Œํ”„ํŠธ์›จ์–ด ํ”Œ๋žซํผ์„ ์ง€์†์ ์œผ๋กœ ๊ณต๊ฒฉ์ ์œผ๋กœ ๊ฐœ์„ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
๋ฐ ๊ธฐ๊ณ„ ํ•™์Šต ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ. ์šฐ๋ฆฌ๋Š” ๋˜ํ•œ ์˜คํ”ˆ ๋จธ์‹ ์„ ์ง€์›ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค
Caffe(4์›” ์ถœ์‹œ)์™€ ๊ฐ™์€ ์ธํ…”๋ฆฌ์ „์Šค ํ”„๋ ˆ์ž„์›Œํฌ. ๋‚˜์ค‘์— ์ด
๋ถ„๊ธฐ์— Torch์— ๋Œ€ํ•œ ์ง€์›์„ ์ œ๊ณตํ•  ๊ณ„ํš์ด๋ฉฐ Tensor Flow ๋Š”
์ž‘ํ’ˆ.

--
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment -302103815

--
K-9 Mail์„ ์‚ฌ์šฉํ•˜์—ฌ Android ๊ธฐ๊ธฐ์—์„œ ๋ณด๋ƒˆ์Šต๋‹ˆ๋‹ค. ์ œ ๊ฐ„๋žตํ•œ ์„ค๋ช…์„ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

@cathalgarvey ์ €๋Š” AMD GPU์—์„œ Caffe OpenCL ๋ถ„๊ธฐ๋ฅผ ์‚ฌ์šฉํ•ด ์™”์œผ๋ฉฐ ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. make run test ํ•˜๋‚˜๋ฅผ ์ œ์™ธํ•œ ๋ชจ๋“  ํ…Œ์ŠคํŠธ ํ†ต๊ณผ

๋ฐ˜๊ฐ‘์Šต๋‹ˆ๋‹ค. HW/SW ์„ค์ •์— ๋Œ€ํ•ด ์—ฌ์ญค๋ด๋„ ๋ ๊นŒ์š”? ์˜ˆ๋ฅผ ๋“ค์–ด, ๋‹น์‹ ์€ ์–ด๋–ค ์นด๋“œ
์‚ฌ์šฉ, Linux์˜ ๋ฐฐํฌํŒ/๋ฒ„์ „ ๋“ฑ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

์ด์ „์— AMDGPU-pro๊ฐ€ ์žˆ์—ˆ์ง€๋งŒ ROCm์„ ์„ค์น˜ํ•  ๋•Œ ์ œ๊ฑฐํ–ˆ์Šต๋‹ˆ๋‹ค.
์ €๋ฅผ ๋ฐฉํ•ดํ•˜๋Š” ์œ ์‚ฐ์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

--
@onetruecathal / @ [email protected]

2017๋…„ 5์›” 17์ผ ์ˆ˜์š”์ผ ์˜คํ›„ 3์‹œ 50๋ถ„, Bryan Li [email protected]
์ผ๋‹ค:

@cathalgarvey ์ €๋Š” AMD GPU์—์„œ Caffe OpenCL ๋ธŒ๋žœ์น˜๋ฅผ ์‚ฌ์šฉํ•ด ์™”์œผ๋ฉฐ
๊ทธ๊ฒƒ์€ ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ์‹คํ–‰ ํ…Œ์ŠคํŠธ๊ฐ€ ํ•˜๋‚˜๋ฅผ ์ œ์™ธํ•œ ๋ชจ๋“  ํ…Œ์ŠคํŠธ๋ฅผ ํ†ต๊ณผํ•˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ๋ณด๊ฑฐ๋‚˜ ์Šค๋ ˆ๋“œ๋ฅผ ์Œ์†Œ๊ฑฐํ•˜์„ธ์š”.

@cathalgarvey

  • Caffe OpenCL ๋ถ„๊ธฐ(ํ…Œ์ŠคํŠธ๋œ ์ปค๋ฐ‹ c61d48746b2df1d237c64abc7416342ce98f3251 )
  • OS: ์šฐ๋ถ„ํˆฌ 16.04.2 LTS
  • Polaris(RX460), ํ”ผ์ง€(Fury X) ๋ฐ ํ†ต๊ฐ€(W7100)์—์„œ ํ…Œ์ŠคํŠธ๋จ
  • ๋“œ๋ผ์ด๋ฒ„: Linux 16.40 ์ด์ƒ์šฉ AMDGPU-Pro ๋“œ๋ผ์ด๋ฒ„
  • ๋น„์—”๋‚˜CL
  • ์ผ๋ฐ˜ ์ข…์†์„ฑ: libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler libatlas-base-dev libblas-dev libgflags-dev libgoogle-glog-dev liblmdb-dev libboost-all-dev cmake python-numpy
  • cmake: cmake -DViennaCL_INCLUDE_DIR=<wherever you downloaded ViennaCL>/ViennaCL-<version> -DOPENCL_INCLUDE_DIRS=<wherever you downloaded ViennaCL>/ViennaCL-<version>/CL/ -DOPENCL_LIBRARIES=/opt/amdgpu-pro/lib/x86_64-linux-gnu/libOpenCL.so.1 ..

์˜ˆ, ์œ„์˜ OpenCL ๋ถ„๊ธฐ ์™ธ์—๋„ naibaf7์€ amd ๋ฐ hd ๊ทธ๋ž˜ํ”ฝ์„ ์‚ฌ์šฉํ•˜๋Š” ์ƒ์šฉ ํ•˜๋“œ์›จ์–ด์— ๋Œ€ํ•œ ์‹ค์‹œ๊ฐ„ ์ถ”๋ก ์„ ์œ„ํ•ด ์ด๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์— ๋Œ€ํ•œ ์ฑ…์„ (๊ณง) ์ถœํŒํ•  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.

์•„; OpenCL ๋ถ„๊ธฐ๊ฐ€ ์•„๋‹Œ hipCaffe์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐํ•˜๊ณ  ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

https://github.com/ROCmSoftwarePlatform/hipCaffe

hipCaffe๋ฅผ ๋นŒ๋“œ/ํ…Œ์ŠคํŠธํ•˜๊ธฐ ์œ„ํ•ด ROCm์„ ์„ค์น˜ํ•˜๋ฉด ์ œ๊ฑฐํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
AMDGPU-pro, ์•„๋งˆ๋„ ๋‚˜๋Š” ๋ฐ”๋‹๋ผ ๋ธŒ๋žœ์น˜๋ฅผ ๋‹ค์‹œ ์‹œ๋„ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋ณ„๋กœ๋‹ค
๋ฌธ์„œํ™”, ๋ถˆํ–‰ํžˆ๋„ .. ๋‚˜๋Š” ๋ธ”๋ผ์ธ๋“œ "๋ฉ”์ดํฌ"๋ฅผ ์‹œ๋„ํ•˜๊ณ  ๋ณผ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ ๋‚˜๋Š” ์—ฌ์ „ํžˆ AMD์— ๋Œ€ํ•œ ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์˜ ๊ฒฝํ—˜์„ ๋“ฃ๋Š” ๋ฐ ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
ROCm/HIP ์Šคํƒ; Tensorflow ํฌํฌ์—์„œ ์ž‘์—…ํ•˜๊ณ  ์žˆ๋‹ค๋ฉด ํ›Œ๋ฅญํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์‹ค์ œ๋กœ 3/4 ์ด์ƒ์˜ AMD ์นด๋“œ ๋ชจ๋ธ์—์„œ ์ž‘๋™ํ•œ๋‹ค๋ฉด
์•ผ์ƒ์˜.

--
@onetruecathal / @ [email protected]

2017๋…„ 5์›” 17์ผ ์ˆ˜์š”์ผ ์˜คํ›„ 4:09, Bryan Li [email protected]
์ผ๋‹ค:

@cathalgarvey

Caffe OpenCL ๋ธŒ๋žœ์น˜(ํ…Œ์ŠคํŠธ๋œ ์ปค๋ฐ‹
c61d48746b2df1d237c64abc7416342ce98f3251)
OS: ์šฐ๋ถ„ํˆฌ 16.04.2 LTS
Polaris(RX460), ํ”ผ์ง€(Fury X) ๋ฐ ํ†ต๊ฐ€(W7100)์—์„œ ํ…Œ์ŠคํŠธ๋จ
๋“œ๋ผ์ด๋ฒ„: Linux 16.40 ์ด์ƒ์šฉ AMDGPU-Pro ๋“œ๋ผ์ด๋ฒ„
๋น„์—”๋‚˜CL
์ผ๋ฐ˜ ์ข…์†์„ฑ: libprotobuf-dev libleveldb-dev libsnappy-dev
libopencv-dev libhdf5-serial-dev protobuf-์ปดํŒŒ์ผ๋Ÿฌ libatlas-base-dev
libblas-dev libgflags-dev libgoogle-glog-dev liblmdb-dev
libboost-all-dev cmake git python-numpy cmake
โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ๋ณด๊ฑฐ๋‚˜ ์Šค๋ ˆ๋“œ๋ฅผ ์Œ์†Œ๊ฑฐํ•˜์„ธ์š”.

@cathalgarvey ๋‚˜๋Š” ๊ทธ๋“ค์ด ์™„์ „ํ•œ ํฌํฌ๊ฐ€ ์•„๋‹Œ ํž™ํ•œ ๋ฐฑ์—”๋“œ์—์„œ ์ž‘์—…ํ•˜๊ณ  ์žˆ๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์€ ์Šฌํ”„๊ณ  ๋…ธ๋ ฅ์„ ๋‚˜๋ˆ„๋Š” ๊ฒƒ๋ฟ์ž…๋‹ˆ๋‹ค.
์ด๋ฏธ ์ถฉ๋ถ„ํ•œ ๋„๊ตฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค :/

@YvanDaSilva AMD์˜ ๋…ธ๋ ฅ์€ ํ˜„์žฌ ์•ฝ๊ฐ„ ์ž˜๋ชป ์กฐ์ •๋˜์—ˆ์Šต๋‹ˆ๋‹ค(์˜ˆ, ๋ชจ๋“  ํฌํฌ). ๋˜ํ•œ Caffe์˜ OpenCL ๋ถ„๊ธฐ์™€ ๋‹ฌ๋ฆฌ ์•„์ง ๋‹ค์–‘ํ•œ ์žฅ์น˜์—์„œ ์ž˜ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด...

@naibaf7 ์ „์ ์œผ๋กœ ๋™์˜ํ•ฉ๋‹ˆ๋‹ค.
์†”์งํžˆ ์ธ์  ์ž์›์ด ๋ถ€์กฑํ•œ ๊ฒƒ ๊ฐ™๊ณ  ๋ชจ๋“ ๋ฉด์—์„œ ๋…ธ๋ ฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
๊ทธ๊ฑด ๊ทธ๋ ‡๊ณ : ETH์— ์‹ ๊ฒฝ ์ •๋ณดํ•™์ด ์žˆ๋Š”์ง€ ๋ชฐ๋ž์Šต๋‹ˆ๋‹ค ;) ์ข‹์•„์š”!

@cathalgarvey ๋‚˜ ๊ฐ™์€ ํ‰์‹ ๋„๋ฅผ ์œ„ํ•œ ROCm/HIP ์Šคํƒ์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์„ค๋ช…ํ•ด ์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ? ์ €๋Š” Sea Islands ์นด๋“œ๋กœ AMGPU-pro ๋ฐ AMDGPU๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์œผ๋ฏ€๋กœ ์œ ์šฉํ•œ ๊ฒฐ๊ณผ๋ฅผ ๊ฒŒ์‹œํ•  ์ˆ˜ ์žˆ์„ ๊ฒƒ์ด๋ผ๊ณ  ํ™•์‹ ํ•ฉ๋‹ˆ๋‹ค.

@YvanDaSilva
๊ทธ๋“ค์€ ๋‚ด ์›๋ž˜ Caffe OpenCL ํ”„๋กœ์ ํŠธ๋ฅผ ํ›„์›ํ–ˆ์ง€๋งŒ ๋ถˆํ–‰ํžˆ๋„ ์ž˜ ์กฐ์ •๋˜์ง€ ์•Š์•˜๊ธฐ ๋•Œ๋ฌธ์— AMD ์—ฐ๊ตฌ์™€ AMD์˜ ๋…๋ฆฝ์ ์ธ ์‚ฌ๋žŒ๋„ OpenCL ํฌํŠธ์—์„œ ๋ณ‘๋ ฌ๋กœ ์ž‘์—…ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด์ „ AMD ์—ฐ๊ตฌ ํŒ€์€ ์ด์ œ ์กด์žฌํ•˜์ง€ ์•Š์œผ๋ฉฐ ๋Œ€๋ถ€๋ถ„์€ ์‹ค์ œ๋กœ Tesla์—์„œ ์ผํ•ฉ๋‹ˆ๋‹ค ์ž์œจ์ฃผํ–‰์ฐจ ํ”„๋กœ์ ํŠธ) ์ด์ œ... ๋„ˆ๋ฌด ๋ถˆํ–‰ํ•œ ์‚ฌ๊ฑด์˜ ์—ฐ์†์ž…๋‹ˆ๋‹ค.
๋‚˜๋Š” ์—ฌ์ „ํžˆ ๊ทธ๋“ค๊ณผ ํ˜‘๋ ฅํ•˜๊ณ  ์—ฐ๋ฝํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. Vega๋Š” ์žฌ๋ฏธ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค :)

@naibaf7 ๋ฐ˜๊ฐ‘์Šต๋‹ˆ๋‹ค. ํ–‰์šด์„ ๋น•๋‹ˆ๋‹ค! ๋‚ด๊ฐ€ Heig-vd์— ์žˆ์„ ๋•Œ ๊ทธ๋Ÿฌํ•œ ์—ฐ๊ตฌ๊ฐ€ ์žˆ์—ˆ๋‹ค๋ฉด ๋ถ„๋ช…ํžˆ ์„์‚ฌ๋ฅผ ๊ณ„์†ํ–ˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋„ค... ๊ทธ๋ ‡๊ฒŒ ์ƒ๊ฐํ–ˆ์Šต๋‹ˆ๋‹ค. ์ž‘์—…์€ ๋„ˆ๋ฌด ๋งŽ๊ณ  ์ด ๋ถ„์•ผ์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์ธ์  ์ž์›์€ ๊ฑฐ์˜ ์—†์Šต๋‹ˆ๋‹ค.

๊ทธ ๋ชจ๋“  ๊ฒƒ์ด ํ›Œ๋ฅญํ•˜๊ฒŒ ๋“ค๋ฆฌ์ง€๋งŒ TensorFlow๊ฐ€ OpenCL SYCL๊ณผ ํ•จ๊ป˜ ์ž‘๋™ํ•˜๋„๋ก ํ•˜๋Š” ๋…ผ์˜์— ๋‹ค์‹œ ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ๊ณต๊ธ‰์—…์ฒด๋ณ„ ์†”๋ฃจ์…˜๋ฟ๋งŒ ์•„๋‹ˆ๋ผ... :-)
RoC์™€ ๋‹ค๋ฅธ HiP๊ฐ€ ๊ฐ์ž์˜ ๋ฌธ์ œ๋ฅผ ๋…ผ์˜ํ•  ์ˆ˜ ์žˆ๋Š” ์ž์ฒด GitHub๊ฐ€ ์žˆ๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค...
@ naibaf7 : ์ ์–ด๋„ ๋‚˜๋Š” ์—ฌ์ „ํžˆ OpenCL ์˜์—ญ์— ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์‹œ ํด๋Ÿฝ์— ๊ฐ€์ž…ํ•˜์„ธ์š”! :-)

@keryell ์ž‘์—… ์ค‘์ธ Tensorflow ์šฉ HIP ํฌํŠธ๊ฐ€ ์žˆ๋‹ค๋ฉด HIP์— ๋Œ€ํ•œ ๋…ผ์˜๊ฐ€ ์œ ํšจํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ตญ ๊ณต์‹ Tensorflow-on-CL ์†”๋ฃจ์…˜์€ ํ”Œ๋žซํผ๊ณผ ์ปค๋„ ์ง€์›์ด ํฌ๊ฒŒ ์ œํ•œ๋œ ๋…์  SYCL ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด๋ฏ€๋กœ CUDA์—์„œ ๋ฒ—์–ด๋‚  ์ˆ˜ ์žˆ๋Š” ์ƒˆ๋กœ์šด ๋ฐฉ๋ฒ•์„ ์ œ๊ณตํ•˜๋Š” "๊ณต๊ธ‰์—…์ฒด๋ณ„" HIP ์†”๋ฃจ์…˜๋ณด๋‹ค ๋” ๋‚˜์€ ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค.

HIP๋Š” ํ˜„์žฌ ๋Œ€๋ถ€๋ถ„ AMD๊ฐ€ ํ•˜๊ณ  ์žˆ์„ ์ˆ˜ ์žˆ์ง€๋งŒ AFAIK๋Š” ๊ณต๊ฐœ ํ‘œ์ค€์ž…๋‹ˆ๊นŒ? ์•„๋งˆ๋„ ๋‚ด๊ฐ€ ํ‹€๋ ธ์„ ๊ฒƒ์ด๋‹ค. ํ•˜์ง€๋งŒ AMD๊ฐ€ tensorflow-on-HIP ํฌํŠธ๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ๊ณต์‹ tensorflow-on-SYCL ํฌํŠธ๋ณด๋‹ค ์ฆ‰์‹œ ๋” ๊ฐœ๋ฐฉ์ ์ผ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

HIP๋Š” CUDA์˜ ํ•˜์œ„ ์ง‘ํ•ฉ์ด๋ฏ€๋กœ CUDA๋งŒํผ ๊ฐœ๋ฐฉ์ ์ž…๋‹ˆ๋‹ค.

์•Œ์•˜์–ด ๊ดœ์ฐฎ์•„; HIP-the-API๋Š” CUDA-the-API์˜ ํ•˜์œ„ ์ง‘ํ•ฉ์ด์ง€๋งŒ NVidia๊ฐ€ Oracle ์ฑ„๋„๋ง์„ ์‹œ์ž‘ํ•  ๋งŒํผ ์ถฉ๋ถ„ํžˆ ๊ฐˆ๋งํ•˜์ง€ ์•Š๋Š” ํ•œ ์ด๊ฒƒ์ด ๋ฌธ์ œ๊ฐ€ ๋ ์ง€ ์˜์‹ฌ๋ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” AMD๊ฐ€ ~ ์—ด๋ ค ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋Š” HIP์šฉ ๋Ÿฐํƒ€์ž„/์ปดํŒŒ์ผ๋Ÿฌ๋ฅผ ์–ธ๊ธ‰ํ•˜๊ณ  ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

ํŽธ์ง‘ : ์œ„์˜ ๋‚ด์šฉ์ด ๋ฌด๋ก€ํ•˜๊ฒŒ ๋“ค๋ฆฌ์…จ๋‹ค๋ฉด ์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. ์œ„์˜ ๋‚ด ์ž…์žฅ์„ ๋ช…ํ™•ํžˆ ํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค!

Vulkan๊ณผ Opencl์ด ์œตํ•ฉ ๋ฉ๋‹ˆ๊นŒ?

@cathalgarvey ํ† ๋ก ์€ ๋ถ„๋ช…ํžˆ ์œ ํšจํ•˜์ง€๋งŒ ์—ฌ๊ธฐ์—์„œ๋Š” ๊ทธ๋ ‡์ง€ ์•Š์Šต๋‹ˆ๋‹ค...
ํ˜„์žฌ GitHub์— ์žˆ์œผ๋ฉฐ Khronos Group ํ‘œ์ค€์„ ์‚ฌ์šฉํ•˜์—ฌ TensorFlow ๋ฐ Eigen ํฌํŠธ์— ๋Œ€ํ•ด ๋…ผ์˜ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
ํŠธ์œ„ํ„ฐ๋‚˜ ํŽ˜์ด์Šค๋ถ ๋‹ด๋ฒผ๋ฝ์ด ์•„๋‹™๋‹ˆ๋‹ค... :-)
๋”ฐ๋ผ์„œ ์ด๋Ÿฌํ•œ ํ”„๋กœ์ ํŠธ์— ๋Œ€ํ•œ ๋ช‡ ๊ฐ€์ง€ ์ปค๋ฐ‹์— ๊ธฐ์—ฌํ•˜์‹ญ์‹œ์˜ค! :-)

OpenCL ์žฅ์น˜๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก Codeplay์˜ SYCL ๊ตฌํ˜„์ธ ComputeCpp๋กœ TensorFlow๋ฅผ ์ปดํŒŒ์ผํ•˜๊ธฐ ์œ„ํ•œ ์„ค์ • ๊ฐ€์ด๋“œ์˜ ์ƒˆ ๋ฒ„์ „์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด์— ๋Œ€ํ•œ ํ”ผ๋“œ๋ฐฑ์„ ๋ณด๋‚ด์ฃผ์‹œ๋ฉด ๊ฐ์‚ฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. https://www.codeplay.com/products/computesuite/computecpp/guides/how-to-setup-tensorflow-with-computecpp

ํ…Œ์ŠคํŠธ๋˜์ง€ ์•Š์€ AMD GPU์—์„œ ์ด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ์„ฑ๊ณต๋ฅ ์ด ์–ด๋Š ์ •๋„์ธ์ง€ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ? AMD Radeon Pro 460 @rodburns์šฉ์œผ๋กœ ํ…Œ์ŠคํŠธ๋˜์—ˆ๋Š”์ง€ ํŠนํžˆ ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ํ…Œ์ŠคํŠธ๋˜์ง€ ์•Š์€ GPU์— ๋Œ€ํ•œ ํฌ๋ง์ด ์žˆ๋‹ค๋ฉด ๋‚ด Macbook ๋…ธํŠธ๋ถ์—์„œ ์šฐ๋ถ„ํˆฌ๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๋ฐ ๋ช‡ ์‹œ๊ฐ„์„ ๋ณด๋‚ด๊ฒŒ ๋˜์–ด ๊ธฐ์ฉ๋‹ˆ๋‹ค.

@samhains ์šฐ๋ฆฌ๋Š” ์ด๊ฒƒ์„ ํ…Œ์ŠคํŠธํ•˜์ง€ ์•Š์•˜์ง€๋งŒ ์‹œ๋„ํ•ด ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. SPIR ํ™•์žฅ์„ ์ง€์›ํ•˜๋Š” Ubuntu์™€ ํ•จ๊ป˜ ์ผ๋ถ€ ์ด์ „ AMD ๋“œ๋ผ์ด๋ฒ„๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ์•„์ง ๊ทธ ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ๋ฌด์—‡์ธ์ง€ ์•Œ์•„๋‚ผ ์ˆ˜ ์—†์—ˆ์Šต๋‹ˆ๋‹ค.

@samhains ์ฝ”๋“œ ํ”Œ๋ ˆ์ด ๊ฒฝ๋กœ๊ฐ€ ์ „๋‹ฌ๋˜์ง€ ์•Š์œผ๋ฉด ๋งˆ์นจ๋‚ด Ubuntu/Mac์—์„œ ์‹ค์ œ ์‚ฌ์šฉ ์ƒํƒœ์— ์žˆ๋Š” tf-coriander ๋ฅผ ๋†“์น˜์ง€ ๋งˆ์„ธ์š”.

ํ˜„์žฌ convnets, ์–‘๋ฐฉํ–ฅ rnns ๋“ฑ์—์„œ ํ…Œ์ŠคํŠธ ์ค‘์ด๋ฉฐ ๋ชจ๋“  ๊ฒƒ์ด ์ž˜ ์ž‘๋™ํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. "๋ฐ”๋‹๋ผ" OpenCL 1.2์—์„œ ์‹คํ–‰๋˜๋ฏ€๋กœ ๋น„๊ต์  ์˜ค๋ž˜๋œ ํ•˜๋“œ์›จ์–ด์˜ ๋ฐฉ๋Œ€ํ•œ ๋ฒ”์œ„์—์„œ Tensorflow๋ฅผ ํ™œ์„ฑํ™”ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋ฌธ์ œ๋Š” ํ˜„์žฌ Tensorflow 0.11์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@๋กœ๋“œ๋ฒˆ์Šค. https://www.codeplay.com/products/computesuite/computecpp/guides/how-to-setup-tensorflow-with-computecpp ๋งํฌ์— ๋‚˜์—ด๋œ ๋‹จ๊ณ„๋ฅผ ๋”ฐ๋ผํ•ด ๋ณด์•˜์Šต๋‹ˆ๋‹ค.
๋‹ค์Œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค.
์˜ค๋ฅ˜: /home/sayantan/.cache/bazel/_bazel_sayantan/6f05f78a1e215999d72e42c1e87a8c1d/external/protobuf/ BUILD:609 :1: ์„ ์–ธ๋˜์ง€ ์•Š์€ ํฌํ•จ์ด '@protobuf/.google/_apithon_impl ๊ทœ์น™์— ํฌํ•จ๋จ ':
์‹ค์ œ๋กœ ์†Œ์Šค์—์„œ tensorflow๋ฅผ ์ปดํŒŒ์ผํ•˜๋ ค๊ณ  ํ•˜๋ฉด ๋™์ผํ•œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ๋” ์ผ์ฐ ์ปดํŒŒ์ผํ–ˆ์ง€๋งŒ ๋ฌด์—‡์ด ๋ฐ”๋€Œ ์—ˆ๋Š”์ง€ ํ™•์‹คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

@rahasayantan ๋ฌด์—‡์ด ํฌํ•จ๋ฉ๋‹ˆ๊นŒ? --config=sycl ์—†์ด ์ปดํŒŒ์ผํ•  ๋•Œ๋„ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

@lukeiwanski : ๋‚ด๊ฐ€ ์ดํ•ดํ•˜๋Š” ๋ฌธ์ œ๋Š” Bazel์ด Protobuf๋ฅผ ์ปดํŒŒ์ผํ•˜๋ ค๊ณ  ํ•˜๊ณ  ํ•˜์œ„ ๋””๋ ‰ํ† ๋ฆฌ๋ฅผ ์ฐพ๊ฑฐ๋‚˜ ๋‹ค์šด๋กœ๋“œํ•˜์ง€ ์•Š๋Š”๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‚˜๋Š” ์žฌ๊ท€ ํ•˜์œ„ ๋ชจ๋“ˆ๋กœ ํ’€์„ํ–ˆ์ง€๋งŒ ์—ฌ์ „ํžˆ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. --config = sycl ์—†์ด๋„ ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์‹ค tensorflow ๋ฉ”์ธ ํ”„๋กœ์ ํŠธ์—์„œ git pull์„ ํ•  ๋•Œ๋„ ๊ฐ™์€ ๋ฌธ์ œ์— ์ง๋ฉดํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ด๊ฒƒ์ด openCL๊ณผ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํ’€์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ์‹์— ๋ช‡ ๊ฐ€์ง€ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. git ์—†์ด ๋ฆฌํฌ์ง€ํ† ๋ฆฌ์—์„œ ํ”„๋กœ์ ํŠธ zip์„ ์ˆ˜๋™์œผ๋กœ ๋‹ค์šด๋กœ๋“œํ•˜๊ณ  ์ปดํŒŒ์ผํ•˜๋ฉด ์ œ๋Œ€๋กœ ์ปดํŒŒ์ผ๋˜์ง€๋งŒ ๋ถ„ํ•  ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ด๋ฏธ ๋‹น์‹ ์˜ GIT ํ”„๋กœ์ ํŠธ์—์„œ ์ด ๋ฌธ์ œ๋ฅผ ์ œ๊ธฐํ–ˆ๊ณ  ์šฐ๋ฆฌ๋Š” ๊ฑฐ๊ธฐ์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ ์Šค๋ ˆ๋“œ์˜ ๋ถ„ํ•  ์˜ค๋ฅ˜์™€ ๊ด€๋ จ๋œ ์—…๋ฐ์ดํŠธ๋ฅผ ์ œ๊ณตํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค(์ค‘๋ณต๋˜๋Š” ์ ์€ ์—†์Œ). ์‘๋‹ตํ•ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

์˜คํ”ˆ ์†Œ์Šค triSYCL์ด ๋„์ฐฉํ•ฉ๋‹ˆ๋‹ค. https://github.com/triSYCL/triSYCL/pull/45 ์ฐธ์กฐ

๋‚˜๋Š” ์‹ ์ž…์ด๋‹ค. TF ์ง€์› OpenCL์— ๊ด€์‹ฌ์ด ๋งŽ์Šต๋‹ˆ๋‹ค. ์ด ์Šค๋ ˆ๋“œ์—์„œ ์—…๋ฐ์ดํŠธ๋ฅผ ๋ฐ›์œผ๋ ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ?

์Œ...์žฌ๋ฏธ์žˆ์ง€๋งŒ ์™œ? Tensorflow๊ฐ€ cuda๋ฅผ ์„ ํƒํ•˜์ง€๋งŒ ์ฒ˜์Œ์—๋Š” opencl์„ ์„ ํƒํ•˜๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ๋‚ด๊ฐ€ ์ถ”์ธกํ•˜๋Š” ์–ด๋–ค ์ƒ์—…์  ์ด์œ ?

์•ˆ๋…•ํ•˜์„ธ์š” @tensorflower-gardener์ž…๋‹ˆ๋‹ค.

@hughperkins ๋Š” OpenCL 1.2 ์žฅ์น˜์—์„œ NVIDIAยฎ CUDAโ„ข ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” Coriander ๋ฅผ ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค. TF๋ฅผ OpenCL 1.2 ์žฅ์น˜์— ์—ฐ๊ฒฐํ•ด์•ผ ํ•˜๋Š” ํ•„์š”์— ๋งž๋Š”์ง€ ์‚ดํŽด๋ณด๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ๊ทธ์˜ ์ €์ž‘๋ฌผ์„ ์‚ฌ์šฉํ•  ๊ณ„ํš์ด๋ผ๋ฉด ๊ทธ์˜ ์ด๋ฆ„๊ณผ ๊ธฐ์—ฌ๋„๋ฅผ ์นœ์ ˆํ•˜๊ฒŒ ํ‘œ์‹œํ•˜์‹ญ์‹œ์˜ค.

Mac์— ๋Œ€ํ•œ OpenCL ์ง€์›์— ๋Œ€ํ•œ ํฌ๋ง์ด ๊ฑฐ์˜ tf.zero ๋กœ ๋ฐ”๋€Œ์—ˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” TensorFlow Mac์ด ๋” ์ด์ƒ GPU ์ง€์›์„ ๋” ์ด์ƒ ์ œ๊ณตํ•˜์ง€ ์•Š์„ ๊ฒƒ์ด๋ผ๊ณ  ์ฝ์—ˆ์Šต๋‹ˆ๋‹ค(1.2+):

Note: As of version 1.2, TensorFlow no longer provides GPU support on Mac OS X.

wtf

https://www.tensorflow.org/install/install_mac

TF-Coriander๋Š” Mac์—์„œ ํ…Œ์ŠคํŠธ๋˜์—ˆ์œผ๋ฏ€๋กœ ๋ฒ„์ „ ํŒจ๋ฆฌํ‹ฐ์— ๋„๋‹ฌํ•˜๋ฉด ์ด๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

2017๋…„ 6์›” 22์ผ 11:46:51 CEST, dylib [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

Mac์— ๋Œ€ํ•œ OpenCL ์ง€์›์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค๋Š” ํฌ๋ง์ด ์‚ฌ๋ผ์ง„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.
์•ฝ๊ฐ„ tf.zero . ๋‚˜๋Š” ๋ฐฉ๊ธˆ ๊ฑฐ๊ธฐ์— Mac์ด ๋” ์ด์ƒ ์—†์„ ๊ฒƒ์ด๋ผ๊ณ  ์ฝ์—ˆ์Šต๋‹ˆ๋‹ค.
๋ชจ๋“  GPU ์ง€์›(1.2+):

Note: As of version 1.2, TensorFlow no longer provides GPU support on
Mac OS X.

wtf

https://www.tensorflow.org/install/install_mac

--
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment -310331281

--
K-9 Mail์„ ์‚ฌ์šฉํ•˜์—ฌ Android ๊ธฐ๊ธฐ์—์„œ ๋ณด๋ƒˆ์Šต๋‹ˆ๋‹ค. ์ œ ๊ฐ„๋žตํ•œ ์„ค๋ช…์„ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

์ด์ œ eGPU์™€ n Nvidia 980 Ti๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ์ž‘๋™ํ•˜๊ณ  Cuda๊ฐ€ ์ž‘๋™ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์Šฌํ”„๋‹ค.

์•„์ง ๋‚ด ๊ตฌ์„ฑ์—์„œ Tensor Flow๋ฅผ ์‚ฌ์šฉํ•ด ๋ณผ ์‹œ๊ฐ„์ด ์—†์—ˆ์Šต๋‹ˆ๋‹ค.

๋‚ด ์ปดํ“จํ„ฐ์— ์›น๋“œ๋ผ์ด๋ฒ„์™€ Cuda ํˆดํ‚ท์ด ์„ค์น˜๋˜์–ด ์žˆ๊ณ  Cuda ์ƒ˜ํ”Œ์ด ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

https://youtu.be/JN9fDmqf010

@cathalgarvey ๋‹น์‹ ์€ tf-coriander์—์„œ convnet์„ ํ…Œ์ŠคํŠธํ•œ๋‹ค๊ณ  ๋งํ–ˆ์ง€๋งŒ ์•„์ง convnet์ด ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. tf-coriander๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ GPU์—์„œ convnet์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ๋ช…ํ™•ํžˆ ๋ง์”€ํ•ด ์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

tensorflow๊ฐ€ OS X์—์„œ ๋” ์ด์ƒ GPU๋ฅผ ์ง€์›ํ•˜์ง€ ์•Š๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ์ฃผ๋ฌธํ•œ eGPU ์„ค์ •๊ณผ ํ•จ๊ป˜ Tensorflow๋ฅผ ์‚ฌ์šฉํ•  ๊ณ„ํš์ด์—ˆ์Šต๋‹ˆ๋‹ค.

@justinrmiller ๊ทธ๋“ค์€ ๋” ์ด์ƒ mac os์—์„œ ํ…Œ์ŠคํŠธํ•  ์ˆ˜ ์—†๋‹ค๊ณ  ์ฃผ์žฅํ•˜๋ฏ€๋กœ ์ง€์›์„ ์ค‘๋‹จํ•˜๊ธฐ๋กœ ๊ฒฐ์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ๋ฏฟ๊ธฐ๊ฐ€ ์–ด๋ ต์Šต๋‹ˆ๋‹ค. ํ•˜์ด ์‹œ์—๋ผ์˜ egpus ๊ด‘๊ณ ์™€ ์ƒˆ๋กœ์šด nvidia ๋“œ๋ผ์ด๋ฒ„๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๋” ์ด์ƒ ๊ทธ๋ ‡์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

@tscholak ๋„ค ๋งž์Šต๋‹ˆ๋‹ค. ์ƒˆ egpu ์ธํด๋กœ์ €๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Windows ์ƒ์ž๋ฅผ ์˜์›ํžˆ ๋ฒ„๋ฆฌ๋ ค๊ณ  ํ–ˆ์Šต๋‹ˆ๋‹ค.

Nvidia ์นด๋“œ๊ฐ€ eGPU ์ธํด๋กœ์ €์—์„œ ์ž‘๋™ํ•˜๋”๋ผ๋„ Apple์€ ๊ณต์‹์ ์œผ๋กœ ๊ฐœ๋ฐœ ํ‚คํŠธ์—์„œ RX580๋งŒ ์ง€์›ํ•˜๋ฏ€๋กœ OpenCL์˜ ํ•„์š”์„ฑ์ด ์‚ฌ๋ผ์ง€์ง€ ์•Š์„ ๊ฒƒ์ž„์„ ๋ช…์‹ฌํ•˜์‹ญ์‹œ์˜ค.

Mac์˜ OpenCL์€ 1.2์ด๋ฏ€๋กœ ํ™œ์„ฑ ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ์—†๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.
๊ฐœ๋ฐœ. TF์— Metal ์ง€์›์„ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์€ ํž˜๋“  ๊ณผ์ •์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.
(Eigen ๋ฐ ์ŠคํŠธ๋ฆผ ์‹คํ–‰๊ธฐ ํ™œ์„ฑํ™”) ํ•˜์ง€๋งŒ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

2017๋…„ 7์›” 16์ผ ์ผ์š”์ผ ์˜คํ›„ 3:17 Ferdia McKeogh [email protected]
์ผ๋‹ค:

Nvidia ์นด๋“œ๋Š” eGPU ์ธํด๋กœ์ €์—์„œ ์ž‘๋™ํ•˜์ง€๋งŒ Apple์€
๊ฐœ๋ฐœ ํ‚คํŠธ์—์„œ RX580๋งŒ ๊ณต์‹์ ์œผ๋กœ ์ง€์›ํ•˜๋ฏ€๋กœ
OpenCL์€ ์‚ฌ๋ผ์ง€์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

โ€”
๋‹น์‹ ์ด ๋Œ“๊ธ€์„ ๋‹ฌ์•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment-315634166 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ACFkv3bmDr_KFSydC-QW_xbuR008pvLXks5sOm_kgaJpZM4Gex3i
.

macOS์šฉ GPU ์ง€์› ์ค‘๋‹จ์— ๋Œ€ํ•ด ๋งค์šฐ ์œ ๊ฐ์ž…๋‹ˆ๋‹ค.

Apple์€ ๋ถ„๋ช…ํžˆ ์กฐ๋งŒ๊ฐ„ Nvidia GPU๋กœ ๋ณ€๊ฒฝํ•˜์ง€ ์•Š์„ ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์— macOS์—์„œ GPU์— ๋Œ€ํ•œ OpenCL ์ง€์›์„ ๊ณ„์† ์ฐพ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

Tensorflow๋Š” ์ œ๊ฐ€ ์„ ํƒํ•œ ์—”์ง„์ž…๋‹ˆ๋‹ค. ๋‚ด MacBook Pro ๋˜๋Š” ๋ฏธ๋ž˜์˜ iMac Pro์—์„œ ๋กœ์ปฌ๋กœ GPU ๊ฐ€์†์„ ์‚ฌ์šฉํ•˜๋ฉด ์ •๋ง ๋ฉ‹์งˆ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

Microsoft์˜ ๊ฒฝ์šฐ Apple์„ ๋ฐฉํ•ดํ•˜๋Š” ๊ฒƒ์ด ํ•ฉ๋ฆฌ์ ์ด์ง€๋งŒ Google์—๋Š” ๋ฐ์Šคํฌํ†ฑ OS๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์Šค์Šค๋กœ์—๊ฒŒ๋งŒ ํ”ผํ•ด๋ฅผ ์ค๋‹ˆ๋‹ค.

์†”์งํžˆ ๋‚˜๋ณด๋‹ค ๋” ๋˜‘๋˜‘ํ•œ ์‚ฌ๋žŒ์ด Mac OS 10.13์˜ MPS - ์ฆ‰์‹œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋งŽ์€ ์‹ ๊ฒฝ๋ง ๊ธฐ๋ณธ ์š”์†Œ๋ฅผ ์ง€์›ํ•˜๋Š” ๊ธˆ์† ์„ฑ๋Šฅ ์…ฐ์ด๋”๋ฅผ ํ†ตํ•ฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‚ดํŽด๋ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋ชจ๋ฐ”์ผ ๋ฐ ๋ฐ์Šคํฌํƒ‘ iOS ๋ฐ macOS Tensorflow ์ถ”๋ก  ๋ฐฐํฌ๋ฅผ ์œ„ํ•œ ์ตœ์‹  ๊ณ ์„ฑ๋Šฅ GPU๊ฐ€ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

๋‚ด๊ฐ€ ์ดํ•ดํ•˜๋Š” ๋Œ€๋กœ Apple ํ”„๋ฆฌ๋ฏธํ‹ฐ๋ธŒ๋กœ ํ›ˆ๋ จํ•  ์ˆ˜๋Š” ์—†์ง€๋งŒ(์•„๋ฌด๊ฒƒ๋„ ์ œ๊ณตํ•˜์ง€ ์•Š์Œ) Tensorflow ์ง€์›์„ ์‚ฌ์šฉํ•˜๋ฉด ๊ฐ€๋Šฅํ• ๊นŒ์š”? Apple ํ”Œ๋žซํผ์„ ์‚ฌ์šฉํ•˜๋Š” ์‚ฌ๋žŒ๋“ค์—๊ฒŒ๋Š” ํ˜œํƒ์ด ๋  ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

๋‚˜๋Š” ๊ตฌ๊ธ€์ด ์ด๊ฒƒ์„ ๋‚ด๋ถ€์ ์œผ๋กœ ์ œ๊ณตํ•  ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•˜์ง€ ์•Š์œผ๋ฉฐ ๋‚˜ ์Šค์Šค๋กœ ๊ทธ๊ฒƒ์„ ์‹œ๋„ํ•˜๋Š” ๋ฐ ํ•„์š”ํ•œ ๊ธฐ์ˆ ์ด ๊ฑฐ์˜ ์—†์Šต๋‹ˆ๋‹ค. ์ด ์•„์ด๋””์–ด๋ฅผ ๊ฒŒ์‹œํ•˜์—ฌ ๋‚˜๋ณด๋‹ค ๋” ์žฌ๋Šฅ ์žˆ๋Š” ์‚ฌ๋žŒ๋“ค์ด ์ด ์•„์ด๋””์–ด๋ฅผ ๋ฐ›์•„๋“ค์ผ ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

:)

Apple์€ ์˜ค๋กœ์ง€ Apple ๊ธฐ๊ธฐ๋ฅผ ํŒ๋งคํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค. Google์€ Google์˜ ๋Œ€๊ทœ๋ชจ ์„œ๋น„์Šค๋ฅผ ๊ณ ์šฉํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค.

ํ•˜๋‚˜์˜ Apple ๋…ธํŠธ๋ถ๊ณผ ๊ฐ™์€ ๋‹จ์ผ ์žฅ์น˜๋กœ AI(ํ•™์Šต)๋ฅผ ์ˆ˜ํ–‰ํ•˜๋ ค๋Š” ๊ฒฝ์šฐ "๋”ฅ ๋Ÿฌ๋‹" ๋Œ€์‹  "ํ”ผ์…œ ๋Ÿฌ๋‹"์„ ์ˆ˜ํ–‰ํ•˜๋ฏ€๋กœ ์ž์Šต์„œ ์ด์™ธ์˜ ์ž‘์—…์„ ํฌ๊ธฐํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ์ถ”๋ก  ๊ฒฐ๊ณผ ๋‹จ์ผ ์žฅ์น˜(๋ฉ€ํ‹ฐ์ฝ”์–ด ์ „ํ™”๊ฐ€ ๋งŽ์ง€ ์•Š์€ ๊ฒฝ์šฐ์—๋„)์—์„œ ๋‹จ์ผ ์‚ฌ์šฉ์ž์— ๋Œ€ํ•œ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์€ GPU๋ฅผ ํ†ตํ•ด ๊น”๋”ํ•˜๊ฒŒ ์ˆ˜ํ–‰๋  ์ˆ˜ ์žˆ์ง€๋งŒ CPU์—์„œ๋งŒ ์™„๋ฒฝํ•˜๊ฒŒ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฐ˜๋ฉด์— ํ•™์Šต์„ ์œ„ํ•ด ๋งค์šฐ ํฐ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์ œ๊ณตํ•˜๊ฑฐ๋‚˜ ๋งค์šฐ ํฐ ๋™์‹œ ๊ณ ๊ฐ ๊ทธ๋ฃน์— ํ›ˆ๋ จ๋œ ์ถ”๋ก ์„ ์ œ๊ณตํ•˜๋ ค๋Š” ๊ฒฝ์šฐ GPU๊ฐ€ ์ ˆ๋Œ€์ ์œผ๋กœ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ ์ด๋Ÿฌํ•œ ๊ทœ๋ชจ๋กœ ์ˆ˜ํ–‰ํ•˜๋Š” ๊ฒƒ์€ ๋„คํŠธ์›Œํฌ ๋ฌธ์ œ๋กœ ์ธํ•ด ์‰ฝ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. TPU-Pods ๋ฌผ๋ฆฌ์  ์•„ํ‚คํ…์ฒ˜๋ฅผ ์‚ดํŽด๋ณด์‹ญ์‹œ์˜ค. ๋žฉํ†ฑ์˜ ๋Œ€์ฒ™์ ์— ์žˆ์Šต๋‹ˆ๋‹ค(์„œ๋ฒ„ ๊ฐ„ ํ†ต์‹ ์„ ์œ„ํ•œ ์ „์šฉ ๊ด‘์„ฌ์œ ๊ฐ€ ์žˆ๋Š” ๋ฉ”๋ชจ๋ฆฌ ๊ณผ๋ถ€ํ•˜ ๋ฉ€ํ‹ฐ ์ฝ”์–ด ์„œ๋ฒ„๋‹น ์—ฌ๋Ÿฌ GPU).

๋งฅ๋ถ ํ”„๋กœ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ํด๋ผ์šฐ๋“œ๋กœ ์ด๋™ํ•˜๊ธฐ ์ข‹์€ ๋‹จ๋ง๊ธฐ์ž…๋‹ˆ๋‹ค :-D

Metal์˜ TF๋Š” iOS๋กœ๋„ ํ™•์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ด€์‹ฌ ์žˆ๋Š” ์‚ฌ๋žŒ์ด ์žˆ๋‹ค๋ฉด ๋จผ์ € Eigen์— Metal ์ง€์›์„ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค(OpenCL์„ ์ฐธ์กฐ๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Œ).

@rogerpasky ํ•™๊ต์—์„œ๋Š” ๋ชจ๋ธ ํ‰๊ฐ€๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ํ•™์Šต ๋ชจ๋ธ์—๋„ Tensorflow๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋‚˜๋Š” ๊ฐ€๊นŒ์šด ์žฅ๋ž˜์— ์ด๊ฒƒ์„ ๋‹ค์‹œ ๋ฐ˜๋ณตํ•ด์•ผ ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‚˜ ๊ฐ™์€ ํ•™์ƒ์—๊ฒŒ๋Š” GPU ๊ต์œก์ด ํ•„์ˆ˜์ด๋ฏ€๋กœ ๋งŽ์€ ์‹œ๊ฐ„์„ ์ ˆ์•ฝํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ๋ช…์˜ ๋™์‹œ ์‚ฌ์šฉ์ž์—๊ฒŒ ์„œ๋น„์Šค๋ฅผ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์€ ๋ฌธ์ œ๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค.

@rogerpasky ๋Š” Mac์—์„œ ๋กœ์ปฌ๋กœ ๋ชจ๋ธ ๋ฐ ์†”๋ฃจ์…˜์„ ๊ฐœ๋ฐœํ•˜๋Š” ๊ธฐ๋Šฅ์— ๊ด€ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@rogerpasky๋Š” ์ •์ค‘ํ•˜๊ฒŒ ๋™์˜ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํด๋ผ์šฐ๋“œ ๊ธฐ๋ฐ˜ ๋‹ค์ค‘ GPU ์†”๋ฃจ์…˜์€ ์ธํ„ฐ๋„ท ์„œ๋น„์Šค์— ์ ํ•ฉํ•˜์ง€๋งŒ ์ €๋Š” ๋ช‡ ์‹œ๊ฐ„ ๋™์•ˆ ์ „๋ฌธ๊ฐ€ ์ˆ˜์ค€์˜ ํ•ด์ƒ๋„์™€ ์••์ถ•๋˜์ง€ ์•Š์€ HD, 2K, 4K ์˜์ƒ์— ๋Œ€ํ•ด ์ถ”๋ก ์ด ์‹คํ–‰๋˜๋Š” ์ „๋ฌธ ๋น„๋””์˜ค ์ œ์ž‘ ํŒŒ์ดํ”„๋ผ์ธ์„ ๋ชฉํ‘œ๋กœ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ํด๋ผ์šฐ๋“œ์— ์—…๋กœ๋“œํ•˜๋ ค๊ณ  ํ•˜๋Š” ๊ฒฝ์šฐ b) Google์ด๋‚˜ ๋ˆ„๊ตฌ์—๊ฒŒ๋‚˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์„ ์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. c) ํ™œ์šฉํ•˜๊ณ ์ž ํ•˜๋Š” ๋กœ์ปฌ์— ๋‹ค์ค‘ GPU ์ง€์› ์‹œ์Šคํ…œ(Mac ๋ฐ Windows)์œผ๋กœ ๊ฐ€๋“ ์ฐฌ ๊ณต๊ฐ„์ด ์žˆ๊ณ  d) ๋ฐ˜๋ฉด ๋‹จ์ผ ์ด๋ฏธ์ง€์— ๋Œ€ํ•œ ์ถ”๋ก ์€ CPU์—์„œ ๊ดœ์ฐฎ์Šต๋‹ˆ๋‹ค. ์—ฌ๋Ÿฌ ๊ทธ๋ž˜ํ”„๋ฅผ ํ†ตํ•ด ์ถ”๋ก ์„ ์œ„ํ•ด ์ „์ฒด ์˜ํ™”๋ฅผ ์‹คํ–‰ํ•˜๋ฉด 100% MPS ๋Œ€ CPU์™€ ๊ฐ™์€ ๊ฒƒ์„ ์‚ฌ์šฉํ•˜์—ฌ ์„ฑ๋Šฅ์ด ์ฆ๊ฐ€ํ•˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ปค๋ฎค๋‹ˆํ‹ฐ๊ฐ€ ํ‘œ์ค€ ์ง€์›/ํฌ์šฉ์„ ๊ฑฐ๋ถ€ํ•˜๊ณ  ๋Œ€์‹  Nvidia ์ „์šฉ ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์‹ค์ œ ์‚ฌ์šฉ ์‚ฌ๋ก€๋Š” ๋น„๋‘˜๊ธฐ ๊ตฌ๋ฉ์ด ๋˜์–ด ์ •๋ง ๋ถ€๋„๋Ÿฝ์Šต๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ํŠœํ† ๋ฆฌ์–ผ์„ ์‹คํ–‰ํ•˜๋Š” ์ทจ๋ฏธ ํ™œ๋™๊ฐ€์˜ ์œ ํœด ์š”์ฒญ์ด ์•„๋‹™๋‹ˆ๋‹ค. GPU ์ถ”๋ก ์€ ์‹ค์ œ ํ•˜๋“œ์›จ์–ด์˜ ๋‹ค์–‘ํ•œ ์›Œํฌ๋กœ๋“œ์— ๋Œ€ํ•œ ๋‹ค์–‘ํ•œ GPU/CPU ์ œํ’ˆ๊ตฐ์„ ์ง€์›ํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. Google์ด ์ด ๋ฌธ์ œ๋ฅผ ์ง„์ง€ํ•˜๊ฒŒ ๋ฐ›์•„๋“ค์ด๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค. TF์™€ ๊ฐ™์€ ๋‹จ์ผ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ๊ณ„์† ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ์ •๋ง ์ข‹์„ ๊ฒƒ์ด๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

์ œ ๋ง์„ ๋“ค์–ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์š•์„ ํ•˜๋ ค๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ ์ปค๋ฎค๋‹ˆํ‹ฐ์— ๋‹ค๋ฅธ ๊ด€์ ์„ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@pldelisle , @tscholak , @vade ์ €๋ฅผ ์˜คํ•ดํ•˜์ง€ ๋งˆ์‹ญ์‹œ์˜ค. ๊ฐ–๊ณ  ์‹ถ๊ณ , ์ด ์Šค๋ ˆ๋“œ์—์„œ ๊ฒ€์ƒ‰ํ•˜๋ฉด ๋‚˜๋Š” ์„œํฌํ„ฐ๋กœ ํ•ฉ๋ฅ˜ํ–ˆ์ง€๋งŒ, ๋‚ด๊ฐ€ ๊ทธ๊ฒƒ์„ ๋”ฐ๋ผ์˜จ ํ•œ ๋‚˜๋Š” ๋‚ด๊ฐ€ ๊ทธ๋ ‡๊ฒŒ ์ƒ๊ฐํ•˜๊ธฐ ๋•Œ๋ฌธ์—๊ฐ€ ์•„๋‹ˆ๋ผ ๋‚ด๊ฐ€ ์“ด ๊ฒฐ๋ก ์— ๋„๋‹ฌํ–ˆ์Šต๋‹ˆ๋‹ค. MacBook์€ ์ˆ˜์ฒœ ๊ฐœ์˜ ๋น„๋””์˜ค๋กœ ๊ต์œก์„ ๋ฐ›์œผ๋ฉด ๋…น์•„๋‚ด๋ฆด ๊ฒƒ์ž…๋‹ˆ๋‹ค :-D), ๊ทธ๋Ÿฌ๋‚˜ ์‹ค์ œ ์‚ฐ์—… ์‚ฌ์‹ค๊ณผ ํ•จ๊ป˜. ์งง์€ ์‹œ๊ฐ„ ์•ˆ์— ํ•ด๊ฒฐ๋˜๊ธฐ๋ฅผ ๊ธฐ๋‹ค๋ฆฌ์ง€ ๋งˆ์„ธ์š”(IMHO, iPhone/Android ๋ฌธ์ œ ์ดํ›„๋กœ Apple๊ณผ Google์ด ์„œ๋กœ๋ฅผ ์‹ซ์–ดํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ•ด๊ฒฐ๋˜์ง€ ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค).

@rogerpasky ์ด๋ฏธ Mac OS์—์„œ nvidia GPU์— ๋Œ€ํ•œ ์ง€์›์ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. 1.2์—์„œ ๋ฐฉ๊ธˆ ์ œ๊ฑฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

Note: As of version 1.2, TensorFlow no longer provides GPU support on Mac OS X.

eGPU(Sonnet's) ์ฃผ๋ฌธ์„ ์ทจ์†Œํ•˜๊ณ  ๋‚ด ๊ฒŒ์ž„ ์žฅ๋น„์—์„œ Linux๋ฅผ ๋“€์–ผ ๋ถ€ํŒ…ํ•  ์˜ˆ์ •์ด์ง€๋งŒ ์‚ฌ๋žŒ๋“ค์ด ์‚ฌ์šฉํ•˜๋˜ ๊ฒƒ์„ ์ง€์›ํ•˜์ง€ ์•Š๋Š” ๊ฒƒ์€ ์ข‹์ง€ ์•Š์Šต๋‹ˆ๋‹ค. eGPU(๋ชจ๋ธ ๊ต์œก)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Mac์—์„œ ์ด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๊ณ  ์‹ถ์—ˆ์ง€๋งŒ ์ง€๊ธˆ์€ ๊ทธ๋ ‡์ง€ ์•Š์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. https://github.com/lengstrom/fast-style-transfer

@rogerpasky Er, CoreML์ด Keras๋ฅผ ํ†ตํ•ด ํ…์„œ ํ๋ฆ„ ๋ชจ๋ธ ๊ฐ€์ ธ์˜ค๊ธฐ๋ฅผ ์ง€์›ํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ณ  ๊ณ„์‹ญ๋‹ˆ๊นŒ? Apple์€ Google์„ '๋ฏธ์›Œํ•˜์ง€' ์•Š์Šต๋‹ˆ๋‹ค. ๋น„์ฆˆ๋‹ˆ์Šค๋Š” ๋น„์ฆˆ๋‹ˆ์Šค์ž…๋‹ˆ๋‹ค. Apples ๊ณต๊ธ‰์—…์ฒด ์ค‘ ํ•˜๋‚˜๋Š” ์‚ผ์„ฑ์ž…๋‹ˆ๋‹ค. ์ž ์‹œ ๋™์•ˆ ์ฝ์–ด๋ณด์‹ญ์‹œ์˜ค. ๊ตฌ๊ธ€, ์• ํ”Œ, ์‚ผ์„ฑ์€ ๊ธฐ์—…์ด๊ณ  ๋ˆ ๋ฒ„๋Š” ์ผ์„ ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ฐธ๊ณ ๋กœ. ์ œ MacBook Pro๋Š” ์ˆ˜์ฒœ ํŽธ์˜ ์˜ํ™”์— ๋Œ€ํ•œ ์ถ”๋ก ์„ ์‹คํ–‰ํ•œ ๊ฒฐ๊ณผ ๋…น์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์ €๋Š” CUDA๊ฐ€ ์ฑ„ํƒํ•˜๊ธฐ์— ๋งค์šฐ ํŽธ๋ฆฌํ–ˆ๊ณ  Nvidia์˜ ์ง€์†์ ์ธ ์ง€์›๊ณผ AMD์˜ ๋†“์นœ ๊ธฐํšŒ๊ฐ€ ์šฐ๋ฆฌ๋ฅผ ์—ฌ๊ธฐ๊นŒ์ง€ ์˜ค๊ฒŒ ํ–ˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ๊ฒƒ์ด ์‚ฌ์•…ํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋ณ€๊ฒฝ ๋น„์šฉ ๋Œ€ ์„ฑ๋Šฅ ๋ธํƒ€ ๋Œ€ ์ฝ”์Šค ์œ ์ง€ ๋น„์šฉ์ž…๋‹ˆ๋‹ค.

๋‚˜๋Š” ์–ด๋–ค ์ฒœ์žฌ๊ฐ€ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ค„ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

OpenCL, Mac, iOS, CoreML, Vulkan ๋“ฑ๊ณผ ๊ฐ™์€ ์ƒˆ๋กœ์šด ์žฅ์†Œ์— ๋”ฅ ๋Ÿฌ๋‹์„ ๋„์ž…ํ•˜๋Š” ๊ฒƒ์— ๋Œ€ํ•œ ๊ณต๋™ ํ† ๋ก ์„ ์œ„ํ•ด Google ๊ทธ๋ฃน์„ ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ผ์ด ๊ฐ€๋Šฅํ•˜๋„๋ก ํ•˜๋ ค๋ฉด ๊ฐ€์ž…ํ•˜์—ฌ ์‚ฌ์šฉ ๋ฉ”๋ชจ๋ฅผ ๊ฒŒ์‹œํ•˜์„ธ์š”. ๊ฒฝ์šฐ ๋˜๋Š” ๋ฌธ์ œ์˜ ์–ด๋–ค ๋ถ€๋ถ„์„ ์ž‘์—…ํ•˜๊ณ  ์žˆ๋Š”์ง€. ์ด๋ฏธ MIOpen, Codeplay์˜ ์ž‘์—…, TF-Coriander ๋ฐ ์šฐ๋ฆฌ ํšŒ์‚ฌ์˜ ๋‚ด๋ถ€ ํ”„๋กœ์ ํŠธ(Vertex.AI)๋ฅผ ํฌํ•จํ•˜์—ฌ ๋” ๋งŽ์€ ํ”Œ๋žซํผ์— TF๋ฅผ ๊ฐ€์ ธ์˜ค๊ธฐ ์œ„ํ•ด ์—ด์‹ฌํžˆ ๋…ธ๋ ฅํ•˜๋Š” ์‚ฌ๋žŒ๋“ค์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋…ธ๋ ฅ์€ ๋ชจ๋‘ ๋ฐ€์ ‘ํ•˜๊ฒŒ ๊ด€๋ จ๋˜์–ด ์žˆ์œผ๋ฏ€๋กœ ๊ฐœ๋ฐœ์ž์™€ ์‚ฌ์šฉ์ž๋ฅผ ํ•œ ๊ณณ์—์„œ ๋ชจ๋‘ ๋ชจ์œผ๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

https://groups.google.com/forum/#!forum/deep -learning-everywhere

@benoitsteiner @hughperkins @cathalgarvey
@rogerpasky @vade @tscholak @pldelisle @adityaatluri @chocol4te @justinrmiller

@justinrmiller ์ €๋Š” Tensorflow 1.2.1(CUDA 8, cuDNN 6)์„ ์‹คํ–‰ํ•˜๋Š” Sierra(Sonnet ์ธํด๋กœ์ €์˜ Titan Xp)์— eGPU๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฌธ์ œ๊ฐ€ ์žˆ์œผ๋ฉด ์•Œ๋ ค์ฃผ์„ธ์š”.

tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: TITAN Xp, pci bus id: 0000:4f:00.0)

In [5]: tf.__version__
Out[5]: '1.2.1'

@danbarnes333 ๋ฉ‹์ง€๋„ค์š”! ์ •๋ณด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!

@danbarnes333 ์–ด๋–ป๊ฒŒ tf 1.2๋ฅผ cuDNN 6์œผ๋กœ ๋นŒ๋“œํ•˜๊ฒŒ ๋˜์—ˆ๋‚˜์š”? LLVM์„ ์‚ฌ์šฉํ•˜์…จ๋‚˜์š”? GCC? cuDNN 5๋กœ ๋นŒ๋“œํ•˜๋Š” ๋ฐ๋งŒ ์„ฑ๊ณตํ–ˆ์Šต๋‹ˆ๋‹ค...

์ฐธ๊ณ ๋กœ https://machinelearning.apple.com/

@tscholak ๋‚˜๋Š” ์ด๊ฒƒ์„ OpenCL์— ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•ด ์—ฌ๊ธฐ์— ๊ฒŒ์‹œํ•˜์ง€ ์•Š๊ฒ ์ง€๋งŒ ์—ฌ๊ธฐ ์— ๋‹จ๊ณ„๋ฅผ ์š”์•ฝํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

@choongng ๊ตฌ๊ธ€ ๊ทธ๋ฃน์— ๊ฐ€์ž…ํ–ˆ๋Š”๋ฐ ์ž ์ž ํ•œ ๊ฒƒ ๊ฐ™์•„์š”. ๊ทธ๋Ÿผ ์—ฌ๊ธฐ์„œ ์š•ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค ;-)

  1. ๋จธ์‹ ๋Ÿฌ๋‹/๊ณ ์„ฑ๋Šฅ/GPU ์ปดํ“จํŒ…์€ ๊ฒฝ์Ÿ์ด ์น˜์—ดํ•œ ์‹œ์žฅ์ž…๋‹ˆ๋‹ค. ์ข‹๋“  ์‹ซ๋“  NVidia๋Š” ์‹œ์žฅ ์„ ์ง€๋ฐฐํ•˜๊ณ  ์นด๋“œ์™€ ์†Œํ”„ํŠธ์›จ์–ด๋ฅผ ๋ฒ ์ŠคํŠธ์— ๊ฐ€๊น๊ฒŒ ์œ ์ง€ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ์‚ฐ๊ณผ ๊ธฐํ•œ์ด ์žˆ๋Š” ๊ฒฝ์šฐ ํ˜„์žฌ๋กœ์„œ๋Š” NVidia์— ๊ฑฐ์˜ ๊ณ ์ •๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

  2. ๋‚˜๋Š” ๊ตฌ์‹ AMD ์นด๋“œ("Bonaire")์™€ ์˜ˆ์‚ฐ ์ด ์ „ํ˜€ ์—†๋Š” ์ทจ๋ฏธ๋ฅผ ๊ฐ–๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์–ด์ œ ํ˜„์žฌ Arch Linux์—์„œ ๋…์  AMD OpenCL 2 ๊ตฌํ˜„์œผ๋กœ caffe ๋ฅผ ์‹คํ–‰ํ•˜๊ณ  ์žˆ์œผ๋ฉฐ ์˜ค๋Š˜ ์•„์นจ์— ๊ฐ™์€ ๋ฐฉ์‹์œผ๋กœ AMD์˜ ์˜คํ”ˆ ์†Œ์Šค MIOpen ๋ฅผ ์‹คํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋ฉด ์ผ๋ถ€ ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Bonaire๋Š” ์•ฝ 1800 GFLOPS ๋‹จ์ •๋ฐ€๋„์—์„œ ์ •์ ์„ ์ด๋ฃน๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ TensorFlow๊ฐ€ Bonaire์—์„œ OpenCL๊ณผ ํ•จ๊ป˜ ์‹คํ–‰๋˜์ง€ ์•Š์œผ๋ฉด TensorFlow๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

  3. ์˜ˆ์‚ฐ์ด ๋งˆ์ˆ ์ฒ˜๋Ÿผ ๋‚˜ํƒ€๋‚˜์•ผ ํ•œ๋‹ค๋ฉด Intel CPU์™€ NVidia ์นด๋“œ๋ฅผ ๊ตฌ์ž…ํ•˜๊ณ  ๊ณต๊ธ‰์—…์ฒด๊ฐ€ ์ง€์›ํ•˜๋Š” ๋…์  ์†Œํ”„ํŠธ์›จ์–ด๋ฅผ ์‹คํ–‰ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. Google, Red Hat, Canonical ๋ฐ AMD์™€ ๊ฐ™์€ ๊ณต๊ธ‰์—…์ฒด์— ๋Œ€ํ•œ ๋ฌด๋ฃŒ QA๋ฅผ ์™„๋ฃŒํ–ˆ์Šต๋‹ˆ๋‹ค.

    3๋…„ ๋™์•ˆ ๊ฐ€์ง€๊ณ  ์žˆ๋˜ GPU์—์„œ ๋ฌด์–ธ๊ฐ€๋ฅผ ์–ป๋Š” ๋ฐ 3๊ฐœ์›”(๋ฐ 3๊ฐœ์˜ ๋ฐฐํฌํŒ - Fedora 25, Ubuntu 16.04 LTS ๋ฐ Arch)์ด ๊ฑธ๋ ธ์Šต๋‹ˆ๋‹ค. Fedora์˜ ๋ฒ„๊ทธ ํŠธ๋ž˜์ปค์— ๋‚ด ์ด๋ฆ„์ด ์žˆ๋Š” ์ˆ˜์ •๋˜์ง€ ์•Š์€ ๋ฒ„๊ทธ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. Ubuntu์™€ Freedesktop.org๋„ ๋งˆ์ฐฌ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์„ ๊ณ ์น  ์‚ฌ๋žŒ๋“ค์˜ ๋Œ€๋ถ€๋ถ„์€ ๋ˆ์„ ๋ฐ›์ง€ ๋ชปํ•˜๊ฑฐ๋‚˜ ๋‹ค๋ฅธ ์ผ์„ ํ•˜๊ธฐ ์œ„ํ•ด ๋ˆ์„ ๋ฐ›๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

    ์˜ˆ, AMD์˜ ์ƒˆ๋กœ์šด CPU๋Š” ์ธ์ƒ์ ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ ‡์Šต๋‹ˆ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ์†Œํ”„ํŠธ์›จ์–ด๋Š” ์˜คํ”ˆ ์†Œ์Šค์ด์ง€๋งŒ ์˜ˆ์‚ฐ๊ณผ ๊ธฐํ•œ์ด ์ƒํ™ฉ์„ ๋ฐ”๊ฟ‰๋‹ˆ๋‹ค. ์ง€์›์ด ํ•ต์‹ฌ์ž…๋‹ˆ๋‹ค. ์ง€์›์ด ์ „๋ถ€์ž…๋‹ˆ๋‹ค!

@znmeb TF์— GCN ์ด์ „ ํ•˜๋“œ์›จ์–ด๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š”์ง€์กฐ์ฐจ ๋ชฐ๋ž์Šต๋‹ˆ๋‹ค.
๋‚ด ํƒ€ํžˆํ‹ฐ์—์„œ๋Š” AMD ๋…์  ๋“œ๋ผ์ด๋ฒ„๊ฐ€ GCN1์šฉ ์ด์ „ Linux ์ปค๋„์—์„œ๋งŒ ์ž‘๋™ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ•˜๋‚˜์˜ ๋ฐฐํฌํŒ(์šฐ๋ถ„ํˆฌ 14.01.x)๋งŒ ์ง€์›ํ–ˆ์Šต๋‹ˆ๋‹ค. (SYCL์„ ํ†ตํ•ด TF + openCL์„ ์–ป์Šต๋‹ˆ๋‹ค(7970์—์„œ ํ…Œ์ŠคํŠธ๋˜์ง€ ์•Š์Œ))

๋‚ด๊ฐ€ ์ผํ•˜๋Š” ๊ณณ์—์„œ๋Š” ์ „์ฒด R&D ๋ถ€์„œ์—์„œ ๊ทธ๋ฆฐ ํŒ€์„ ์šด์˜ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋“ค์€ ๋ชจ๋‘ PHD๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๊ณ  cuda(๋˜๋Š” OCL)์˜ ํ•œ ์ค„์„ ์ž‘์„ฑํ•˜์ง€ ์•Š์€ ๊ฒƒ์„ ์ œ์™ธํ•˜๊ณ  ๋ชจ๋‘๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋„๊ตฌ๋Š” Keras ์›Œํฌ๋กœ๋“œ๋ฅผ ๊ฐ€์†ํ™”ํ•˜๊ธฐ ์œ„ํ•ด ์—ฌ๊ธฐ์— ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์žฌํ™œ์šฉ๋œ ๊ด‘์‚ฐ GPU์—์„œ ๋‘ ๋ฒˆ์งธ ์‚ถ์„ ์งœ๋‚ด๋ ค๊ณ  ํ•˜๋Š” ์ผ์ข…์˜ ๊ดด์งœ์ž…๋‹ˆ๋‹ค.

๊ทธ๋ฆฐ ํŒ€ ์ง€์› ์ด์™ธ์˜ tl;dr์€ AMD GPU ์‹œ์žฅ ์ ์œ ์œจ์ด ํ‘œ์‹œ๋˜๋Š” ๊ฒฝ์šฐ์—๋งŒ ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค.
๋‹ญ๊ณผ ๊ณ„๋ž€์˜ ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค. ๋‚˜๋Š” vega์— ๋Œ€ํ•œ ํฌ๋ง์ด ์žˆ์ง€๋งŒ ... ์˜ˆ ... 1080Ti ํ‚ฌ๋Ÿฌ๋Š” ์•„๋‹™๋‹ˆ๋‹ค.

@acoye FWIW๋Š” 4์›”๋ถ€ํ„ฐ ์Šค๋ž˜์‹ฑ ๋ฐ ์ธํ„ฐ๋„ท ๊ฒ€์ƒ‰ ํ›„ ์ด๋ฒˆ ์ฃผ๋ง์— ์ €๋ฅผ ๋ฐ๋ ค์˜จ GitHub ๊ฒŒ์‹œ๋ฌผ์ž…๋‹ˆ๋‹ค. https://github.com/BVLC/caffe/issues/5804#issuecomment-318789942 . https://github.com/cdeterman/gpuR/issues/77#issuecomment-318814154 ๋„ ์ฐธ์กฐํ•˜์„ธ์š”. ๊ทธ๊ฒƒ์ด ์ €์˜ ์›๋ž˜ ๋ฌธ์ œ์˜€์Šต๋‹ˆ๋‹ค. Bonaire๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ R์—์„œ ์„ ํ˜• ๋Œ€์ˆ˜ํ•™์„ ๊ฐ€์†ํ™”ํ•˜๋ ค๊ณ  ํ–ˆ์Šต๋‹ˆ๋‹ค.

@acoye
์ตœ์‹  Linux ๋ฐฐํฌํŒ์œผ๋กœ ์ด๋™ํ•˜์—ฌ AMDGPU ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ํ™œ์„ฑํ™”๋˜๊ณ  RADEON์ด ๋น„ํ™œ์„ฑํ™”๋˜๊ณ  ์ปค๋„ ๊ตฌ์„ฑ์—์„œ CONFIG_DRM_AMDGPU_SI=Y ๋ฐ/๋˜๋Š” CONFIG_DRM_AMDGPU_CIK=Y ๊ฐ€ ์„ค์ •๋œ 4.11/4.12์™€ ๊ฐ™์€ ์ตœ์‹  ์‚ฌ์šฉ์ž ์ •์˜ ์ปดํŒŒ์ผ๋œ ์ปค๋„์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. initramfs => ์ตœ์‹  AMDGPU-PRO OpenCL์˜ 7970(Tahiti)์šฉ AMD ํŽŒ์›จ์–ด๋Š” ๋ชจ๋“  GCN ์นด๋“œ์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. FGLRX(์ด์ „ Linux ๋ฐฐํฌํŒ์—์„œ)์™€ RADEON ๋“œ๋ผ์ด๋ฒ„๋ฅผ ํ†ตํ•œ Clover๋Š” ์žŠ์–ด๋ฒ„๋ฆฌ์„ธ์š”. ๋‘˜ ๋‹ค ํ•˜์œ„ ์ˆ˜์ค€์ž…๋‹ˆ๋‹ค.
GCN ์ด์ „ ์นด๋“œ๋„ ์žŠ์–ด๋ฒ„๋ฆฌ์‹ญ์‹œ์˜ค. Caffe์šฉ Windows์—์„œ OpenCL์„ ์‚ฌ์šฉํ•˜์—ฌ ํ…Œ์ŠคํŠธํ–ˆ์ง€๋งŒ ์„ฑ๋Šฅ์€ ์ด๋Ÿฌํ•œ ์˜ค๋ž˜๋œ ์นด๋“œ์— ๋…ธ๋ ฅํ•  ๊ฐ€์น˜๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. 2012๋…„ ์ดํ›„์˜ ๋ชจ๋“  AMD ์นด๋“œ๋Š” ์–ด์จŒ๋“  GCN์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

@ naibaf7 ์–ด์ œ AMD์˜ ์˜คํ”ˆ ์†Œ์Šค ์Šคํƒ์„ ์ž‘๋™์‹œํ‚ค๋ ค๊ณ  ๋ช‡ ์‹œ๊ฐ„์„ ๋ณด๋ƒˆ์Šต๋‹ˆ๋‹ค. MIOpen ๋ฐ ํ•ด๋‹น ์ข…์†์„ฑ์„ ์–ป์—ˆ์ง€๋งŒ hcc ์— ์—ฌ์ „ํžˆ ์ผ๋ถ€ ๋น„ํŠธ๊ฐ€ ๋ˆ„๋ฝ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋“  ๊ฒƒ์„ ์–ป์œผ๋ ค๋ฉด ์‚ฌ์šฉ์ž ์ •์˜ ์ปค๋„ ๋นŒ๋“œ๋ฅผ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ €๋Š” CUDA ์ฝ”๋“œ๋ฅผ ์ด์‹ํ•˜๊ฑฐ๋‚˜ GPU์—์„œ ์ปดํŒŒ์ผ๋œ C++๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๊ฒƒ์— ๋Œ€ํ•ด ๋ณ„๋กœ ์‹ ๊ฒฝ ์“ฐ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ;-)

๋‚˜๋Š” ๋˜ํ•œ ๊ทธ๋“ค์˜ ์›น์‚ฌ์ดํŠธ์—์„œ ์–ด์…ˆ๋ธ”๋Ÿฌ๋กœ ํ”„๋กœ๊ทธ๋ž˜๋ฐํ•˜๋Š” ๊ฒƒ์— ๊ด€ํ•œ ๊ฒƒ์„ ๋ณด์•˜์Šต๋‹ˆ๋‹ค. ์–ด์…ˆ๋ธ”๋Ÿฌ์—์„œ FORTH๋กœ ๊ฐ€๋Š” ๊ฒƒ์ด ์‰ฝ๊ธฐ ๋•Œ๋ฌธ์— ๊ด€์‹ฌ์ด ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ;-)

@znmeb ์˜ˆ, ๋˜ํ•œ RX 480์—์„œ ์ผ๋ถ€ MIOpen ๋ฐ TensorFlow ํ•ญ๋ชฉ์„ ์ž‘๋™์‹œํ‚ค๋ ค๊ณ  ๋…ธ๋ ฅํ•˜๊ณ  ์žˆ์ง€๋งŒ ์ฃผ์š” ๊ฐœ๋ฐœ ์žฅ๋น„๋ฅผ ํŒŒ๊ดดํ•˜๊ณ  ์‹ถ์ง€ ์•Š์œผ๋ฏ€๋กœ ๋Œ€์‹  IOMMU ๊ฐ€์ƒํ™”๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  Ubuntu 16.04 ๊ฐ€์ƒ ๋จธ์‹ ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. RX 480. AMD ๋“œ๋ผ์ด๋ฒ„๋Š” ๊ฐ€์ƒํ™”์— ๋งค์šฐ ์นœ์ˆ™ํ•ฉ๋‹ˆ๋‹ค(๊ฒŒ์ž„ ์นด๋“œ์šฉ์œผ๋กœ ๋งŒ๋“ค์–ด์ง„ nVidia ๋“œ๋ผ์ด๋ฒ„์™€ ๋‹ฌ๋ฆฌ Quadro ๋“œ๋ผ์ด๋ฒ„๋งŒ ๊ฐ€๋Šฅ).

@znmeb sudo apt-get install rocm miopen-hip ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

@adityaatluri Arch User Repository์— ์žˆ์ง€๋งŒ ์„ค์น˜๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. GitHub ์†Œ์Šค์—์„œ๋„ ์„ค์น˜๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ฐ„๋‹จํ•œ ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ž…๋‹ˆ๋‹ค. ๋ช‡ ๊ฐ€์ง€ ์ข…์†์„ฑ์„ ์ฐพ์„ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

@znmeb ์—ฌ๊ธฐ(https://github.com/RadeonOpenCompute/ROCm/issues)์—์„œ ๋ฌธ์ œ๋ฅผ ์ƒ์„ฑํ•˜์—ฌ ๋…ผ์˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ๊ฐ์‚ฌ ํ•ด์š”!

@adityaatluri ๋ฌผ๋ก ์ž…๋‹ˆ๋‹ค - ์ €๋… ๋จน์œผ๋Ÿฌ ๊ฐ€๊ณ  ์žˆ์ง€๋งŒ ๋Œ์•„์˜ฌ ๋•Œ ์ œ์ถœํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@ebrevdo AMD ํ”„๋กœ์„ธ์„œ๊ฐ€ ์žฅ์ฐฉ๋œ Mac์—์„œ tensorflow GPU๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

์šฐ๋ฆฌ ํšŒ์‚ฌ๋Š” ํ•œ๋™์•ˆ OpenCL ๋”ฅ ๋Ÿฌ๋‹์— ๋Œ€ํ•ด ์ž‘์—…ํ•ด ์™”์œผ๋ฉฐ ๋ช‡ ๊ฐ€์ง€ ์ดˆ๊ธฐ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ๋“œ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๊ฐ€๊นŒ์šด ์‹œ์ผ ๋‚ด์— Keras์— ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ์žˆ์ง€๋งŒ (๋งค์šฐ) ์‹คํ—˜์ ์ธ TensorFlow ์ง€์›๋„ ๊ตฌ์ถ•ํ–ˆ์œผ๋ฉฐ ์ดˆ๊ธฐ ๋ฆด๋ฆฌ์Šค ํ›„์— ๋‹ค์‹œ ์‚ดํŽด๋ณผ ๊ฒƒ์ž…๋‹ˆ๋‹ค. AMD์˜ ์ดˆ๊ธฐ ์ฒ˜๋ฆฌ๋Ÿ‰ ์ˆ˜์น˜๋ฅผ ํฌํ•จํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ: http://vertex.ai/blog/bringing-deep-learning-to-opencl

๋ฉ‹์žˆ๋Š”!

์ž‘์€ ๊ผฌ์ง‘์Œ: AFAIK, MIOpen์€ ROCm๋ฟ๋งŒ ์•„๋‹ˆ๋ผ OpenCL์—๋„ ์—ฐ๊ฒฐํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— AMD ์ „์šฉ์ด ์•„๋‹™๋‹ˆ๋‹ค. ํ›„์ž๊ฐ€ ์•„๋งˆ๋„ ๋” ๋น ๋ฅด์ง€๋งŒ ์—ฌ์ „ํžˆ ๊ทธ๋ ‡์Šต๋‹ˆ๋‹ค. MIOpen์€ "GPU์˜ Open Source Neural Networks On GPU"์— ๋Œ€ํ•œ ์—„์ฒญ๋‚œ ์ง„์ „์ด๋ฉฐ AMD๋Š” OpenCL์—์„œ ์ž˜ ์ž‘๋™ํ•œ๋‹ค๋ฉด ์ด์— ๋Œ€ํ•œ ์—„์ฒญ๋‚œ ์‹ ๋ขฐ๋ฅผ ๋ฐ›์„ ์ž๊ฒฉ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

2017๋…„ 8์›” 14์ผ ์˜คํ›„ 5:19 "Choong Ng" ์ž‘์„ฑ:
์šฐ๋ฆฌ ํšŒ์‚ฌ๋Š” ํ•œ๋™์•ˆ OpenCL ๋”ฅ ๋Ÿฌ๋‹์— ๋Œ€ํ•ด ์ž‘์—…ํ•ด ์™”์œผ๋ฉฐ ๋ช‡ ๊ฐ€์ง€ ์ดˆ๊ธฐ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ๋“œ๋ฆด ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๊ฐ€๊นŒ์šด ์‹œ์ผ ๋‚ด์— Keras์— ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ์žˆ์ง€๋งŒ (๋งค์šฐ) ์‹คํ—˜์ ์ธ TensorFlow ์ง€์›๋„ ๊ตฌ์ถ•ํ–ˆ์œผ๋ฉฐ ์ดˆ๊ธฐ ๋ฆด๋ฆฌ์Šค ํ›„์— ๋‹ค์‹œ ์‚ดํŽด๋ณผ ๊ฒƒ์ž…๋‹ˆ๋‹ค. AMD์˜ ์ดˆ๊ธฐ ์ฒ˜๋ฆฌ๋Ÿ‰ ์ˆ˜์น˜๋ฅผ ํฌํ•จํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ: http://vertex.ai/blog/bringing-deep-learning-to-opencl (http://vertex.ai/blog/bringing-deep-learning-to-opencl)

โ€”

๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub(https://github.com/tensorflow/tensorflow/issues/22#issuecomment-322235416)์—์„œ ํ™•์ธํ•˜๊ฑฐ๋‚˜ ์Šค๋ ˆ๋“œ๋ฅผ ์Œ์†Œ๊ฑฐ(https://github.com/notifications/unsubscribe-auth)ํ•˜์„ธ์š”. /ABHR3VYHXFDEX0gPHTGLSbFeHjPfEfsXks5sYHOGgaJpZM4Gex3i).

@cathalgarvey ์ˆ˜์ •ํ•ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. MIOpen ๋ฌธ์„œ(https://rocmsoftwareplatform.github.io/MIOpen/doc/html/install.html#prerequisites)์˜ ์‹œ์Šคํ…œ ์š”๊ตฌ ์‚ฌํ•ญ์— ๋Œ€ํ•œ ๋‚ด ์˜๊ฒฌ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ–ˆ์ง€๋งŒ ๋” ๋‚˜์€ ๋งํฌ.

์ž ๊น, ๋‚˜๋Š” ์ง€๊ธˆ 10๋ถ„ ๋™์•ˆ ์ด ์Šค๋ ˆ๋“œ/ํ˜ธ๋ฅผ ์ฝ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ค‘๊ฐ„์— ๋ณด๊ณ  ๋‚˜๋จธ์ง€๋Š” ๊ฑด๋„ˆ๋›ฐ์—ˆ์Šต๋‹ˆ๋‹ค. AMD GPU๊ฐ€ ์•„์ง ์ง€์›๋ฉ๋‹ˆ๊นŒ?

Kernel/OS(์ฝ”๋“œ ํ”Œ๋ ˆ์ด)์˜ ์•„์ฃผ ์˜ค๋ž˜๋œ ์กฐํ•ฉ์—์„œ๋งŒ ์ž‘๋™ํ•˜๋Š” ๊นŒ๋‹ค๋กœ์šด ํ์‡„ ์†Œ์Šค ์‚ฌ์šฉ: ์˜ˆ

tensorflow์˜ ์ด์ „ ๋ฒ„์ „์„ ์‚ฌ์šฉํ•˜๊ณ  ์•„์ง ์ผ๋ถ€ ๋น„์„ ํ˜•์„ฑ์„ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค(tf-coriander): ์˜ˆ.

์ •๋ง: ๊ณต์‹์ ์œผ๋กœ๋Š” ์•„๋‹™๋‹ˆ๋‹ค. AMD๊ฐ€ HIP๋กœ ํฌํŒ…ํ•˜๊ณ  ์žˆ์ง€๋งŒ 3๊ฐœ์›” ์ •๋„ ์ด๋‚ด์— ์ง„์ „์ด ์žˆ์„ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒํ•ฉ๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ์ด๋ฏธ ๊ทธ๋“ค์˜ ๋…ธ๋ ฅ์œผ๋กœ ์ž˜ ์ž‘๋™ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

2017๋…„ 8์›” 18์ผ 02:09:55 GMT+01:00, abrad1212 [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

์ž ๊น, ๋‚˜๋Š” ์ง€๊ธˆ 10๋ถ„ ๋™์•ˆ ์ด ์Šค๋ ˆ๋“œ/ํ˜ธ๋ฅผ ์ฝ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๋ฐ˜์„ ์–ป์—ˆ๋‹ค
๋‚˜๋จธ์ง€๋Š” ๊ฑด๋„ˆ๋›ฐ์—ˆ์Šต๋‹ˆ๋‹ค. AMD GPU๊ฐ€ ์•„์ง ์ง€์›๋ฉ๋‹ˆ๊นŒ?

--
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment -323233294

--
K-9 Mail์„ ์‚ฌ์šฉํ•˜์—ฌ Android ๊ธฐ๊ธฐ์—์„œ ๋ณด๋ƒˆ์Šต๋‹ˆ๋‹ค. ์ œ ๊ฐ„๋žตํ•œ ์„ค๋ช…์„ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

FWIW ์ตœ์‹  ๋ฒ„์ „์˜ PyGpu๋Š” CUDA ๋˜๋Š” OpenCL์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๋‚ด ์•„์น˜ ์ƒ์ž์— ๋ชจ๋“  ์†Œํ”„ํŠธ์›จ์–ด๊ฐ€ ์„ค์น˜๋˜์–ด ์žˆ์ง€๋งŒ ์•„์ง ํ…Œ์ŠคํŠธํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.

@abrad1212 ์˜ˆ, ์ด ๋ฌธ์ œ๋Š” ํ•œ๋™์•ˆ ๊ณ„์†๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋…ธ๋ ฅ์€ ๋ฐฉ๋Œ€ํ•˜๊ณ  ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์ด @cathalgarvey๊ฐ€ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด "์ž‘๋™์‹œํ‚ค๋ ค๊ณ " ๋…ธ๋ ฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์šฐ๋ฆฌ ์ธก์—์„œ ์•ฝ๊ฐ„์˜ ์—…๋ฐ์ดํŠธ. Ubuntu 16.04์šฉ AMDGPU-pro ๋“œ๋ผ์ด๋ฒ„ ์Šคํƒ์—์„œ ComputeCpp 0.3.0์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ง€์นจ์€ http://deep-beta.co.uk/tensorflow-1-3-on-ubuntu-16 ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ ์šฐ๋ฆฌ๋Š” ํ˜„์žฌ ๋‹ค์–‘ํ•œ ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ ๊ฐœ์„ ์— ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ํ•ด์•ผ ํ•  ์ผ์ด ๋งŽ์ง€๋งŒ ๋ชฉํ‘œ๋ฅผ ๋‹ฌ์„ฑํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

@lukeiwanski ๋ฒค์น˜๋งˆํ‚น์— ๋Œ€ํ•œ ์ ‘๊ทผ ๋ฐฉ์‹์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? Keras์— ํฌํ•จ๋œ ๋ชจ๋ธ์˜ ์‹œ๊ฐ„์„ ์ธก์ •ํ•˜๊ณ  TF+cuDNN+K80์— ๋Œ€ํ•ด ์ •๊ทœํ™”ํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์ผ๋ฐ˜์ ์ด๊ณ  ์ตœ์ ํ™”๋œ ๊ตฌ์„ฑ์ด๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์šฐ๋ฆฌ์˜ ๋ฐฉ๋ฒ•๋ก ์€ Max Woolf(http://minimaxir.com/2017/06/keras-cntk/)์™€ ๋น„์Šทํ•ฉ๋‹ˆ๋‹ค. ์ฝ”๋“œ๊ฐ€ ๋งŽ์ง€๋Š” ์•Š์ง€๋งŒ ๊ณต์œ ํ•˜๊ฒŒ ๋˜์–ด ๊ธฐ์ฉ๋‹ˆ๋‹ค. ์›น ์‚ฌ์ดํŠธ(http://vertex.ai)์— ์ฒ˜๋ฆฌ๋Ÿ‰ ์ˆ˜์น˜๊ฐ€ ์žˆ์œผ๋ฉฐ, ์šฐ๋ฆฌ ์ฝ”๋“œ๋Š” Xception ์ถ”๋ก ์—์„œ TF 1.2๋ณด๋‹ค ์•ฝ๊ฐ„ ๋น ๋ฅด๋ฉฐ ๋” ๋งŽ์€ ์ ‘๊ทผ ๋ฐฉ์‹์„ ๋‚˜๋ž€ํžˆ ๋น„๊ตํ•˜๋Š” ๊ฒƒ์ด ํฅ๋ฏธ๋กœ์šธ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

Windows ์†”๋ฃจ์…˜์ด ์žˆ์Šต๋‹ˆ๊นŒ? ๋‚ด PC์— Ubuntu๋ฅผ ์„ค์น˜ํ•˜๊ณ  ์‹ถ์ง€๋งŒ ํ˜„์žฌ ์„ค์น˜ํ•  ๊ณต๊ฐ„์ด ์ถฉ๋ถ„ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์šฐ๋ถ„ํˆฌ 14.04
ํ…์„œํ”Œ๋กœ ๋งˆ์Šคํ„ฐ ๋ธŒ๋žœ์น˜
opencl ์ง€์›์„ ๋นŒ๋“œํ•˜๊ณ  opencl ์ธํ…” CPU ๋Ÿฐํƒ€์ž„๋งŒ ์„ค์น˜ํ–ˆ์Šต๋‹ˆ๋‹ค.
ํŒŒ์ด์ฌ 2.7
https://developer.codeplay.com/computecppce/latest/getting-started-with-tensflow ๊ฐ€์ด๋“œ๋ฅผ ๋”ฐ๋ฅด์‹ญ์‹œ์˜ค.
ํŒŒ์ด์ฌ classify_image.py ์‹คํ–‰
opencl ๋“œ๋ผ์ด๋ฒ„๋ฅผ ํ˜ธ์ถœํ•˜์ง€ ์•Š์€ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. (๋‚˜๋Š” ๋‚ด โ€‹โ€‹opencl icd ๋ž˜ํผ๋ฅผ ์ถ”๊ฐ€ํ–ˆ์ง€๋งŒ ์•„๋ฌด๊ฒƒ๋„ ๋ณด์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค)
python ์ฝ”๋“œ์— ์ถ”๊ฐ€ํ•ด์•ผ ํ•˜๋Š” ๊ตฌ์„ฑ์ด ์žˆ์Šต๋‹ˆ๊นŒ?
sess.graph.device('/cpu0')์ฒ˜๋Ÿผ

๊ทธ๋Ÿฌ๋‚˜ Eigen์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ skcl ๊ฐ€์ด๋“œ๋Š” OpenCL์„ ์ง€์›ํ•˜๋Š” CPU์—์„œ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (๋˜ํ•œ ์ด ๊ฐ€์ด๋“œ ์ฝ”๋“œ๋Š” ์•ฝ๊ฐ„ ๊ตฌ์‹์ด๋ฏ€๋กœ ์ˆ˜์ •์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค)
https://developer.codeplay.com/computecppce/latest/getting-started-with-eigen

๋ˆ„๊ตฌ๋“ ์ง€ tensorflow python ์ธํ„ฐํŽ˜์ด์Šค๊ฐ€ OpenCL ์ง€์›๊ณผ ํ•จ๊ป˜ ์‹คํ–‰๋  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ํ™•์ธํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ค„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ์ด opt ์„ธํŠธ๋กœ tensorflow๋ฅผ ๋นŒ๋“œํ•˜๋ฉด ์‹ค์ œ๋กœ tensorflow ๋ฐ”์ด๋„ˆ๋ฆฌ๊ฐ€ ์ƒ์„ฑ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. --config=์ฃผ๊ธฐ
์ด ๋ช…๋ น์œผ๋กœ tensorflow๋ฅผ ๋นŒ๋“œํ•˜์‹ญ์‹œ์˜ค.
bazel ๋นŒ๋“œ -c opt /tensorflow/tools/pip_ ํŒจํ‚ค์ง€:build_pip_package

์•„๋งˆ๋„ ๋‚˜๋Š” forget --config=sycl์„ ๋นŒ๋“œํ•ฉ๋‹ˆ๋‹ค.
๋นŒ๋“œ ๋ช…๋ น์„ ์‹œ๋„ํ•˜๊ณ  OpenCL ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ํ˜ธ์ถœํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค. ๊ฒฐ๊ณผ๋ฅผ ์–ป์€ ํ›„ ์—ฌ๊ธฐ์— ๊ฒŒ์‹œํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.
bazel ๋นŒ๋“œ -c opt tensorflow/tools/pip_ ํŒจํ‚ค์ง€:build_pip_package

@ joe8086 ์•„๋ž˜๋กœ tf.Session ์ƒ์„ฑ์„ ์ˆ˜์ •ํ•˜๋ฉด ํ„ฐ๋ฏธ๋„์— ๋กœ๊ทธ๊ฐ€ ํ‘œ์‹œ๋ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์–ด๋””์—์„œ๋‚˜ SYCL์„ ์–ธ๊ธ‰ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ?
tf.Session(config=tf.ConfigProto(log_device_placement=True))

Eigen ๊ฐ€์ด๋“œ์— ๋Œ€ํ•ด ์ตœ์‹  ๋ฒ„์ „์ด ์•„๋‹Œ ํŠน์ • ํ”ผ๋“œ๋ฐฑ์ด ์žˆ์Šต๋‹ˆ๊นŒ?

@rodburns ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
๋‚ด ์˜ค๋ฅ˜๋Š” ๋นŒ๋“œ tensorflow ๋ฏธ์Šค ๊ตฌ์„ฑ ์˜ต์…˜ --config=sycl์ž…๋‹ˆ๋‹ค.
์ด ๋ถ„๊ธฐ ์ฝ”๋“œ๋กœ ์ด ์˜ต์…˜์„ ์ถ”๊ฐ€ํ•œ ํ›„ https://github.com/lukeiwanski/tensorflow.git
OpenCL ๋ฐฑ์—”๋“œ๋กœ ์‹คํ–‰๋˜๋Š” ํ…์„œํ”Œ๋กœ๋ฅผ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Eigen ๊ฐ€์ด๋“œ์˜ ๊ฒฝ์šฐ ์ฃผ์š” ์˜ค๋ฅ˜๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.
1, ์˜ฌ๋ฐ”๋ฅธ ํฌํ•จ ํŒŒ์ผ์„ ์ œ๊ณตํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
2, ๋ฐฐ์—ด์˜ ๊ฒฝ์šฐ Tensor, TensorMap์ด ์˜ฌ๋ฐ”๋ฅธ ํ…œํ”Œ๋ฆฟ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ œ๊ณตํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
3, static_cast์˜ ๊ฒฝ์šฐ ๋ฐ์ดํ„ฐ ์œ ํ˜•์„ ์ œ๊ณตํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์ด ํ† ๋ก  ์ฃผ์ œ์— ์•ฝ๊ฐ„์˜ ๋„์›€์ด ๋  ์ˆ˜ ์žˆ๋Š” ๋” ๋งŽ์€ ์ •๋ณด๋ฅผ ์ถ”๊ฐ€ํ•˜์‹ญ์‹œ์˜ค.
1, Main tensorflow๋Š” --config=sycl์ด ์˜ฌ๋ฐ”๋ฅธ tensorflow๋ฅผ ๋นŒ๋“œํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
2, CPU OpenCL์„ ์‚ฌ์šฉํ•˜๋ฉด ๋‚ด ํ™˜๊ฒฝ์—์„œ ์ผ๋ฐ˜ CPU ๊ตฌํ˜„๋ณด๋‹ค ์†๋„๊ฐ€ ์•ฝ 4x~8x ์†Œ๋น„๋ฉ๋‹ˆ๋‹ค.

์‹œ๊ฐ„ ํŒŒ์ด์ฌ classify_image.py
2017-09-07 16:56:29.076054: I tensorflow/core/platform/cpu_feature_guard.cc:137] ๊ท€ํ•˜์˜ CPU๋Š” ์ด TensorFlow ๋ฐ”์ด๋„ˆ๋ฆฌ๊ฐ€ ์‚ฌ์šฉํ•˜๋„๋ก ์ปดํŒŒ์ผ๋˜์ง€ ์•Š์€ ๋ช…๋ น์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค: SSE4.1 SSE4.2 AVX
2017-09-07 16:56:29.077967: W ./tensorflow/core/common_runtime/sycl/sycl_device.h:49] OpenCL CPU๋ฅผ ์‹œ๋„ํ•˜๋Š” ComputeCpp์—์„œ ์ง€์›ํ•˜๋Š” OpenCL GPU๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.
2017-09-07 16:56:29.159775: I ./tensorflow/core/common_runtime/sycl/sycl_device.h:66] ๋‹ค์Œ OpenCL ์žฅ์น˜๋ฅผ ์ฐพ์•˜์Šต๋‹ˆ๋‹ค.
2017-09-07 16:56:29.159825: I ./tensorflow/core/common_runtime/sycl/sycl_device.h:68] id: 0, ์œ ํ˜•: CPU, ์ด๋ฆ„: Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz, ๊ณต๊ธ‰์—…์ฒด: Intel(R) Corporation, ํ”„๋กœํ•„: FULL_PROFILE
2017-09-07 16:56:30.213375: W ./tensorflow/core/framework/op_def_util.cc:333] Op BatchNormWithGlobalNormalization์€ ๋” ์ด์ƒ ์‚ฌ์šฉ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. GraphDef ๋ฒ„์ „ 9์—์„œ ์ž‘๋™์ด ์ค‘์ง€๋ฉ๋‹ˆ๋‹ค. tf.nn.batch_normalization()์„ ์‚ฌ์šฉํ•˜์‹ญ์‹œ์˜ค.
์ž์ด์–ธํŠธ ํŒฌ๋”, ํŒฌ๋”, ํŒฌ๋” ๊ณฐ, ์ฟค๋ฒ ์–ด, Ailuropoda melanoleuca (์ ์ˆ˜ = 0.89107)
์ธ๋””, ์ธ๋””, ์ธ๋”” ์ธ๋””, ์ธ๋”” ๋ธŒ๋ ˆ๋น„์นด์šฐ๋‹คํˆฌ์Šค(์ ์ˆ˜ = 0.00779)
๋ ˆ์„œ ํŒฌ๋”, ๋ ˆ์„œ ํŒฌ๋”, ํŒฌ๋”, ๊ณฐ ๊ณ ์–‘์ด, ๊ณ ์–‘์ด ๊ณฐ, Ailurus fulgens(์ ์ˆ˜ = 0.00296)
์ปค์Šคํ„ฐ๋“œ ์• ํ”Œ (์ ์ˆ˜ = 0.00147)
Earthstar (์ ์ˆ˜ = 0.00117)

์‹ค์ œ 1m44.473์ดˆ
์‚ฌ์šฉ์ž 2m8.980s
์‹œ์Šคํ…œ 1m20.024s

์—ฌ๋Ÿฌ๋ถ„, ์ €๋Š” ์ด ์ „์ฒด ์Šค๋ ˆ๋“œ๋ฅผ ์ฝ์ง€๋Š” ์•Š๊ฒ ์ง€๋งŒ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ์ œ ์งˆ๋ฌธ์— ๋‹ตํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ์ •๋ง ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค! ์•„์ง AMD GPU์™€ ํ•จ๊ป˜ Tensorflow๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ๊ทธ๋ ‡๋‹ค๋ฉด ์–ด๋–ค ์šด์˜ ์ฒด์ œ์—์„œ RX Vega๋กœ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ๊ฐ์‚ฌ ํ•ด์š”!

@M3L0NM4N ํ  ... ์Šค๋ ˆ๋“œ๋ฅผ ๋”ฐ๋ผ๊ฐ€์ง€๋Š” ์•Š์•˜์ง€๋งŒ ์ ์–ด๋„ CPU OpenCL์—์„œ๋Š” ์ง€๊ธˆ ํ…Œ์ŠคํŠธ ๊ฐ€๋Šฅํ•œ OpenCL ์ฝ”๋“œ๊ฐ€ ์žˆ๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ด์ „ AMD GPU("Bonaire")๊ฐ€ ์žˆ๊ณ  GPU์™€ CPU์—์„œ ๋ชจ๋‘ OpenCL์„ ์‹คํ–‰ํ•˜๊ณ  ์žˆ์œผ๋ฏ€๋กœ ์ด๋ฅผ ํ…Œ์ŠคํŠธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ฃผ๋ง์— ํ•œ ๋ฒˆ ์‹œ๋„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ๋‚ด GPU์— OpenCL TensorFlow๊ฐ€ ์ •๋ง ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

tf-๊ณ ์ˆ˜๋Š” https://github.com/hughperkins/tf-coriander ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

macos์—์„œ tensorflow 1.3 gpu/opencl ์ง€์›์ด ์žˆ์Šต๋‹ˆ๊นŒ?

์ตœ์‹  ๋‰ด์Šค: GitHub ์†Œ์Šค์—์„œ OpenCL์„ ์‚ฌ์šฉํ•˜์—ฌ TensorFlow 1.3.1์„ ์„ฑ๊ณต์ ์œผ๋กœ ๋นŒ๋“œํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ฌธ์„œ์— ๋ˆ„๋ฝ๋œ ๋ถ€๋ถ„์ด ๊ฝค ์žˆ์œผ๋ฉฐ GPU์—์„œ ์•„์ง ์‹คํ–‰์„ ์‹œ๋„ํ•˜์ง€ ์•Š์•˜์ง€๋งŒ ์ตœ์†Œํ•œ OpenCL์ด ์•„๋‹Œ CPU์—์„œ๋Š” ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. BTW, CPU OpenCL์ด ์„ค์น˜๋˜์–ด ์žˆ์ง€ ์•Š๊ณ  GPU OpenCL๋งŒ ์„ค์น˜๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

OpenCL GPU๊ฐ€ ์žˆ๋Š” TensorFlow์— ๋Œ€ํ•œ ํ…Œ์ŠคํŠธ ์‚ฌ๋ก€๊ฐ€ ์žˆ๋Š” ์‚ฌ๋žŒ์ด ์žˆ์Šต๋‹ˆ๊นŒ? ๋‚˜๋Š” ๊ฒฐ๊ตญ ๋‚˜ ์ž์‹ ์„ ์œ„ํ•ด ํ•˜๋‚˜๋ฅผ ๋งŒ๋“ค์–ด์•ผ ํ•˜์ง€๋งŒ ๋น ๋ฅธ ํ™•์ธ์„ ๋ฐ”๋ž์Šต๋‹ˆ๋‹ค.

@znmeb ๋„ค, ์ œ๊ฐ€ ๋ณด๊ณ ํ•œ ๋ฌธ์ œ์— ํ…Œ์ŠคํŠธ ์•ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค. https://github.com/hughperkins/tf-coriander/issues/64

๊ท€ํ•˜์˜ ๊ฒฝ์šฐ์— ์ž‘๋™ํ•˜๋Š”์ง€ ์•Œ๋ ค์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

@unoexperto ์˜ˆ - ์ž‘๋™ํ•˜์ง€๋งŒ(์ถฉ๋Œํ•˜์ง€ ์•Š์Œ) OpenCL์„ ์ฐพ์•˜๋Š”์ง€ ์—ฌ๋ถ€๋Š” ํ‘œ์‹œ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

ย python ./hello-tensorflow.py 
b'Hello, TensorFlow!'

์—ฌ๊ธฐ์—์„œ ์ตœ์„ ์˜ ์กฐ์น˜๋Š” ๋ฌธ์„œ๋ฅผ ์š”์ฒญํ•˜๊ธฐ ์œ„ํ•ด ๋ณ„๋„์˜ ๋ฌธ์ œ๋ฅผ ์ œ์ถœํ•˜๋Š” ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ( ./configure ๋นŒ๋“œ๋ฅผ ์†Œ์Šค์—์„œ ์‹คํ–‰ํ•  ๋•Œ) OpenCL์šฉ ์ฝ”๋“œ ๊ฐ€ ์žˆ๋‹ค๋Š” ๊ฒƒ์ด ๋ถ„๋ช…ํ•˜๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์•”ํŠผ ๊ทธ๋ ‡๊ฒŒ ์ฐพ์•˜์Šต๋‹ˆ๋‹ค.

@znmeb ๋‚ด ๊ฒฝ์šฐ GPU ์žฅ์น˜ ์„ ํƒ์— ๋Œ€ํ•œ ์ดˆ๊ธฐ ๋””๋ฒ„๊ทธ ์ •๋ณด๋ฅผ ์ธ์‡„ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ๊ท€ํ•˜์˜ ๊ฒฝ์šฐ์— GPU ์žฅ์น˜๋ฅผ ์ฐพ์•˜๋Š”์ง€ ์˜์‹ฌ์Šค๋Ÿฝ์Šต๋‹ˆ๋‹ค. tensorflow/core/common_runtime/gpu/gpu_device.cc ์–ด๋”˜๊ฐ€์— ์ฝ˜์†”์— printf ๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ ๋‹ค์‹œ ์ปดํŒŒ์ผํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

@unoexperto ํ† ๋ก  Google ๊ทธ๋ฃน์— ๊ฐ€์ž…ํ•˜์—ฌ ๋ฌธ์„œ ์š”์ฒญ์„ ๊ฒŒ์‹œํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ด๊ฒƒ์— ๋” ๋งŽ์€ ๋…ธ๋ ฅ์„ ๊ธฐ์šธ์ด๊ธฐ ์ „์— ๋ˆ„๊ตฐ๊ฐ€ ์‘๋‹ตํ•˜๋Š”์ง€ ๊ธฐ๋‹ค๋ฆด ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@znmeb ์–ด๋–ค ์ง€์นจ์„ ๋”ฐ๋ฅด๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ? clinfo๋ฅผ ์‹คํ–‰ํ•˜์…จ์Šต๋‹ˆ๊นŒ? computecpp_info๋ฅผ ์‹คํ–‰ํ•˜์…จ์Šต๋‹ˆ๊นŒ? OpenCL ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ์˜ˆ์ƒ๋Œ€๋กœ ์„ค์น˜๋˜์—ˆ์Œ์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๊นŒ? Ubuntu 14.04์— ๋Œ€ํ•œ ์ง€์นจ์€ https://developer.codeplay.com/computecppce/latest/getting-started-with-tensflow ์— ์žˆ์œผ๋ฉฐ 16.04๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ http://deep-beta.co์— ๋ช‡ ๊ฐ€์ง€ ์‹คํ—˜ ์ง€์นจ์ด ์žˆ์Šต๋‹ˆ๋‹ค. uk/tensorflow-1-3-on-ubuntu-16-04-lts/

@rodburns clinfo ๋ฐ clpeak ๋‘˜ ๋‹ค ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ตœ๊ทผ์— ์ด๊ฒƒ์„ ํ•˜์ง€ ์•Š์•˜์ง€๋งŒ, ์†Œ์Šค์—์„œ caffe๋ฅผ ๋นŒ๋“œํ•˜๊ณ  ํ…Œ์ŠคํŠธ๋ฅผ ์‹คํ–‰ํ•˜๋ฉด ํ™•์‹คํžˆ GPU์— ๋‹ฟ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ OpenCL/GPU ๋“œ๋ผ์ด๋ฒ„/๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์ž‘๋™ํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ํ™•์‹ ํ•ฉ๋‹ˆ๋‹ค.

์ €๋Š” Arch Linux๋ฅผ ์‚ฌ์šฉ ์ค‘์ž…๋‹ˆ๋‹ค. ์ปค๋„์€ LTS์ธ linux-lts 4.9.52-1์ž…๋‹ˆ๋‹ค. ์ค‘์š”ํ•œ ๊ฒฝ์šฐ "Bonaire"๋Š” 32๋น„ํŠธ ๋ชจ๋“œ์—์„œ ์ตœ๋Œ€ ์•ฝ 1.7TFLOPS์ด๋ฉฐ AMD GPU์˜ "Sea Island" ์ œํ’ˆ๊ตฐ์— ์žˆ์Šต๋‹ˆ๋‹ค.

bin/computecpp_info 
********************************************************************************

ComputeCpp Info (CE 0.3.2)

********************************************************************************

Toolchain information:

GLIBC version: 2.26
GLIBCXX: 20160609
This version of libstdc++ is supported.

********************************************************************************


Device Info:

Discovered 1 devices matching:
  platform    : <any>
  device type : <any>

--------------------------------------------------------------------------------
Device 0:

  Device is supported                     : UNTESTED - Untested OS
  CL_DEVICE_NAME                          : Bonaire
  CL_DEVICE_VENDOR                        : Advanced Micro Devices, Inc.
  CL_DRIVER_VERSION                       : 2442.7
  CL_DEVICE_TYPE                          : CL_DEVICE_TYPE_GPU 

If you encounter problems when using any of these OpenCL devices, please consult
this website for known issues:
https://computecpp.codeplay.com/releases/v0.3.2/platform-support-notes

๋ˆ„๊ตฐ๊ฐ€ ํ…Œ์ŠคํŠธ ๋กœ๊ทธ๋ฅผ ์ˆ˜์ง‘ํ•ฉ๋‹ˆ๊นŒ? ๋‚ด ์žฅ์น˜๊ฐ€ ํ…Œ์ŠคํŠธ๋˜์ง€ ์•Š์•˜์œผ๋ฏ€๋กœ ํ…Œ์ŠคํŠธํ•  ๊ฒƒ์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ;-)

Sycl/OpenCL์šฉ TensorFlow๋ฅผ ๋นŒ๋“œํ•˜๋Š” ๊ฒƒ์€ ๋ถˆ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค!

๊ตฌ์„ฑ:
์šฐ๋ถ„ํˆฌ 16.04
ํ…์„œํ”Œ๋กœ r1.3
์˜คํ”ˆCL 2.0
ComputeCpp CE 0.3.2(computecpp_info ํ™•์ธ)
์ธํ…” HD ๊ทธ๋ž˜ํ”ฝ 620
๋ฐ”์ ค 0.5.4

์„ค์น˜ ์ง€์นจ(OpenCL Intel/ComputeCpp ๋นŒ๋“œ):
https://software.intel.com/en-us/articles/opencl-drivers#philinux
https://www.codeplay.com/portal/03-30-17-setting-up-tensorflow-with-opencl-using-sycl

์˜ค๋ฅ˜:

ERROR: /home/erwang/workspace/ia/tf_original/tensorflow/tensorflow/core/kernels/BUILD:1695:1: C++ compilation of rule '//tensorflow/core/kernels:adjust_contrast_op' failed (Exit 1)
In file included from tensorflow/core/kernels/adjust_contrast_op.cc:19:
In file included from ./tensorflow/core/kernels/adjust_contrast_op.h:18:
In file included from ./third_party/eigen3/unsupported/Eigen/CXX11/Tensor:1:
In file included from external/eigen_archive/unsupported/Eigen/CXX11/Tensor:14:
In file included from external/eigen_archive/Eigen/Core:299:
In file included from external/local_config_sycl/crosstool/../sycl/include/SYCL/sycl.hpp:20:
In file included from external/local_config_sycl/crosstool/../sycl/include/SYCL/sycl_interface.h:54:
external/local_config_sycl/crosstool/../sycl/include/SYCL/multi_pointer.h:342:3: error: multiple overloads of 'global_ptr' instantiate to the same signature 'void (pointer_t)' (aka 'void (__attribute__((address_space(1))) float *)')

๋‚ด CPU์˜ ํ›ˆ๋ จ ๋ชจ๋ธ์€ ์˜ค๋ž˜ ๊ฑธ๋ฆฌ๊ณ  OpenCL/GPU ๊ฐ€์†์ด ์ •๋ง ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค...

@ErwanGalline ์šฐ๋ฆฌ๋Š” Eigen( https://bitbucket.org/benoitsteiner/opencl/pull-requests/16/changes-required-for-new-computecpp-ce/diff#comment-None )์— ๋Œ€ํ•œ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์„ ์—…๋ฐ์ดํŠธํ•˜๋Š” ์ค‘์ž…๋‹ˆ๋‹ค. ๋ณด๊ณ  ์žˆ๋Š” ๋ฌธ์ œ๋ฅผ ์ˆ˜์ •ํ•˜์‹ญ์‹œ์˜ค.

๋˜ํ•œ ์šฐ๋ฆฌ๋Š” Eigen์— ๋Œ€ํ•œ ์—…์ŠคํŠธ๋ฆผ ์„ฑ๋Šฅ ๊ฐœ์„ ์„ ์ค€๋น„ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์•ฝ๊ฐ„ ๊นŒ๋‹ค๋กญ๊ณ  ๋ณ‘ํ•ฉ ์ถฉ๋Œ์˜ ํ๋ฆ„์„ ํ”ผํ•˜๊ธฐ ์œ„ํ•ด @benoitsteiner ์™€ ์กฐ์ •์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์šฐ๋ฆฌ๋Š” ๊ฑฐ๊ธฐ์— ๋„๋‹ฌํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

AMD ์‚ฌ์šฉ์ž์˜ ๊ฒฝ์šฐ ํฌํฌ๋ฅผ ์‚ฌ์šฉํ•ด ๋ณผ ๊ฒƒ์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค: https://github.com/lukeiwanski/tensorflow/tree/dev/amd_gpu
Ubuntu 16.04์— ๋Œ€ํ•œ ์„ค์ • ์ง€์นจ์€ ๋‹ค์Œ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. http://deep-beta.co.uk/tensorflow-1-3-on-ubuntu-16-04-lts/
์•ž์„œ ์–ธ๊ธ‰ํ•œ Eigen ๋ณ€๊ฒฝ ์‚ฌํ•ญ์ด ์ ์šฉ๋œ ํ›„ ๋ชจ๋“  ๋ณ€๊ฒฝ ์‚ฌํ•ญ์€ tensorflow์˜ ์—…์ŠคํŠธ๋ฆผ์ด ๋ฉ๋‹ˆ๋‹ค.

๋„์›€์ด ๋˜๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค.

@lukeiwanski ๊ท€ํ•˜์˜ ํฌํฌ๋Š” AMD R9 Nano/AMD FirePro GPU๋งŒ ์ง€์›ํ•ฉ๋‹ˆ๊นŒ?

@lukeiwanski GPU๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๋ฐ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํ…Œ์ŠคํŠธ ์‚ฌ๋ก€๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ? radeontop ๋กœ ๋ชจ๋‹ˆํ„ฐ๋งํ•  ์ˆ˜ ์žˆ์ง€๋งŒ TensorFlow ์ž์ฒด๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์„ ์›ํ•ฉ๋‹ˆ๋‹ค.

@ZixuanLiang ์•„๋‹ˆ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ..
์šฐ๋ฆฌ๋Š” ํ˜„์žฌ AMD(R9 380, R9 Nano, FirePro)์—์„œ ํ…Œ์ŠคํŠธํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. Intel GPU๊ฐ€ ์ผ๋ถ€ ๋“œ๋ผ์ด๋ฒ„ ๋ฒ„๊ทธ๋ฅผ ๋…ธ์ถœํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ณ  ์žˆ์ง€๋งŒ ์ˆ˜์ • ์‚ฌํ•ญ์ด ์ œ๊ณต๋  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์šฐ๋ฆฌ๋Š” Renesas R-Car๋ฅผ ๋ฐœํ‘œํ–ˆ์œผ๋ฉฐ ๋” ๋งŽ์€ ๊ฒƒ์ด ๋’ค๋”ฐ๋ฅผ ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€ํ•ฉ๋‹ˆ๋‹ค.

Xilinx๊ฐ€ triSYCL https://github.com/tensorflow/tensorflow/pull/12882 ์— ๋Œ€ํ•œ ์—…์ŠคํŠธ๋ฆผ ์ง€์›์„ ์ œ๊ณตํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ FPG(?) - @keryell ์€ ์ด์— ๋Œ€ํ•ด ๋” ๋งŽ์ด ์•Œ์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค.

@znmeb bazel test -c opt --config=sycl --test_output=all //tensorflow/python/kernel_tests:basic_gpu_test ๋Š” ๊ณต์ •ํ•œ ๊ฒ€์ฆ์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ถœ๋ ฅ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค.
INFO: From Testing //tensorflow/python/kernel_tests:basic_gpu_test: ==================== Test output for //tensorflow/python/kernel_tests:basic_gpu_test: 2017-10-05 10:53:52.727745: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2017-10-05 10:53:53.059908: I ./tensorflow/core/common_runtime/sycl/sycl_device.h:66] Found following OpenCL devices: 2017-10-05 10:53:53.059926: I ./tensorflow/core/common_runtime/sycl/sycl_device.h:68] id: 0, type: GPU, name: Tonga, vendor: Advanced Micro Devices, Inc., profile: FULL_PROFILE .....

@lukeiwanski ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค AMD GPU์—์„œ ์‹œ๋„ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค

@lukeiwanski ๋นŒ๋“œ ๋ฐ ํ…Œ์ŠคํŠธ๊ฐ€ ๋‚ด Bonaire์—์„œ ์ž‘๋™ํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ €๋Š” Python 3.6์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์œผ๋ฉฐ ์ง€์นจ์—์„œ๋Š” Python 2.7์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. 2.7์„ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ ์•„๋‹ˆ๋ฉด 3.6์ด ์ž‘๋™ํ•ฉ๋‹ˆ๊นŒ?

@znmeb ๋‹ค์Œ https://github.com/tensorflow/tensorflow/issues/6533#issuecomment -273852647 Python 3.6์ด ์ž‘๋™ํ•ด์•ผ ํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

@lukeiwanski ํ˜„์žฌ TF๋ฅผ ๊ตฌ์ถ•ํ•  ์ˆ˜ ์žˆ๋Š” ComputeCpp ๋ฒ„์ „์ธ๊ฐ€์š”?
0.3.2์™€ 0.1.4 ์‚ฌ์ด์˜ ๋‹ค์–‘ํ•œ ๋ฒ„์ „์„ ์‹œ๋„ํ–ˆ์ง€๋งŒ ์•„๋ฌด ๊ฒƒ๋„ ์ž‘๋™ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ๊ทธ๋“ค์€ ๋ชจ๋‘ "'global_ptr'์˜ ๋‹ค์ค‘ ์˜ค๋ฒ„๋กœ๋“œ๊ฐ€ ๋™์ผํ•œ ์„œ๋ช…์œผ๋กœ ์ธ์Šคํ„ด์Šคํ™”๋จ" ์˜ค๋ฅ˜๋กœ ๋๋‚ฌ์Šต๋‹ˆ๋‹ค.
Btw, TF ์†Œ์Šค์—์„œ TensorDeviceSycl.h ํŒŒ์ผ์„ ์ฐพ์„ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ด๋ฆ„์ด ๋ฐ”๋€ ํŒŒ์ผ์ธ๊ฐ€์š”? ํ˜„์žฌ ์†Œ์Šค์— ํŒจ์น˜๋ฅผ ์ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

๋ฏธ๋ฆฌ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

@eLvErDe ComputeCpp 0.3.2 ๋นŒ๋“œ ๊ฐ€๋Šฅ: https://github.com/lukeiwanski/tensorflow/tree/dev/amd_gpu

์—…์ŠคํŠธ๋ฆผ์—๋Š” ์ด๋ฅผ ์ˆ˜์ •ํ•˜๋Š” Eigen ํŒจ์น˜๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. https://github.com/tensorflow/tensorflow/issues/22#issuecomment -334154564 ์ฐธ์กฐ

bazel ๋นŒ๋“œ ์ค‘์— ์ด Eigen ํŒจ์น˜๋ฅผ ์‚ฝ์ž…ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ? ๊ณ ์ • ๋ฒ„์ „์„ ์–ป์œผ๋ ค๋ฉด Eigen tgz ๋ฒ„์ „์„ ์–ด๋”˜๊ฐ€์— ๋ถ€๋”ช์ณ์•ผ ํ• ๊นŒ์š”?

๊ณ ๋งˆ์›Œ, ์•„๋‹ด.

https://github.com/lukeiwanski/tensorflow/commit/8468d65e87e083337f18616f75ac56d3296d6ab1

์ด ์ปค๋ฐ‹์€ ๋นŒ๋“œํ•˜๊ธฐ์— ์ถฉ๋ถ„ํ• ๊นŒ์š”?

์˜ˆ, ๋‹น์‹ ์€ ๊ทธ๊ฒƒ์„ ์ฒด๋ฆฌ ์„ ํƒ ํ•  ์ˆ˜ ์žˆ์–ด์•ผํ•ฉ๋‹ˆ๋‹ค

์Šฌํ”„๊ฒŒ๋„, ๊ทธ๊ฒƒ์€ ๋ถ„๋ช…ํžˆ ์ถฉ๋ถ„ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ์€ ๋‹ค์Œ ๋นŒ๋“œ ์‹คํŒจ ์ค‘ ์ผ๋ถ€์ž…๋‹ˆ๋‹ค.

external/eigen_archive/Eigen/src/Core/util/BlasUtil.h:63:63: error: no type named 'ReturnType' in 'Eigen::ScalarBinaryOpTraits<cl::sycl::vec<float, 4>, std::complex<float>, Eigen::internal::scalar_product_op<cl::sycl::vec<float, 4>, std::complex<float> > >'
  typedef typename ScalarBinaryOpTraits<LhsScalar,RhsScalar>::ReturnType Scalar;
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
external/eigen_archive/Eigen/src/Core/util/BlasUtil.h:69:34: error: invalid operands to binary expression ('const cl::sycl::vec<float, 4>' and 'const std::complex<float>')
  { return conj_if<ConjLhs>()(x) *  conj_if<ConjRhs>()(y); }
           ~~~~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~~~~~~~~~~~~~~

@eLvErDe ์ปดํŒŒ์ผํ•˜๋ ค๋ฉด ์ ์šฉํ•ด์•ผ ํ•˜๋Š” ์ปค๋ฐ‹์ด ๊ฑฐ์˜ ์—†์Šต๋‹ˆ๋‹ค.
dev/amd_gpu์˜ ํŒ์„ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ ํ˜„์žฌ ๋ถ„๊ธฐ๋ฅผ ๋ณ€๊ฒฝํ•˜์ง€ ์•Š์œผ๋ ค๋Š” ๊ฒฝ์šฐ .. dev/amd_gpu๋ฅผ ๋ณ‘ํ•ฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์‚ฌ์‹ค ์ €๋Š” ๋น„๊ณต์‹ Debian/Ubuntu ํŒจํ‚ค์ง€์—์„œ ์ž‘์—… ์ค‘์ด๋ฏ€๋กœ ๊ณต์‹ 1.3.1 ๋ฆด๋ฆฌ์Šค์— ๊ฐ€๊น๊ฒŒ ์œ ์ง€ํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. OpenCL ์ง€์› ์—†์ด๋„ ์‚ด ์ˆ˜ ์žˆ์ง€๋งŒ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์ง€์›๋˜๋Š” ์ฆ‰์‹œ ํ™œ์„ฑํ™”ํ•  ์ค€๋น„๊ฐ€ ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ํ…Œ์ŠคํŠธ ๋ชฉ์ ์œผ๋กœ ๋ธŒ๋žœ์น˜์— ๋Œ€ํ•ด ํŒจํ‚ค์ง€๋ฅผ ์—…๋ฐ์ดํŠธํ•  ์ˆ˜๋„ ์žˆ์ง€๋งŒ ์˜ค๋Š˜์€ ์ถฉ๋ถ„ํ•ฉ๋‹ˆ๋‹ค. ;)

๊ด‘์‚ฐ ์žฅ๋น„์—๋Š” ์•ฝ 10๊ฐ€์ง€ ์ข…๋ฅ˜์˜ AMD GPU๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. (์šฐ๋ถ„ํˆฌ 16.04 ๋ฐ amdgpu-pro๋ฅผ ์‹คํ–‰ํ•˜๋Š” 7970์—์„œ RX 480์œผ๋กœ). ๋‚ด๊ฐ€ ๋ฌด์–ธ๊ฐ€๋ฅผ ํ…Œ์ŠคํŠธํ•˜์—ฌ ๊ธฐ์—ฌํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ๋ ค์ฃผ์‹ญ์‹œ์˜ค.

๋‚ด๊ฐ€ ๋ฌด์–ธ๊ฐ€๋ฅผ ํ…Œ์ŠคํŠธํ•˜์—ฌ ๊ธฐ์—ฌํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ๋ ค์ฃผ์‹ญ์‹œ์˜ค.
https://github.com/ROCmSoftwarePlatform/hipCaffe ๋Š” ์–ด๋–ป์Šต๋‹ˆ๊นŒ?
https://github.com/ROCmSoftwarePlatform/hipeigen

2017๋…„ 10์›” 17์ผ ํ™”์š”์ผ ์˜ค์ „ 10์‹œ 54๋ถ„์— slundell [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

๊ด‘์‚ฐ ์žฅ๋น„์—๋Š” ์•ฝ 10๊ฐ€์ง€ ์ข…๋ฅ˜์˜ AMD GPU๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. (์—์„œ
7970 - ์šฐ๋ถ„ํˆฌ 16.04 ๋ฐ amdgpu-pro๋ฅผ ์‹คํ–‰ํ•˜๋Š” RX 480). ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ๋ ค์ฃผ์„ธ์š”
๋ฌด์—‡์ด๋“  ํ…Œ์ŠคํŠธํ•˜์—ฌ ๊ธฐ์—ฌํ•˜์‹ญ์‹œ์˜ค.

โ€”
๋‹น์‹ ์ด ๋Œ“๊ธ€์„ ๋‹ฌ์•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment-337309059 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/AA6MNxXJ-G3nCQUA9RucrJ8y4vs5NPtLks5stOnbgaJpZM4Gex3i
.

@lukeiwanski ํฌํฌ๊ฐ€ macOS์—์„œ๋„ AMD GPU๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๊นŒ?

์•ˆ๋…•,
GPU(Mali-T720)๊ฐ€ ํ™œ์„ฑํ™”๋œ Android ๊ธฐ๊ธฐ์šฉ Ubuntu16.04 x64์—์„œ tensorflow API๋ฅผ ๊ตฌ์ถ•ํ•˜๊ณ  ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

๋‚ด OS ์ •๋ณด:
์šฐ๋ถ„ํˆฌ 16.04 x64
์ปดํ“จํ„ฐ GPU: NVIDIA 1080Ti
์ฟ ๋‹ค 8.0
CUDNN 5.1(๋นŒ๋“œ์— cuda ๋˜๋Š” cudnn์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š์ง€๋งŒ)
๋ฐ”์ ค 0.5.2
ComputeCpp CE 0.3.2

๋‚ด build.sh๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค
'
bazel ๋นŒ๋“œ -c opt --config=sycl //tensorflow/contrib/android:libtensorflow_cc.so --cxxopt="-
std=c++11" --cxxopt="-DTENSORFLOW_DISABLE_META" --verbose_failures --
crosstool_top=//external:android/crosstool --host_crosstool_top=@bazel_tools//tools/cpp:toolchain --
cpu=armeabi-v7a
'
๋นŒ๋“œํ•˜๊ธฐ ์ „์—. LD_LIBRARY_PATH=my_sycl_lib_path=$LD_LIBRARY_PATH๋ฅผ ๋‚ด๋ณด๋‚ด๊ณ  ' --config=sycl ' ์—†์ด ๋นŒ๋“œํ•˜๋Š” ๊ฒƒ์ด ์ข‹๊ณ  ์˜ฌ๋ฐ”๋ฅธ libtensorflow_cc.so๋ฅผ ์–ป์—ˆ์ง€๋งŒ ' --config=sycl '์„ ์‚ฌ์šฉํ•˜๋ฉด ์ตœ์ข… ๊ฒฐ๊ณผ์— -lComputeCpp๊ฐ€ ์—†๋Š” ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ์Šต๋‹ˆ๋‹ค. ์ปดํŒŒ์ผ ์˜ค๋ฅ˜

๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ „์ฒด ๋กœ๊ทธ:

์˜ค๋ฅ˜: /home/e0024/workspace/tensorflow/tensorflow/contrib/android/BUILD:102:1: '//tensorflow/contrib/android:libtensorflow.so' ๊ทœ์น™ ์—ฐ๊ฒฐ ์‹คํŒจ: link_dynamic_library.sh ์‹คํŒจ: ๋ช…๋ น ์‹คํ–‰ ์˜ค๋ฅ˜
(cd /home/e0024/.cache/bazel/_bazel_e0024/783dad02ec856015f56356584726dd10/execroot/org_tensorflow && \
์ž„์› ํ™˜๊ฒฝ - \
COMPUTECPP_TOOLKIT_PATH=/home/e0024/workspace/source/computeCppForSYCL1.2 \
HOST_CXX_COMPILER=/usr/bin/g++ \
HOST_C_COMPILER=/usr/bin/gcc \
LD_LIBRARY_PATH=/home/e0024/workspace/source/computeCppForSYCL1.2/lib:/home/e0024/workspace/caffe/build/lib:/home/e0024/workspace/cudnn/lib64: \
๊ฒฝ๋กœ=/home/e0024/bin:/home/e0024/.local/bin:/home/e0024/workspace/Anaconda2/bin:/opt/cuda:/home/e0024/workspace/source/protoc-3.3.0- linux-x86_64/bin:/home/e0024/workspace/bazel/output:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/ ๊ฒŒ์ž„:/usr/local/games:/snap/bin \
PWD=/proc/self/cwd \
PYTHON_BIN_PATH=/home/e0024/workspace/Anaconda2/bin/python \
PYTHON_LIB_PATH=/home/e0024/workspace/Anaconda2/lib/python2.7/site-packages \
TF_NEED_CUDA=0 \
TF_NEED_OPENCL=1 \
์™ธ๋ถ€/bazel_tools/tools/cpp/link_dynamic_library.sh ๋ฌด์‹œ ๋ฌด์‹œ ๋ฌด์‹œ ์™ธ๋ถ€/androidndk/ndk/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/arm-linux-androideabi-gcc -shared -o bazel-out/arm-linux-androideabi-4.9-v7a-gnu-libstdcpp-opt/bin/tensorflow/contrib/android/libtensorflow.so '-Wl,-rpath,$ORIGIN/../../../ _solib_armeabi-V7A / _U @local_Uconfig_Usycl_S_Ssycl_Csyclrt___Uexternal_Slocal_Uconfig_Usycl_Ssycl_Slib '-Lbazel ์•„์›ƒ / ํŒ” - ๋ฆฌ๋ˆ…์Šค - androideabi-4.9-V7A - GNU-libstdcpp-์˜ตํŠธ / ๋นˆ / _solib_armeabi-V7A / _U @local_Uconfig_Usycl_S_Ssycl_Csyclrt___Uexternal_Slocal_Uconfig_Usycl_Ssycl_Slib -Wl, ์ „์ฒด์ ์ธ ์•„์นด์ด๋ธŒ bazel ์•„์›ƒ / ํŒ” -linux-androideabi-4.9-v7a-gnu-libstdcpp-opt/bin/tensorflow/c/libc_api.a -Wl,-no-whole-archive -Wl,-whole-archive bazel-out/arm-linux-androideabi- 4.9-v7a-gnu-libstdcpp-opt/bin/tensorflow/core/libandroid_tensorflow_lib.lo -Wl,-no-whole-archive -Wl,-whole-archive bazel-out/arm-linux-androideabi-4.9-v7a-gnu -libstdcpp-opt/bin/tensorflow/core/kernels/libandr oid_tensorflow_kernels.lo -Wl,-no-whole-archive -Wl,-whole-archive bazel-out/arm-linux-androideabi-4.9-v7a-gnu-libstdcpp-opt/bin/tensorflow/core/libandroid_tensorflow_lib_lite.lo -Wl ,-no-whole-archive -Wl,-whole-archive bazel-out/arm-linux-androideabi-4.9-v7a-gnu-libstdcpp-opt/bin/tensorflow/core/libprotos_all_cc.a -Wl,-no-whole -archive -Wl,-์ „์ฒด ์•„์นด์ด๋ธŒ bazel-out/arm-linux-androideabi-4.9-v7a-gnu-libstdcpp-opt/bin/external/protobuf/libprotobuf.a -Wl,-no-whole-archive -Wl, -์ „์ฒด ์•„์นด์ด๋ธŒ bazel-out/arm-linux-androideabi-4.9-v7a-gnu-libstdcpp-opt/bin/external/protobuf/libprotobuf_lite.a -Wl,-no-whole-archive -lComputeCpp ์™ธ๋ถ€/androidndk/ndk/ ์†Œ์Šค/cxx-stl/gnu-libstdc++/4.9/libs/armeabi-v7a/libgnustl_static.a ์™ธ๋ถ€/androidndk/ndk/sources/cxx-stl/gnu-libstdc++/4.9/libs/armeabi-v7a/libsupc++.a -landroid -llog -lm -z defs -s -Wl,--gc-sections '-Wl,-soname=libtensorflow.so' -Wl,--version-script tensorflow/c/version_script.lds -lz -static-libgcc - no-canonical-prefixes '-march=armv7-a' -Wl,--fix-cortex-a8 '--sysroot=external/androidndk/ndk/platforms/android-14/arch-arm'): com.google.devtools.build.lib.shell.BadExitStatusException: ํ”„๋กœ์„ธ์Šค๊ฐ€ ์ข…๋ฃŒ๋จ ์ƒํƒœ 1.
์™ธ๋ถ€/androidndk/ndk/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/../lib/gcc/arm-linux-androideabi/4.9/../../../.. /Ssoarm-linux-androideabi/bin/ld: ๊ฒฝ๊ณ : ํ˜ธํ™˜๋˜์ง€ ์•Š๋Š” bazel-out/arm-linux-androideabi-4.9-v7a-gnu-libstdcpp-opt/bin/_solib_armeabi-v7a/_U@local_Uconfig_Usycl_S_Ssycl_Csyclrt___Uexternal_Csyclrt___Uexternal_Ssycl ๋™์•ˆ ๊ฒ€์ƒ‰ ComputeCpp์šฉ
์™ธ๋ถ€/androidndk/ndk/toolchains/arm-linux-androideabi-4.9/prebuilt/linux-x86_64/bin/../lib/gcc/arm-linux-androideabi/4.9/../../../.. /arm-linux-androideabi/bin/ld: ์˜ค๋ฅ˜: -lComputeCpp๋ฅผ ์ฐพ์„ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
collect2: ์˜ค๋ฅ˜: ld๊ฐ€ 1 ์ข…๋ฃŒ ์ƒํƒœ๋ฅผ ๋ฐ˜ํ™˜ํ–ˆ์Šต๋‹ˆ๋‹ค.
๋Œ€์ƒ //tensorflow/contrib/android:libtensorflow.so ๋นŒ๋“œ ์‹คํŒจ
์ •๋ณด: ๊ฒฝ๊ณผ ์‹œ๊ฐ„: 617.736์ดˆ, ์ž„๊ณ„ ๊ฒฝ๋กœ: 54.66์ดˆ

์Œ.... GPU(Mali-T720)๊ฐ€ ํ™œ์„ฑํ™”๋œ ํŒ”๋ชฉ์— tensorflow API๋ฅผ ๊ตฌ์ถ•ํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.
๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ์—ฌ๊ธฐ์— ๋ช‡ ๊ฐ€์ง€ ๊ฒฝํ—˜์ด๋‚˜ ์ œ์•ˆ์„ ๋‚จ๊ธธ ์ˆ˜ ์žˆ๋‹ค๋ฉด ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ํ— ๋Œ€๋ฐ•.

๋‹ค์Œ ์ฃผ Arm TechCon, @laMia482 ์—์„œ ์ œ ๊ฐ•์—ฐ์„ ๋“ค์œผ๋Ÿฌ ์˜ค์„ธ์š”! http://schedule.armtechcon.com/session/running-tensorflow-machine-learning-on-arm-embedded-hardware/850230

์•„์ง ์‰ฝ๊ฒŒ ๊ตฌํ•  ์ˆ˜ ์—†๋Š” SPIR-V๋ฅผ ์ง€์›ํ•˜๋Š” Mali ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  Arm CPU ์ง€์› ๋ฐ SPIR-V ์ง€์›์ด ํฌํ•จ๋œ Android์šฉ ComputeCpp ๋Ÿฐํƒ€์ž„์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค(์•„์ง ์ œ๊ณต๋˜์ง€ ์•Š์Œ). ๊ทธ๋ž˜์„œ, ๋‹น์‹ ์€ ๋‹จ์ง€ _์กฐ๊ธˆ_ ์กฐ๊ธˆ๋งŒ ์ฐธ์•„์•ผ ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์šฐ๋ฆฌ(Vertex.AI)๋Š” OpenCL์—์„œ Keras ์‹คํ–‰์„ ์ง€์›ํ•˜๋Š” ๋”ฅ ๋Ÿฌ๋‹ ์Šคํƒ์ธ PlaidML์„ ๊ณต๊ฐœํ–ˆ์Šต๋‹ˆ๋‹ค. TensorFlow ์ง€์›์ด ์˜ˆ์ •๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์˜ˆ, Mac ์ง€์›์ด ์ง„ํ–‰ ์ค‘์ž…๋‹ˆ๋‹ค(Windows๋„ ํฌํ•จ). http://vertex.ai/blog/announcing-plaidml @ggaabe

@choongng ํ•ด๋ณด๊ณ  ์‹ถ์—ˆ์ง€๋งŒ ์‹คํŒจ.
pip search plaidml ๋ฐ˜ํ™˜

plaidml (0.1.0rc3)        - PlaidML machine learning accelerator

ํ•˜์ง€๋งŒ pip install plaidml ๋˜๋Š” pip install plaidml==0.1.0rc3
๋ณด๊ณ 

Could not find a version that satisfies the requirement plaidml (from versions: )
No matching distribution found for plaidml

@hy9be tensorflow ์—์„œ OpenCL์„ ์ง€์›ํ•˜๋Š” ๋ฌธ์ œ์ด๊ธฐ ๋•Œ๋ฌธ์— ์—ฌ๊ธฐ๋ณด๋‹ค๋Š” plaidml ์ €์žฅ์†Œ ์—์„œ ๋ฌธ์ œ๋ฅผ ์ œ๊ธฐํ•˜๋Š” ๊ฒƒ์ด ๋” ์ ์ ˆํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์„ค์น˜ ์ง€์นจ์„ ๋ณด๋ฉด pip install ๋ช…๋ น์ด ์˜ฌ๋ฐ”๋ฅด์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ท€ํ•˜์˜ ๊ด€์‹ฌ๊ณผ ์„ธ์…˜ ์—ฐ์„ค์— ๋Œ€ํ•ด @andrewrichards ์—๊ฒŒ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ ํ˜„์žฌ ๋‚˜(๋Œ€ํ•™์›์ƒ)๋Š” Android ๊ธฐ๊ธฐ์—์„œ Tensorflow๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์•ฑ์„ ๋นŒ๋“œํ•˜๊ณ  GPU(Mali-T720)๋ฅผ ํ™œ์„ฑํ™”ํ•˜๊ธฐ ์œ„ํ•ด SPIP-V๋ฅผ ์ง€์›ํ•˜๋Š” Mali ๋“œ๋ผ์ด๋ฒ„์™€ Arm CPU๊ฐ€ ์žˆ๋Š” Android์šฉ ComputeCpp ๋Ÿฐํƒ€์ž„์„ ํ™•๋ณดํ•˜๊ธฐ ์œ„ํ•ด ํ•„์š”ํ•œ ๊ฒƒ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ์ง€์› ๋ฐ SPIR-V ์ง€์›.

CodePlay ํ™ˆํŽ˜์ด์ง€์—์„œ ComputeCpp(Ubuntu16.04 x64 with bin/doc/include/lib/)๋ฅผ ๋‹ค์šด๋กœ๋“œํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์–ด์ œ ๋‹ค์Œ์„ ์‹คํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.
bazel build -c opt --config=sycl //tensorflow/contrib/android:libtensorflow_cc.so --cxxopt="-std=c++11" --cxxopt="-DTENSORFLOW_DISABLE_META" --verbose_failures --crosstool_top=//external:android/crosstool --host_crosstool_top=@bazel_tools//tools/cpp:toolchain --cpu=armeabi-v7a
์˜ค๋ฅ˜๋Š” libComputeCpp.so incompatible ๋ผ๊ณ  ํ‘œ์‹œ๋˜์–ด ์žˆ์œผ๋ฏ€๋กœ Arm CPU ์ง€์› ๋ฐ SPIR-V ์ง€์›์ด ํฌํ•จ๋œ Android์šฉ ComputeCpp๊ฐ€ ํ•„์š”ํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•˜์ง€๋งŒ Android ComputeCpp๋ฅผ ๋นŒ๋“œํ•˜๊ธฐ ์œ„ํ•œ ์†Œ์Šค ์ฝ”๋“œ๋ฅผ ์ฐพ์„ ์ˆ˜ ์—†์—ˆ๊ณ  github์— ์ƒ˜ํ”Œ๋งŒ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ํ˜„์žฌ Android์šฉ ComputeCpp๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†๋‹ค๊ณ  ๋ง์”€ํ•˜์…จ๋Š”๋ฐ, Android ๊ธฐ๊ธฐ๋ฅผ ์ง€์›ํ•  ๊ณ„ํš์ด ์žˆ๊ฑฐ๋‚˜ ์ง€์›๋˜๋Š” ๊ฒฝ์šฐ ์–ด๋–ป๊ฒŒ ๋ฐ›์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

AMD GPU ๋ฐ Linux ์‚ฌ์šฉ์ž๋ฅผ ์œ„ํ•ด AMD๋Š” ์ตœ๊ทผ ์—ฌ๊ธฐ ์— tensorflow์˜ HIP ํฌํŠธ๋ฅผ ์ถœ์‹œํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ด€์‹ฌ์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ํ…Œ์ŠคํŠธํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.

ํ…Œ์ŠคํŠธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ณ„์† ์ง€์ผœ๋ด ์ฃผ์‹ญ์‹œ์˜ค. ๊ทธ๋ž˜๋„ CI์— ์‹คํŒจํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์‹ค์ œ๋กœ ์‹คํŒจํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์•„์ง ์ดˆ๊ธฐ ๋‹จ๊ณ„๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ํ…Œ์ŠคํŠธํ–ˆ๊ณ  MNIST ์˜ˆ์ œ์—์„œ ์ฆ‰์‹œ segfault๋ฅผ ์–ป์—ˆ์Šต๋‹ˆ๋‹ค.
๋‚ด๊ฐ€ ์—ฌ๊ธฐ์„œ ๋ญ˜ ์ž˜๋ชปํ•˜๊ณ  ์žˆ๋Š”์ง€ ๋ชจ๋ฅด๊ฒ ๋‹ค.

$ python ./convolutional.py 
I tensorflow/stream_executor/dso_loader.cc:130] Couldn't open CUDA library libhipblas.so. LD_LIBRARY_PATH: :/home/masa/project/rendering/RadeonProRender-Baikal/Bin/Release/x64:/usr/local/lib64:/opt/CodeXL_2.5-25:/usr/lib/x86_64-linux-gnu/:/opt/CodeXL_2.5-25/RuntimeLibs/QT/
I tensorflow/stream_executor/cuda/cuda_blas.cc:2305] Unable to load HIPBLAS DSO.
I tensorflow/stream_executor/dso_loader.cc:130] Couldn't open CUDA library libhipfft.so. LD_LIBRARY_PATH: :/home/masa/project/rendering/RadeonProRender-Baikal/Bin/Release/x64:/usr/local/lib64:/opt/CodeXL_2.5-25:/usr/lib/x86_64-linux-gnu/:/opt/CodeXL_2.5-25/RuntimeLibs/QT/
I tensorflow/stream_executor/cuda/cuda_fft.cc:344] Unable to load cuFFT DSO.
I tensorflow/stream_executor/dso_loader.cc:139] successfully opened CUDA library libhip_hcc.so locally
I tensorflow/stream_executor/dso_loader.cc:130] Couldn't open CUDA library libhiprng.so. LD_LIBRARY_PATH: :/home/masa/project/rendering/RadeonProRender-Baikal/Bin/Release/x64:/usr/local/lib64:/opt/CodeXL_2.5-25:/usr/lib/x86_64-linux-gnu/:/opt/CodeXL_2.5-25/RuntimeLibs/QT/
I tensorflow/stream_executor/cuda/cuda_rng.cc:338] Unable to load cuRAND DSO.
I tensorflow/stream_executor/dso_loader.cc:139] successfully opened CUDA library libMIOpen.so locally
Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
W tensorflow/stream_executor/cuda/cuda_driver.cc:633] creating context when one is currently active; existing: 0x7f94fa357e90
I tensorflow/core/common_runtime/gpu/gpu_device.cc:892] Found device 0 with properties: 
name: Fiji [Radeon R9 FURY / NANO Series]
major: 2 minor: 0 memoryClockRate (GHz) 1
pciBusID 1๏ฟฝ๏ฟฝ๏ฟฝ๏ฟฝ
Total memory: 4.00GiB
Free memory: 3.75GiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:913] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:972] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Fiji [Radeon R9 FURY / NANO Series], pci bus id: 1๏ฟฝ๏ฟฝ๏ฟฝ๏ฟฝ)
Initialized!
I tensorflow/core/kernels/conv_ops.cc:604] running auto-tune for Convolve
Invoking clang-ocl on "/tmp/miopen-MIOpenUtilKernels.cl-c377-1df5-8b6a-884c/MIOpenUtilKernels.cl"
/opt/rocm/bin/clang-ocl -DNUM_CH_PER_WG=1 -DNUM_IM_BLKS_X=1 -DNUM_IM_BLKS=4 -DLOCAL_MEM_SIZE=432 -DSTRIDE_GT_1=0 -DTILE_SZ_X=32 -DTILE_SZ_Y=8 -DUSE_IM_OFF_GUARD=1 -mcpu=gfx803 -Wno-everything MIOpenUtilKernels.cl -o /tmp/miopen-MIOpenUtilKernels.cl-c377-1df5-8b6a-884c/MIOpenUtilKernels.cl.o
writing gemm kernel to "/tmp/miopen-tinygemm.cl-836e-c4d4-abd3-b292/tinygemm.cl"
Invoking clang-ocl on "/tmp/miopen-tinygemm.cl-836e-c4d4-abd3-b292/tinygemm.cl"
/opt/rocm/bin/clang-ocl -mcpu=gfx803 -Wno-everything tinygemm.cl -o /tmp/miopen-tinygemm.cl-836e-c4d4-abd3-b292/tinygemm.cl.o
GCN assember path: /opt/rocm/opencl/bin/x86_64/clang
Arugment: --version 
Invoking clang-ocl on "/tmp/miopen-MIOpenConvDirUniC.cl-f5fc-85f4-7079-a024/MIOpenConvDirUniC.cl"
/opt/rocm/bin/clang-ocl -DMLO_HW_WAVE_SZ=64 -DMLO_DIR_FORWARD=1 -DMLO_FILTER_SIZE0=5 -DMLO_FILTER_SIZE1=5 -DMLO_FILTER_PAD0=2 -DMLO_FILTER_PAD1=2 -DMLO_N_OUTPUTS=32 -DMLO_N_INPUTS=1 -DMLO_BATCH_SZ=64 -DMLO_OUT_WIDTH=28 -DMLO_OUT_HEIGHT=28 -DMLO_OUT_BATCH_STRIDE=25088 -DMLO_OUT_CHANNEL_STRIDE=784 -DMLO_OUT_STRIDE=28 -DMLO_IN_WIDTH=28 -DMLO_IN_HEIGHT=28 -DMLO_IN_BATCH_STRIDE=784 -DMLO_IN_CHANNEL_STRIDE=784 -DMLO_IN_STRIDE=28 -DMLO_IN_TILE0=28 -DMLO_IN_TILE1=8 -DMLO_OUT_TILE0=28 -DMLO_OUT_TILE1=8 -DMLO_GRP_TILE0=16 -DMLO_GRP_TILE1=8 -DMLO_ACTIVE_ALUS=112 -DMLO_N_ALUTILES_PERSTACK=2 -DMLO_OUT_PIX_TILE0=2 -DMLO_OUT_PIX_TILE1=2 -DMLO_N_STACKS=1 -DMLO_N_OUT_TILES=8 -DMLO_N_OUT_TILES_PERSTACK=16 -DMLO_N_IN_TILES_PERSTACK=1 -DMLO_N_READ_PROCS=128 -DMLO_CONV_BIAS=0 -DMLO_ALU_VTILE0=14 -DMLO_ALU_VTILE1=4 -mcpu=gfx803 -Wno-everything MIOpenConvDirUniC.cl -o /tmp/miopen-MIOpenConvDirUniC.cl-f5fc-85f4-7079-a024/MIOpenConvDirUniC.cl.o
Invoking clang-ocl on "/tmp/miopen-MIOpenConvFFT.cl-2fbf-2ba2-0088-ebfc/MIOpenConvFFT.cl"
/opt/rocm/bin/clang-ocl -DCFF_TRANSP_WT_MOD16=1 -DCFF_CGEMM_CHOICE_0=1 -DCFF_IMG_SZ_28_28 -DCFF_IMG_H=28 -DCFF_IMG_W=28 -DCFF_BATCH=64 -DCFF_NFILTER=32 -DCFF_CHANNELS=1 -DCFF_HALFW=1148928 -mcpu=gfx803 -Wno-everything MIOpenConvFFT.cl -o /tmp/miopen-MIOpenConvFFT.cl-2fbf-2ba2-0088-ebfc/MIOpenConvFFT.cl.o
Segmentation fault (core dumped)

@masahi - rocm 1.6.4 base๊ฐ€ ์„ค์น˜๋˜์–ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜์‹ญ์‹œ์˜ค.

@bensander ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค, ์—…๊ทธ๋ ˆ์ด๋“œํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

@bensander AMD ์Šคํƒ์—์„œ ๋‹ค๋ฅธ ๊ฒƒ์ด ํ•„์š”ํ•ฉ๋‹ˆ๊นŒ? ์ง€๊ธˆ ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ๊ฒƒ์€ ์˜คํ”ˆ ์†Œ์Šค "amdgpu" ๋“œ๋ผ์ด๋ฒ„๋ฅผ ์‚ฌ์šฉํ•˜๋Š” AMD ๋…์  opencl ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฟ์ž…๋‹ˆ๋‹ค.

@masahi - rocm ๋ฐ rocm-libs(์˜ˆ: "apt-get install rocm rocm-libs")๋ฅผ ์„ค์น˜ํ•˜๋Š” ๊ฒฝ์šฐ ํ•„์š”ํ•œ ๋ชจ๋“  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ €์žฅ์†Œ์˜ rocm_docs์—๋Š” ์˜ˆ์ƒ ๊ฒฐ๊ณผ๋ฅผ ํฌํ•จํ•œ ์ „์ฒด ์ง€์นจ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

@bensander rocm 1.6.4๋ฅผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ(1.6.3์ด ์•„๋‹ˆ๋ผ) ์‹คํ–‰ํ•˜๊ณ  ์žˆ๋Š”์ง€ ์–ด๋–ป๊ฒŒ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

@masahi ๋Š” ์ถ”์ธก์ผ ๋ฟ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ๊ฐ€ ์•„๋‹ˆ๋ผ AMD ๋˜๋Š” RoCM ํ”„๋กœ์ ํŠธ์™€ ๊ฐ™์ด ๋ฌธ์ œ์™€ ๊ด€๋ จํ•˜์—ฌ ๋” ๊ด€๋ จ์ด ์žˆ๋Š” ๊ณณ์—์„œ ์งˆ๋ฌธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค...

@keryell ๋งž์Šต๋‹ˆ๋‹ค. ์ฃผ์ œ์—์„œ ๋ฒ—์–ด๋‚ฌ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ ๋ฉˆ์ถฅ๋‹ˆ๋‹ค.
์–ด์จŒ๋“  ๋‚ด ์‹œ์Šคํ…œ์—์„œ hiptensorflow๊ฐ€ ์ž‘๋™ํ•˜๋„๋ก ํ•  ์ˆ˜ ์—†์—ˆ์Šต๋‹ˆ๋‹ค. ๋‚˜์ค‘์— ๊นจ๋—ํ•œ Ubuntu ์„ค์น˜๋กœ ์‹œ๋„ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

@masahi - ๋ฌธ์ œ๋ฅผ ์—ด์–ด์ฃผ์‹œ๋ฉด ์„ค์ •ํ•ด ๋“œ๋ฆฌ๊ฒ ์Šต๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š”, @bensander์™€ AMD์˜ ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค ๋•๋ถ„์— hiptensorflow ๊ฐ€ ์ž‘๋™ํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค๋Š” ๊ฒƒ์„ ๋ง์”€๋“œ๋ฆฌ๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ๋น ๋ฅธ ์‹œ์ž‘ ๊ฐ€์ด๋“œ์—์„œ ๋ชจ๋“  ์˜ˆ์ œ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐ์‚ฌ ํ•ด์š”

ROCm์„ ์‚ฌ์šฉํ•˜์—ฌ AMD ํ•˜๋“œ์›จ์–ด์—์„œ TensorFlow๋ฅผ ์‹œ๋„ํ•˜๋ ค๋Š” ์‚ฌ๋žŒ๋“ค์„ ์œ„ํ•ด AMD Fury Nano๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Fast.ai ๋…ธํŠธ๋ถ์„ ์‹คํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ค๋ช…ํ•˜๋Š” ๋ธ”๋กœ๊ทธ๋ฅผ ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.
http://briansp2020.github.io/2017/11/05/fast_ai_ROCm/

๐Ÿ‘ ๊ธฐ๋Œ€๋ฉ๋‹ˆ๋‹ค!

์ ์ ˆํ•œ Tensorflow ์ง€์›์ฒ˜๋Ÿผ ๋“ค๋ฆฌ๋Š” ROCm 1.7์ด ๊ณง ์ถœ์‹œ๋ฉ๋‹ˆ๋‹ค!

https://www.phoronix.com/scan.php?page=news_item&px=AMD-ROCm-1.7-์ถœ์‹œ

AMD GPU์— ๋Œ€ํ•œ Tensorflow ํฌํŠธ:
https://github.com/ROCmSoftwarePlatform/hiptensorflow/blob/hip/README.ROCm.md

๊ทธ๊ฒƒ์€ ๋‚˜๋ฅผ ์œ„ํ•ด ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๋‚ด ํ•˜๋“œ์›จ์–ด ์„ค์ •:
GPU: AMD ๋ผ๋ฐ์˜จ RX 480
CPU: ์ธํ…” ์ œ์˜จ 2603 v3
MB: ์Šˆํผ๋งˆ์ดํฌ๋กœ x10srl-f

ํ•ต์‹ฌ์€ ๋งˆ๋”๋ณด๋“œ์ด๊ณ  CPU๋Š” PCIe v3๋ฅผ ์ง€์›ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

์„ฑ๋Šฅ์€ Nvidia 980Ti์™€ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

"์ง€์›๋˜๋Š”" Ubuntu 16.04 LTS ์„ค์น˜์—์„œ ์ž‘๋™ํ•˜๋„๋ก "์ง€์›๋˜๋Š”" AMD ๋“œ๋ผ์ด๋ฒ„๋„ ์–ป์„ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๊ณ„ํš๋œ ๋…ธํ›„ํ™”?

znmeb, ๋‹น์‹ ์˜ AMD GPU๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ๋“€์–ผ GPU๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ BIOS์—์„œ ์ง€์›๋˜์ง€ ์•Š๋Š” GPU๋ฅผ ๋น„ํ™œ์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค.

์ „์ฒด ์Šค๋ ˆ๋“œ๋ฅผ ์ฝ์„ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค... MacOS(sierra +)์˜ OpenCL์—์„œ tensorflow์˜ ํ˜„์žฌ ์ƒํƒœ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ํŠนํžˆ, Intell Iris GPU๊ฐ€ ์žˆ๊ณ  ์†Œ์Šค Tf+Open CL ์ง€์›์—์„œ ๋นŒ๋“œํ•  ์ˆ˜ ์žˆ๊ธฐ๋ฅผ ๋ฐ”๋ž์Šต๋‹ˆ๋‹ค.
๋˜ํ•œ tf corrainder๋Š” ๋ฒ„์ „ 1.2์—์„œ ์ž˜ ์‹คํ–‰๋˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

@varun19299 FWIW์—๋Š” OpenCL์šฉ Intel SDK๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์ œ ๊ณ ๋Œ€ Sandy Bridge ๋žฉํ†ฑ์— ์ด SDK๊ฐ€ ์žˆ์ง€๋งŒ ๊ท€ํ•˜์˜ ์ปดํ“จํ„ฐ์—์„œ๋„ ์ž‘๋™ํ•  ๊ฒƒ์ด๋ผ๊ณ  ํ™•์‹ ํ•ฉ๋‹ˆ๋‹ค. https://software.intel.com/en-us/intel-opencl

์ด๊ฒƒ์€ ํ˜„์žฌ ์šฐ๋ถ„ํˆฌ๊ฐ€ ์•„๋‹Œ Linux ์‹œ์Šคํ…œ์—์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ์ƒํƒœ์ž…๋‹ˆ๊นŒ? ๋กœ๋“œ๋งต ํŽ˜์ด์ง€๋Š” ์—ฌ๊ธฐ๋กœ ์—ฐ๊ฒฐ๋ฉ๋‹ˆ๋‹ค.

@pfc ํ˜„์žฌ Ubuntu๊ฐ€ ์•„๋‹Œ Linux์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ์ผ๋ฐ˜์ ์œผ๋กœ OpenCL์„ ์‚ฌ์šฉํ•˜๋Š” TensorFlow? ์•„๋‹ˆ๋ฉด AMD GPU์—์„œ OpenCL์„ ์‚ฌ์šฉํ•˜๋Š” TensorFlow์ž…๋‹ˆ๊นŒ? OpenCL์„ ์‚ฌ์šฉํ•˜์—ฌ TensorFlow๋ฅผ ์‹คํ–‰ํ•˜๋ ค๋Š” ์œ ์ผํ•œ ์ด์œ ์ด๊ธฐ ๋•Œ๋ฌธ์— ํ›„์ž๋กœ ๊ฐ€์ •ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค. NVidia GPU์˜ ๊ฒฝ์šฐ NVidia ๋“œ๋ผ์ด๋ฒ„/๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  CPU ์ „์šฉ์˜ ๊ฒฝ์šฐ OpenCL์—์„œ ์–ป์„ ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์ด ์—†์Šต๋‹ˆ๋‹ค.

์ €๋Š” ๋ช‡ ์ฃผ ์ „์— ๋…์  ComputeCpp SYCL ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์™€ AMD "Bonaire"(Sea Islands ์•„ํ‚คํ…์ฒ˜) GPU๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Arch Linux์—์„œ ์ด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค. ํ…Œ์ŠคํŠธํ•ด์•ผ ํ•˜๋Š” ์ƒˆ๋กœ์šด ComputeCpp ๋ฆด๋ฆฌ์Šค๊ฐ€ ์žˆ์ง€๋งŒ ์ž‘๋™ํ•  ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์ด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐ ํ•„์š”ํ•œ AMDGPU Pro ๋…์  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ Ubuntu 16.04.3์—์„œ ์‹คํ–‰๋˜์ง€ ์•Š๋Š” ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ์Šต๋‹ˆ๋‹ค. 16.04.2์—์„œ ์—…๊ทธ๋ ˆ์ด๋“œํ•˜๋ฉด ๋” ์ƒˆ๋กœ์šด Linux ์ปค๋„๊ณผ X Server๊ฐ€ ๋„์ž…๋˜์—ˆ์œผ๋ฉฐ AMD๋Š” ์•„์ง ์ž‘๋™ํ•˜๋Š” ์ œํ’ˆ์„ ์ถœ์‹œํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ http://support.amd.com/en-us/kb-articles/Pages/AMDGPU-PRO-Driver-Compatibility-Advisory-with-Ubuntu-16.04.2-and-16.04.3.aspx ๋ฅผ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค. Ubuntu์—์„œ AMD OpenCL์ด ์ž‘๋™ํ•˜๋„๋ก ๋งŒ๋“ค ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

์ปดํŒŒ์ผ๋Ÿฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ CUDA ์ฝ”๋“œ๋ฅผ OpenCL ์ฝ”๋“œ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์‹คํ—˜์  AMD ๋ฒ„์ „์˜ TensorFlow๊ฐ€ ์žˆ์ง€๋งŒ ์ €๋„ ํ…Œ์ŠคํŠธํ•˜์ง€๋Š” ์•Š์•˜์Šต๋‹ˆ๋‹ค. ์ง€์›๋˜๋Š” ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ์—†์œผ๋ฉด ์“ธ๋ชจ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

https://github.com/ROCmSoftwarePlatform/hiptensorflow/tree/hip/rocm_docs ๋Š” AMD ํ•˜๋“œ์›จ์–ด์—์„œ ํ…์„œ ํ๋ฆ„์„ ์‹คํ–‰ํ•˜๊ธฐ ์œ„ํ•ด ๊ณต์‹์ ์œผ๋กœ ์ง€์›๋˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.

@bensander ROCm ๋Ÿฐํƒ€์ž„์€ Ubuntu 16.04.3์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๊นŒ? ๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ์ž‘๋™์‹œํ‚ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.

์ถ”์‹ : AMDGPU-Pro ์„ค์ •์ด Ubuntu 16.04.3์—์„œ ์ž‘๋™ํ•˜๋Š”์ง€ ์—ฌ๋ถ€์— ๋Œ€ํ•œ ํ†ต์ฐฐ๋ ฅ์ด ์žˆ์Šต๋‹ˆ๊นŒ? ๋‹ค๋ฅธ ํ”„๋กœ์ ํŠธ์— ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

ํ , ๋‚˜๋Š” Ubuntu๋ฅผ ์–ด๋””์—์„œ๋‚˜ ์ฆ๊ธฐ์ง€ ์•Š์ง€๋งŒ (๊ทธ๋ฆฌ๊ณ ํ•˜์ง€ ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค) CentOS 7 w/ repos์™€ GTX1080TI๊ฐ€ ์žˆ๊ณ  ์ปค๋„ 4.14.x์™€ ์ตœ์‹  Nvidia ๋ฒ ํƒ€ ๋“œ๋ผ์ด๋ฒ„๋ฅผ ์‹คํ–‰ํ•˜๋ฏ€๋กœ ํ…Œ์ŠคํŠธํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋„์›€์ด ๋œ๋‹ค๋ฉด ์˜ค๋Š˜ ์–ด๋Š ์‹œ์ ์— ๊ฑฐ๊ธฐ์— ์žˆ์Šต๋‹ˆ๊นŒ?

--
์ƒ˜ ๋งฅ๊ทธ๋กœ๋“œ

2017๋…„ 12์›” 7์ผ 07:28์— M. Edward (Ed) Borasky [email protected] ์ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

@bensander ROCm ๋Ÿฐํƒ€์ž„์€ Ubuntu 16.04.3์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๊นŒ? ๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ์ž‘๋™์‹œํ‚ค์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.

โ€”
๋‹น์‹ ์ด ๋Œ“๊ธ€์„ ๋‹ฌ์•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ๋ณด๊ฑฐ๋‚˜ ์Šค๋ ˆ๋“œ๋ฅผ ์Œ์†Œ๊ฑฐํ•˜์„ธ์š”.

@sammcj ์™„๋ฒฝํ•˜๊ฒŒ ์ข‹์€ CUDA ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์žˆ๋Š”๋ฐ OpenCL๊ณผ ํ•จ๊ป˜ NVidia GPU๋ฅผ ์‹คํ–‰ํ•˜๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

๋‹จ์ง€ ๋‹น์‹ ์„ ์œ„ํ•ด ๊ทธ๊ฒƒ์„ ํ…Œ์ŠคํŠธํ•˜๋Š” ๋ฐ ๋„์›€์ด!

์† ํ…Œ์ŠคํŠธ๊ฐ€ ํ•„์š”ํ•˜์ง€ ์•Š๋‹ค๋ฉด ๊ฑฑ์ •ํ•˜์ง€ ๋งˆ์„ธ์š”. ์ œ๊ฐ€ ์ œ์•ˆํ•  ๊ฑฐ๋ผ๊ณ  ์ƒ๊ฐํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” cuda TBH๋กœ ๊ทธ ๋จธ์‹ ์„ ์‹œ๋„ํ•˜์ง€๋„ ์•Š์•˜๊ณ , ํ˜„์žฌ๋กœ์„œ๋Š” Docker๋ฅผ ํ†ตํ•ด OpenCL์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†๋Š” MacOS์—์„œ๋งŒ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค.

--
์ƒ˜ ๋งฅ๊ทธ๋กœ๋“œ

2017๋…„ 12์›” 7์ผ 08:16 M. Edward (Ed) Borasky [email protected] ์ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

@sammcj ์™„๋ฒฝํ•˜๊ฒŒ ์ข‹์€ CUDA ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์žˆ๋Š”๋ฐ OpenCL๊ณผ ํ•จ๊ป˜ NVidia GPU๋ฅผ ์‹คํ–‰ํ•˜๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ๋ณด๊ฑฐ๋‚˜ ์Šค๋ ˆ๋“œ๋ฅผ ์Œ์†Œ๊ฑฐํ•˜์„ธ์š”.

@znmeb ์ €๋Š” ComputeCpp SYCL์„ ์‹œ๋„ํ•˜๋ ค๊ณ  ํ–ˆ์ง€๋งŒ ์šฐ๋ถ„ํˆฌ ์„ค์น˜ ํ”„๋กœ๊ทธ๋žจ๋งŒ ์ œ๊ณตํ•˜๊ณ (์ €๋„ ์•„์น˜์— ์žˆ์Šต๋‹ˆ๋‹ค) aur ์„ค์น˜ ์Šคํฌ๋ฆฝํŠธ๊ฐ€ ์†์ƒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ž‘๋™ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์†Œ์‹์„ ๋“ฃ๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ๋‚ด๊ฐ€ ์ถฉ๋ถ„ํžˆ ์ ˆ๋งํ•œ๋‹ค๋ฉด ๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ์‹œ๋„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
@bensander ์ •ํ™•ํžˆ ADM ์ง€์›์„ ๋ฐ›๋Š” ๋ฐ ํ•„์š”ํ•œ ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ด์ง€๋งŒ ์ด ์ฝ”๋“œ๊ฐ€ TF๋กœ ๋ฐฑํฌํŒ…๋˜์ง€ ์•Š์•˜๊ณ  ์ฝ”๋“œ๊ฐ€ TF 1.4๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•œ๋‹ค๋Š” ์ ์„ ๊ฐ์•ˆํ•  ๋•Œ ํ•ด๋‹น ์ฝ”๋“œ๊ฐ€ 2๊ฐœ์›” ์ „์— ๋งˆ์ง€๋ง‰์œผ๋กœ ์—…๋ฐ์ดํŠธ๋˜์—ˆ๋‹ค๋Š” ์‚ฌ์‹ค์ด ๊ฑฑ์ •๋ฉ๋‹ˆ๋‹ค. 0
ํ˜„์žฌ๋กœ์„œ๋Š” tensorflow๊ฐ€ ๊ธฐ๋ณธ์ ์œผ๋กœ ์‚ฌ์šฉ์ž๋ฅผ Nvidia์— ์—ฐ๊ฒฐํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ ์–ด๋„ ์šฐ๋ฆฌ "ํ•„๋ฉธ์˜" ํ”„๋กœ๊ทธ๋ž˜๋จธ์—๊ฒŒ๋Š” ๊ทธ๋ ‡์Šต๋‹ˆ๋‹ค. ๋ฌธ์„œ ๋ถ€์กฑ/์—…๋ฐ์ดํŠธ๋œ ๋กœ๋“œ๋งต์€ ๋„์›€์ด ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๋‚ด๊ฐ€ ํ•  ์ˆ˜ ์žˆ๋Š” ์–ด๋–ค ์‹์œผ๋กœ๋“  ๋„์™€์ฃผ๋Š” ๊ฒƒ์„ ๊บผ๋ฆฌ์ง€ ์•Š์„ ๊ฒƒ์ด์ง€๋งŒ, ๋‚˜๋Š” ์ง€๊ธˆ๊นŒ์ง€ ์ผ์„ ํ•˜๋Š” ๋ฐ ๊ฑฐ์˜ ์„ฑ๊ณตํ•˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.

@pfc Arch์—์„œ ComputeCpp SYCL์ด ์ž‘๋™ํ•˜๋„๋ก ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‚ด๊ฐ€ ํ•  ๋•Œ ์›น์‚ฌ์ดํŠธ์— ๋ฐ”์ด๋„ˆ๋ฆฌ tarball์ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

SYCL 1.2.1 ๋ฆด๋ฆฌ์Šค์— ๋Œ€ํ•œ ์ด ๋‰ด์Šค์—์„œ
https://www.roboticstomorrow.com/news/2017/12/06/the-khronos-group-releases-finalized-sycl-121-/11107/
๊ทธ๊ฒƒ์€ ๋งํ•œ๋‹ค :
_์ƒˆ๋กœ์šด ์‚ฌ์–‘์€ 3๊ฐœ์˜ ๊ฐœ๋ณ„ ๊ตฌํ˜„์—์„œ ์–ป์€ ์ค‘์š”ํ•œ ๊ฒฝํ—˜๊ณผ TensorFlow์™€ ๊ฐ™์€ ๊ธฐ๊ณ„ ํ•™์Šต ํ”„๋ ˆ์ž„์›Œํฌ ๊ฐœ๋ฐœ์ž์˜ ํ”ผ๋“œ๋ฐฑ์„ ํ†ตํ•ฉํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ํ˜„์žฌ ์›๋ž˜ CUDA ๊ฐ€์†๊ธฐ ๋ฐฑ์—”๋“œ์™€ ํ•จ๊ป˜ SYCL์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค._

์ด์ œ SYCL์ด ๊ตฌ์ถ•๋œ OpenCL 1.2๋ฅผ ์ง€์›ํ•˜๋Š” AMD GPU์—์„œ TensorFlow๋ฅผ "์‰ฝ๊ฒŒ" ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ์˜๋ฏธ์ž…๋‹ˆ๊นŒ?

AMD ํ•˜๋“œ์›จ์–ด์šฉ ์ผ๋ถ€ ์ €์ˆ˜์ค€ ์†Œํ”„ํŠธ์›จ์–ด/๋“œ๋ผ์ด๋ฒ„/๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” ํ•˜๋“œ์›จ์–ด๋‚˜ TensorFlow ๋˜๋Š” OpenCL ํ‘œ์ค€ ๋˜๋Š” SYCL์ด ์•„๋‹Œ ๋Œ€๋ถ€๋ถ„์˜ ๊ณ ์žฅ๋‚œ ๋ฌผ๊ฑด์ด ์žˆ๋Š” ๊ณณ์ด๋ผ๋Š” ์˜๋ฏธ์—์„œ "์‰ฝ๊ฒŒ"์ž…๋‹ˆ๋‹ค. ;-) ์ž‘๋™ํ•˜๋Š” AMD GPU ๋“œ๋ผ์ด๋ฒ„์™€ ์ž‘๋™ํ•˜๋Š” OpenCL ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์žˆ๋‹ค๋ฉด AMD GPU์— TensorFlow๊ฐ€ ์žˆ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

AMD Bonaire(Sea Islands ์•„ํ‚คํ…์ฒ˜)์— ๋Œ€ํ•œ ๋‚˜์˜ ์ž‘์—… ์„ค์ •:

amdgpu ์ปค๋„ ๋ชจ๋“ˆ์ด ๋กœ๋“œ๋˜๊ณ  radeon ์ปค๋„ ๋ชจ๋“ˆ์ด ๋ธ”๋ž™๋ฆฌ์ŠคํŠธ์— ์žˆ๋Š” Arch Linux
์•„์น˜ ์‚ฌ์šฉ์ž ์ €์žฅ์†Œ ํŒจํ‚ค์ง€ opencl-amd
ComputeCpp ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ
@lukeiwanski ์˜ ํฌํฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‚ด ์›Œํฌ์Šคํ…Œ์ด์…˜์˜ ์†Œ์Šค์—์„œ ๋นŒ๋“œ๋œ TensorFlow:

https://github.com/tensorflow/tensorflow/issues/22#issuecomment-334154564

"AMD GPU ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ์ž‘๋™ํ•˜๊ณ  OpenCL ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์ž‘๋™ํ•œ๋‹ค๋ฉด AMD GPU์— TensorFlow๊ฐ€ ์žˆ๋Š” ๊ฒƒ"์ด๋ผ๊ณ  ๋งํ•œ ๊ฒƒ์— ์•ฝ๊ฐ„ ๋†€๋ž์Šต๋‹ˆ๋‹ค. TensorFlow "๊ณต์‹" ๋ฒ„์ „์ด OpenCL์—์„œ ์‹คํ–‰๋˜์ง€ ์•Š๋Š”๋‹ค๋Š” ๊ฒƒ์„ ์ดํ•ดํ–ˆ์Šต๋‹ˆ๋‹ค(CUDA๋งŒ ํ•ด๋‹น). ์ œ๊ฐ€ ํ—ท๊ฐˆ๋ ธ๋˜ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.
์ ์–ด๋„ ์ผ๋ถ€ Keras ์ฝ”๋“œ๊ฐ€ AMD Redeon HD 6970์ด ์„ค์น˜๋œ iMac์—์„œ ์‹คํ–‰๋˜๋„๋ก ํ—ˆ์šฉํ•˜๋Š” PlaidML ํ”„๋กœ์ ํŠธ๋ฅผ ์ฐพ์•„์„œ ๋งค์šฐ ๊ธฐ๋ปค์Šต๋‹ˆ๋‹ค. (https://groups.google.com/forum/#!topic/plaidml-dev/ksFMgxjgKrM ) AFAIK ๋‹น์‹ ์€ ๋˜ํ•œ ๊ทธ ํ”„๋ ˆ์ž„ ์›Œํฌ๋ฅผ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค.
Tensorflow๊ฐ€ ์ด๋ฏธ ์‹คํ–‰ ์ค‘์ด๋ฉด Ubuntu VirtualBox์—์„œ TensorFlow๋ฅผ ์‹คํ–‰ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค(CPU๋งŒ ํ•ด๋‹น).

@PALYGAP VirtualBox๊ฐ€ Mac ํ˜ธ์ŠคํŠธ์—์„œ Linux ๊ฒŒ์ŠคํŠธ๋กœ OpenCL์„ ๋‚ด๋ณด๋‚ด์ง€ ์•Š๋Š”๋‹ค๊ณ  ์ƒ๊ฐํ•˜๊ณ  Ubuntu 16.04.3์ด ํ˜„์žฌ ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ €๋Š” Mac์ด ์—†์œผ๋ฏ€๋กœ ํ…Œ์ŠคํŠธํ•  ๋ฐฉ๋ฒ•์ด ์—†์Šต๋‹ˆ๋‹ค.

OpenCL์„ ํ†ตํ•ด AMD์—์„œ TensorFlow ์ž‘์—…์„ ์„ฑ๊ณต์ ์œผ๋กœ ์‹œ๋„ํ•˜๊ณ  ์„ฑ๊ณตํ•œ ์‚ฌ๋žŒ์ด ์žˆ์Šต๋‹ˆ๊นŒ?.

@mohnkhan @lukeiwanski ํฌํฌ๊ฐ€ ์ž‘๋™ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค(Arch Linux) - https://github.com/tensorflow/tensorflow/issues/22#issuecomment-349877056 ์ฐธ์กฐ . ๋ธ”๋กœ๊ทธ ๊ฒŒ์‹œ๋ฌผ์„ ๊ฒŒ์‹œํ•˜๊ธฐ ์ „์— ๋” ๋งŽ์€ AMDGPU-Pro ์ž‘์—…์„ ๊ธฐ๋‹ค๋ฆฌ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. https://github.com/corngood/archlinux-amdgpu/pull/54 ์ฐธ์กฐ .

@znmeb ์ž…๋ ฅ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค

@mohnkhan BTW, AMD๋Š” ์ปดํŒŒ์ผ๋Ÿฌ ๋„๊ตฌ ์ฒด์ธ์„ ์‚ฌ์šฉํ•˜์—ฌ CUDA ์ฝ”๋“œ๋ฅผ OpenCL ์ฝ”๋“œ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์™„์ „ํžˆ ์˜คํ”ˆ ์†Œ์Šค์ธ ๋Œ€์ฒด ๊ฒฝ๋กœ๋ฅผ ๊ตฌ์ถ•ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ €์™€ ๊ฐ™์€ ์˜ค๋ž˜๋œ ์นด๋“œ์˜ ์ƒํƒœ๊ฐ€ ์–ด๋–ค์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.

๊ธฐ์‚ฌ๋ฅผ ์“ฐ์‹ค ์˜ˆ์ •์ด๋ผ๋ฉด ์„ค๋ช…๋„ ํ•ด์ฃผ์‹œ๋Š” ๊ฒƒ๋„ ๋‚˜์˜์ง€ ์•Š์„ ๊ฒƒ ๊ฐ™์•„์š”(์ „์ฒด ๊ทธ๋ฆผ์„ ๋ณด๋Š” ๋ฐ 3์‹œ๊ฐ„ ์†Œ์š”).

๊ทธ๋ฆฌ๊ณ  ์ด๊ฒƒ์ด ๊ฒฐ๊ตญ ๋‹น์‹ ์„ ์—ด๋งํ•˜๋Š” OpenCL 1.2(w/ cl_khr_spir ext)๋กœ ๋ฐ”๋กœ ์ธ๋„ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋Œ€์‹  HIP๋Š” SYCL๊ณผ ๋ฐ˜๋Œ€ ๋˜๋Š” ๋˜ ๋‹ค๋ฅธ ๋ฐฑ์—”๋“œ์ด๋ฉฐ ROCm๋งŒ์„ ๋Œ€์ƒ์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค(๋˜๋Š” nvidia gpu๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ cuda๋„ ๋งˆ์ฐฌ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๊ฒƒ์€ ๋˜ ๋‹ค๋ฅธ ์ด์•ผ๊ธฐ์ž…๋‹ˆ๋‹ค).

AMD๋Š” ์ปดํŒŒ์ผ๋Ÿฌ ํˆด์ฒด์ธ์„ ์‚ฌ์šฉํ•˜์—ฌ CUDA ์ฝ”๋“œ๋ฅผ OpenCL ์ฝ”๋“œ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์™„์ „ํ•œ ์˜คํ”ˆ ์†Œ์Šค์ธ ๋Œ€์ฒด ๊ฒฝ๋กœ๋ฅผ ๊ตฌ์ถ•ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์•„๋‹ˆ์š”. ๋‹น์‹ ์€ HIP์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ .. ์‹ค์ œ๋กœ ๊ทธ๊ฒŒ ์ „๋ถ€์ž…๋‹ˆ๋‹ค. ๊ฒฐ๊ตญ ์ฝ”๋“œ ๋ฅผ โ€‹โ€‹. OpenCL์ด ์•„๋‹™๋‹ˆ๋‹ค .
๊ทธ๋Ÿฐ ๋‹ค์Œ HIP๋Š” ๋‚ด๊ฐ€ ๋งํ–ˆ๋“ฏ์ด ROCm์—์„œ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค.
ROCm์€ OpenCL์„ ์‹คํ–‰ํ•˜๋Š” ๊ธฐ๋Šฅ ์ด๊ธฐ๋„ ํ•˜์ง€๋งŒ (์ง€์›๋˜๋Š” ์นด๋“œ ์—์„œ) ๊ด€๊ณ„๊ฐ€ "ํ•˜์œ„ ๊ณ„์ธต ๋‚ด"๊ฐ€ ์•„๋‹Œ ROCm์—์„œ๋งŒ ์–ด๋–ป๊ฒŒ ์ „๋‹ฌ๋˜๋Š”์ง€ ๋ชจ๋“  ์‚ฌ๋žŒ์—๊ฒŒ ๊ฐ•์กฐํ•ฉ๋‹ˆ๋‹ค.

๋‹น์‹ ์ด ์ƒ๊ฐํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์€ ๊ณ ์ˆ˜ ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ €์™€ ๊ฐ™์€ ์˜ค๋ž˜๋œ ์นด๋“œ์˜ ์ƒํƒœ๊ฐ€ ์–ด๋–ค์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.

์—ฌ๊ธฐ ์— ์š”์•ฝํ•˜์ž๋ฉด: ์™„์ „ํ•œ ๊ธฐ๋Šฅ์„ ๊ฐ–์ถ˜ AMDGPU-PRO, amdgpu-pro-opencl-only ๋“œ๋ผ์ด๋ฒ„ ์ง€๊ธˆ ํ•˜๊ณ  ์žˆ๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ... ๋˜๋Š” ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๋งˆ์นจ๋‚ด ํด๋กœ๋ฒ„๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋งŒ๋“ค ๋•Œ๊นŒ์ง€ 10๋…„์ด ๋๋‚  ๋•Œ๊นŒ์ง€ ๊ณ„์† ๊ธฐ๋‹ค๋ฆฌ์‹ญ์‹œ์˜ค.

๋˜ํ•œ, fglrx... ํ•˜์ง€๋งŒ pre-gcn ์นด๋“œ์— ๊ถŒ์žฅํ•˜๊ธฐ ์–ด๋ ต๋‹ค๋ฉด ๋ฒ ์ผ์„ ๋ฎ๋Š” ๊ฒƒ์ด ๋” ๋‚˜์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

@mirh

  1. ์ €๋Š” GCN ์ด์ „ ์นด๋“œ์— ๋Œ€ํ•ด์„œ๋Š” ๊ด€์‹ฌ์ด ์—†์Šต๋‹ˆ๋‹ค. ๋‚ด ๊ฒƒ์€ ๋ฐ”๋‹ค ์„ฌ์ด๊ณ  ๋‚˜๋Š” ๋” ์˜ค๋ž˜๋œ ๊ฒƒ์„ ๊ตฌ์ž…ํ•  ๊ณ„ํš์ด ์—†์Šต๋‹ˆ๋‹ค. ๋‹ค์‹œ ๋งํ•˜์ง€๋งŒ, ๋‹ค๋ฅธ AMD GPU๋„ ๊ตฌ์ž…ํ•  ๊ณ„ํš์ด ์—†์Šต๋‹ˆ๋‹ค. ;-)
  2. ROCm์ด ๋‚ด ์›Œํฌ์Šคํ…Œ์ด์…˜์—์„œ ์‹คํ–‰๋˜๋Š”์ง€ ์—ฌ๋ถ€๋Š” ์•Œ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์˜ˆ ๋˜๋Š” ์•„๋‹ˆ์˜ค๋กœ ๋‹ตํ•  ์ˆ˜ ์žˆ๋Š” ์˜คํ”ˆ ์†Œ์Šค ํ•˜๋“œ์›จ์–ด ํ…Œ์Šคํ„ฐ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ๊ฒƒ์— ๋Œ€ํ•œ ๋ฌธ์ œ๋ฅผ ์—ด์—ˆ๊ณ  ์‘๋‹ต์„๋ฐ›์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.
  3. SPIR-V๋Š” ์ปดํŒŒ์ผ๋Ÿฌ ๋Œ€์ƒ์ž…๋‹ˆ๋‹ค. ์ปดํŒŒ์ผ๋Ÿฌ ์ž‘์„ฑ์ž๋ฅผ ๊ณ ์šฉํ•  ์˜ˆ์‚ฐ์ด ์—†์–ด์„œ ์‚ดํŽด๋ณด๊ณ  ์†์„ ๋—์Šต๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ SYCL์„ ๋– ๋‚˜๊ฑฐ๋‚˜ ... ๋˜๋Š” ๋‹ค๋ฅธ ๋‘ ์†์„ ๋“ค๊ณ  TensorFlow, Theano(์ •์ง€๋˜๊ณ  ์žˆ์Œ), CNTK ๋˜๋Š” PlaidML ๋ฐฑ์—”๋“œ๊ฐ€ ์žˆ๋Š” Keras๋กœ ๋ชจ๋“  ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ˆœ์ „ํžˆ ์—”์ง€๋‹ˆ์–ด๋ง ๊ฒฝ์ œํ•™์˜ ๊ด€์ ์—์„œ ๋ณผ ๋•Œ Keras / PlaidML์€ TensorBoard๋ฅผ ์–ด๋–ป๊ฒŒ๋“  ์–ป์„ ์ˆ˜ ์žˆ๋‹ค๋ฉด ํฐ ์Šน์ž์ž…๋‹ˆ๋‹ค.

@mirh ๋ชจ๋“  ๋งํฌ๊ฐ€ ํฌํ•จ๋œ ์ข‹์€ ์š”์•ฝ์— ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค. 3์‹œ๊ฐ„์„ ๋‚ญ๋น„ํ•˜์ง€ ์•Š์œผ์‹  ๊ฒƒ ๊ฐ™์•„์š”... :-)

ROCm์ด ๋‚ด ์›Œํฌ์Šคํ…Œ์ด์…˜์—์„œ ์‹คํ–‰๋˜๋Š”์ง€ ์—ฌ๋ถ€๋Š” ์•Œ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์˜ˆ ๋˜๋Š” ์•„๋‹ˆ์˜ค๋กœ ๋‹ตํ•  ์ˆ˜ ์žˆ๋Š” ์˜คํ”ˆ ์†Œ์Šค ํ•˜๋“œ์›จ์–ด ํ…Œ์Šคํ„ฐ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ๊ฒƒ์— ๋Œ€ํ•œ ๋ฌธ์ œ๋ฅผ ์—ด์—ˆ๊ณ  ์‘๋‹ต์„๋ฐ›์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.

๋‚ด๊ฐ€ ๋‹น์‹ ์—๊ฒŒ ๊ฝค ์—ฌ๋Ÿฌ ๋ฒˆ ๋งํ–ˆ๋“ฏ์ด, ๊ทธ๊ฒƒ์€ ์ž‘๋™ํ•˜์ง€ ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
Pre GCN 3์„ธ๋Œ€ GPU์—๋Š” ROCm์ด ์ „ํ˜€ ์ž‘๋™ํ•˜๊ฑฐ๋‚˜ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ๋Š” ํ•˜๋“œ์›จ์–ด๊ฐ€ ๋ถ€์กฑํ•ฉ๋‹ˆ๋‹ค.

SPIR(-V).. ๋ฌด์Šจ ๋ง์”€์„ ํ•˜์‹œ๋Š”์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค. ๊ทธ๊ฑธ ์‹ ๊ฒฝ์“ฐ๋Š”๊ฑด ๋‹น์‹ ์˜ ์ผ์ด ์•„๋‹™๋‹ˆ๋‹ค. Computecpp๋Š” SYCL "๋ช…๋ น"์—์„œ ์ด๋ฅผ ๋งŒ๋“ค๊ณ  ๋ชจ๋“  (opencl) ๋“œ๋ผ์ด๋ฒ„ ๋น„์ฆˆ๋‹ˆ์Šค์ž…๋‹ˆ๋‹ค.

๋‹น์‹ ์€ ๋‚ด๊ฐ€ ์ž ์ •์ ์œผ๋กœ amdgpu-pro-opencl-only๋ผ๊ณ  ๋ถ€๋ฅด๋Š” ๊ฒƒ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋Š”๋ฐ, ๊ทธ๋•Œ ๋ฌด์—‡์ด โ€‹โ€‹๋ฌธ์ œ์ธ์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.
ํŽธ์ง‘: Luke์˜ ์ฝ”๋“œ๊ฐ€ ์ฐฉ๋ฅ™ํ•˜๊ธฐ ์œ„ํ•ด ์ผ์ข…์˜ ETA๋ฅผ ๊ฐ–๋Š” ๊ฒƒ๋„ ๋ฉ‹์งˆ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@znmeb ์™€ ์—ฌ๋Ÿฌ๋ถ„

(L) ์šฐ๋ถ„ํˆฌ 17.10์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ปค๋„ 4.14.x ๋ฐ AMDGPU Pro 17.40 ๋“œ๋ผ์ด๋ฒ„์˜ OpenCL ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ถ€๋ถ„์ด ์‹คํ–‰๋˜๊ณ  ๋‚ด AMD A12-9800E APU์—์„œ ๋ฌธ์ œ ์—†์ด clinfo ๋˜๋Š” Boinc(์˜ˆ: Engima @Home , Milkyway@Home)์™€ ๊ฐ™์€ OpenCL ์‘์šฉ ํ”„๋กœ๊ทธ๋žจ์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ tensorflow(ํ˜„์žฌ ๋ฒ„์ „ 1.4.1) CPU ๋ฒ„์ „์„ ์„ฑ๊ณต์ ์œผ๋กœ ์ปดํŒŒ์ผํ•˜๊ณ  ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ tensorflow์˜ OpenCL ๋ฒ„์ „์„ ์„ฑ๊ณต์ ์œผ๋กœ ์ปดํŒŒ์ผํ•˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ €๋Š” ๋ฐ”๋‹๋ผ ํ…์„œํ”Œ๋กœ์šฐ 1.4.1 ๋ฐ @lukeiwanski ํฌํฌ์˜ "dev/ amd_gpu " ๋ถ„๊ธฐ์™€ ํ•จ๊ป˜ computecpp 0.5(๋“ฑ๋กํ•  ํ•„์š” ์—†์ด ๋‹ค์šด๋กœ๋“œํ•  ์ˆ˜ ์žˆ๋Š” ํ˜„์žฌ ๋ฒ„์ „)๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ tensorflow์˜ OpenCL ๋ฒ„์ „์„ ์„ฑ๊ณต์ ์œผ๋กœ ์ปดํŒŒ์ผํ•œ ์‚ฌ๋žŒ์ด ์–ด๋–ค ๋ฒ„์ „์˜ computecpp ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์™€ ์–ด๋–ค tensorflow git์˜ ๋ถ„๊ธฐ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋Š”์ง€ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค

@AlphasCodes Ubuntu์—์„œ ์‹คํ–‰ ์ค‘์ธ ๊ฒƒ์ด ์—†์Šต๋‹ˆ๋‹ค. ๋ชจ๋“  ์ž‘์—…์€ Arch์— ์žˆ์Šต๋‹ˆ๋‹ค. ์‹œ์Šคํ…œ์ด Ubuntu 16.04.3์œผ๋กœ ์ด์ค‘ ๋ถ€ํŒ…๋˜์—ˆ์ง€๋งŒ AMD ๋…์  ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” ์•„์ง ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋‚ด๊ฐ€ ์•„๋Š” ํ•œ 17.10์—์„œ๋Š” ์ง€์›๋˜์ง€ ์•Š์ง€๋งŒ 17.10์—์„œ ์ž‘๋™ํ•˜๋Š” OpenCL ์กฐ๊ฐ์ด ์žˆ๋Š” ๊ฒฝ์šฐ ์„ธ ๋ฒˆ์งธ ๋ถ€ํŒ…์„ ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋””์Šคํฌ ๊ณต๊ฐ„์ด ์ถฉ๋ถ„ํ•ฉ๋‹ˆ๋‹ค. ;-)

์–ด๋–ค ์ข…๋ฅ˜์˜ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๊นŒ? ๋นŒ๋“œ ์˜ค๋ฅ˜์ธ ๊ฒฝ์šฐ Bazel ๋น„ํ˜ธํ™˜์„ฑ์ด ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Bazel์€ TensorFlow์ฒ˜๋Ÿผ ๋Š์ž„์—†์ด ์•ž์œผ๋กœ ๋‚˜์•„๊ฐ€๊ณ  ์žˆ์œผ๋ฉฐ ๋•Œ๋กœ๋Š” ํ•˜๋‚˜๊ฐ€ ๋‹ค๋ฅธ ํ•˜๋‚˜๋ณด๋‹ค ์•ž์„œ๊ฒŒ ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

"์ง€์›๋˜์ง€ ์•Š์Œ"์€ ๋ฌด์—‡์„ ์˜๋ฏธํ•ฉ๋‹ˆ๊นŒ?

์ด .
์šฐ๋ถ„ํˆฌ์˜ ๊ฒฝ์šฐ 16.04.3๋งŒ ์ง€์›๋œ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.
ํŽธ์ง‘: '์™„์ „ํ•œ' AMDGPU-PRO ๋“œ๋ผ์ด๋ฒ„์—๋Š” ์ปค๋„ 4.9๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๋ฌธ์ œ์˜€์„ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

๊ด€์‹ฌ์ด ์žˆ๋Š” ์‚ฌ๋žŒ์ด ์žˆ๋‹ค๋ฉด AMDGPU-Pro Driver 17.40์„ Arch๋กœ ์ด์‹ํ•˜๋Š” ์ž‘์—…์ด ์ง„ํ–‰ ์ค‘์ด๋ฉฐ GitHub( https://github.com/corngood/archlinux-amdgpu/pull/54 )์—์„œ ๋งค์šฐ ํ™œ์„ฑํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

@mirh ๊ฐ€ ์ง€์ ํ–ˆ๋“ฏ์ด TensorFlow๋Š” OpenCL์ด ์•„๋‹Œ SYCL์„ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด ๋ฌธ์ œ๋ฅผ ๋‹ซ์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค. "AMD ์นด๋“œ์˜ TensorFlow"๋ผ๋Š” ๋‹ค๋ฅธ ์นด๋“œ๋ฅผ ์—ด์–ด์•ผ ํ• ๊นŒ์š”?

์•„๋‹ˆ์š”, ์™„์ „ํžˆ ํ•ฉ๋ฒ•์ž…๋‹ˆ๋‹ค.
tensorflow๊ฐ€ ๊ฒฐ๊ตญ opencl ์žฅ์น˜์—์„œ ์‹คํ–‰๋˜๊ธฐ๋ฅผ ์›ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์ด ๋ชฉํ‘œ์ž…๋‹ˆ๋‹ค. ํ•ฉ๋ฒ•์ ์ด๊ณ  ๋์ž…๋‹ˆ๋‹ค.
์‹ค์ œ๋กœ SYCL์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ๋งํ•˜๋Š” ๊ฒƒ์€ ๋‚ด๊ฐ€ ๋งŒ๋“  ๊ธฐ์ˆ ์ ์ธ ์—‰ํ„ฐ๋ฆฌ์ผ ๋ฟ์ด์—ˆ์Šต๋‹ˆ๋‹ค. ๋งˆ์ˆ ์ฒ˜๋Ÿผ ๋ฌด์ž‘์œ„์ ์ธ ๊ธฐ์ˆ ์˜ ์ด๋Ÿฌํ•œ ๋ชจ๋“  ์•ฝ์–ด๊ฐ€ ์ €๋ฅผ ๋ฏธ์น˜๊ฒŒ ๋งŒ๋“ค์—ˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.
ํŽธ์ง‘: ๋‚˜๋Š” ๋˜ํ•œ ๊ทธ๋“ค์˜ ์—„์ฒญ๋‚œ ์ž‘์—…์— ๋Œ€ํ•ด ๋ชจ๋“  codeplay ์‚ฌ๋žŒ๋“ค์—๊ฒŒ ๊ฐ์‚ฌํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

amd๋ฅผ ์œ„ํ•ด ํŠน๋ณ„ํžˆ ์ œ์ž‘๋œ ๊ฒƒ์„ ์›ํ•œ๋‹ค๋ฉด hiptensorflow ๋ฅผ ํ™•์ธํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ROCm ์ „์šฉ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ œ๋ฐœ, ์ด ์ฃผ์žฅ์€ ๋’ค๋กœ ํ•˜๊ณ  ๊ฐ€์ž.

์ข‹์•„์š”. ๋นŒ๋“œ๋ฅผ ๋‹ค์‹œ ์ˆ˜ํ–‰ํ•˜๊ณ  ์ฃผ๋ง๊นŒ์ง€ ์ปดํŒŒ์ผ ์˜ค๋ฅ˜๋ฅผ ์ œ๊ณตํ•  ์ถฉ๋ถ„ํ•œ ์‹œ๊ฐ„์ด ์žˆ๋Š”์ง€ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ๊ธฐ์กด ๋ฌธ์„œ๋ฅผ ์ƒˆ github ์ €์žฅ์†Œ์— ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

์ž์„ธํ•œ ๋‚ด์šฉ์€ https://github.com/AlphasCodes/DeepLearning ์„ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค(๋‚ด ํ•˜๋“œ์›จ์–ด/์†Œํ”„ํŠธ์›จ์–ด ์„ค์ • + AMD OpenCL ์„ค์ • + Tensorflow ์„ค์ •).

@mirh ๋Š” "[...] [๋‹น์‹ ์„] ํ™”๋‚˜๊ฒŒ ๋งŒ๋“œ๋Š” ๋งˆ๋ฒ•์ฒ˜๋Ÿผ ๋ฌด์ž‘์œ„์ ์ธ ๊ธฐ์ˆ  [...]์˜ ์•ฝ์–ด"๋ฅผ ๋ช…ํ™•ํžˆ ํ•˜๊ธฐ ์œ„ํ•ด:

Khronos Group ์˜์—ญ์—์„œ OpenCL์€ ํ•˜์œ„ ์ˆ˜์ค€์˜ ๋น„๋‹จ์ผ ์†Œ์Šค API์ด๊ณ  SYCL์€ ์ƒ์œ„ ์ˆ˜์ค€์˜ ๋‹จ์ผ ์†Œ์Šค C++ DSeL(๋„๋ฉ”์ธ๋ณ„ ์ž„๋ฒ ๋””๋“œ ์–ธ์–ด)์ž…๋‹ˆ๋‹ค. SYCL์€ OpenCL ์œ„์— ๊ตฌ์ถ•๋  ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒ๋˜๋ฏ€๋กœ SYCL์„ ์‚ฌ์šฉํ•  ๋•Œ ์ „์ด์„ฑ์œผ๋กœ ์ธํ•ด OpenCL์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์Šต๋‹ˆ๋‹ค.

TensorFlow๋Š” ๋‹จ์ผ ์†Œ์Šค CUDA์™€ ํ•จ๊ป˜ ๋‹จ์ผ ์†Œ์Šค C++ ์ ‘๊ทผ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜๋Š” Eigen์„ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋‚˜์ค‘์— OpenCL๋กœ ์ด์‹๋˜์—ˆ์„ ๋•Œ ๋‹จ์ผ ์†Œ์Šค C++๋ฅผ ๊ฐ–๋Š” Khronos Group์˜ ํ‘œ์ค€ ๋ฐฉ์‹์ด๊ธฐ ๋•Œ๋ฌธ์— SYCL์ด ์„ ํƒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋Ÿฌ๋‚˜ CUDA์— ๋Œ€ํ•ด ์ƒ๊ฐํ•ด๋ณด๋ฉด ํ›จ์”ฌ ๋” ๋ฏธ๋ฌ˜ํ•ฉ๋‹ˆ๋‹ค.

๊ฑฐ์˜ ๋ชจ๋“  ์‚ฌ๋žŒ์ด ์‹ค์ œ๋กœ "CUDA Runtime API"๋ผ๋Š” CUDA์˜ ๊ณ ๊ธ‰ ๋‹จ์ผ ์†Œ์Šค ๋ฒ„์ „์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์–ด๋–ป๊ฒŒ ๋“  SYCL๊ณผ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค.
๊ทธ๋Ÿฌ๋‚˜ ์‹ค์ œ๋กœ OpenCL๊ณผ ์œ ์‚ฌํ•˜๊ณ  "CUDA ๋Ÿฐํƒ€์ž„ API" ๊ตฌํ˜„ ์ž์ฒด์—์„œ ์‚ฌ์šฉ๋˜๋Š” "CUDA ๋“œ๋ผ์ด๋ฒ„ API"๋ผ๊ณ  ํ•˜๋Š” CUDA์˜ ๋œ ์•Œ๋ ค์ง„ ์ €์ˆ˜์ค€ ๋น„ ๋‹จ์ผ ์†Œ์Šค ๋ฒ„์ „์ด ์žˆ์Šต๋‹ˆ๋‹ค.

์ผ์ข…์˜ FAQ์ด๊ธฐ ๋•Œ๋ฌธ์— https://en.wikipedia.org/wiki/SYCL ๋ฐ https://en.wikipedia.org/wiki/CUDA ๋ฅผ ์กฐ๊ธˆ ๋ช…ํ™•ํžˆ ํ–ˆ์Šต๋‹ˆ๋‹ค.

TensorFlow์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•˜๋Š” SYCL ๊ตฌํ˜„์ธ ComputeCpp๋Š” ์•„์ง Ubuntu 17.10์„ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ LTS์ธ Ubuntu 16.04๋ฅผ ๊ณ ์ˆ˜ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ง€์นจ ๋ฐ ์ „์ œ ์กฐ๊ฑด์€ https://developer.codeplay.com/computecppce/latest/getting-started-with-tensflow ์— ์žˆ์Šต๋‹ˆ๋‹ค.

์ œ์ณ๋‘๊ณ , TensorFlow์— ๋Œ€ํ•œ OpenCL ์ง€์›์€ ๋‹จ์ง€ AMD ์žฅ์น˜ ์ง€์›์„ ์˜๋ฏธํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. SYCL ํ†ตํ•ฉ์€ ๋‹ค๋ฅธ OpenCL ์žฅ์น˜๋„ ํ™œ์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค. TensorFlow๋กœ ์ˆ˜ํ–‰ ์ค‘์ธ ์ž‘์—…์˜ ์ผ๋ถ€๋กœ ARM ๋ฐ Intel GPU์— ๋Œ€ํ•œ ์ง€์›์€ ์ด๋“ค ํšŒ์‚ฌ์˜ ์ตœ์‹  ๋“œ๋ผ์ด๋ฒ„๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜๋ฉด ์ œ๊ณต๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๋˜ํ•œ R-Car ํ”Œ๋žซํผ์— ๋Œ€ํ•ด์„œ๋„ Renesas ๊ฐ€์†๊ธฐ ํ”„๋กœ์„ธ์„œ์— ๋Œ€ํ•œ ์ง€์›์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•ด ๋…ธ๋ ฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

@rodburns ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค! ๋‚˜๋Š” ์ด๊ฒƒ์ด Arch User Repository์˜ opencl-amd ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ Arch Linux(4.14.4 ์ปค๋„)์—์„œ ์ž‘๋™ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์นด๋“œ๋Š” Bonaire(GCN 2.0)์ž…๋‹ˆ๋‹ค. ํ•ด๋‹น ํŽ˜์ด์ง€์—์„œ ํ…Œ์ŠคํŠธ๋ฅผ ์‹คํ–‰ํ•˜์—ฌ ์ œ๋Œ€๋กœ ์ž‘๋™ํ•˜๋Š”์ง€ ํ™•์ธํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

GCN 2์„ธ๋Œ€(์ผ๋ช… 1.1)๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ 2.0์€ ์กด์žฌํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
(๋„ˆ๋ฌด ํ˜„ํ•™์ ์ด๋ผ๊ณ  ๊ตฌ๋ถ€๋ ค์•ผ ํ•จ)

์„ฑ๊ณต!

@lukeiwanski ํฌํฌ์˜ ์ตœ์‹  "dev/amd_gpu" ๋ถ„๊ธฐ ์ปค๋ฐ‹์ด ๋‚ด Tensorflow OpenCL ์ปดํŒŒ์ผ ๋ฌธ์ œ๋ฅผ ์ˆ˜์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. SysCL 1.2.1 ๊ด€๋ จ ์ปค๋ฐ‹์ด๋ผ๊ณ  ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค.

Tensorflow OpenCL ๋ฒ„์ „์„ ์„ฑ๊ณต์ ์œผ๋กœ ์ปดํŒŒ์ผํ•˜์—ฌ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋‚ด Tensorflow ์„ค์ • ๋ฌธ์„œ ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

๋˜ํ•œ ํ–ฅํ›„ ๋‹ค์–‘ํ•œ Tensorflow ์„ค์ •(๋น„ CPU ์ตœ์ ํ™”, CPU ์ตœ์ ํ™”, OpenCL)์—์„œ ์„ค์ •์˜ ์ผ๋ถ€ ๋ฒค์น˜๋งˆํฌ๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ๋Š” ๋ฒค์น˜๋งˆํฌ ํŽ˜์ด์ง€๋ฅผ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

AMDGPU Pro ๋“œ๋ผ์ด๋ฒ„ ๋ฒ„์ „ 17.50๋„ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๊ด€๋ จ AMD OpenCL ์„ค์ • ๋ฌธ์„œ๋ฅผ ์—…๋ฐ์ดํŠธํ–ˆ์Šต๋‹ˆ๋‹ค.

๋ชจ๋“  ๊ธฐ์—ฌ์ž์—๊ฒŒ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

๋ช‡ ๊ฐ€์ง€ ๋ฒค์น˜๋งˆํฌ๋ฅผ ์ˆ˜ํ–‰ํ–ˆ๋Š”๋ฐ iGPU๊ฐ€ matmul_bench.py โ€‹โ€‹๋ฒค์น˜๋งˆํฌ๋ฅผ ์ œ์™ธํ•˜๊ณ  ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ 4๊ฐœ์˜ CPU ์Šค๋ ˆ๋“œ๋ณด๋‹ค ๋Š๋ฆฐ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

OpenCL Tensorflow ์‹คํ–‰์˜ ์ดˆ๊ธฐํ™”๋Š” CPU ์ „์šฉ OpenCL Tensorflow ์‹คํ–‰๋ณด๋‹ค ํ›จ์”ฌ ๋Š๋ฆฝ๋‹ˆ๋‹ค. CPU์˜ ๊ฒฝ์šฐ 5์ดˆ, OpenCL์˜ ๊ฒฝ์šฐ 1-2๋ถ„ ์ •๋„์ž…๋‹ˆ๋‹ค.

์•„๋ฌด๋„ ๊ทธ๋Ÿฐ ๊ฒฐ๊ณผ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

์•Œ๊ฒ ์Šต๋‹ˆ๋‹ค. ๋ฌธ์ œ ํ•ด๊ฒฐ์„ ๋” ํ–ˆ์Šต๋‹ˆ๋‹ค.

  • Tensorflow MNIST ์˜ˆ์ œ๋ฅผ ์‚ฌ์šฉํ–ˆ์Šต๋‹ˆ๋‹ค . Tensorflow ์„ค์ • ํ™•์ธ ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.
  • ๋‚˜๋Š” "sudo cat /sys/kernel/debug/dri/0/amdgpu_pm_info"๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ iGPU ํด๋Ÿญ/๋กœ๋“œ๋ฅผ ํ™•์ธ/๊ด€์ฐฐํ•˜๊ณ  "top"์„ ์‚ฌ์šฉํ•˜์—ฌ CPU ๋กœ๋“œ๋ฅผ ํ™•์ธํ–ˆ์Šต๋‹ˆ๋‹ค.
  • 0๋‹จ๊ณ„๊นŒ์ง€์˜ ์ดˆ๊ธฐํ™” ๋‹จ๊ณ„๋Š” ์•ฝ 6๋ถ„์ด ๊ฑธ๋ ธ๊ณ , iGPU ๋กœ๋“œ๋Š” ์•ฝ 0%, iGPU ํด๋Ÿญ์€ 300MHz(์ตœ์†Œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ํด๋Ÿญ)์ด๊ณ  ํŒŒ์ด์ฌ ํ”„๋กœ์„ธ์Šค CPU ์‚ฌ์šฉ๋Ÿ‰์€ ์•ฝ 200%(= 2 ์Šค๋ ˆ๋“œ)์˜€์Šต๋‹ˆ๋‹ค.
  • 0๋‹จ๊ณ„๋ถ€ํ„ฐ iGPU ๋กœ๋“œ๋Š” ์•ฝ 90%์˜€๊ณ , iGPU ํด๋Ÿญ์€ ํ•ญ์ƒ 654MHz - 720MHz - 800MHz - 900MHz(์ตœ๋Œ€ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ํด๋Ÿญ)์—์„œ ์ „ํ™˜๋˜์—ˆ์œผ๋ฉฐ Python ํ”„๋กœ์„ธ์Šค CPU ์‚ฌ์šฉ๋Ÿ‰์€ ์•ฝ 100%(= 1 CPU)์˜€์Šต๋‹ˆ๋‹ค. ์‹ค)

์ €๋Š” ์—ฌ์ „ํžˆ Arch์—์„œ ์ปดํŒŒ์ผํ•  ์ˆ˜ ์žˆ๋„๋ก ๋…ธ๋ ฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์–ด์ œ ์‚ฌ์šฉํ•œ ๊ฒƒ .
14์‹œ๊ฐ„ ํ›„(์˜ˆ, ์ œ ๊ฐ์ž๋Š” ๋งค์šฐ ๋Š๋ฆฝ๋‹ˆ๋‹ค) ์‹œ๋„ํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด ์ด ๋ฐ”์ด๋„ˆ๋ฆฌ ๋ฅผ ์–ป์—ˆ์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” ๋ฌด์Šจ ์ผ์ด ์ผ์–ด๋‚˜๊ณ  ์žˆ๋Š”์ง€ ์•Œ์•„ ๋‚ด๋ ค๊ณ  ๋…ธ๋ ฅํ–ˆ์ง€๋งŒ ๋ถˆํ–‰ํžˆ๋„ ๋‚˜๋Š” ํ•  ์ˆ˜ ์—†์—ˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ์— ๋Œ€ํ•ด ์•„๋Š” ์‚ฌ๋žŒ์ด ์†๋„๋ฅผ ๋†’์ด๋Š” ๋ฐ ๋„์›€์„ ์ฃผ์‹œ๋ฉด ๊ฐ์‚ฌํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค!

์œ„์˜ ๋…ผ์˜ ๋Œ€๋ถ€๋ถ„์€ AMD ์นฉ์—์„œ OpenCL ๊ฐ€์†์œผ๋กœ Tensorflow๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๊ฒƒ๊ณผ ๊ด€๋ จ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋‚ด๊ฐ€ ์ด ๋ง์„ ํ•˜๋Š” ๊ฒƒ์ด ๋งž์Šต๋‹ˆ๊นŒ? Opencl์„ ์ง€์›ํ•˜๋Š” ํ†ตํ•ฉ ๊ทธ๋ž˜ํ”ฝ ์นด๋“œ(์ธํ…” HD 5000)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ GPU ๊ฐ€์† ํ…์„œํ”Œ๋กœ๋ฅผ ์–ป์œผ๋ ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ?

๋ฏธ๋ฆฌ ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค!

@znmeb ์•ˆ๋…•ํ•˜์„ธ์š” Ed, ๋‹ต๋ณ€ํ•ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๋‚ด ์‹œ์Šคํ…œ์—์„œ OpenCL์„ ๋‹ค์šด๋กœ๋“œํ•˜์—ฌ ์‹คํ–‰ ์ค‘์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋‚ด ์งˆ๋ฌธ์€ - ์‹ค์ œ๋กœ OpenCL ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด tensorflow๋ฅผ ์ปดํŒŒ์ผํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

https://developer.codeplay.com/computecppce/latest/getting-started-with-tensflow
๊ทธ๋ฆฌ๊ณ  ๊ฐ€์žฅ ์ค‘์š”ํ•œ ๊ฒƒ์€ ์ธํ…” https://github.com/codeplaysoftware/computecpp-sdk/issues/78#issuecomment -352411192์ž…๋‹ˆ๋‹ค.

@AlphaCode ๊ฒฐ๊ณผ๋ฅผ ๊ฒŒ์‹œํ•ด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์ดˆ๊ธฐํ™” ์‹œ๊ฐ„๊ณผ ๊ด€๋ จํ•˜์—ฌ OpenCL์ด ์ž‘๋™ํ•˜๋Š” ๋ฐฉ์‹์€ ์‹คํ–‰ ์ „์— ์ฝ”๋“œ๋ฅผ ์ปดํŒŒ์ผํ•˜๋Š” ๊ฒƒ์ด๋ฏ€๋กœ ์‹œ์ž‘ ์‹œ๊ฐ„์€ ์ปดํŒŒ์ผ ํ”„๋กœ์„ธ์Šค์ž…๋‹ˆ๋‹ค.

@brainwave Intel ์žฅ์น˜์˜ ๊ฒฝ์šฐ ์‹คํ–‰ ์ค‘์ธ ์žฅ์น˜์— ๋Œ€ํ•œ ์ œํ•œ์„ ์ œ๊ฑฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ค๋ช…ํ•˜๋Š” @mirh ๊ฐ€ ์žˆ๋Š” ์Šค๋ ˆ๋“œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค . ์ด๋Ÿฌํ•œ ์žฅ์น˜ ์œ ํ˜•์ด ์ œํ•œ๋œ ์ด์œ ๋Š” Intel ๋“œ๋ผ์ด๋ฒ„์— ๋ฌธ์ œ๊ฐ€ ์žˆ๋Š” ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ์ง€๋งŒ ์ง€์›์„ ๊ฐœ์„ ํ•˜๋Š” Intel ์žฅ์น˜์— ๋Œ€ํ•ด ์—…๋ฐ์ดํŠธ๋œ ๋“œ๋ผ์ด๋ฒ„๋ฅผ ๊ณง ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค. ๊ทธ๋™์•ˆ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์œผ๋กœ TensorFlow๋ฅผ ๋‹ค์‹œ ์ปดํŒŒ์ผํ•˜์—ฌ ์ž์ฒด Intel ํ•˜๋“œ์›จ์–ด๋ฅผ ํ…Œ์ŠคํŠธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ์ฝ”๋“œ๋ฒ ์ด์Šค์—์„œ ์žฅ์น˜ ์ œํ•œ์„ ์ œ๊ฑฐํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค.

@AlphasCodes ์—ฌ๋Ÿฌ๋ถ„, ์ˆœ์ง„ํ•œ ์งˆ๋ฌธ์— ๋Œ€ํ•ด ์‚ฌ๊ณผ๋“œ๋ฆฝ๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด ๋นŒ๋“œ๊ฐ€ AMD GPU ์ „์šฉ์ธ ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? OpenCL์ด ํ‘œ์ค€์ด ๋˜์–ด์•ผ ํ•˜์ง€ ์•Š์Šต๋‹ˆ๊นŒ? OpenCL 2.0 ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ์„ค์น˜๋œ Intel Carbon X1์—์„œ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š”๋‹ค๋Š” ๊ฒƒ์„ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์ดํ•ดํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ?

๋‘ ๋ฒˆ ๋งํฌ๋œ ๋ฌธ์ œ๋ฅผ ์ฝ์œผ๋ฉด amd gpu์— ๋Œ€ํ•œ ๋‚ด์šฉ์ด ์—†์Œ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์ธํ…”์€ ํ˜„์žฌ ์ œ์™ธ๋˜์ง€๋งŒ ์‚ฌ์šฉ์ž๋ฅผ ๊ฐ•์ œํ•˜๋ ค๋Š” ๊ฒƒ๊ณผ๋Š” ๊ด€๋ จ์ด ์—†์œผ๋ฉฐ ์ž„์‹œ ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์‹ค์ œ๋กœ ์žˆ๋Š” ๊ฒฝ์šฐ ๋…ผ์˜ํ•˜์‹ญ์‹œ์˜ค.

jupyter ๋…ธํŠธ๋ถ๊ณผ ํ•จ๊ป˜ amd_gpu ๋ถ„๊ธฐ๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ ์Šค๋ ˆ๋“œ๊ฐ€ ๋‚จ์•„ ์žˆ๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ํŒŒ์ด์ฌ์€ ๊ณ„์‚ฐ์ด ์™„๋ฃŒ๋œ ํ›„์—๋„ ์—ฌ์ „ํžˆ ํ•˜๋‚˜์˜ CPU๋ฅผ 100% ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ปค๋„์„ ๋‹ค์‹œ ์‹œ์ž‘ํ•˜๋ฉด ์ž˜๋ชป๋œ ์Šค๋ ˆ๋“œ๊ฐ€ ์ข…๋ฃŒ๋ฉ๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ์‚ฌ๋žŒ์ด ์ด๊ฒƒ์„ ๊ฒฝํ—˜ํ•ฉ๋‹ˆ๊นŒ?

@brainwave @unoexperto
AMD OpenCL ํ•˜๋“œ์›จ์–ด๋งŒ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— Intel OpenCL์— ๋Œ€ํ•ด ๋„์›€์„ ๋“œ๋ฆด ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

@desperadoduck
์ €๋Š” ์•„์ง jupyter๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ €๋Š” ์ผ๋ฐ˜ bash ์…ธ๊ณผ ๊ฐ€์ƒ Python 3 ํ™˜๊ฒฝ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค( ๋‚ด Python 3 + Tensorflow ์„ค์ • ์ฐธ์กฐ). ํ•˜์ง€๋งŒ ๋ฌธ์ œ๋ฅผ ์žฌํ˜„ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๊ณ„์‚ฐ์ด ์™„๋ฃŒ๋œ ํ›„์—๋Š” ํŒŒ์ด์ฌ ํ”„๋กœ์„ธ์Šค์—์„œ CPU ์‚ฌ์šฉ๋Ÿ‰์ด ์—†์Šต๋‹ˆ๋‹ค.

@rodburns
์ •๋ณด ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์ดˆ๊ธฐ ์ปดํŒŒ์ผ ์‹œ๊ฐ„์„ ๋‹จ์ถ•ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ์˜ˆ๋ฅผ ๋“ค์–ด 50%๋งŒ ์‚ฌ์šฉํ•˜๋Š” ๋Œ€์‹  ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋“  CPU ์Šค๋ ˆ๋“œ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

@brainwave @rodburns
Linux์—์„œ Intel GPU(Gen9)์˜ ๊ฒฝ์šฐ PlaidML์—์„œ ์ผ๋ฐ˜ ๋น„์ „ ๋„คํŠธ๋กœ ๋ฒค์น˜๋งˆํ‚นํ•  ๋•Œ Intel์˜ ์˜คํ”ˆ ์†Œ์Šค Beignet ๊ตฌํ˜„๊ณผ ํ์‡„ ์†Œ์Šค ๊ตฌํ˜„์œผ๋กœ ํ›จ์”ฌ ๋” ๋‚˜์€ DNN ์„ฑ๋Šฅ์„ ๋ณด์•˜์Šต๋‹ˆ๋‹ค. Beignet์€ ๋˜ํ•œ ์„ค์น˜ํ•˜๊ธฐ๊ฐ€ ๋” ์‰ฝ์Šต๋‹ˆ๋‹ค.

ubuntu17.10์—์„œ ์ธํ…” ๊ทธ๋ž˜ํ”ฝ hd615(7์„ธ๋Œ€ CPU)๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๊นŒ?

linux64์šฉ opencl dirver SRB5.0์€ ubuntu17.10์—์„œ ์ž˜ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ์˜ค๋žซ๋™์•ˆ ์—…๋ฐ์ดํŠธ๋˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค.
https://bitbucket.org/mehdi_goli/opencl/branch/IntelGPU

์‹ ์˜ ์‚ฌ๋ž‘์„ ์œ„ํ•ด ์œ„์˜ 2๊ฐœ(๋‘๊ฐœ!)๊ฐœ์˜ ๊ฒŒ์‹œ๋ฌผ๋งŒ ์ฝ์„ ์ˆ˜ ์—†์Šต๋‹ˆ๊นŒ?
https://github.com/codeplaysoftware/computecpp-sdk/issues/78 ์—์„œ ์ธํ…” GPU(๋˜๋Š” amd cpu) ์ง€์› ๋ถ€์กฑ์— ๋Œ€ํ•ด ๋…ผ์˜ํ•˜์‹ญ์‹œ์˜ค.

@znmeb ๋‹ค์–‘ํ•œ ์ปดํ“จํŒ… ๋ฆฌ์†Œ์Šค(์˜ˆ: cpu, gpu, DSP, ๊ธฐํƒ€ ๋ณด์กฐ ํ”„๋กœ์„ธ์„œ)๋ฅผ ์ตœ๋Œ€ํ•œ ํ™œ์šฉํ•˜๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ์ž…๋‹ˆ๋‹ค.
์‚ฌ์‹ค, ํ•˜๋“œ์›จ์–ด ๊ณต๊ธ‰์—…์ฒด์˜ ์ง€์›์— ๋”ฐ๋ผ ๋‹ค๋ฆ…๋‹ˆ๋‹ค: dirver ๋ฐ OS.
๋‚ด๊ฐ€ ์•„๋Š” ํ•œ, vedio ๋“œ๋ผ์ด๋ฒ„์˜ ์ œํ•œ์œผ๋กœ ์ธํ•ด Intel GPU์™€ nvida GPU๋ฅผ ๋™์‹œ์— ๋น„๋””์˜ค์šฉ์œผ๋กœ ํ™œ์„ฑํ™”ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. (๋‹น์‹ ์€ ๊ทธ๋“ค ์‚ฌ์ด๋ฅผ ์ „ํ™˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค).
๊ทธ๋Ÿฌ๋‚˜ opencl์€ ๋™์‹œ์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‘˜ ๋‹ค "์žฅ์น˜"์ž…๋‹ˆ๋‹ค.

@choongng ํฅ๋ฏธ๋กญ๊ฒŒ๋„ ์šฐ๋ฆฌ๋Š” Beignet์„ ํ™œ์„ฑํ™”ํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ฃผ๊ธฐ ์œ„ํ•ด ๋ช‡ ๊ฐ€์ง€ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ–ˆ์ง€๋งŒ ์ด ํ”„๋กœ์ ํŠธ์˜ ํ™œ๋™์€ ๋‹ค์†Œ ์กฐ์šฉํ•ด์ง„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

@znmeb ์˜ˆ, ์–ด๋–ค GPU๋ผ๋„ ์ž‘์€ ๋ฌธ์ œ์—์„œ ํ›จ์”ฌ ๋” ๋‚˜์€ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•˜์ง€ ๋ชปํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜๋„ ์•ฝ๊ฐ„์˜ ์ง„์ „์ด ์žˆ์–ด ๋‹คํ–‰์ž…๋‹ˆ๋‹ค!

@unoexperto ComputeCpp with TensorFlow๋Š” Intel GPU๋ฅผ ํฌํ•จํ•˜๋Š” SPIR OpenCL ์ค‘๊ฐ„ ๋ช…๋ น์–ด๋ฅผ ์ง€์›ํ•˜๋Š” ๋ชจ๋“  ํ•˜๋“œ์›จ์–ด์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ์—ฌ๊ธฐ ์Šค๋ ˆ๋“œ์—์„œ์™€ ๊ฐ™์ด ํ˜„์žฌ ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ํ˜„์žฌ ์ž‘๋™ํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•˜์ง€ ์•Š์•˜๊ธฐ ๋•Œ๋ฌธ์— ์˜๋„์ ์œผ๋กœ ์‹คํ–‰์„ ๋ฐฉ์ง€ํ–ˆ์Šต๋‹ˆ๋‹ค. . ์ผ๋ถ€ ์‚ฌ์šฉ์ž๊ฐ€ ๋‹ค๋ฅธ Intel ๋“œ๋ผ์ด๋ฒ„์™€ ํ•จ๊ป˜ ์ž‘๋™ํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋“ค๋ฆฌ๋ฏ€๋กœ ํ•ด๋‹น ์ œํ•œ์„ ์ œ๊ฑฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ OpenCL ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ์žˆ๋Š” ARM ๋ฐ Renesas ํ”„๋กœ์„ธ์„œ์— ๋Œ€ํ•ด ์ด ๊ธฐ๋Šฅ์„ ํ™œ์„ฑํ™”ํ•˜๊ธฐ ์œ„ํ•ด ๋…ธ๋ ฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

@sxpc722 ๊ทธ๋Ÿฌ๋ฉด ์ž‘๋™ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๊ฑด ๊ทธ๋ ‡๊ณ , ์ƒˆ ์‹œ์Šคํ…œ์€ Windows 10์ด๊ณ  ์ ˆ๋Œ€์ ์œผ๋กœ ํ•ด์•ผ ํ•  ๋•Œ๊นŒ์ง€ Linux๋กœ ์ด์ค‘ ๋ถ€ํŒ…ํ•  ๊ณ„ํš์ด ์—†์Šต๋‹ˆ๋‹ค! ๊ณต๊ธ‰์—…์ฒด์˜ ๋“œ๋ผ์ด๋ฒ„ ๋ฐ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ๋ฒ„๊ทธ๋ฅผ ์ถ”์ ํ•˜๋Š” ๊ฒƒ์ด ์ง€๊ฒน์Šต๋‹ˆ๋‹ค(AMD๋ฅผ ๋ณด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค). ์‚ฌ์‹ค, ๋™์ผํ•œ AMD ์ด์œ ๋กœ ๋‚ด ์›Œํฌ์Šคํ…Œ์ด์…˜์— Windows ํŒŒํ‹ฐ์…˜์„ ๋‘˜ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ;-)

14์ผ ๋™์•ˆ ํ™œ๋™์ด ์—†์—ˆ์œผ๋ฉฐ ์ด ๋ฌธ์ œ์—๋Š” ๋‹ด๋‹น์ž๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ์— ๋”ฐ๋ผ ๋ ˆ์ด๋ธ” ๋ฐ/๋˜๋Š” ์ƒํƒœ๋ฅผ ์—…๋ฐ์ดํŠธํ•˜์‹ญ์‹œ์˜ค.

๋‚ด ํ…Œ์ŠคํŠธ์— ๋”ฐ๋ฅด๋ฉด Tensorflow AMD OpenCL ์„ฑ๋Šฅ์€ ๋งค์šฐ ๋Š๋ฆฝ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๋‹ค๋ฅธ ๋”ฅ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋กœ ๋ช‡ ๊ฐ€์ง€ ๊ธฐ๋ณธ ํ…Œ์ŠคํŠธ๋ฅผ ํ–ˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ ๋‚ด GitHub ํŽ˜์ด์ง€์—์„œ ์„ค์ • ๋ฐ ๋ฒค์น˜๋งˆํฌ๋ฅผ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ธด ์ด์•ผ๊ธฐ๋ฅผ ์งง๊ฒŒ. ๋‹ค๋ฅธ ๋”ฅ ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ํ˜„์žฌ Tensorflow AMD OpenCL๋ณด๋‹ค ์•ฝ 10๋ฐฐ ๋น ๋ฆ…๋‹ˆ๋‹ค.

@AlphasCodes @znmeb TF ํŒ€์ด ์Šค๋ ˆ๋“œ๋ฅผ TF ์ „์šฉ์œผ๋กœ ์œ ์ง€ํ•˜๋Š” ๊ฒƒ์„ ์„ ํ˜ธํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. PlaidML ํ”„๋กœ์ ํŠธ์—์„œ PlaidML ๊ด€๋ จ ๋Œ€ํ™”๋ฅผ ํ˜ธ์ŠคํŠธํ•˜๊ฒŒ ๋˜์–ด ๊ธฐ์ฉ๋‹ˆ๋‹ค. ์ฆ‰, TensorFlow ์ž์ฒด์™€ ๋น„ OpenCL ํ”Œ๋žซํผ(์˜ˆ: ํ˜„์žฌ ํ”„๋กœํ† ํƒ€์ž… ํ˜•ํƒœ๋กœ ์กด์žฌํ•˜๋Š” iOS์šฉ Apple Metal)์„ ๊ถ๊ทน์ ์œผ๋กœ ์ง€์›ํ•˜๊ธฐ๋ฅผ ํฌ๋งํ•ฉ๋‹ˆ๋‹ค.

https://github.com/plaidml/plaidml

@choongng ๊ทธ์— ๋”ฐ๋ผ ๋‚ด ๋ฉ”์‹œ์ง€๋ฅผ ํŽธ์ง‘ํ•œ ์ •๋ณด์— ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

@znmeb AMD A12-9800E iGPU๋Š” GCN v3์ด์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋‚ด๊ฐ€ ๋ฒค์น˜๋งˆํฌ/ํ…Œ์ŠคํŠธ๋ฅผ ํ•˜๋Š” ์ฃผ๋œ ์ด์œ ๋Š” "AMD๋ฅผ ๊ณ„์† ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ ๋”ฅ ๋Ÿฌ๋‹ ๋ชจํ—˜์„ ์œ„ํ•ด Nvidia๋กœ ์ „ํ™˜ํ•˜์‹ญ์‹œ์˜ค"๋ผ๋Š” ์งˆ๋ฌธ์— ๋Œ€ํ•œ ๋‹ต์„ ์ฐพ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ๋‹ต์€ ์ž…๋‹ˆ๋‹ค. ๋‚˜๋Š” AMD์˜ ์˜คํ”ˆ ์†Œ์Šค ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ •๋ง ์ข‹์•„ํ•˜์ง€๋งŒ ๋‘ ๊ฐ€์ง€ ์š”์ธ์œผ๋กœ ์ธํ•ด Nvidia๋กœ ์ „ํ™˜ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋จผ์ € ๋”ฅ ๋Ÿฌ๋‹ ์†Œํ”„ํŠธ์›จ์–ด ์Šคํƒ(์˜ˆ: Tensorflow)์€ Nvidia์—์„œ ํ›จ์”ฌ ๋” ์„ฑ์ˆ™ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‘ ๋ฒˆ์งธ๋กœ ๋‚ด ํŠน์ • ์š”๊ตฌ ์‚ฌํ•ญ์— ๋Œ€ํ•ด ์ œ๊ณตํ•˜๋Š” ๊ทธ๋ž˜ํ”ฝ ์นด๋“œ(Dan A4 SFX ์ผ€์ด์Šค์— ๋งž์•„์•ผ ํ•˜๊ณ  ๋ช‡ ์‹œ๊ฐ„ ๋™์•ˆ ์ตœ๋Œ€ ๋ถ€ํ•˜์—์„œ ๋งค์šฐ ์กฐ์šฉํ•˜๊ฑฐ๋‚˜ ๊ฑฐ์˜ ์†Œ์Œ์ด ์—†์–ด์•ผ ํ•จ)๋Š” AMD ์ธก์— ๋งค์šฐ ์ œํ•œ์ ์ด๊ฑฐ๋‚˜ ์•„์˜ˆ ์กด์žฌํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์ธํ…” GPU๊ฐ€ ์ง€์›๋ฉ๋‹ˆ๊นŒ? ๋‚ด Iris Pro๋Š” ์˜ค๋žœ ํ›ˆ๋ จ์„ ์กฐ๊ธˆ์ด๋‚˜๋งˆ ๊ฐ€์†ํ™”ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์—์„œ ์ธํ…” GPU(๋˜๋Š” amd cpu) ์ง€์› ๋ถ€์กฑ์— ๋Œ€ํ•ด ๋…ผ์˜ํ•˜์‹ญ์‹œ์˜ค. codeplaysoftware/computecpp-sdk#78

https://github.com/codeplaysoftware/computecpp-sdk/issues/82

์ด ๋ฌธ์ œ์˜ ์ƒํƒœ๋ฅผ ํŒŒ์•…ํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๊ฐ€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋งํ•˜๋Š” ๊ฒƒ์ด ์˜ณ์Šต๋‹ˆ๊นŒ?

https://github.com/lukeiwanski/tensorflow

...ComputeCpp๋กœ ๊ตฌ์ถ•๋œ ๊ฒƒ์ด ํ˜„์žฌ ์ผ๋ฐ˜์ ์ธ AMD GPU ์ง€์›์œผ๋กœ Tensorflow๋ฅผ ๊ตฌ์ถ•ํ•˜๊ธฐ ์œ„ํ•œ ์ตœ์ƒ์˜ ์˜ต์…˜์ž…๋‹ˆ๊นŒ? ๊ทธ๋ ‡๋‹ค๋ฉด ์ด ๋นŒ๋“œ๊ฐ€ CPU๋ณด๋‹ค ์†๋„ ํ–ฅ์ƒ์„ ์ œ๊ณตํ•œ๋‹ค๋Š” ๋ฒค์น˜๋งˆํฌ ์ฆ๊ฑฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?

"์ผ๋ฐ˜ AMD GPU ์ง€์›"์ด ์˜๋ฏธํ•˜๋Š” ๋ฐ”์— ๋”ฐ๋ผ ๋‹ค๋ฆ…๋‹ˆ๋‹ค. ์ •๋ง ์˜ค๋ž˜๋œ dGPU ๋˜๋Š” APU๋ฅผ ์˜๋ฏธํ•˜๋Š”์ง€ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ตœ์‹ (2์„ธ๋Œ€ GCN ์ด์ƒ)์ด ์žˆ๋Š” ๊ฒฝ์šฐ ROCm์—์„œ ์‹คํ–‰๋˜๋Š” hipTensorFlow(v1.0.1)๊ฐ€ ๊ฝค ์ž˜ ์ž‘๋™ํ–ˆ์Šต๋‹ˆ๋‹ค.

@briansp2020 ์•„ ์˜ˆ, ROCm์— ๋Œ€ํ•œ AMD์˜ ์ž‘์—…์„ ๋ณด์•˜์Šต๋‹ˆ๋‹ค. ๋ถˆํ–‰ํžˆ๋„ ๊ทธ๋“ค์€ Linux๋งŒ ์ง€์›ํ•˜๊ณ  ๋‹ค๋ฅธ OS์— ๋Œ€ํ•œ ์ง€์›์€ ๋กœ๋“œ๋งต์— ์—†๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. Windows๋ฅผ ์ง€์›ํ•˜๋Š” ๋ฌด์–ธ๊ฐ€๋ฅผ ๊ธฐ๋Œ€ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

@mjmax Windows์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” GPU ๊ฐ€์† ํ…์„œํ”Œ๋กœ ํŒจํ‚ค์ง€๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ? GPU ๊ฐ€์† ๋”ฅ ๋Ÿฌ๋‹์„ ์›ํ•œ๋‹ค๋ฉด Linux๊ฐ€ ์œ ์ผํ•œ ์„ ํƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ–ˆ์Šต๋‹ˆ๋‹ค. TensorFlow๊ฐ€ OpenCL๋กœ ํฌํŒ…๋˜๋ฉด Windows๋กœ ํฌํŒ…ํ•˜๊ธฐ๊ฐ€ ๋” ์‰ฌ์šธ๊นŒ์š”? GPU ๊ฐ€์†์ด ์žˆ๋Š” Windows์—์„œ CUDA๊ฐ€ ์ง€์›๋  ๋•Œ TensorFlow๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†๋Š” ์ด์œ ๊ฐ€ ํ™•์‹คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ์ด์ œ ์ฃผ์ œ๋ฅผ ๋ฒ—์–ด๋‚œ ๊ฒƒ ๊ฐ™์ง€๋งŒ GPU ๊ฐ€์†ํ™”๋œ Windows์šฉ TensorFlow ๋ฐ/๋˜๋Š” PyTorch์— ๋Œ€ํ•ด ์•„๋Š” ์‚ฌ๋žŒ์ด ์žˆ๋‹ค๋ฉด ์ €๋„ ์•Œ๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค...

@briansp2020 ๋‚ด๊ฐ€ ์•„๋Š” ํ•œ, Tensorflow๋Š” ์ด๋ฏธ Windows์—์„œ Nvidia GPU ๊ฐ€์†์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

CL tensofrflow๋Š” ์ด๋ฏธ Linux์—์„œ ์—‰๋ง์ž…๋‹ˆ๋‹ค. ๊ณง ์•„๋ฌด ๊ฒƒ๋„ ๊ธฐ๋Œ€ํ•˜์ง€ ๋งˆ์‹ญ์‹œ์˜ค.
๊ฑฐ๊ธฐ์—์„œ ์†๋„๋ฅผ ๋†’์ด๊ณ  ์‹ถ๋‹ค๋ฉด plaidML๋งŒ ์žˆ์Šต๋‹ˆ๋‹ค.
(๊ทธ๋ฆฌ๊ณ  ์ œ๋ฐœ, ์šฐ๋ฆฌ๋Š” ์ด๋ฏธ 500๊ฐœ์˜ ๋Œ“๊ธ€์„ ๋‹ฌ์•˜์Šต๋‹ˆ๋‹ค. ์ •๋ง, ์ •๋ง๋กœ ํ•„์š”ํ•œ ๊ฒฝ์šฐ์—๋งŒ ๊ฒŒ์‹œํ•˜๋„๋ก ๋…ธ๋ ฅํ•ฉ์‹œ๋‹ค)

@mirh OpenCL Caffe๋Š” Windows์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ๋ฌผ๋ก  ๊ธฐ๋Šฅ ๋ฉด์—์„œ TensorFlow๋Š” ์•„๋‹ˆ์ง€๋งŒ ๋ชจ๋“  ๊ณณ์— ๋ฐฐํฌํ•ด์•ผ ํ•˜๋Š” ์†Œํ”„ํŠธ์›จ์–ด์—๋Š” ๊ฝค ๊ฒฌ๊ณ ํ•ฉ๋‹ˆ๋‹ค.

openCL ํฌํŠธ๋ฅผ AMD๊ฐ€ ์ง€์›ํ•˜๋Š” HIP ํฌํŠธ๋กœ ๊ต์ฒดํ•˜๋Š” ๊ฒƒ์€ ์–ด๋–ป์Šต๋‹ˆ๊นŒ?

https://github.com/ROCmSoftwarePlatform/hiptensorflow

ใ…‹! @LifeIsStrange ์ธ์ƒ์€ ์‚ฌ์‹ค ๊ต‰์žฅํžˆ ์ด์ƒํ•ด์š”... AMD์˜ HiP ๋งˆ์ผ€ํŒ… ํŒ€์—์„œ ์ผํ•˜๊ณ  ์žˆ๋‚˜์š”? :-)
์ด ๋ฌธ์ œ์˜ ์ฃผ์ œ์ธ "OpenCL ์ง€์›"์„ ์‚ดํŽด๋ณด์‹ญ์‹œ์˜ค.

์ด๋Š” Khronos ํ‘œ์ค€ https://en.wikipedia.org/wiki/OpenCL ์— ๊ด€ํ•œ ๊ฒƒ์ด๋ฉฐ OpenCL Khronos ์ž‘์—… ๊ทธ๋ฃน์˜ ๋‹ค๋ฅธ SYCL ํ‘œ์ค€์€ "๊ฐœ์š”" ์„น์…˜ ๋์— ๋‚˜ํƒ€๋‚ฉ๋‹ˆ๋‹ค.

๋ฌผ๋ก  ์ด ๋ฌธ์ œ ๋ฐ–์— ์žˆ๋Š” ์„ธ๊ณ„๊ฐ€ ์žˆ์ง€๋งŒ ๊ทธ๊ฒƒ์€... ๋ฐ”๊นฅ์ชฝ์— ์žˆ์Šต๋‹ˆ๋‹ค! :-)

์ด๋ฏธ ๋„ˆ๋ฌด ๊ธด ํ† ๋ก ์— ๋Œ€ํ•ด ์ž„์˜์˜ ๊ฒŒ์‹œ๋ฌผ์„ ๊ฒŒ์‹œํ•˜์—ฌ ์šฐ์ฃผ์˜ ์—”ํŠธ๋กœํ”ผ๋ฅผ ๋ฌด์‹ฌ์ฝ” ์ฆ๊ฐ€์‹œํ‚ค์ง€ ๋งˆ์‹ญ์‹œ์˜ค... :-)
๊ทธ๋Ÿฐ๋ฐ ์ด ๋Œ“๊ธ€์€ ๋‹น์‹  ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์—ฌ๊ธฐ ์žˆ๋Š” ๋ช‡๋ช‡ ๋‹ค๋ฅธ ํฌ์Šคํ„ฐ๋“ค์—๊ฒŒ๋„ ์ ์šฉ๋ฉ๋‹ˆ๋‹ค.
์ด๊ฒƒ์€ ๊ธฐ์ˆ ์ ์ธ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ GitHub ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค. ์‚ฌ๋žŒ๋“ค์ด ๋„๊ตฌ A ๋˜๋Š” B๋ฅผ ์ข‹์•„ํ•˜๊ฑฐ๋‚˜ ์‹ซ์–ดํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ FaceBook ํŽ˜์ด์ง€๊ฐ€ ์•„๋‹ˆ๋ผ OpenCL ํ‘œ์ค€์„ ์ง€์›ํ•˜๋Š” ์žฅ์น˜์—์„œ TensorFlow๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. :-)
๊ทธ๋Ÿฌ๋‚˜ ์šฐ๋ฆฌ๊ฐ€ ๋ณผ ์ˆ˜ ์žˆ๋Š” ์ด ๋ฌธ์ œ์™€ ๊ด€๋ จ๋œ ์ผ๋ถ€ git ์ปค๋ฐ‹์„ ์ž์œ ๋กญ๊ฒŒ ๋ณด๋‚ด์ฃผ์‹ญ์‹œ์˜ค...

OpenCL https://github.com/hughperkins/tf-coriander ๋ฅผ ์ง€์›ํ•˜๋Š” TensorFlow ํฌํฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ๋ฌผ๋ก  @benoitsteiner ์˜ ์ž‘์—… https://github.com/benoitsteiner/tensorflow-opencl

IMHO, ์ฃผ๋ฅ˜ TF๊ฐ€ ์—ฌ์ „ํžˆ ์ž‘์—…์„ ๋ณ‘ํ•ฉํ•˜์ง€ ์•Š์•˜๋‹ค๋Š” ๊ฒƒ์€ ์šฐ์Šค๊ฝ์Šค๋Ÿฝ์Šต๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์—์„œ lomg-as-it-is-OpenCL์„ ์‹คํ–‰ํ•˜๋Š” ๋ฐ ์ดˆ์ ์„ ๋งž์ถ”๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ, ์•„๋‹ˆ๋ฉด ์‹ค์ œ๋กœ ๋” ๋น ๋ฅด๊ฒŒ ์‹คํ–‰ํ•˜๋„๋ก ํ•ฉ๋‹ˆ๊นŒ? ๋‚˜๋Š” ๊ฑฐ๋ฃฉํ•œ ์ „์Ÿ์ด ์•„๋‹ˆ๋ผ ์—ฌ๋Ÿฌ GPU์—์„œ ๋น ๋ฅด๊ฒŒ ์‹คํ–‰๋˜๋Š” ๋ฐ ์ง‘์ค‘ํ•˜๋Š” ๊ฒƒ์„ ์„ ํ˜ธํ•ฉ๋‹ˆ๋‹ค. LifeIsStrange์˜ ์ดˆ์ ์€ AMD GPU์—์„œ ์ž‘๋™ํ•˜๋„๋ก ํ•˜๋Š” ๊ฒƒ์ด๋ฉฐ HIP๋Š” ํƒ€๋‹นํ•ฉ๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์—๊ฒŒ ์ดˆ์ ์€ Intel GPU ๋˜๋Š” Android์—์„œ ์ž‘๋™ํ•˜๋„๋ก ํ•˜๋Š” ๊ฒƒ์ด๋ฉฐ OpenCL์ด ํ›จ์”ฌ ๋” ํ•ฉ๋ฆฌ์ ์ž…๋‹ˆ๋‹ค. GPU ์–ธ์–ด๋Š” ์—‰๋ง์ด๋ฏ€๋กœ ์‹ค์šฉ์ ์œผ๋กœ ์œ ์ง€ํ•˜์‹ญ์‹œ์˜ค.

์—ฌ๊ธฐ์—์„œ ์ผ๋ถ€ ์ฃผ์„์„ ์ฝ์œผ๋ฉด ์„ฑ๋Šฅ์ด OpenCL ํฌํŠธ์˜ ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ถˆํ–‰ํžˆ๋„ ์ฃผ๋ณ€์— ๋งŽ์€ ๋ฒค์น˜๋งˆํฌ๋ฅผ ๋ณผ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ๋ณด๋‹ค ๋” ๋งŽ์€ ๋ฒค์น˜๋งˆํฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ? https://github.com/AlphasCodes/DeepLearning/blob/master/Tensorflow_Benchmarks.md

์ œ๊ฐ€ ์•Œ๊ธฐ๋กœ๋Š” CUDA์™€ OpenCL์„ ๋น„๊ตํ•˜๋ฉด ๋‹ค๋ฅธ ํ•˜๋“œ์›จ์–ด๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋ฒค์น˜๋งˆํ‚น์ด ์–ด๋ ต์Šต๋‹ˆ๋‹ค. ์˜์‹ฌ๋˜๋Š” ๋ฐ”์— ๋”ฐ๋ฅด๋ฉด, nVidia๋Š” OpenCL ๊ตฌํ˜„์„ ์˜๋„์ ์œผ๋กœ ๋งŒ๋“ค๊ฑฐ๋‚˜ ํ—ˆ์šฉํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ๋™์ผํ•œ ํ•˜๋“œ์›จ์–ด์—์„œ ๋ฒค์น˜๋งˆํ‚นํ•˜๋ฉด ํ•ญ์ƒ CUDA๊ฐ€ ๋ฉ‹์ง€๊ฒŒ ๋ณด์ž…๋‹ˆ๋‹ค.

2018๋…„ 2์›” 12์ผ 14:26:11 GMT+00:00์— VincentSC [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์—์„œ lomg-as-it-is-OpenCL์„ ์‹คํ–‰ํ•˜๋Š” ๋ฐ ์ค‘์ ์„ ๋‘์—ˆ์Šต๋‹ˆ๊นŒ?
์‹ค์ œ๋กœ ๋” ๋น ๋ฅด๊ฒŒ ์‹คํ–‰๋˜๋„๋ก ํ•˜์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ? ์„ฑ์Šค๋Ÿฌ์šด ์ „์Ÿ์€ ์—†์—ˆ์œผ๋ฉด ์ข‹๊ฒ ์ง€๋งŒ,
์—ฌ๋Ÿฌ GPU์—์„œ ๋น ๋ฅด๊ฒŒ ์‹คํ–‰๋˜๋„๋ก ํ•˜๋Š” ๋ฐ ์ค‘์ ์„ ๋‘ก๋‹ˆ๋‹ค. ๋ผ์ดํ”„์•„์ด์ฆˆ์ŠคํŠธ๋ ˆ์ธ์ง€
์ดˆ์ ์€ AMD GPU์—์„œ ์ž‘๋™ํ•˜๋„๋ก ํ•˜๋Š” ๊ฒƒ์ด๋ฉฐ HIP๋Š”
๊ฐ๊ฐ. ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์—๊ฒŒ๋Š” Intel GPU ๋˜๋Š”
Android, OpenCL์ด ํ›จ์”ฌ ๋” ํ•ฉ๋ฆฌ์ ์ž…๋‹ˆ๋‹ค. GPU ์–ธ์–ด๋Š”
์—‰๋ง์ด๋ฏ€๋กœ ์‹ค์šฉ์ ์œผ๋กœ ์œ ์ง€ํ•˜์‹ญ์‹œ์˜ค.

์—ฌ๊ธฐ ๋Œ“๊ธ€ ์ค‘ ์ผ๋ถ€๋ฅผ ์ฝ์œผ๋ฉด ์„ฑ๋Šฅ์ด ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค.
OpenCL ํฌํŠธ. ๊ทธ๋Ÿฌ๋‚˜ ๋ถˆํ–‰ํžˆ๋„ ์ฃผ๋ณ€์— ๋งŽ์€ ๋ฒค์น˜๋งˆํฌ๋ฅผ ๋ณผ ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
์ด๊ฒƒ๋ณด๋‹ค ๋” ๋งŽ์€ ๋ฒค์น˜๋งˆํฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?
https://github.com/AlphasCodes/DeepLearning/blob/master/Tensorflow_Benchmarks.md

--
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment -364936498

--
K-9 Mail์„ ์‚ฌ์šฉํ•˜์—ฌ Android ๊ธฐ๊ธฐ์—์„œ ๋ณด๋ƒˆ์Šต๋‹ˆ๋‹ค. ์ œ ๊ฐ„๋žตํ•œ ์„ค๋ช…์„ ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

2๊ฐœ์˜ ์ˆซ์ž๋งŒ ๋น„๊ตํ•˜๋Š” ๊ฒƒ์€ ์ •๋ณด๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค. NVidia์˜ OpenCL์ด ๋‹ค๋ฅธ GPU์—์„œ 4๋ฐฐ ์†๋„๋กœ ์‹คํ–‰๋˜๋Š” ๊ฒฝ์šฐ ์ ˆ๋ฐ˜ ์†๋„๋กœ ์‹คํ–‰๋˜๋Š”์ง€ ๋ˆ„๊ฐ€ ์‹ ๊ฒฝ์„ ์“ฐ๊ฒ ์Šต๋‹ˆ๊นŒ?

๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋ฒค์น˜๋งˆํฌ๊ฐ€ ํ•„์š”ํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

  1. NV GPU์˜ CUDA(์ฐธ์กฐ ๋ฒค์น˜๋งˆํฌ)
  2. AMD, Nvidia ๋ฐ Intel GPU์˜ https://github.com/hughperkins/tf-coriander
  3. AMD, Nvidia ๋ฐ Intel GPU์˜ https://github.com/benoitsteiner/tensorflow-opencl
  4. AMD, Nvidia ๋ฐ Intel GPU์˜ https://github.com/lukeiwanski/tensorflow

์ฐธ์กฐ ๋ฒค์น˜๋งˆํฌ๋Š” ์‰ฝ๊ฒŒ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์— ์ผ๋ถ€ ๊ณ ๊ธ‰ GPU๊ฐ€ ์žˆ์œผ๋ฏ€๋กœ ์ˆซ์ž๋ฅผ ์ž…๋ ฅํ•  ์žฅ์†Œ๋งŒ ์žˆ์œผ๋ฉด ๋ฉ๋‹ˆ๋‹ค(๊ฑด๋ฌผ ๋ฌธ์„œ์— ๋Œ€ํ•œ ๋งํฌ ํฌํ•จ).

OpenCL ์ง€์› ์‚ฌ์‹ค์ด ๋˜์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

cuda๋Š” ๋„ˆ๋ฌด ์ œํ•œ์ ์ด๋ฉฐ nvidia๋Š” ๊ณต์œ ํ•˜๊ณ  ์‹ถ์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
cuda๋Š” Nv GPU์—์„œ๋งŒ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.
TensorFlow์˜ ๋ง‰๋‹ค๋ฅธ ๊ณจ๋ชฉ์ž…๋‹ˆ๋‹ค.
๋‹ค๋ฅธ "TensorFlow"๊ฐ€ ๋‚˜์˜ค์ง€๋งŒ TensorFlow๋ณด๋‹ค ๋” ๋งŽ์€ ์ง€์›์„ ํ•œ๋‹ค๋ฉด.
TensorFlow๊ฐ€ ์—ฌ์ „ํžˆ Windows์—์„œ cuda๋งŒ ์ง€์›ํ•˜๋Š” ๊ฒฝ์šฐ.
์œ ์ผํ•œ ์„ ํƒ์ด ์•„๋‹Œ TensorFlow๋ฅผ ์ธ์‹ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

OpenCL์ด HIP๋ณด๋‹ค ๋‚˜์€ ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ? ๋‚˜๋Š” OpenCL์ด ๊ฒฌ์ธ๋ ฅ์„ ์–ป๋Š” ๋ฐ ์‹คํŒจํ–ˆ์œผ๋ฉฐ ์ด ์‹œ์ ์—์„œ OpenCL์„ ์ง€์›ํ•˜๋Š” ๊ฒƒ์€ ์•„๋งˆ๋„ ์ „์ฒด ์ปค๋ฎค๋‹ˆํ‹ฐ/์‚ฐ์—…์— ์—ญ์ƒ์‚ฐ์ ์ด๊ณ  ์ž์› ๋‚ญ๋น„์ผ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. TensorFlow๊ฐ€ HIP๋ฅผ ์ง์ ‘ ์ง€์›ํ•˜๊ณ  ์ปดํŒŒ์ผ๋Ÿฌ/๋„๊ตฌ/๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ์ด์‹์„ฑ์„ ์ฒ˜๋ฆฌํ•˜๋„๋ก ํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

์†Œํ”„ํŠธ์›จ์–ด๊ฐ€ ํ•˜๋‚˜์˜ ์–ธ์–ด/ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ชจ๋ธ์„ ์ง€์›ํ•˜๋Š” ๊ฒƒ์ด ๋” ์ข‹์ง€ ์•Š์Šต๋‹ˆ๊นŒ?

์†Œํ”„ํŠธ์›จ์–ด๋Š” ๋ชจ๋“  ์‚ฌ์šฉ ์‚ฌ๋ก€๋ฅผ ๋‹ค๋ฃจ๊ธฐ ์œ„ํ•ด ์ง€์›ํ•ด์•ผ ํ•˜๋Š” ๊ฒƒ์„ ์ง€์›ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
ํ•˜๋“œ์›จ์–ด๋ฅผ ์ง€์›ํ•˜๋Š” ๊ฒฝ์šฐ HIP๋Š” ๋ชจ๋“  ์ข…๊ณผ ํ˜ธ๋ฃจ๋ผ๊ธฐ (์ ์–ด๋„ ์ข…์ด์—)์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด ์„ธ์ƒ์— "์ตœ์‹  amd ๋ฐ nvidia ์นด๋“œ"๋งŒ ์žˆ๋Š” ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค.

์ด์ œ ์ œ๋ฐœ, ์‹ ์˜ ์‚ฌ๋ž‘์„ ์œ„ํ•ด, ๊ทธ๊ฒƒ์— ๋Œ€ํ•œ ์–ด๋–ค ๋ฌธ์ œ๋ผ๋„ ์—ฌ๊ธฐ ์—์„œ ๋ถˆํ‰ํ•˜์‹ญ์‹œ์˜ค.
๊ทธ๋ฆฌ๊ณ  ์ด ๋ฌธ์ œ์˜ ์ง€์†์— ๊ด€์‹ฌ์ด ์žˆ๋Š” ๋‹ค๋ฅธ ๋ชจ๋“  ์‚ฌ๋žŒ๋“ค์„ ์œ„ํ•ด ์—ฌ๊ธฐ ์— ์žˆ์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” SPIR-V๊ฐ€ ํ•˜๋“œ์›จ์–ด ๊ฐ„ ๋Œ€์•ˆ์œผ๋กœ CUDA๋ฅผ ์ง์ ‘ ๋Œ€์ฒดํ•  ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ–ˆ์Šต๋‹ˆ๋‹ค.
http://alphanew.net/index.php?section=alphanew&site=overview&lang=eng&newsID=111

Google์ด ์—ฌ์ „ํžˆ CUDA์— ์˜์กดํ•˜๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

๋„์›€์ด ๋ ๊นŒ์š”?

OpenCL ๋‚œ์ˆ˜ ์ƒ์„ฑ(Thomas Wang's):

uint wang_hash(uint seed)
{
               seed = (seed ^ 61) ^ (seed >> 16);
               seed *= 9;
               seed = seed ^ (seed >> 4);
               seed *= 0x27d4eb2d;
               seed = seed ^ (seed >> 15);
               return seed;
}

void wang_rnd_0(__global unsigned int * intSeeds,int id)                
{
               uint maxint=0;
               maxint--;
               uint rndint=wang_hash(id);
               intSeeds[id]=rndint;
}

float wang_rnd(__global unsigned int * intSeeds,int id)                
{
               uint maxint=0;
               maxint--;
               uint rndint=wang_hash(intSeeds[id]);
               intSeeds[id]=rndint;
               return ((float)rndint)/(float)maxint;
}


// initialize each thread's own random number seed
__kernel void rnd_0(__global unsigned int * intSeeds)
{
               int id=get_global_id(0);
               wang_rnd_0(intSeeds,id);     
}

// get a new random value by each thread
__kernel void rnd_1(__global unsigned int * intSeeds)
{
               int id=get_global_id(0);
               float randomFloat=wang_rnd(intSeeds,id);
}

OpenCL SHA3hashing(๋ˆ„๊ฐ€ ์ž‘์„ฑํ–ˆ๋Š”์ง€ ์žŠ์–ด๋ฒ„๋ฆผ)

https://gist.github.com/tugrul512bit/c8170f74846e36e350607664f12c525c

์ด ๋ฌธ์ œ๋Š” ์™ธ๋ถ€ ๊ธฐ์—ฌ๋ฅผ ์ดˆ๋Œ€ํ•˜๋ฏ€๋กœ ์–‘์ˆ˜์ธ์„ ์ œ๊ฑฐํ•˜์‹ญ์‹œ์˜ค. ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด contributions welcome ๋ ˆ์ด๋ธ”์„ ์ œ๊ฑฐํ•ฉ๋‹ˆ๋‹ค. ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

์ด ๋ฌธ์ œ๋Š” ์™ธ๋ถ€ ๊ธฐ์—ฌ๋ฅผ ์ดˆ๋Œ€ํ•˜๋ฏ€๋กœ ์–‘์ˆ˜์ธ์„ ์ œ๊ฑฐํ•˜์‹ญ์‹œ์˜ค. ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด contributions welcome ๋ ˆ์ด๋ธ”์„ ์ œ๊ฑฐํ•ฉ๋‹ˆ๋‹ค. ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

OpenCL์„ ์ง€์›ํ•˜๋Š” ๊ฒƒ์€ Google์˜ ์ด์ต์ž…๋‹ˆ๋‹ค.
ํŠน์ •(ํšŒ์‚ฌ/๋ธŒ๋žœ๋“œ/๊ณต๊ธ‰์—…์ฒด)์˜ ํŠน์ • ํ•˜๋“œ์›จ์–ด๋ฅผ ์†Œํ”„ํŠธ์›จ์–ด์— ๋Œ€ํ•œ ์ข…์†์„ฑ์œผ๋กœ ์‚ฌ์šฉํ•˜๋ฉด ํ•˜๋“œ์›จ์–ด์— ๋Œ€ํ•ด ๋” ๋งŽ์€ ๋น„์šฉ์„ ์ง€๋ถˆํ•ด์•ผ ํ•˜๊ณ  ์‹œ์žฅ ๊ฒฝ์Ÿ์€ ๋น„์šฉ์„ ๋‚ฎ์ถฅ๋‹ˆ๋‹ค.
Google์€ ์ฒ˜์Œ๋ถ€ํ„ฐ ํ•ญ์ƒ ์ƒ์šฉ ํ•˜๋“œ์›จ์–ด์— ๊ด€ํ•œ ๊ฒƒ์ด์—ˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” Google์˜ ์„ฑ๊ณต(์‹œ์žฅ ์ง€๋ฐฐ๋ ฅ)์— ์—ฌ์ „ํžˆ ๊ฒฐ์ •์ ์ด์—ˆ๊ณ , ๋ฐ์ดํ„ฐ ์„ผํ„ฐ ์šด์˜ ๋น„์šฉ์ด ๋‚ฎ์•„ Gmail(์ €์žฅ ๊ณต๊ฐ„) ๋ฐ Google Photos(์ €์žฅ ๊ณต๋ฐฑ ๋ฐ ์ž๋™ ํƒœ๊ทธ ์ถ”๊ฐ€).

@wesamco ์•„๋‹ˆ์š”, ๋ฐ˜๋“œ์‹œ Google์˜ ์ด์ต์ด ๋˜๋Š” ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค. ๊ทธ๋“ค์€ IIRC์ธ "TensorBoard"๋ผ๊ณ  ํ•˜๋Š” ์ž์ฒด ํ•˜๋“œ์›จ์–ด๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค. OpenCL ๋ฐ CUDA/CUDnn์„ ์šฐํšŒํ•˜๊ณ  ๋ณด๋“œ๊ฐ€ ์›์‹œ TensorFlow ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•˜๋„๋ก ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์›์‹œ TensorFlow ์ฝ”๋“œ.

๊ทธ๋Ÿฐ ๊ฒƒ์€ ์—†์Šต๋‹ˆ๋‹ค. ๊ฐ€๊ณต๋˜์ง€ ์•Š์€ ์Œ์‹๊ณผ ๋‹ค๋ฆ…๋‹ˆ๋‹ค. TPU์—๋Š” ๋‹ค์–‘ํ•œ ์œ ํ˜•์˜ ํ˜ธ์ถœ์„ ์ฒ˜๋ฆฌํ•˜๋Š” ์ž์ฒด DNN ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

์œ„์˜ ๋…ผ์˜๋ฅผ ๋‹ค์‹œ ํ•˜๋‚˜์˜ ๋ชฉ๋ก์œผ๋กœ ์••์ถ•ํ•ด์•ผ ํ•  ๋•Œ์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

  • CodePlay๋Š” SYCL ๋ฐฑ์—”๋“œ์—์„œ ์ž‘์—… ์ค‘์ž…๋‹ˆ๋‹ค.
  • Hugh Perkins๋Š” tf-coriander๋ฅผ ์ž‘์—… ์ค‘์ž…๋‹ˆ๋‹ค.
  • AMD๋Š” HIP ๋ฐฑ์—”๋“œ์—์„œ ์ž‘์—… ์ค‘์ž…๋‹ˆ๋‹ค.
  • PlaidML์€ ํ˜„์žฌ CPU๋งŒ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.
  • Intel GPU์— ๋Œ€ํ•œ ์ง€์› ์ƒํƒœ๋Š” ๋ถˆ๋ช…ํ™•ํ•ฉ๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ ์›ํ•˜๋Š” ํ”„๋กœ์ ํŠธ๋ฅผ ์„ ํƒํ•˜๊ณ  ์ง€์›์„ ์‹œ์ž‘ํ•˜์‹ญ์‹œ์˜ค. ๊ฐ ๊ทธ๋ฃน์ด ํ”„๋กœ์ ํŠธ์— ๋Œ€ํ•œ ์ƒํƒœ ์—…๋ฐ์ดํŠธ๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

OpenCL์€ ์ „์ฒด ์–ธ์–ด์—์„œ SPIRV(์ปค๋„)๋กœ ํ‘œํ˜„๋˜๋Š” ์–ธ์–ด ์ •์˜/ํ•˜๋“œ์›จ์–ด ์‚ฌ์–‘์œผ๋กœ ๋ณ€ํ™˜๋˜์—ˆ์œผ๋ฉฐ OpenCL ๋“œ๋ผ์ด๋ฒ„์™€ ๊ฐ™์€ ํ”Œ๋žซํผ ์œ„์—์„œ ์‹คํ–‰๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๋‚˜์ค‘์— Vulkan ๋“œ๋ผ์ด๋ฒ„์—์„œ๋„ ์‹คํ–‰๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (ํ”Œ๋žซํผ). ๋”ฐ๋ผ์„œ SYCL์„ ์ง€์›ํ•˜๋ฉด OpenCL๋„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

์™„๋ฒฝํ•œ ์š”์•ฝ์ด์ง€๋งŒ plaidml์€ GPU์—์„œ๋„ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค.
ํ˜„์žฌ tensorflow๊ฐ€ ์•„๋‹Œ โ€‹โ€‹keras์˜ ๋ฐฑ์—”๋“œ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๊ทธ๊ฒƒ์€ ์•ฝ๊ฐ„ OT์ž…๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„,
@VincentSC ๋‹ค์–‘ํ•œ ๋…ธ๋ ฅ์— ๋Œ€ํ•œ ํ›Œ๋ฅญํ•œ ์š”์•ฝ์— ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค!

๋”ฐ๋ผ์„œ ์›ํ•˜๋Š” ํ”„๋กœ์ ํŠธ๋ฅผ ์„ ํƒํ•˜๊ณ  ์ง€์›์„ ์‹œ์ž‘ํ•˜์‹ญ์‹œ์˜ค. ๊ฐ ๊ทธ๋ฃน์ด ํ”„๋กœ์ ํŠธ์— ๋Œ€ํ•œ ์ƒํƒœ ์—…๋ฐ์ดํŠธ๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

SYCL ์ ‘๊ทผ ๋ฐฉ์‹์€ ์ด์ œ ๋‹ค์–‘ํ•œ ํ”Œ๋žซํผ/์žฅ์น˜๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. ์ œ๊ฐ€ ์–ธ๊ธ‰ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ๋“ค์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

  • AMD GPU(FirePro W8100, R9 Nano ๋ฐ R9 380 ์‹œ๋ฆฌ์ฆˆ) ์—ฌ๊ธฐ ๋˜๋Š” ์—ฌ๊ธฐ์—์„œ ์ง€์นจ ์‚ฌ์šฉ ๊ฐ€๋Šฅ
  • ARM Mali( HiKey 960 ) ์—ฌ๊ธฐ์—์„œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ช…๋ น์–ด
  • Intel NEO OpenCL ๋“œ๋ผ์ด๋ฒ„๊ฐ€ ํฌํ•จ๋œ Intel GPU(SkyLake ์‹œ๋ฆฌ์ฆˆ)

AMD์˜ ๊ฒฝ์šฐ ์œ„์—์„œ ์–ธ๊ธ‰ํ•œ GPU๋Š” ๋ ˆ๊ฑฐ์‹œ OpenCL์ด ํ™œ์„ฑํ™”๋œ AMDGPU-Pro ๋“œ๋ผ์ด๋ฒ„ 17.40-xxx๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹ค๋ฅธ ์‹œ๋ฆฌ์ฆˆ๊ฐ€ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ๋ถ„๋ช…ํ•œ ์ด์œ ๋Š” ๋ณด์ด์ง€ ์•Š์Šต๋‹ˆ๋‹ค(SPIR/SPIR-V๊ฐ€ ์ง€์›๋œ๋‹ค๋Š” ๊ฐ€์ • ํ•˜์—)

์šฐ๋ฆฌ๊ฐ€ ์ง‘์ค‘ํ•˜๊ณ  ์žˆ๋Š” ์ฃผ์š” ํ”Œ๋žซํผ์€ Linux์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ํ–ฅํ›„ Windows๋ฅผ ํ™œ์„ฑํ™”ํ•˜๊ธฐ ์œ„ํ•ด ์ง€์†์ ์ธ ๋…ธ๋ ฅ์„ ๊ธฐ์šธ์ด๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ฐ€๊นŒ์šด ์žฅ๋ž˜์— OSX๋ฅผ ์ง€์›ํ•  ๊ณ„ํš์€ ์—†์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์Šฌํ”ˆ ์–ผ๊ตด์„ ์•Œ๊ณ  ์žˆ๋‹ค.

์šฐ๋ฆฌ์˜ ์ดˆ์ ์€ CNN์˜ ์„ฑ๋Šฅ์„ ๊ฐœ์„ ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ํ˜„์žฌ ์„ฑ๋Šฅ์€ ์ตœ์ ํ™”๋˜์ง€ ์•Š์•˜์œผ๋ฉฐ ์šฐ๋ฆฌ๊ฐ€ ๋ณผ ์ˆ˜ ์žˆ๋Š” ๊ณณ ๊ทผ์ฒ˜์— ์—†์Šต๋‹ˆ๋‹ค. ์ฆ‰, ์šฐ๋ฆฌ๋Š” ์ด๋ฏธ ๋‹ค๋ฅธ ๋Œ€์ƒ์—์„œ ๋Œ€๋ถ€๋ถ„์˜ ๋ชจ๋ธ์— ๋Œ€ํ•ด CPU ์„ฑ๋Šฅ์„ ๋Šฅ๊ฐ€ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

๊ฐœ๋ฐœ ์ฃผ๊ธฐ๋ฅผ ๊ฐ€์†ํ™”ํ•˜๊ณ  TensorFlow์˜ ์ „์ฒด ์ปดํŒŒ์ผ ์‹œ๊ฐ„์„ ์ค„์ด๊ธฐ ์œ„ํ•ด(๋˜ํ•œ ์ด์‹์„ฑ์„ ๊ฐœ์„ ํ•˜๊ธฐ ์œ„ํ•ด) Eigen, BLAS ๋ฐ DNN ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ์—์„œ ์ž‘์—…ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์ด๋Ÿฌํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” ์„ฑ๋Šฅ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ  TensorFlow์™€ ๊ฐ™์€ ๋ณต์žกํ•œ ํ”„๋กœ์ ํŠธ์™€ ์‰ฝ๊ฒŒ ํ†ตํ•ฉํ•  ์ˆ˜ ์žˆ๋Š” ์ด์‹ ๊ฐ€๋Šฅํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์ƒํƒœ๊ณ„๋ฅผ ๊ตฌ์ถ•ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•ฉ๋‹ˆ๋‹ค.

์•„๋ž˜์—์„œ ํ˜„์žฌ ๊ณต์œ ํ•  ์ˆ˜ ์žˆ๋Š” ์„ฑ๋Šฅ ๊ทธ๋ž˜ํ”„๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”. ๊ทธ๊ฒƒ๋“ค์€ ๋‚ด ํฌํฌ https://github.com/lukeiwanski/tensorflow/tree/dev/amd_gpu(271093b21cc5ca38e8699e154b5cada96bd7ac0d )์—์„œ ๊ฐ€์ ธ์™”์Šต๋‹ˆ๋‹ค.
์‚ฌ์šฉ๋œ ๋ฒค์น˜๋งˆํฌ๋Š” https://github.com/tensorflow/benchmarks ์ž…๋‹ˆ๋‹ค.

cpuvssycl
๊ทธ๋ž˜ํ”„๋Š” Intel i7-4790K ๊ฒฐ๊ณผ๋กœ ์ •๊ทœํ™”๋ฉ๋‹ˆ๋‹ค.

๋ณ€๊ฒฝ ์‚ฌํ•ญ์ด ๋ฐœ์ƒํ•˜๋ฉด TensorFlow๋ฅผ ๋”ฐ๋ฅด๋„๋ก Eigen์— ๋Œ€ํ•œ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์„ ์ฒœ์ฒœํžˆ ์—…์ŠคํŠธ๋ฆผํ•ฉ๋‹ˆ๋‹ค.

๋„์›€์ด ๋˜๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค.
๋ฃจํฌ

GPU/OpenCL์„ ์ง€์›ํ•˜๋Š” ๋ชจ๋ฐ”์ผ ์žฅ์น˜์˜ ๋”ฅ ๋Ÿฌ๋‹ ์ถ”๋ก ์„ ์œ„ํ•ด Adreno, Mali ๋ฐ PowerVR GPU์— ์ตœ์ ํ™”๋œ MACE ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ์€ ๋ช‡ ๊ฐ€์ง€ ๋ฒค์น˜๋งˆํฌ ๊ฒฐ๊ณผ ์ž…๋‹ˆ๋‹ค.

@keryell @benoitsteiner , ํ†ตํ•ฉ์— ํ•„์š”ํ•œ tensorflow ๋ฐ trisycl ๋ฒ„์ „. ์ตœ์‹  trisycl ๋ฆด๋ฆฌ์Šค๋กœ tensorflow(1.9)๋ฅผ ๋นŒ๋“œํ•˜๋Š” ๋ฐ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ถˆํ–‰ํžˆ๋„ ์ตœ์‹  TensorFlow๋Š” ํ˜„์žฌ triSYCL์ด ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ๋ณด๋‹ค ๋” ๊ณ ๊ธ‰ ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์œผ๋ฏ€๋กœ ํ˜„์žฌ ์œ ์ผํ•˜๊ฒŒ ์™„์ „ํžˆ ํ˜ธํ™˜๋˜๋Š” SYCL ๊ตฌํ˜„์ธ ComputeCpp๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค...

Tensorflow๋Š” Google Brain์—์„œ ์ง€์›ํ•˜๊ณ  Google์€ nVidia์™€ ํŒŒํŠธ๋„ˆ์‹ญ์„ ๋งบ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. Tensorflow์—์„œ OpenCL์„ ์ง€์›ํ•˜์ง€ ์•Š์„ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.
ํฐ OpenCL ์ปค๋ฎค๋‹ˆํ‹ฐ ๋…ธ๋ ฅ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค

OpenCL ์ง€์› ๋ถ€ํƒ๋“œ๋ฆฝ๋‹ˆ๋‹ค!

OpenCL์€ ์šฐ๋ฆฌ์—๊ฒŒ๋„ ๋” ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.

@Makhaon ๋‚˜๋„. NVIDIA ๊ทธ๋ž˜ํ”ฝ ์นด๋“œ๊ฐ€ ์žˆ๋Š” ์ปดํ“จํ„ฐ๋ฅผ ์‚ด ์—ฌ์œ ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

์œ„์˜ 2๊ฐœ์˜ ๊ฒŒ์‹œ๋ฌผ ์™ธ์—๋„ ์ด์ œ AMD์˜ Vega GPU(Raven Ridge APU ๋‚ด๋ถ€์˜ GPU ํฌํ•จ)๊ฐ€ FLOPS์˜ ๋‘ ๋ฐฐ์—์„œ FP16์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ TF๊ฐ€ OpenCL์„ ํ†ตํ•ด ์ง€์›ํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ์ •๋ง ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ ์€ ์˜ˆ์‚ฐ. ๋˜ํ•œ ์ด๋“ค ์ค‘ ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์ด ํ•™์ƒ์ผ ๊ฒƒ์ด๋ฉฐ, DNN ์—ฌ์ •์˜ ์‹œ์ž‘์ ์œผ๋กœ TF๋ฅผ ์‚ฌ์šฉํ•˜๊ฒŒ ํ•˜๋ฉด ๊ทธ๋“ค์€ ์•„๋งˆ๋„ TF๋ฅผ ๊ณ„์† ๊ณ ์ˆ˜ํ•˜๊ณ  ๋‹ค๋ฅธ ์‚ฌ๋žŒ๋“ค์—๊ฒŒ TF์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด ํ”„๋กœ์ ํŠธ๋ฅผ ํ™•์žฅํ•˜๋Š” ๋ฐ ๋„์›€์ด ๋˜๋Š” ์ข‹์€ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.

๋‚˜๋Š” ์ด ์Šค๋ ˆ๋“œ๊ฐ€ ๊ฐœ๋ฐœ์ž๋“ค์—๊ฒŒ ๋Œ€๋ถ€๋ถ„ ์˜๋ฏธ๊ฐ€ ์—†๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค(๋„ˆ๋ฌด ๋งŽ์€ ๋…ธ์ด์ฆˆ - ๊ทธ๋ฆฌ๊ณ  ๋” ์ถ”๊ฐ€ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค ;-). ํ•˜์ง€๋งŒ ๋งŽ์€ ์˜๊ฒฌ์ด ์š”์ ์„ ๋†“์น˜๊ณ  ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.
AMD ์นด๋“œ๋กœ Tensorflow๋ฅผ ์‹คํ–‰ํ•˜๋ ค๋ฉด OpenCL์ด ์›ํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹™๋‹ˆ๋‹ค . https://github.com/ROCmSoftwarePlatform/ ์œผ๋กœ ์ด๋™ํ•˜์—ฌ ROCm ์Šคํƒ์„ ์„ค์น˜ํ•˜์„ธ์š”. AFAIK AMD์˜ ํ˜„์žฌ ์ „๋žต์€ Tensorflow/pytorch์šฉ OpenCL ๋Œ€์‹  ROCm์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•ฉ๋‹ˆ๋‹ค .

Generic OpenCL์€ ๋„ˆ๋ฌด ๋งŽ์€ ์œ ์ง€ ๊ด€๋ฆฌ๊ฐ€ ํ•„์š”ํ–ˆ์œผ๋ฉฐ AMD์— ์ถฉ๋ถ„ํ•œ ์„ฑ๋Šฅ ์ด์ ์„ ์ œ๊ณตํ•˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด ํ‹ฐ์ผ“์€ OpenCL๋งŒ ์‚ฌ์šฉํ•˜๋Š” ARM ํ”Œ๋žซํผ์„ ์‹คํ–‰ํ•˜๋Š” ๊ฒฝ์šฐ ์—๋งŒ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.

(๋ฉด์ฑ… ์กฐํ•ญ: Tensorflow ๊ฐœ๋ฐœ์— ๋Œ€ํ•œ ์‹ค์ œ ๋‚ด๋ถ€๊ฐ€ ์•„๋‹Œ ์™ธ๋ถ€์ธ์— ๋ถˆ๊ณผํ•˜๋ฏ€๋กœ ์œ„์˜ ์ •๋ณด๊ฐ€ ์™„์ „ํžˆ ์ž˜๋ชป๋˜์—ˆ๊ฑฐ๋‚˜ ์˜คํ•ด์˜ ์†Œ์ง€๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋” ์ž˜ ์•Œ๊ณ  ์žˆ๋‹ค๋ฉด ๋ถ€๋‹ด ์—†์ด ์ €๋ฅผ ๋น„๋‚œํ•˜์‹ญ์‹œ์˜ค.)

์ƒˆ๋กœ์šด GPU ์˜คํ”„๋กœ๋“œ๊ฐ€ ํฌํ•จ๋œ llvm์€ ์–ด๋–ป์Šต๋‹ˆ๊นŒ? ๊ทธ๋Ÿฌ๋ฉด tensorflow์™€ cuda ํŠน์ • ์ฝ”๋“œ ์‚ฌ์ด์— ์ƒ๋‹นํ•œ ์ˆ˜์ค€์˜ ์ถ”์ƒํ™”๊ฐ€ ์ด๋ฃจ์–ด์ง‘๋‹ˆ๋‹ค.

์œ„์˜ ๋‹จ 10๊ฐœ์˜ ๊ฒŒ์‹œ๋ฌผ์„ ์ฝ๊ณ  ์‹œ๋„ํ•  ์ˆ˜ ์žˆ๋Š” lukeiwanski/codeplaysoftware์˜ ํฌํฌ๊ฐ€ ์ด๋ฏธ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ์•„์ฐจ๋ฆฐ ์—ฌ๋Ÿฌ๋ถ„์€ ์–ด๋–ป์Šต๋‹ˆ๊นŒ?
(๋˜ํ•œ ํ•œ ๋ฒˆ ์‹ฌ๊ฐํ•œ ์ข…๋ฅ˜์˜ ์˜คํ”ˆ ์†Œ์Šค ๋…ธ๋ ฅ์— ๊ธฐ์—ฌํ•œ ๊ฒƒ์— ๋Œ€ํ•ด xiaomi์— ๋ชจ์ž๋ฅผ ๋ฒ—์Šต๋‹ˆ๋‹ค)

@FelixSchwarz ROCm์ด OpenCL์„ ์‚ฌ์šฉํ•œ๋‹ค๋Š” ์‚ฌ์‹ค์„ ์•Œ๊ณ  ๊ณ„์‹œ๊ธฐ ๋•Œ๋ฌธ์— Linux์—์„œ AMD์˜ ์‚ฌ์šฉ์ž ๊ณต๊ฐ„ OpenCL ๋“œ๋ผ์ด๋ฒ„์ž…๋‹ˆ๋‹ค(์ด๊ฒƒ์ด Windows๋ฅผ ์ง€์›ํ•˜์ง€ ์•Š๋Š” ์ด์œ ์ž…๋‹ˆ๋‹ค). ๋”ฐ๋ผ์„œ Linux์—์„œ AMD์˜ ๋“œ๋ผ์ด๋ฒ„ ์—์ฝ”์‹œ์Šคํ…œ์ด ์ž‘๋™ํ•˜๋Š” ๋ฐฉ์‹์„ ๋ชจ๋ฅด๋Š” ๊ฒฝ์šฐ ์ปค๋„ ์ธก ๋“œ๋ผ์ด๋ฒ„ AMDGPU ๋ฐ AMDKFD(ํ˜„์žฌ AMDGPU๋กœ ๋ณ‘ํ•ฉ๋จ) ๋‹ค์Œ ์‚ฌ์šฉ์ž ๊ณต๊ฐ„ ๋“œ๋ผ์ด๋ฒ„ RadeonSI(OpenGL์šฉ) RadV/AMDVLK(Vulkan์šฉ) ๋ฐ ROCm(OpenCL์šฉ)์ด ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๋ฒ„๊ทธ ๋ฐ ๊ธฐํƒ€ ํฌํฌ์˜ ์—ญํ•™ ๊ด€๊ณ„๋กœ ํŒ๋‹จํ•˜๋ฉด Google์€ ์ด์— ๋Œ€ํ•ด ์ „ํ˜€ ๊ด€์‹ฌ์ด ์—†์œผ๋ฉฐ ๊ณต์‹ ์ €์žฅ์†Œ์—์„œ ์ด๋ฅผ ๊ตฌํ˜„ ํ•˜์ง€ ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‚˜๋Š” ๋ชจ๋“  ์‚ฌ๋žŒ์—๊ฒŒ ์ž˜๋ชป๋œ ํฌ๋ง์„ ์ฃผ์ง€ ์•Š๊ธฐ ์œ„ํ•ด ์ด ๋ฌธ์ œ๋ฅผ ์ข…๋ฃŒ(๋˜๋Š” ์ž ๊ธˆ)ํ•˜๋Š” ๋ฐ ํˆฌํ‘œํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋ฌธ์ œ๋Š” ์—ฌ๊ธฐ์—์„œ ๋‹ค์Œ์„ ์ˆ˜ํ–‰ํ•  ๋ชจ๋“  ์‚ฌ๋žŒ๋“ค์„ ์ตœ์†Œํ•œ ์—ฌ๊ธฐ์—์„œ ์ง€์ ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
ํ•„์—ฐ์ ์œผ๋กœ ๋‹ค์‹œ ์—ฝ๋‹ˆ ๋‹ค.

2018๋…„ 9์›” 15์ผ(ํ† ) 09:45 Anton Kochkov [email protected] ์—์„œ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

์ด ๋ฒ„๊ทธ์™€ ๋‹ค๋ฅธ ํฌํฌ์˜ ์—ญํ•™์œผ๋กœ ํŒ๋‹จํ•˜๋ฉด Google์—๋Š” 0์ด ์—†์Šต๋‹ˆ๋‹ค.
๊ด€์‹ฌ์„ ๊ฐ–๊ณ  ๊ณต์‹์ ์œผ๋กœ ์ด๋ฅผ ๊ตฌํ˜„ ํ•˜์ง€ ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ €์žฅ์†Œ. ๋‚˜๋Š” ์ด ๋ฌธ์ œ๋ฅผ ๋‹ซ๊ฑฐ๋‚˜ ์ž ๊ทธ๋Š” ๊ฒƒ์— ํˆฌํ‘œํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๋ชจ๋“  ์‚ฌ๋žŒ์—๊ฒŒ ๊ฑฐ์ง“ ํฌ๋ง์„ ์ฃผ์ง€ ๋งˆ์‹ญ์‹œ์˜ค.

โ€”
์ด ์Šค๋ ˆ๋“œ์— ๊ฐ€์ž…ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด ๋ฉ”์‹œ์ง€๋ฅผ ๋ฐ›๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/tensorflow/tensorflow/issues/22#issuecomment-421535747 ,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/AB1qNyDrfbiQ4h3kQyqObEfpK3O0FqRGks5ubKIBgaJpZM4Gex3i
.

Movidius Pi Hat์„ ์ง€์›ํ•˜๋Š” TensorRT๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. Movidius Pi Hat์€ Google์˜ 45๋‹ฌ๋Ÿฌ์งœ๋ฆฌ "AIY โ€‹โ€‹Vision Kit"์ž…๋‹ˆ๋‹ค. Google์€ Target์— ์—ฐ๊ฒฐํ•˜์—ฌ ๊ตฌ๋งคํ•ฉ๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ CUDA ๋˜๋Š” Nvidia์™€ ๊ด€๋ จ์ด ์—†์Šต๋‹ˆ๊นŒ? ์ธํ…” ์นฉ์„ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ํ•ต์‹ฌ์€ ์•„๋งˆ๋„ ์นฉ์ด FPGA์ผ๊นŒ์š”? ๋ˆ„๊ตฌ๋“ ์ง€ ๊ทธ๊ฒƒ์— ๋Œ€ํ•ด ๋” ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ?

๋‚˜๋Š” ํฐ Movidius ์œ ๋‹›์— ๋Œ€ํ•ด ๊ฝค ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์ถ”๋ก ์ผ ๋ฟ์ด๋ฉฐ TensorFlow ๋˜๋Š” Caffe ์‚ฌ์ „ ์ปดํŒŒ์ผ๋œ ๋ชจ๋ธ์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค. IIRC๋Š” ๋ชจ๋‘ 16๋น„ํŠธ ๋ชจ๋“œ์ž…๋‹ˆ๋‹ค.

Movidius ์นฉ ์ž์ฒด๋Š” ํ›จ์”ฌ ๋” ๊ฐ•๋ ฅํ•˜์ง€๋งŒ SDK๋ฅผ ๋ฐ›์œผ๋ ค๋ฉด ์ž๊ฒฉ์„ ๊ฐ–์ถ˜ ํŒŒํŠธ๋„ˆ์—ฌ์•ผ ํ•ฉ๋‹ˆ๋‹ค.

ํ…์„œ opencl์„ ์‚ฌ์šฉํ•˜๋ ค๋Š” ๋‹ค๋ฅธ ์‚ฌ๋žŒ์— ๋Œ€ํ•œ ๋งํฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

https://github.com/hughperkins/tf-coriander
https://github.com/ChiahungTai/tensorflow-cl
https://github.com/guoyejun/tensorflow-cl
https://github.com/honggui/tensorflow-cl
https://github.com/benoitsteiner/tensorflow-opencl
https://github.com/lukeiwanski/tensorflow (์ €์žฅ์†Œ๊ฐ€ ์˜ค๋ž˜๋จ)
https://github.com/codeplaysoftware/tensorflow
๋˜ํ•œ ํ™•์ธํ•  ๊ฐ€์น˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

https://documen.tician.de/pyopencl/
https://pypi.org/project/DeepCL/
https://www.khronos.org/sycl/

์ž‘์—… ํ”„๋กœ์ ํŠธ๋ฅผ ์ž์œ ๋กญ๊ฒŒ ์ถ”๊ฐ€ํ•˜์‹ญ์‹œ์˜ค.

์—…๋ฐ์ดํŠธ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ? ์ด ๋ฌธ์ œ๋Š” 3๋…„์ด ๋„˜์—ˆ์Šต๋‹ˆ๋‹ค.

๋„ค, ๊ฒŒ์‹œ๋ฌผ์˜ ๋งˆ์ง€๋ง‰ ํ•œ ์คŒ๋งŒ ๋ณด์„ธ์š”.

@ filips123 ์•„๋‹ˆ์š”, ์—…๋ฐ์ดํŠธ๊ฐ€ ์—†์œผ๋ฉฐ ์˜ˆ์ธก ๊ฐ€๋Šฅํ•œ ๋ฏธ๋ž˜์—๋„ ์—†์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทธ ๊ฐ€๋Šฅ์„ฑ์€ ์™ธ๊ณ„์ธ์˜ ์นจ๊ณต๊ณผ ์‹œ๊ฐ„์„ ๊ฑฐ์Šฌ๋Ÿฌ ์—ฌํ–‰ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ฐพ๋Š” ๊ฒƒ๋ณด๋‹ค ๋‚ฎ์Šต๋‹ˆ๋‹ค.

์ด ์ธํ…” ์ด๋‹ˆ์…”ํ‹ฐ๋ธŒ PlaidML์€ ์ถฉ๋ถ„ํžˆ ์ž˜ ์ž‘๋™ํ•˜๋ฏ€๋กœ ํ™•์ธํ•ด ๋ณผ ๊ฐ€์น˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.
https://github.com/plaidml/plaidml
Mac์˜ opencl ๋˜๋Š” metal์—์„œ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์€ ๋‚ด๊ฐ€ ์ฐพ๊ณ  ์žˆ๋˜ Macbook Pro AMD GPU์™€ ํ•จ๊ป˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.
ํ•œํŽธ PlaidML์—์„œ Pytorch ์ง€์›์— ํˆฌํ‘œํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ฃผ์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? https://github.com/plaidml/plaidml/issues/63

PlaidML์€ ํ™•์‹คํžˆ ๋ชจ๋‘ ํ›Œ๋ฅญํ•˜๊ณ  ๋ฉ‹์ง‘๋‹ˆ๋‹ค(์˜ˆ๋ฅผ ๋“ค์–ด, tf์˜ cuda ์ž์ฒด๋ณด๋‹ค opencl์˜ nvidia gpu์—์„œ ๋” ๋งŽ์€ ์„ฑ๋Šฅ์„ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.)
๊ทธ๋Ÿฌ๋‚˜ keras์˜ ๋ฐฑ์—”๋“œ์ž…๋‹ˆ๊นŒ? ์•„์‹œ๋‹ค์‹œํ”ผ tensorflow๋ฅผ ์™„์ „ํžˆ ๋Œ€์ฒดํ•˜๋Š” ๊ฒƒ์€ ์šฐ๋ฆฌ๊ฐ€ ์ด๊ฒƒ์„ ๋…ผ์˜ํ•˜๋Š” ์ €์žฅ์†Œ์ž…๋‹ˆ๊นŒ?
(์ตœ์‹  tf ๋ฒ„์ „์€ ๋ชจ๋ธ์„ keras๋กœ ์ง์ ‘ ๋‚ด๋ณด๋‚ผ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์ดํ•ดํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๊นŒ? ๊ทธ๋ž˜์„œ ๊ฑฐ๊ธฐ์— ..)

์–ด์จŒ๋“ , ๋„ค ๋ฒˆ์งธ๋กœ, ๋งŒ์•ฝ ๋‹น์‹ ์ด opencl์— ๋Œ€ํ•œ ์ตœ์‹  ์†”๋ฃจ์…˜์„ ์› ํ•˜๊ณ  ์—ฌ์ „ํžˆ ํ™œ๋ฐœํžˆ ๊ฐœ๋ฐœ๋˜๊ณ  ์žˆ๋Š” ๊ฒƒ( ๋˜ํ•œ ์‹ค์ œ๋กœ ์–ธ์  ๊ฐ€๋Š” ์—ฌ๊ธฐ์— ๋ณ‘ํ•ฉ๋  ์‹ค์ œ ๊ธฐํšŒ๊ฐ€ ์žˆ๋Š” ๊ฒƒ)์„ ์›ํ•œ๋‹ค๋ฉด, ๋‹จ์ง€ ์ฝ”๋“œ ํ”Œ๋ ˆ์ด ์Šคํƒ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹ค์‹œ:
https://developer.codeplay.com/computecppce/latest/tensorflow-overview
https://github.com/Rbiessy/tensorflow/tree/dev/amd_gpu

PlaidML์€ ํ™•์‹คํžˆ ๋ชจ๋‘ ํ›Œ๋ฅญํ•˜๊ณ  ๋ฉ‹์ง‘๋‹ˆ๋‹ค(์˜ˆ๋ฅผ ๋“ค์–ด, tf์˜ cuda ์ž์ฒด๋ณด๋‹ค opencl์˜ nvidia gpu์—์„œ ๋” ๋งŽ์€ ์„ฑ๋Šฅ์„ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.)
๊ทธ๋Ÿฌ๋‚˜ keras์˜ ๋ฐฑ์—”๋“œ์ž…๋‹ˆ๊นŒ? ์•„์‹œ๋‹ค์‹œํ”ผ tensorflow๋ฅผ ์™„์ „ํžˆ ๋Œ€์ฒดํ•˜๋Š” ๊ฒƒ์€ ์šฐ๋ฆฌ๊ฐ€ ์ด๊ฒƒ์„ ๋…ผ์˜ํ•˜๋Š” ์ €์žฅ์†Œ์ž…๋‹ˆ๊นŒ?
(์ตœ์‹  tf ๋ฒ„์ „์€ ๋ชจ๋ธ์„ keras๋กœ ์ง์ ‘ ๋‚ด๋ณด๋‚ผ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์ดํ•ดํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๊นŒ? ๊ทธ๋ž˜์„œ ๊ฑฐ๊ธฐ์— ..)

์–ด์จŒ๋“ , ๋„ค ๋ฒˆ์งธ ๋นŒ์–ด๋จน์„ ์‹œ๊ฐ„ ๋™์•ˆ, ๋งŒ์•ฝ ๋‹น์‹ ์ด opencl์— ๋Œ€ํ•œ ์ตœ์‹  ์†”๋ฃจ์…˜์„ ์› ํ•˜๊ณ  ์—ฌ์ „ํžˆ ํ™œ๋ฐœํžˆ ๊ฐœ๋ฐœ๋˜๊ณ  ์žˆ๋Š” ๋ฌด์–ธ๊ฐ€๋ฅผ ์›ํ•œ๋‹ค๋ฉด(_๋˜ํ•œ_ ์‹ค์ œ๋กœ ์–ธ์  ๊ฐ€๋Š” ์—ฌ๊ธฐ์— ๋ณ‘ํ•ฉ๋  ์‹ค์ œ ๊ธฐํšŒ๊ฐ€ ์žˆ๋Š” ๊ฒƒ), ๋‹จ์ง€ ์ฝ”๋“œ ํ”Œ๋ ˆ์ด ์Šคํƒ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
๋‹ค์‹œ:
https://developer.codeplay.com/computecppce/latest/tensorflow-overview
https://github.com/Rbiessy/tensorflow/tree/dev/amd_gpu

์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. tensorflow ์ง€์›์ด ์—†๋‹ค๋Š” ๊ฒƒ์„ ๊นจ๋‹ซ์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‚ด ๊ฐ€์ •์ ์ธ ๋‘๋‡Œ๋Š” keras gpu ์ง€์› == tensorflow ์ง€์›์ด๋ผ๊ณ  ์ƒ๊ฐํ–ˆ์Šต๋‹ˆ๋‹ค.

plaidML์€ ์ •๋ง ๋ฉ‹์ง‘๋‹ˆ๋‹ค. ์ผ€๋ผ์Šค์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.
๋ฌผ๋ก  plaidML ๋ฐฑ์—”๋“œ(์˜ˆ: tf.image.ssim)์—์„œ ์ž‘์—…ํ•˜๊ธฐ ์œ„ํ•ด ์ผ๋ถ€ tf ์ฝ”๋“œ๋ฅผ ์ˆœ์ˆ˜ ์ผ€๋ผ์Šค๋กœ ์ „์†กํ•ด์•ผ ํ–ˆ์Šต๋‹ˆ๋‹ค.
๊ทธ๋Ÿฌ๋‚˜ ๊ฒฐ๊ณผ - ๋‚ด ์ฝ”๋“œ๋Š” NVIDIA ๋ฐ AMD ์นด๋“œ์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

๋˜ํ•œ plaidML์€ ์—ฐ๊ตฌ์ž์—๊ฒŒ ์ฒœ๊ตญ์ž…๋‹ˆ๋‹ค. "ํƒ€์ผ" ์–ธ์–ด๋กœ ์ž‘์„ฑํ•  ๋ชจ๋“  ๊ธฐ๋Šฅ์— ๋Œ€ํ•ด ์ž๋™์œผ๋กœ ๊ทธ๋ผ๋””์–ธํŠธ๋ฅผ ์ƒ์„ฑํ•˜๊ณ  GPU์—์„œ ํ…์„œํ”Œ๋กœ์˜ 80% ์†๋„๋กœ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ ML ์—ฐ๊ตฌ์›์ด ์—ฌ์ „ํžˆ PyTorch๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ๋ฅผ ์ดํ•ดํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. Intel์˜ plaidML๋กœ ML ๊ณผํ•™์„ ํ–ฅ์ƒ์‹œํ‚ค์ž?

@iperov ์‹ค์ œ๋กœ ์•„๋ฌด๋„ PlaidML์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” ์ด์œ ๋ฅผ ์•Œ๊ณ  ์‹ถ์Šต๋‹ˆ๊นŒ?

  1. Tensorflow์˜ CUDA ๋ฐฑ์—”๋“œ์— ๋น„ํ•ด AMD์˜ OpenCL ๊ตฌํ˜„์—์„œ ๊ฐ€๋ จํ•˜๊ฒŒ ๋Š๋ฆฌ๊ฒŒ ์‹คํ–‰๋˜๋ฏ€๋กœ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ์˜ ์ ˆ๋ฐ˜ ์ด์ƒ์ด ์žˆ์Šต๋‹ˆ๋‹ค. CPU์™€ ํ•จ๊ป˜ Tensorflow๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๊ฒฝ์Ÿ๋ ฅ์ด ์žˆ๊ฑฐ๋‚˜ PlaidML์„ ์‚ฌ์šฉํ•˜๋Š” ํ•˜๋“œ์›จ์–ด๋ฅผ ์™„์ „ํžˆ ๋Šฅ๊ฐ€ํ•˜๋Š” ์„ฑ๋Šฅ์ด ๋„ˆ๋ฌด ๋‚˜์ฉ๋‹ˆ๊นŒ?

  2. ์•„๋ฌด๋„ ์ˆœ์ˆ˜ ์ˆ˜ํ•™ ๊ต์ˆ˜์™€ ๊ฐ™์€ ์‚ฌ๋žŒ๋งŒ์ด ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ์ „๋ฌธ์ ์ธ Tile ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋ฅผ ์œ ์ง€ํ•˜๋Š” ๋ฐ ๊ด€์‹ฌ์ด ์—†์Šต๋‹ˆ๋‹ค.

  3. ์ด๊ฒƒ์€ #2์™€ ๊ฑฐ์˜ ๊ด€๋ จ์ด ์žˆ์ง€๋งŒ Intel์ด Vertex.AI๋ฅผ ์ธ์ˆ˜ํ•œ ์ดํ›„๋กœ ๊ทธ๋“ค์€ PlaidML์— ๋Œ€ํ•ด ๋” ์ด์ƒ ์‹ ๊ฒฝ ์“ฐ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. GPU ์ปดํ“จํŒ… ๊ฐ€์† ๋จธ์‹  ๋Ÿฌ๋‹์„ ์œ„ํ•œ ์ธํ…”์˜ ์†”๋ฃจ์…˜์€ Tensorflow, PyTorch ๋˜๋Š” ๊ธฐํƒ€ ๋”ฅ ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ๋ฐฑ์—”๋“œ๋กœ ๋Œ€์ƒ์œผ๋กœ ํ•˜๊ธฐ ์œ„ํ•ด ํ˜„์žฌ nGraph ๋กœ ์•Œ๋ ค์ง„ ๋”ฅ ๋Ÿฌ๋‹์„ ์œ„ํ•œ ์ƒˆ๋กœ์šด ์ปดํŒŒ์ผ๋Ÿฌ๋ฅผ ๋„์ž…ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. nGraph๊ฐ€ ์žˆ์œผ๋ฉด ๋” ์ด์ƒ PlaidML์„ ์ค‘๊ฐœ์ž๋กœ ๊ณ„์† ๊ฐœ๋ฐœํ•  ์ด์œ ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

์‚ฌ๋žŒ๋“ค์€ ์œ ์ง€ ๊ด€๋ฆฌ ๋˜๋Š” ๊ธฐํƒ€ ๊ธฐ๋Šฅ๊ณผ ๊ฐ™์€ ๋‹ค๋ฅธ ์ด์œ ๋กœ PyTorch๋ฅผ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ์š”์•ฝํ•˜์ž๋ฉด PlaidML์€ Intel์˜ ๋„๊ตฌ์ด๋ฉฐ ๊ณ„ํš์˜ ๋งˆ์ง€๋ง‰ ๋ถ€๋ถ„์—์„œ PyTorch๊ฐ€ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•  ์˜๋„๊ฐ€ ์—†์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. nGraph์˜ ํ˜„์žฌ Intel GPU ๋ฐฑ์—”๋“œ๋Š” OpenCL 2.1์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋ฉฐ ๊ทธ ์ค‘ Intel๋งŒ ์ค€์ˆ˜ ๊ตฌํ˜„์„ ํ•˜๊ณ  ์žˆ์œผ๋ฏ€๋กœ Intel์€ ์ˆœ์ˆ˜ํ•˜๊ฒŒ ๊ธฐ๊ณ„ ํ•™์Šต์˜ ๊ฐœ์„ ๋ณด๋‹ค๋Š” ์Šค์Šค๋กœ๋ฅผ ์œ„ํ•ด ์กด์žฌํ•ฉ๋‹ˆ๋‹ค. Intel์ด ๊ณ„์†ํ•ด์„œ nGraph๋ฅผ ๊ฐœ๋ฐœํ•  ๋•Œ ๋งŽ์€ ๋”ฅ ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ์— OpenCL, Metal ๋˜๋Š” Vulkan์˜ ๋ณ„๋„ ์†Œ์Šค ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ชจ๋ธ๊ณผ ํ˜ธํ™˜๋˜์ง€ ์•Š๋Š” ํ…œํ”Œ๋ฆฟํ™”๋œ ์ปค๋„์ด ์žˆ์œผ๋ฏ€๋กœ OpenCL 2.1๋งŒ์œผ๋กœ๋Š” GPU ๋ฐฑ์—”๋“œ ๊ธฐ๋ฐ˜์„ ๊ณ„์† ์œ ์ง€ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์‹คํ—˜ ๋ชฉ์ ์œผ๋กœ ๋งŒ. Intel์˜ ์ตœ์ข… GPU ๋ฐฑ์—”๋“œ๋Š” ์•„๋งˆ๋„ SYCL 2.2๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๊ฑฐ๋‚˜ OpenMP์™€ ๊ฐ™์ด ์™„์ „ํžˆ ๋‹ค๋ฅธ ๋ฌด์–ธ๊ฐ€๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•  ๊ฒƒ์ด๋ฉฐ ๊ณต๊ธ‰์—…์ฒด๋ณ„ ์†”๋ฃจ์…˜์„ ๊ฐ€์ ธ์˜ฌ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

AMD์˜ ๊ฒฝ์šฐ ๋ˆ„๊ฐ€ ์‹ ๊ฒฝ์„ ์“ฐ๋‚˜์š”? OpenCL์€ ๊ทธ๋“ค๊ณผ ๊ด€๋ จ์ด ์—†์œผ๋ฉฐ ๋งˆ์นจ๋‚ด HIP์— ๋Œ€ํ•œ ์ž‘์—…์œผ๋กœ ์ผ๋ถ€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค ...

@iperov ์‹ค์ œ๋กœ ์•„๋ฌด๋„ PlaidML์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” ์ด์œ ๋ฅผ ์•Œ๊ณ  ์‹ถ์Šต๋‹ˆ๊นŒ?

  1. Tensorflow์˜ CUDA ๋ฐฑ์—”๋“œ์— ๋น„ํ•ด AMD์˜ OpenCL ๊ตฌํ˜„์—์„œ ๊ฐ€๋ จํ•˜๊ฒŒ ๋Š๋ฆฌ๊ฒŒ ์‹คํ–‰๋˜๋ฏ€๋กœ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ์˜ ์ ˆ๋ฐ˜ ์ด์ƒ์ด ์žˆ์Šต๋‹ˆ๋‹ค. CPU์™€ ํ•จ๊ป˜ Tensorflow๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ๊ฒฝ์Ÿ๋ ฅ์ด ์žˆ๊ฑฐ๋‚˜ PlaidML์„ ์‚ฌ์šฉํ•˜๋Š” ํ•˜๋“œ์›จ์–ด๋ฅผ ์™„์ „ํžˆ ๋Šฅ๊ฐ€ํ•˜๋Š” ์„ฑ๋Šฅ์ด ๋„ˆ๋ฌด ๋‚˜์ฉ๋‹ˆ๊นŒ?
  2. ์•„๋ฌด๋„ ์ˆœ์ˆ˜ ์ˆ˜ํ•™ ๊ต์ˆ˜์™€ ๊ฐ™์€ ์‚ฌ๋žŒ๋งŒ์ด ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ์ „๋ฌธ์ ์ธ Tile ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋ฅผ ์œ ์ง€ํ•˜๋Š” ๋ฐ ๊ด€์‹ฌ์ด ์—†์Šต๋‹ˆ๋‹ค.
  3. ์ด๊ฒƒ์€ #2์™€ ๊ฑฐ์˜ ๊ด€๋ จ์ด ์žˆ์ง€๋งŒ Intel์ด Vertex.AI๋ฅผ ์ธ์ˆ˜ํ•œ ์ดํ›„๋กœ ๊ทธ๋“ค์€ PlaidML์— ๋Œ€ํ•ด ๋” ์ด์ƒ ์‹ ๊ฒฝ ์“ฐ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. GPU ์ปดํ“จํŒ… ๊ฐ€์† ๋จธ์‹  ๋Ÿฌ๋‹์„ ์œ„ํ•œ ์ธํ…”์˜ ์†”๋ฃจ์…˜์€ Tensorflow, PyTorch ๋˜๋Š” ๊ธฐํƒ€ ๋”ฅ ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ๋ฐฑ์—”๋“œ๋กœ ๋Œ€์ƒ์œผ๋กœ ํ•˜๊ธฐ ์œ„ํ•ด ํ˜„์žฌ nGraph ๋กœ ์•Œ๋ ค์ง„ ๋”ฅ ๋Ÿฌ๋‹์„ ์œ„ํ•œ ์ƒˆ๋กœ์šด ์ปดํŒŒ์ผ๋Ÿฌ๋ฅผ ๋„์ž…ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. nGraph๊ฐ€ ์žˆ์œผ๋ฉด ๋” ์ด์ƒ PlaidML์„ ์ค‘๊ฐœ์ž๋กœ ๊ณ„์† ๊ฐœ๋ฐœํ•  ์ด์œ ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

์‚ฌ๋žŒ๋“ค์€ ์œ ์ง€ ๊ด€๋ฆฌ ๋˜๋Š” ๊ธฐํƒ€ ๊ธฐ๋Šฅ๊ณผ ๊ฐ™์€ ๋‹ค๋ฅธ ์ด์œ ๋กœ PyTorch๋ฅผ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ์š”์•ฝํ•˜์ž๋ฉด PlaidML์€ Intel์˜ ๋„๊ตฌ์ด๋ฉฐ ๊ณ„ํš์˜ ๋งˆ์ง€๋ง‰ ๋ถ€๋ถ„์—์„œ PyTorch๊ฐ€ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•  ์˜๋„๊ฐ€ ์—†์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. nGraph์˜ ํ˜„์žฌ Intel GPU ๋ฐฑ์—”๋“œ๋Š” OpenCL 2.1์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋ฉฐ ๊ทธ ์ค‘ Intel๋งŒ ์ค€์ˆ˜ ๊ตฌํ˜„์„ ํ•˜๊ณ  ์žˆ์œผ๋ฏ€๋กœ Intel์€ ์ˆœ์ˆ˜ํ•˜๊ฒŒ ๊ธฐ๊ณ„ ํ•™์Šต์˜ ๊ฐœ์„ ๋ณด๋‹ค๋Š” ์Šค์Šค๋กœ๋ฅผ ์œ„ํ•ด ์กด์žฌํ•ฉ๋‹ˆ๋‹ค. Intel์ด ๊ณ„์†ํ•ด์„œ nGraph๋ฅผ ๊ฐœ๋ฐœํ•  ๋•Œ ๋งŽ์€ ๋”ฅ ๋Ÿฌ๋‹ ํ”„๋ ˆ์ž„์›Œํฌ์— OpenCL, Metal ๋˜๋Š” Vulkan์˜ ๋ณ„๋„ ์†Œ์Šค ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ชจ๋ธ๊ณผ ํ˜ธํ™˜๋˜์ง€ ์•Š๋Š” ํ…œํ”Œ๋ฆฟํ™”๋œ ์ปค๋„์ด ์žˆ์œผ๋ฏ€๋กœ OpenCL 2.1๋งŒ์œผ๋กœ๋Š” GPU ๋ฐฑ์—”๋“œ ๊ธฐ๋ฐ˜์„ ๊ณ„์† ์œ ์ง€ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์‹คํ—˜ ๋ชฉ์ ์œผ๋กœ ๋งŒ. Intel์˜ ์ตœ์ข… GPU ๋ฐฑ์—”๋“œ๋Š” ์•„๋งˆ๋„ SYCL 2.2๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๊ฑฐ๋‚˜ OpenMP์™€ ๊ฐ™์ด ์™„์ „ํžˆ ๋‹ค๋ฅธ ๋ฌด์–ธ๊ฐ€๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•  ๊ฒƒ์ด๋ฉฐ ๊ณต๊ธ‰์—…์ฒด๋ณ„ ์†”๋ฃจ์…˜์„ ๊ฐ€์ ธ์˜ฌ ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

AMD์˜ ๊ฒฝ์šฐ ๋ˆ„๊ฐ€ ์‹ ๊ฒฝ์„ ์“ฐ๋‚˜์š”? OpenCL์€ ๊ทธ๋“ค๊ณผ ๊ด€๋ จ์ด ์—†์œผ๋ฉฐ ๋งˆ์นจ๋‚ด HIP์— ๋Œ€ํ•œ ์ž‘์—…์œผ๋กœ ์ผ๋ถ€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค ...

ํœด๋Œ€ํฐ, raspberry pi odroid ๋“ฑ๊ณผ ๊ฐ™์€ ์•” ๋จธ์‹  ๋‚ด๋ถ€์˜ ๋ชจ๋“  GPU๋Š” ์–ด๋–ป์Šต๋‹ˆ๊นŒ?
๊ทธ๋“ค์€ opencl์„ ์ง€์›ํ•˜์ง€ ์•Š์Šต๋‹ˆ๊นŒ?
Google์€ Android์˜ GPU์— tensorflow๋ฅผ ์‚ฝ์ž…ํ•˜๋Š” ๋ฐ ์‹ ๊ฒฝ์„ ์จ์•ผ ํ•ฉ๋‹ˆ๋‹ค.
์‹ ๊ฒฝ๋ง ๊ต์œก์˜ ๊ฐ€์žฅ ํฐ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋Š” Nvidia GPU์—์„œ๋งŒ ์‹คํ–‰๋˜๋ฉฐ Nvidia GPU๋ฅผ ์ ์  ๋” ๋น„์‹ธ๊ฒŒ ๋งŒ๋“ญ๋‹ˆ๋‹ค(์‚ฌ๋žŒ๊ณผ ํšŒ์‚ฌ๊ฐ€ ์ „๋ฌธ์ ์ธ ์‹ ๊ฒฝ๋ง ๊ต์œก์„ ์œ„ํ•ด์„œ๋งŒ ๊ตฌ์ž…ํ•˜๊ธฐ ๋•Œ๋ฌธ์—). ๊ทธ๋Ÿฌ๋ฉด Google์€ ๋” ๋งŽ์€ ๋ˆ์„ ์žƒ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

@Degerz ๋‹น์‹ ์€ ์–ด๋Š ํ–‰์„ฑ์—์„œ ์™”์Šต๋‹ˆ๊นŒ?
tf-CPU์™€ AMD GPU๋ฅผ ์–ด๋–ป๊ฒŒ ๋น„๊ตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?
tf-CPU๋ณด๋‹ค ๋น ๋ฅธ plaidML x30์˜ AMD GPU

  1. Tensorflow์˜ CUDA ๋ฐฑ์—”๋“œ์™€ ๋น„๊ตํ•˜์—ฌ AMD์˜ OpenCL ๊ตฌํ˜„์—์„œ ๊ฐ€๋ จํ•˜๊ฒŒ ๋Š๋ฆฌ๊ฒŒ ์‹คํ–‰๋˜๋ฏ€๋กœ ์‚ฌ์šฉํ•˜๋Š” ์ด์œ ์˜ ์ ˆ๋ฐ˜ ์ด์ƒ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

๋‚ด deepfakes ํ…Œ์ŠคํŠธ์—์„œ OpenCL์€ 20%๋งŒ ๋Š๋ ค์ง€์ง€๋งŒ ์ผ๋ถ€ ๋ฏธ๋‹ˆ ๋„คํŠธ์›Œํฌ์—์„œ๋Š” OpenCL์ด 20% ๋” ๋น ๋ฆ…๋‹ˆ๋‹ค.

๋‚ด ํ”„๋กœ์ ํŠธ DeepFaceLab์—๋Š” AMD์˜ ์ง€์›์„ ๊ธฐ๋‹ค๋ ค์˜จ ๋งŽ์€ ์‚ฌ์šฉ์ž๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๋”ฅํŽ˜์ดํฌ๊ฐ€ ๋งˆ์นจ๋‚ด AMD ์นด๋“œ์—์„œ ํ›ˆ๋ จ๋  ์ˆ˜ ์žˆ๊ฒŒ ๋˜์—ˆ์„ ๋•Œ ์–ผ๋งˆ๋‚˜ ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์ด ๊ธฐ๋ปํ–ˆ์Šต๋‹ˆ๊นŒ?
๋˜ํ•œ plaidML์€ ๊ธฐ๋ณธ์ ์œผ๋กœ AMD/IntelHD๋ฅผ ์ง€์›ํ•˜๋Š” keras์˜ ์œ ์ผํ•œ ๋ฐฑ์—”๋“œ์ž…๋‹ˆ๋‹ค.
keras์šฉ ์ƒˆ AMD ๋ฐฑ์—”๋“œ๊ฐ€ ๋‚˜ํƒ€๋‚˜๋ฉด ๋ฌผ๋ก  ๋‚ด ํ”„๋กœ์ ํŠธ๊ฐ€ ํ•ด๋‹น ๋ฐฑ์—”๋“œ๋กœ ์ „ํ™˜๋ฉ๋‹ˆ๋‹ค.
PyTorch์—๋Š” ๋ฏธ๋ž˜๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

plaidML์—์„œ ๋ฌด์—‡์„ ์œ ์ง€ํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ? Ops๋Š” ์ž๋™ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜๋ฉฐ ์œ ์ง€ ๊ด€๋ฆฌํ•  ๊ฒƒ์ด ์—†์Šต๋‹ˆ๋‹ค.

์ˆœ์ˆ˜ ์ˆ˜ํ•™ ๊ต์ˆ˜์™€ ๊ฐ™์€ ์‚ฌ๋žŒ๋งŒ์ด ๊ตฌ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ํƒ€์ผ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด

๊ธฐ๊ณ„ ํ•™์Šต์€ ์ˆ˜ํ•™ ๊ต์ˆ˜๋“ค์ด ๋ฐœ๋ช…ํ•œ ๊ฒƒ ์•„๋‹™๋‹ˆ๊นŒ?

@talregev ARM ๋˜๋Š” Broadcom์€ ์–ด๋–ป์Šต๋‹ˆ๊นŒ? ์ „์ž๋Š” ์•„๋งˆ๋„ ํ•˜์œ„ OpenCL ๊ตฌํ˜„์„ ๊ฐ€์ง€๊ณ  ์žˆ๊ณ  ํ›„์ž๋Š” OpenCL ๋“œ๋ผ์ด๋ฒ„๋ฅผ ๊ณต์‹์ ์œผ๋กœ ์ œ๊ณตํ•˜์ง€๋„ ์•Š์Šต๋‹ˆ๋‹ค! ํ•˜๋“œ์›จ์–ด ๊ณต๊ธ‰์—…์ฒด๋ฅผ ์œ„ํ•œ ์œ ๋Šฅํ•œ ์ปดํ“จํŒ… ์Šคํƒ์„ ๋งŒ๋“ค๊ณ  ์œ ์ง€ํ•˜๋Š” ๊ฒƒ์€ Google์˜ ์ฑ…์ž„์ด ์•„๋‹™๋‹ˆ๋‹ค...

@iperov PlaidML ์— ๋ ˆ์ด์–ด๋ฅผ ์ž„๋ฒ ๋”ฉํ•˜์—ฌ ์‹ ๊ฒฝ๋ง์„ ํ›ˆ๋ จํ•˜๋Š” ๊ฒƒ์ด ๊ณ ํ†ต์Šค๋Ÿฝ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ณ  ๊ณ„์‹œ๋‚˜์š”? PlaidML์€ ๋˜ํ•œ DenseNet์— ์ ํ•ฉํ•˜์ง€ ์•Š๊ฑฐ๋‚˜ ๊ณ„์‚ฐ ๊ทธ๋ž˜ํ”„๊ฐ€ ์ •์ ์ด๋ฉฐ PlaidML์ด RNN๊ณผ๋„ ์ž˜ ์ž‘๋™ํ•œ๋‹ค๋Š” ์‚ฌ์‹ค๊ณผ ๊ฐ™์€ ๋งŽ์€ ๋‹ค๋ฅธ ์ œํ•œ ์‚ฌํ•ญ์ด ์žˆ์Šต๋‹ˆ๊นŒ?

๋‹น์‹ ์˜ ํ”„๋กœ์ ํŠธ์— ๊ด€ํ•ด์„œ๋Š”, ๊ทธ๊ฒƒ์— ๋Œ€ํ•ด ๊ฑฑ์ •ํ•˜์ง€ ๋งˆ์‹ญ์‹œ์˜ค. MIOpen์ด ์—…์ŠคํŠธ๋ฆผ๋˜๋ฉด AMD๊ฐ€ ๊ณง ๋„ค์ดํ‹ฐ๋ธŒ GPU ๋ฐฑ์—”๋“œ๋ฅผ ์ œ๊ณตํ•  ์˜ˆ์ •์ด๋ฏ€๋กœ Tensorflow์™€ ๊ฐ™์€ ๋” ๋‚˜์€ ๊ฒƒ์œผ๋กœ ์ด๋™ํ•˜๊ฒŒ ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์„ฑ๋Šฅ๋ฉด์—์„œ ๋จผ์ง€. ์–ด์จŒ๋“  ์ธํ…” iGPU์— ๊ด€์‹ฌ์ด ์žˆ๋Š” ์‚ฌ๋žŒ์€ ๋ˆ„๊ตฌ์ž…๋‹ˆ๊นŒ? ์ธํ…”์ด ๋ฏธ๋ž˜์˜ ๊ฐœ๋ณ„ ๊ทธ๋ž˜ํ”ฝ ํ•˜๋“œ์›จ์–ด์—์„œ ๊ณ ์„ฑ๋Šฅ ๋”ฅ ๋Ÿฌ๋‹์„ ์ œ๊ณตํ•˜๋Š” ๋ฐ ์ง„์ •์œผ๋กœ ์ „๋…ํ•œ๋‹ค๋ฉด ๋‹ค๋ฅธ ์ œํ’ˆ(AMD/HIP ๋ฐ Nvidia/CUDA)์ด ์ด์ „์— ์ œ๊ณตํ–ˆ๋˜ ๊ฒƒ์ฒ˜๋Ÿผ ๋‹จ์ผ ์†Œ์Šค ์˜ต์…˜์„ ์ œ๊ณตํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค...

PyTorch์—๋Š” ๋ฏธ๋ž˜๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

๋งŽ์ด ๋ถ€๋Ÿฌ์›Œ? PyTorch๋Š” PlaidML๋ณด๋‹ค โ€‹โ€‹~10๋ฐฐ ๋” ์ธ๊ธฐ๊ฐ€ ์žˆ์œผ๋ฉฐ DL์˜ ์ตœ์‹  ๊ธฐ์ˆ ์€ PyTorch, ๋‹ค์–‘ํ•œ ๊ธฐ์—ฌ์ž์—์„œ ์‰ฝ๊ฒŒ ๊ตฌํ˜„๋˜๋ฉฐ Intel์ด ๊ฑฐ์˜ ํ•œ ๋‹ฌ ๋™์•ˆ PlaidML์— ๊ธฐ์—ฌํ•˜์ง€ ์•Š์€ ๋™์•ˆ Facebook์—์„œ ์ ๊ทน์ ์œผ๋กœ ๊ฐœ๋ฐœํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

plaidML์—์„œ ๋ฌด์—‡์„ ์œ ์ง€ํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ? Ops๋Š” ์ž๋™ ๋ฏธ๋ถ„ ๊ฐ€๋Šฅํ•˜๋ฉฐ ์œ ์ง€ ๊ด€๋ฆฌํ•  ๊ฒƒ์ด ์—†์Šต๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ PlaidML์€ ์•ž์œผ๋กœ ์ƒˆ๋กœ์šด ์ˆ˜์ • ์‚ฌํ•ญ์ด๋‚˜ ์ƒˆ๋กœ์šด ๊ธฐ๋Šฅ์„ ๋ฐ›์ง€ ์•Š์•„์•ผ ํ•œ๋‹ค๋Š” ์ ์„ ์•Œ๋ ค๋“œ๋ฆฝ๋‹ˆ๋‹ค. ์ฝ”๋“œ ๊ฐœ์„ ์˜ ๊ฐ€์น˜๋ฅผ ๋ณด์ง€ ๋ชปํ•œ๋‹ค๋ฉด PlaidML์˜ ๋ˆˆ์— ๋„๋Š” ๊ฒฐํ•จ์„ ์ธ์ •ํ•˜๋„๋ก ์„ค๋“ํ•˜๋Š” ๊ฒƒ์€ ์˜๋ฏธ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค...

๊ธฐ๊ณ„ ํ•™์Šต์€ ์ˆ˜ํ•™ ๊ต์ˆ˜๋“ค์ด ๋ฐœ๋ช…ํ•œ ๊ฒƒ ์•„๋‹™๋‹ˆ๊นŒ?

๊ฐ€๋…์„ฑ๋ณด๋‹ค ์šฐ์•„ํ•จ์ด ๋ถ„๋ช…ํžˆ ์„ ํ˜ธ๋˜๋Š” Tile์˜ ๊ฒฝ์šฐ ํŠนํžˆ ๊ทธ๋“ค์ด ๊ตฌ์„ฑํ•˜๋Š” ๋ชจ๋“  ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•œ๋‹ค๋Š” ์˜๋ฏธ๋Š” ์•„๋‹™๋‹ˆ๋‹ค. ๋งŽ์€ ์ž ์žฌ์  ๊ธฐ์—ฌ์ž๊ฐ€ ๊ธฐ์—ฌํ•˜๋Š” ๊ฒƒ์„ ๋‘๋ ค์›Œํ•˜๋Š” ์ด์œ ๋Š” ์ „ํ˜€ ์ด์ƒํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค...

๋ง™์†Œ์‚ฌ, STFUํ•˜๊ณ  ๋Œ€์‹  ์ผํ„ฐ๋กœ ๋Œ์•„๊ฐ€๊ธธ ๋ฐ”๋ž๋‹ˆ๋‹ค. ํ™”์—ผ ์ „์Ÿ์— ๊ด€ํ•œ ์ด๋ฉ”์ผ์„ ๋ฐ›๋Š” ๊ฒƒ์ด ๊ฒฌ๋”œ ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์— ํ‹ฐ์ผ“์—์„œ ๊ตฌ๋…์„ ์ทจ์†Œํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋ถˆํ–‰ํžˆ๋„ ์œ ์ง€ ๊ด€๋ฆฌ์ž๋Š” ์Šค๋ ˆ๋“œ๋ฅผ ์Œ์†Œ๊ฑฐํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

@gunan @caisq @sanjoy ์ข€ ํ•ด์ฃผ์‹ค ์ˆ˜ ์žˆ๋‚˜์š”?

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰