Numpy: ๋ฐฑ๋ถ„์œ„ ์ˆ˜ ๋ฐฉ๋ฒ• ์žฌ๊ตฌ์„ฑ

์— ๋งŒ๋“  2018๋…„ 03์›” 12์ผ  ยท  53์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: numpy/numpy

Wikipedia ํŽ˜์ด์ง€์— ์˜ˆ์‹œ ๋œ๋Œ€๋กœ : https://en.wikipedia.org/wiki/Percentile#The_nearest -rank_method

00 - Bug 01 - Enhancement high

๋ชจ๋“  53 ๋Œ“๊ธ€

์ด๋ฏธ ์กด์žฌํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๊นŒ? ์œ„ํ‚ค ๋ฐฑ๊ณผ ์˜ˆ์ œ ์‚ฌ์šฉ :

>>> np.percentile(15, 20, 35, 40, 50], [5, 30, 40, 50, 100], interpolation='lower')
array([15, 20, 20, 35, 50])

๊ทธ๋ ‡์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์œ„ํ‚ค ๋ฐฑ๊ณผ ํŽ˜์ด์ง€์˜ ์˜ˆ์ œ 2๋ฅผ๋ณด์‹ญ์‹œ์˜ค.

>>> np.percentile([3, 6, 7, 8, 8, 10, 13, 15, 16, 20], [25,50,75,100], interpolation='lower')
array([ 7,  8, 13, 20])

[7,8,15,20] ์ด์–ด์•ผํ•˜๋Š” ๊ฒฝ์šฐ

์„ธ ๋ฒˆ์งธ ์˜ˆ์—์„œ๋„ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ ์‹คํŒจํ•ฉ๋‹ˆ๋‹ค.

๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์†Œ๋ฆฌ๊ฐ€ "๊ฐ€์žฅ ๊ฐ€๊นŒ์šด"๊ฒƒ ๊ฐ™๋‚˜์š”? ๊ฒฝ๊ณ„๊ฐ€ ์ •ํ™•ํžˆ ์–ด๋–ป๊ฒŒ ์ž‘๋™ํ•˜๋Š”์ง€์— ๋Œ€ํ•œ ๋˜ ๋‹ค๋ฅธ ์š”์ ์ด ํ•ญ์ƒ ์žˆ์ง€๋งŒ.
ํŽธ์ง‘ : ์ฆ‰, ์ •ํ™•ํžˆ 0๊ณผ 100์ด ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ์—์„œ ๋˜๋Š” ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ ์ด์ „์œผ๋กœ ๊ฐ„์ฃผ๋˜๋Š” ๊ณณ์ž…๋‹ˆ๊นŒ? (๊ทธ๊ฒƒ์€ IIRC์ž…๋‹ˆ๋‹ค, ์–ด์จŒ๋“  ์—ฌ๊ธฐ์—๋Š” ๋งŽ์€ ์„ฑ๊ฐ€์‹  ๋ณต์žก์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค)

์ฝ๊ณ  ์‹ถ์ง€ ์•Š๋‹ค๋ฉด, ๊ทธ ์ฐจ์ด๊ฐ€ ๋” ์•„๋ž˜์—์žˆ๋Š” C ๋งค๊ฐœ ๋ณ€์ˆ˜ ์ผ ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ์ด๊ฒƒ์„ ์•„๋Š” ์‚ฌ๋žŒ์ด ์ด๊ฒƒ์„ ์ถ”๊ฐ€ํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด ....

์†”์งํžˆ C ๋งค๊ฐœ ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•˜๋ฉด ์ •๋ง ์ข‹์„ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋Œ€์ฒด๋กœ ๋” ๋‚˜์€ ๋ฌธ์„œ๊ฐ€ ์ข‹์„ ๊ฒƒ์ด๊ณ , ์ด๊ฒƒ์„ ์ •๋ง๋กœ ์•„๋Š” ์‚ฌ๋žŒ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค ....

์ด๊ฒƒ์ด C- ํŒŒ๋ผ๋ฏธํ„ฐ์™€ ๊ด€๋ จ์ด ์žˆ๋Š”์ง€๋Š” ๋ชจ๋ฅด๊ฒ ์ง€๋งŒ, ์„ ํƒํ•˜๋Š” ์˜ต์…˜์ด ๋ฐ”๋žŒ์ง ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๋ฐ ๋™์˜ํ•ฉ๋‹ˆ๋‹ค.

์šฐ์—ฐํžˆ์ด ๋ฌธ์ œ๋ฅผ ์ œ๊ธฐ ํ•œ ๋‹ค๋ฅธ ์Šค๋ ˆ๋“œ ๋ฅผ ์ฐพ์•˜์Šต๋‹ˆ๋‹ค (2016 ๋…„ 12 ์›”). ๋‚ด๊ฐ€ ์ฐพ๊ณ ์žˆ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜ (๊ทธ๋ฆฌ๊ณ  wikipedia์—์„œ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ˆœ์œ„๋ผ๊ณ  ๋ถ€๋ฅด๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜)์€ Hyndman-Fan (H & F) ์ด ์ผ๋ฐ˜์ ์œผ๋กœ ์ธ์šฉ ํ•œ์ด

๋‹ค์Œ์€ ์ง๊ด€์ ์œผ๋กœ ์œ ์‚ฌํ•œ ๊ฒƒ์„ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ด๋Š” numpy๊ฐ€ ์ œ๊ณตํ•˜๋Š” ๋‹ค๋ฅธ ์˜ต์…˜ (์˜ˆ : 'lower', 'nearest')๊ณผ ๋น„๊ตํ•˜์—ฌ ์–ด๋–ป๊ฒŒ ๋ณด์ด๋Š”์ง€ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

percentiles

๋‚˜์—๊ฒŒ ๊ทธ๊ฒƒ์€ ์ฒซ๋ˆˆ์— C ๋งค๊ฐœ ๋ณ€์ˆ˜์™€ ๋˜‘๊ฐ™์ด ๋ณด์ž…๋‹ˆ๋‹ค. ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ๊ณก์„ ์€ H & F ๊ณก์„ ๋ณด๋‹ค ๋” ๋Š˜์–ด๋‚ฉ๋‹ˆ๋‹ค .numpy๋Š” 1์„ ์‚ฌ์šฉํ•˜๊ณ  H & F๋Š” 0์„ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์˜ˆ์ƒ๋ฉ๋‹ˆ๋‹ค.

์ฆ๊ฑฐ๋ฅผ ์›ํ•œ๋‹ค๋ฉด. ๋™์ผํ•œ ๊ฐ’์„ 1000 ๋ฒˆ ๋ฐ˜๋ณตํ•˜์—ฌ ์ „์ฒด๋ฅผ ๋ฐ˜๋ณตํ•˜๋ฉด ์ˆ˜๋ ด ํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.
ํŽธ์ง‘ : ์•„๋‹ˆ๋ฉด ๊ทธ๋ ‡์ง€ ์•Š์„ ์ˆ˜๋„ ์žˆ๊ณ , ์ธ๋‚ด์‹ฌ์ด๋‚˜ ์‹œ๊ฐ„์ด ์—†์–ด์„œ ์ƒ๊ฐํ•  ์‹œ๊ฐ„์ด ์—†์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์—ฌ์ „ํžˆ ์œ„ํ‚คํ”ผ๋””์•„์—์„œ ์–ธ๊ธ‰ ํ•œ C ๋งค๊ฐœ ๋ณ€์ˆ˜๋ผ๊ณ  ์ƒ๊ฐํ•˜๋ฏ€๋กœ ํ‹€๋ ธ๋‹ค๋Š” ๊ฒƒ์„ ์ฆ๋ช… ํ•ด์ฃผ์„ธ์š” :)

์ด์™€ ๊ฐ™์€ ๊ทธ๋ž˜ํ”„๋Š” ๋ฐฑ๋ถ„์œ„ ์ˆ˜ ๋ฌธ์„œ์— ํฐ ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ํŽธ์ง‘ : ๋ฐ”๋žŒ์งํ•˜๊ฒŒ๋Š” ๋ถˆ์—ฐ์†์˜ ๊ฐœ๋ฐฉ / ํ์‡„๋ฅผ ๋ณด์—ฌ์ฃผ๋Š” ๊ฒƒ

๋…์ž ์ฐธ๊ณ  ์‚ฌํ•ญ : ์ด ์Šค๋ ˆ๋“œ๋ฅผ ๊ด€๋ฆฌํ•˜๊ธฐ ์‰ฝ๊ฒŒ ์œ ์ง€ํ•˜๊ธฐ ์œ„ํ•ด์ด ๊ทธ๋ž˜ํ”„๋ฅผ ๋ฌธ์„œ์— ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ๋ชจ๋“  ํ† ๋ก ์„ "ํ•ด๊ฒฐ๋จ"์œผ๋กœ ํ‘œ์‹œํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด์ œ ๊ทธ๋ž˜ํ”„๋Š” https://numpy.org/devdocs/reference/generated/numpy.percentile.html ์˜ ํ•˜๋‹จ์—

@ eric-wieser ๋‚˜๋Š” ๊ทธ ๊ทธ๋ž˜ํ”„๋ฅผ ๋งŒ๋“œ๋Š” ๋ฐ ์‹ ๊ฒฝ ์“ฐ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์˜ค๋Š˜ ๋Šฆ๊ฒŒ ๋‹ค์‹œ ์˜ฌ๊ฒŒ์š”, ์—ฌ๊ธฐ์— ์˜ฌ๋ฆด๊นŒ์š”?

@seberg ์†”์งํžˆ ๋งํ•ด์„œ C- ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋ณด๊ฐ„์ด ์–ด๋–ป๊ฒŒ ๊ณ„์‚ฐ๋˜๋Š”์ง€ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ๊ด€๋ จ์ด ์—†๋‹ค๊ณ  ์ƒ๊ฐํ•˜๊ฒŒ ๋งŒ๋“œ๋Š” ๊ฒƒ์€ C- ํŒŒ๋ผ๋ฏธํ„ฐ๊ฐ€ ์„ ํ˜• ๋ณด๊ฐ„ ์„น์…˜ (Wikipedia)์—์„œ๋งŒ ๋…ผ์˜๋˜๊ณ  Wikipedia์™€ Hyndmand & Fan ๋…ผ๋ฌธ์—์„œ ๋‚ด๊ฐ€ ์š”์ฒญํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ๋ณด๊ฐ„๊ณผ๋Š” ๋ณ„๊ฐœ์˜ ์„น์…˜์—์„œ ๋…ผ์˜ํ•œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๊ด€์‹ฌ์žˆ๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜๊ณผ ํ•ญ์ƒ ๋™์ผํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ œ๊ณตํ•˜๋Š” ๋ณด๊ฐ„ ๋งค๊ฐœ ๋ณ€์ˆ˜๊ฐ€ ์žˆ๋Š”์ง€ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.

์กด์žฌํ•˜๋”๋ผ๋„,์ด ๋ฐฉ๋ฒ•์œผ๋กœ ์ ‘๊ทผํ•ด์•ผํ•ฉ๋‹ˆ๊นŒ? ๊ฐ€์žฅ ์ผ๋ฐ˜์ ์ธ ๋ฐฑ๋ถ„์œ„ ์ˆ˜ ์ •์˜๋ฅผ ์–ป๊ธฐ ์œ„ํ•ด '์ด์ƒํ•œ'๋งค๊ฐœ ๋ณ€์ˆ˜๋ฅผ ๋ณ€๊ฒฝํ•˜๋Š” ๊ฒƒ์€์ด๋ฅผ ๊ตฌํ˜„ํ•˜๋Š” ๊ฐ€์žฅ ์ข‹์€ ๋ฐฉ๋ฒ•์ด ์•„๋‹Œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

@ ricardoV94 ์ผ ์ˆ˜๋„ ์žˆ์ง€๋งŒ ์•„๋ฌด๋ฆฌ ๋‚˜์˜๋”๋ผ๋„ ๊ธฐ๋ณธ๊ฐ’์„ ๋ณ€๊ฒฝํ•  ์ˆ˜๋Š” ์—†์Šต๋‹ˆ๋‹ค. method = "H & K"์™€ ๊ฐ™์€ ๊ฒƒ์„ ๋…ธ์ถœํ•˜์—ฌ ๋‘ ๋งค๊ฐœ ๋ณ€์ˆ˜๋ฅผ ํ•œ ๋ฒˆ์— ์žฌ์ •์˜ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

C ๋งค๊ฐœ ๋ณ€์ˆ˜๋Š” ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ์— ๋Œ€ํ•ด 0 % ๋ฐ 100 %๋ฅผ ์ •์˜ํ•˜๋Š” ๊ณณ์ž…๋‹ˆ๋‹ค (๋ฐ์ดํ„ฐ ํฌ์ธํŠธ์— ๋Œ€ํ•œ ์—ฌ๋ถ€ ๋“ฑ). ์œ„ํ‚คํ”ผ๋””์•„์˜ ๋งค๊ฐœ ๋ณ€์ˆ˜ C ๋กœ ๋ณด๊ฐ„์—๋งŒ ํ•ด๋‹น ๋  ์ˆ˜ ์žˆ์ง€๋งŒ ๋™์ผํ•œ ๋ฌธ์ œ๋กœ ์ธํ•ด ์ฐจ์ด๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. C ๋Š” ๋‹น์—ฐํžˆ ๋ชจํ˜ธํ•ฉ๋‹ˆ๋‹ค. ์ ์ ˆํ•œ ์ด๋ฆ„์€ range = 'min-max'๋˜๋Š” range = 'extrapolated'๋˜๋Š” ์™„์ „ํžˆ ๋‹ค๋ฅธ ๊ฒƒ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚ด๊ฐ€ ๋งํ–ˆ๋“ฏ์ด, ๋งŽ์€ ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ (์•„๋งˆ๋„ ์ž‘์€ ๋…ธ์ด์ฆˆ)๊ฐ€์žˆ๋Š” ํ”Œ๋กฏ์„ ๋‹ค์‹œ ์‹คํ–‰ํ•˜๋ฉด ๋ฒ”์œ„ ์ •์˜๊ฐ€ ๋œ ๋ช…ํ™• ํ•ด์ง€๊ธฐ ๋•Œ๋ฌธ์— ์ด๋“ค์ด ์ˆ˜๋ ดํ•˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜์žˆ์„ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

@seberg ๋‚˜๋Š” method = "H & K"๋˜๋Š” method = "classic"์œผ๋กœ ๊ดœ์ฐฎ์Šต๋‹ˆ๋‹ค. Interpolation = "none"๋„ ์˜๋ฏธ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฌธ์„œ์— ์ด๋ฏธ์ง€๋ฅผ ํฌํ•จํ•˜๋Š” ๋ฉ”์ปค๋‹ˆ์ฆ˜์ด ๋ฌด์—‡์ธ์ง€ ๋˜๋Š”์ด๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ์„ ๋ก€๊ฐ€ ์žˆ๋Š”์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋ฌธ์„œ ๋‚ด์—์„œ matplotlib ์ฝ”๋“œ๋ฅผ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๋‹ค๋ฅธ ๊ณณ์—์„œ ์ˆ˜ํ–‰ํ•˜๋Š” ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. ์ด๋Š” ๋˜ํ•œ ํ˜„์‹ค๊ณผ ๋™๊ธฐํ™” ๋œ ์ƒํƒœ๋กœ ์œ ์ง€๋˜๋„๋กํ•ฉ๋‹ˆ๋‹ค.

์ข‹์•„์š”,์ด ๊ฒฝ์šฐ ์ตœ๊ณ ์˜ ์ฝ”๋“œ ์ด๋ฏธ์ง€๋ฅผ ์ƒ๊ฐํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

๊ฐ€์žฅ ๋ฌธ์ œ๊ฐ€๋˜๋Š” ๋ถ€๋ถ„์€ ๋ถˆ์—ฐ์†์„ฑ์— ๋Œ€ํ•œ ๊ฐœ๋ฐฉํ˜•, ํ์‡„ ํ˜• ๋งˆ์ปค์ž…๋‹ˆ๋‹ค. matplotlib์—๋Š” ํ•ด๋‹น ๊ธฐ๋Šฅ (afaik)์ด ๋‚ด์žฅ๋˜์–ด ์žˆ์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค. ์ด ๊ฒฝ์šฐ ํ•˜๋“œ ์ฝ”๋”ฉ์€ ์˜๋ฏธ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.

์ง€๊ธˆ์€ ๊ฑด๋„ˆ ๋›ธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. matplotlib์— ์ž๋™ ์ง€์›์ด ์žˆ์œผ๋ฉด ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๋” ๋‚˜์€ ์ œ์•ˆ์„ ํ•  ์ˆ˜ ์žˆ๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์€ ๋ถˆ์—ฐ์†์„ฑ์— ๋Œ€ํ•ด ์—ฌ์ „ํžˆ ์šฐ์•„ํ•ฉ๋‹ˆ๋‹ค.

import matplotlib.pyplot as plt

a = [0,1,2,3]
p = np.arange(101)

plt.step(p, np.percentile(a, p, interpolation='linear'), label='linear')
plt.step(p, np.percentile(a, p, interpolation='higher'), label='higher', linestyle='--')
plt.step(p, np.percentile(a, p, interpolation='lower'), label='lower', linestyle='--')
plt.step(p, np.percentile(a, p, interpolation='nearest'), label='nearest', linestyle='-.',)
plt.step(p, np.percentile(a, p, interpolation='midpoint'), label='midpoint', linestyle='-.',)

plt.title('Interpolation methods for list: ' + str(a))
plt.xlabel('Percentile')
plt.ylabel('List item returned')
plt.yticks(a)
plt.legend()

Image

๋‚˜๋Š” interpolation = 'linear' ์ด ๋‹จ์ฐจ๊ฐ€ ์•„๋‹Œ ๊ทœ์น™์ ์ธ ์„ ์ด์–ด์•ผํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•˜์ง€๋งŒ ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ์ข‹์•„ ๋ณด์ธ๋‹ค. ๋ฌธ์„œ์— ์ถ”๊ฐ€ํ•˜์—ฌ PR์„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

์‚ฌ์‹ค, step ์€ ์ผ๋ฐ˜์ ์œผ๋กœ ์˜คํ•ด์˜ ์†Œ์ง€๊ฐ€์žˆ๋Š” ์ธ๊ณต๋ฌผ์„ ์œ ๋ฐœํ•˜๋ฏ€๋กœ ํ”ผํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. linspace(0, 100, 60) ๋„ ๋” ์ •ํ™•ํ•œ ์ค‘๊ฐ„ ์ขŒํ‘œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

PR์„ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•์„ ๋ชจ๋ฆ…๋‹ˆ๋‹ค.

๊ท€ํ•˜์˜ ๊ณ„์ •์œผ๋กœ ์ž์œ ๋กญ๊ฒŒ ๋ณ€๊ฒฝ ์ œ์•ˆ์„ ์ถ”๊ฐ€ํ•˜๊ฑฐ๋‚˜ ๋…ผ์˜ํ•˜์‹ญ์‹œ์˜ค.

๋‚˜๋Š” ๋‹น์‹ ์ด C ๋ฅผ ์ด์™€ ๊ฐ™์€ ๊ฒƒ์œผ๋กœ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค (๋ฌด์–ธ๊ฐ€์—์„œ ๊ทธ๊ฒƒ์„ ํ…Œ์ŠคํŠธํ•˜์‹ญ์‹œ์˜ค). ๋ฐฑ๋ถ„์œ„ ์ˆ˜์— ๋Œ€ํ•œ ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœ ํ•œ ๋‹ค์Œ numpy ๋ฒ„์ „์— ์—ฐ๊ฒฐํ•ฉ๋‹ˆ๋‹ค (C = 1์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ง€๊ธˆ์€ ๊ฒฝ๊ณ„๋ฅผ ๋ฒ—์–ด๋‚œ ๋ฐฑ๋ถ„์œ„ ์ˆ˜๋ฅผ ์ˆ˜์ •ํ•˜๋Š” ๊ฒƒ์„ ์ œ์™ธํ•˜๊ณ ๋Š” ์ž‘๋™ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค).

def scale_percentiles(p, num, C=0):
     """
     p : float
          percentiles to be used (within 0 and 100 inclusive)
     num : int
         number of data points.
     C : float
         parameter C, should be 0, 0.5 or 1. Numpy uses 1, matlab 0.5, H&K is 0.
     """
     p = np.asarray(p)
     fact = (num-1.+2*C)/(num-1)
     p *= fact
     p -= 0.5 * (fact-1) * 100
     p[p < 0] = 0
     p[p > 100] = 100
     return p

๊ทธ๋ฆฌ๊ณ  ์งœ์ž”, "๊ฐ€๊นŒ์šด"์œผ๋กœ "H & F"๋ฅผ ์–ป๊ณ  ์„ ํ˜•์œผ๋กœ Wikipedia์—์„œ ํ”Œ๋กฏ์„ ์–ป์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (๋‚ด๊ฐ€ ์ž˜๋ชป๋œ ๊ฒƒ์„ ์–ป์—ˆ์„ ๋•Œ๊นŒ์ง€ ๊ธฐ๋‹ค๋ฆฌ๊ณ  ์žˆ์ง€๋งŒ ๋‚ด๊ฐ€ ์˜ณ์•˜๋‹ค ๊ณ  ํ™•์‹ ํ•ฉ๋‹ˆ๋‹ค).

์•ž์„œ ๋งํ–ˆ๋“ฏ์ด ์ฐจ์ด์ ์€ ๋งˆ์ง€๋ง‰ ํฌ์ธํŠธ์— ๋Œ€ํ•ด 0-100 (๊ท ๋“ฑํ•˜๊ฒŒ)์˜ ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ๋ฅผ ๋ฐฐ์น˜ํ•˜๋Š” ์œ„์น˜์ž…๋‹ˆ๋‹ค. C = 1์˜ ๊ฒฝ์šฐ min (data)์„ 0 ๋ฒˆ์งธ ๋ฐฑ๋ถ„์œ„ ์ˆ˜ ๋“ฑ์— ๋„ฃ์Šต๋‹ˆ๋‹ค. "๋ฌด์—‡์ด ๋” ํƒ€๋‹นํ•œ ์ง€"์— ๋Œ€ํ•œ ๋‹จ์„œ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ์•„๋งˆ๋„ ์ผ๋ฐ˜์ ์ธ ๊ด€์ ์—์„œ ์•ฝ๊ฐ„ ์ค‘์š” ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. 1์„ ํฌํ•จํ•˜๊ณ  0์„ ์ œ์™ธํ•˜๋Š” ์ด๋ฆ„์€ ์•ฝ๊ฐ„ ์˜๋ฏธ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค (๋ฐฐํƒ€์  ๊ฐ€๋Šฅํ•œ ๋ฒ”์œ„๊ฐ€ ๋ฐ์ดํ„ฐ ๋ฒ”์œ„ ๋ฐ–์— ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ๋ฐฑ๋ถ„์œ„ ์ˆ˜์˜ ์ „์ฒด ๋ฒ”์œ„๋ฅผ ์ƒ๊ฐํ•  ๋•Œ). C = 1 / 2๋„ ๊ทธ๋Ÿฐ ์˜๋ฏธ์—์„œ ๋ฐฐํƒ€์ ์ž…๋‹ˆ๋‹ค.

๋‚˜๋Š” C ๋งค๊ฐœ ๋ณ€์ˆ˜๋ฅผ ์ถ”๊ฐ€ ํ•  ๊ฒƒ์ด์ง€๋งŒ, ๊ฐ€๋Šฅํ•˜๋‹ค๋ฉด ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ์„ค๋ช…์ ์ธ ์ด๋ฆ„์„ ์ œ์‹œํ•˜๊ธฐ๋ฅผ ์›ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ์ตœ์„ ์˜ ๊ธฐ๋ณธ๊ฐ’์„ ๋ช…ํ™•ํ•˜๊ฒŒ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด "๋ฐฉ๋ฒ•"๊ณผ ๊ฐ™์€ ๊ฒƒ์„ ์‹ ๊ฒฝ ์“ฐ์ง€ ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค (๋ณด๊ฐ„ + C ์กฐํ•ฉ). ๋˜๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ๋Œ€๋ถ€๋ถ„์˜ ์กฐํ•ฉ์ด ์‚ฌ์šฉ๋˜์ง€ ์•Š๊ณ  ์œ ์šฉํ•˜์ง€ ์•Š๋‹ค๊ณ  ๊ฒฐ์ •ํ•ฉ๋‹ˆ๋‹ค.

๊ฒฐ๊ตญ ๋‚ด ๋ฌธ์ œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค. ํ†ต๊ณ„ํ•™์ž๊ฐ€ ์–ด๋–ค ๋ฐฉ๋ฒ•์— ํ•ฉ์˜๊ฐ€ ์žˆ๋Š”์ง€ ์•Œ๋ ค์ฃผ๊ธฐ๋ฅผ ์›ํ•ฉ๋‹ˆ๋‹ค (R์—๋Š” ๋ช‡ ๊ฐ€์ง€ ๋‚ด์šฉ์ด ์žˆ์ง€๋งŒ ๋งˆ์ง€๋ง‰์œผ๋กœ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ์—ฌ๊ธฐ์— ์™”์„ ๋•Œ R ๋ฌธ์„œ ๋˜๋Š” ์œ ์‚ฌํ•œ ๋‚ด์šฉ์„ numpy ์ปจํ…์ŠคํŠธ๋กœ ์„ค์ •ํ•˜์ง€ ์•Š๊ณ  ๋ณต์‚ฌ ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋งํ•  ํ•„์š”๋„์—†์ด, ์ผ๋ฐ˜ ์ฒญ์ค‘์—๊ฒŒ๋Š” ์“ธ๋ชจ๊ฐ€ ์—†์—ˆ์œผ๋ฉฐ ๋…ผ๋ฌธ์„ ์ธ์šฉํ•˜๋Š” ๊ฒƒ์ด ๋” ๋„์›€์ด๋˜์—ˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค).

๋‚˜๋Š” ๊ทธ H & F ๋…ผ๋ฌธ์„ ์ฝ๊ณ  ์‹ถ์ง€ ์•Š์ง€๋งŒ (์†”์งํžˆ ์ฝ๊ธฐ์—๋„ ๋งค์šฐ ๋งค๋„๋Ÿฝ๊ฒŒ ๋ณด์ด์ง€ ์•Š๋Š”๋‹ค), ๋‹น์‹ ๋„์ง€์ง€์˜ ๊ด€์ ์—์„œ ๋ณผ ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•œ๋‹ค. numpy "๊ฐ€์žฅ ๊ฐ€๊นŒ์šด"(๋˜๋Š” ๊ธฐํƒ€) ๋ฒ„์ „์€ ๊ฐ ๋ฐ์ดํ„ฐ ํฌ์ธํŠธ์— ๋Œ€ํ•ด ๋™์ผํ•œ ์ง€์› (๋ฐฑ๋ถ„์œ„ ์ˆ˜)์„ ๊ฐ–์ง€ ์•Š์œผ๋ฉฐ, H & F๋Š” "๊ฐ€์žฅ ๊ฐ€๊นŒ์šด"์— ๋Œ€ํ•ด ๋™์ผํ•œ ์ง€์›์„ ์ œ๊ณตํ•˜๋ฉฐ ์ค‘๊ฐ„ ํฌ์ธํŠธ์— ๋Œ€ํ•ด C = 1 / 2 ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ™•์‹คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
๋‚˜๋Š” ๊ณ„์† ๋ฐ˜๋ณตํ•œ๋‹ค. ๊ทธ๋Ÿฐ์ง€์ง€ ์ฃผ์žฅ (numpy์™€ ๊ฐ™์€ C = 1์— ๋Œ€ํ•œ ๋ฐ˜๋Œ€)์ด ์‹ค์ œ๋กœ ์ง„์งœ ์ด์œ ์ธ์ง€ ๋ชจ๋ฅด๊ฒ ๋‹ค.

ํŽธ์ง‘ : ์ค‘๊ฐ„ ์ ์€ numpy์—์„œ ๋™์ผํ•œ ์ง€์› (๋ฐ์ดํ„ฐ ์  ์ž์ฒด๊ฐ€ ์•„๋‹Œ ๋ฐ์ดํ„ฐ ์  ์‚ฌ์ด์˜ ์˜์—ญ์— ๋Œ€ํ•ด)์ด ์žˆ์œผ๋ฏ€๋กœ "C = 1"

@seberg ๋‚˜์™€ ํ•จ๊ป˜ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ž‘๋™ํ•˜๋Š” ์ฝ”๋“œ๋ฅผ ๊ฒŒ์‹œ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

๊ธ€์Ž„, ๋‚˜๋Š” ๊ฑฐ๊ธฐ์—์žˆ๋Š” ์ฝ”๋“œ์—์„œ ์ž˜๋ชป๋œ ๋ถ€ํ˜ธ๋ฅผ ์–ป์—ˆ์œผ๋ฏ€๋กœ ๋ฐ˜๋Œ€์˜€์Šต๋‹ˆ๋‹ค (C = 0์ด ์•„๋‹ˆ๋ผ C = 1).

def scale_percentiles(p, num, C=0):
     """
     p : float
          percentiles to be used (within 0 and 100 inclusive)
     num : int
         number of data points.
     C : float
         parameter C, should be 0, 0.5 or 1. Numpy uses 1, matlab 0.5, H&F is 0.
     """
     p = np.asarray(p)
     fact = (num+1.-2*C)/(num-1)
     p *= fact
     p -= 0.5 * (fact-1) * 100
     p[p < 0] = 0
     p[p > 100] = 100
     return p
plt.figure()
plt.plot(np.percentile([0, 1, 2, 3], scale_percentiles(np.linspace(0, 100, 101), 5, C=0), interpolation='nearest'))
plt.plot(np.percentile([0, 1, 2, 3], scale_percentiles(np.linspace(0, 100, 101), 5, C=1), interpolation='nearest'))
plt.figure()
plt.plot(np.percentile([15, 20, 35, 40, 50], scale_percentiles(np.linspace(0, 100, 101), 5, C=1), interpolation='linear'))
plt.plot(np.percentile([15, 20, 35, 40, 50], scale_percentiles(np.linspace(0, 100, 101), 5, C=0.5), interpolation='linear'))
plt.plot(np.percentile([15, 20, 35, 40, 50], scale_percentiles(np.linspace(0, 100, 101), 5, C=0), interpolation='linear'))

@seberg ๋‹ซ์ง€ ๋งŒ ์•„์ง ์—†์Šต๋‹ˆ๋‹ค. ๋ฅผ ๋“ค์–ด a = [0,1,2,3] ๋ฐ percentiles = [25, 50, 75, 100] , np.percentile (a, scale_percentiles(percentiles, len(a), C=0), interpolation='nearest) ๋ฐ˜ํ™˜ [0, 2, 3, 3] , ๊ทธ๊ฒƒ์€ ๋ฐ˜ํ™˜ํ•ด์•ผํ•˜๋Š” ๊ฒฝ์šฐ [0,1,2,3] .

๋ชฉ๋ก ๋ฐฑ๋ถ„์œ„ ์ˆ˜๋ฅผ dtype=np.float ๋กœ ๋งŒ๋“ค์–ด์•ผํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ํ•จ์ˆ˜์—์„œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜์ง€๋งŒ ๊ทธ๊ฒŒ ๋ฌธ์ œ๋ผ๊ณ  ์ƒ๊ฐํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๊ณ ์ „์ ์ธ ๋ฐฉ๋ฒ•์˜ ๊ธฐ๋Šฅ์€ ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค.
๋ฐฑ๋ถ„์œ„ ์ˆ˜ / 100 * N-> ์ง€์ˆ˜ ์ธ ์ •์ˆ˜์ธ ๊ฒฝ์šฐ, ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ ์ƒํ•œ์„ ์„ ์ง€์ˆ˜๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋Ÿผ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  C ์ธ์ˆ˜๋Š” ์˜ˆ์ƒ๋Œ€๋กœ ์ž‘๋™ํ•˜๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ด๋ฏ€๋กœ ์‚ฌ๋žŒ๋“ค์ด ๋ณด๊ฐ„์— ์‚ฌ์šฉํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ์—ฌ์ „ํžˆ ์œ„ํ‚คํ”ผ๋””์•„์ฒ˜๋Ÿผ ์ž‘๋™ํ•˜๋Š” method = 'classic'๋˜๋Š” interpolation = 'none'์„ ์›ํ•ฉ๋‹ˆ๋‹ค.

๋””๋ฒ„๊น…์„ ์œ„ํ•ด ์ด๊ฒƒ์€ ๊ณ ์ „์ ์ธ ๋ฐฉ๋ฒ•์˜ ์ถ”์•…ํ•œ ๋น„ numpy ๊ตฌํ˜„์ž…๋‹ˆ๋‹ค.

def percentile (arr, p):
    arr = sorted(arr)

    index = p /100 * len(arr)

    # If index is a whole number, and larger than zero, subtract one unit (due to 0-based indexing)
    if index%1 < 0.0001 and index//1 > 0:
        index -= 1

    return arr[int(index)]

๊ทธ๋ฆฌ๊ณ  ๋” numpythonic ํ•˜๋‚˜ :

def indexes_classic(percentiles, set_size):
    percentiles = np.asarray(percentiles)

    indexes = percentiles / 100* set_size
    indexes[np.isclose(indexes%1, 0)] -= 1
    indexes = np.asarray(indexes, dtype=np.int)
    indexes[indexes < 0] = 0
    indexes[indexes > 100] = 100

    return indexes

์ด๋Ÿฌํ•œ ์ฐจ์ด์ ์€ ๋ถ€๋™ ์†Œ์ˆ˜์  / ๋ฐ˜์˜ฌ๋ฆผ ๋ฌธ์ œ์ฒ˜๋Ÿผ ๋“ค๋ฆฝ๋‹ˆ๋‹ค.
์•Œ๊ณ ์žˆ๋Š” ๊ฒƒ ๊ฐ™์Œ), ์•„๋งˆ๋„ C = 0์— ๋Œ€ํ•œ ๋‚ด ์ถ”์ธก์ด ์ž˜๋ชป๋˜์—ˆ๊ณ  ๋‹น์‹ ์€
C = 0.5.
๋‚ด ์š”์ ์€ ์ฐจ์ด์ ์ด ์–ด๋””์—์„œ ์˜ค๋Š”์ง€ ๋งํ•˜๋Š” ๊ฒƒ์ด ์—ˆ์Šต๋‹ˆ๋‹ค ( "C ๋งค๊ฐœ ๋ณ€์ˆ˜"
IMO, ๋งŽ์€ ๊ฒƒ์„ ์‹ซ์–ดํ• ๋งŒํ•œ ์ด์œ ๊ฐ€ ์žˆ์ง€๋งŒ
์กฐํ•ฉ). ํ•ด๊ฒฐ ๋ฐฉ๋ฒ•์„ ์ œ๊ณต / ๊ตฌํ˜„ํ•˜๊ธฐ์œ„ํ•œ ๊ฒƒ์ด ์•„๋‹™๋‹ˆ๋‹ค.

"๊ณ ์ „์ ์ธ"๋ฐฉ๋ฒ•์— ๊ด€ํ•ด์„œ๋Š” ์†”์งํžˆ ์–ด๋–ค ๊ณ ์ „์ ์ธ
์žˆ์–ด์•ผํ•ฉ๋‹ˆ๋‹ค. ๋‚ด๊ฐ€ ์•„๋Š” ๋ชจ๋“  ํด๋ž˜์‹์€ "์ƒ๋‹นํžˆ
์‚ฌ๋žŒ๋“ค์ด ๊ทธ๊ฒƒ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. "

์†”๋ฃจ์…˜ ํ˜„๋ช…ํ•œ ๋‚ด ์ฒซ์ธ์ƒ์€ "ํด๋ž˜์‹"๋˜๋Š”
์ด๋ฆ„์ด ๋ช…ํ™•ํ•˜์ง€ ์•Š์€ ๋‹ค๋ฅธ ํ˜ผ๋ž€์Šค๋Ÿฌ์šด ์˜ต์…˜์„ ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ํฌ๋ง
์ด ํ† ๋ก ์€ ์‹ค์ œ๋กœ ๋ชจ๋“  ๊ฒƒ์„ ๋งŒ๋“œ๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ๊ฐˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
์‚ฌ์šฉ์ž๊ฐ€ ๊นจ๋—ํ•˜๊ณ  ํˆฌ๋ช…ํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜์žˆ๋Š” ์ข‹์€ (๊ณตํ†ต) ์˜ต์…˜
๋ฐฉ๋ฒ•. ์‚ฌ๋žŒ๋“ค์ด ์‹ค์ œ๋กœ ์ดํ•ดํ•  ์ˆ˜์žˆ๋Š” ๊ฐ€์žฅ ์ข‹์€ ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.

๋ฐฉ๋ฒ•์„ ํ•˜๋‚˜ ๋” ์ถ”๊ฐ€ ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ์†”์งํžˆ ์ ˆ๋ฐ˜ ์ •๋„๋งŒ ์ข‹์•„ํ•ฉ๋‹ˆ๋‹ค. ์–ธ์ œ ์šฐ๋ฆฌ๊ฐ€
๋งˆ์ง€๋ง‰์œผ๋กœ ๋” ๋งŽ์€ ๋ฐฉ๋ฒ•์„ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค (์ •ํ™•ํ•˜๊ฒŒ ๋ณ€๊ฒฝ๋œ ๋‚ด์šฉ์ด ๊ธฐ์–ต ๋‚˜์ง€ ์•Š์Œ).
์ด๋ฏธ ์ง€์—ฐ๋˜์—ˆ๊ณ  ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๋›ฐ์ณ ๋‚˜์™€
์šฐ๋ฆฌ๊ฐ€ ๊ฐ€์ ธ์•ผ ํ•  ๊ฒƒ. ๋งํ•  ํ•„์š”๋„์—†์ด ๊ทธ๊ฒƒ์€ ์‹ค์ œ๋กœ ์ผ์–ด๋‚œ ์ ์ด ์—†์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ง€๊ธˆ
๋‚˜๋Š” ์ฐจ์ด์ ์„ ์ง€์ ํ•˜๊ณ  ๊ทธ๊ฒƒ์ด ์–ด๋–ป๊ฒŒ ๋งž๋Š”์ง€ ๋ณด๋ ค๊ณ  ๋…ธ๋ ฅํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
ํ˜„์žฌ ๊ฐ€์ง€๊ณ ์žˆ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ ๋‚ด ์ธ์ƒ์€ (๋ฐ˜์˜ฌ๋ฆผ๊ณผ ์ •ํ™•ํ•œ
๋ฐฑ๋ถ„์œ„ ์ˆ˜ ์ผ์น˜) ๋งŽ์€ "๋ณด๊ฐ„"์˜ต์…˜์ด ์žˆ์Šต๋‹ˆ๋‹ค (์•„๋งˆ๋„ ๋„ˆ๋ฌด).
"C ๋งค๊ฐœ ๋ณ€์ˆ˜"๋˜๋Š”์ด๋ฅผ ํ˜ธ์ถœํ•˜๋ ค๋Š” ๋ชจ๋“  ๊ฒƒ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
๊ฑฐ์˜ ๋ชจ๋“  ๊ฒƒ์„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๊ทธ๋ฆฌ๊ณ  ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ์–ด๋–ป๊ฒŒ ๋ชจ๋“  ๊ฒƒ์„ ๋งํ•ด ์ค„ ์ˆ˜ ์žˆ๋‹ค๋ฉด ์ •๋ง ๊ธฐ์  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
(๊ณตํ†ต) "๋ฐฉ๋ฒ•"์€ ์ด๋Ÿฌํ•œ ๋ฒ”์ฃผ์— ์†ํ•ฉ๋‹ˆ๋‹ค.
๊ทธ ์ด์ƒ์œผ๋กœ C = 0,0.5,1์€ ์‹ฌ์ง€์–ด ์กด์žฌํ•˜๊ณ  ์–ด์ฉŒ๋ฉด ๊ทธ ๋ฐ–์˜ ์ผ๋ถ€๋Š”
์˜ต์…˜ ....

์ž˜๋ชป๋œ ์ฐจ์„ ์œผ๋กœ ๊ฐ€๊ณ  ์žˆ์ง€๋งŒ "Method1"์„
๋ˆ„๊ตฌ์—๊ฒŒ๋„ ๊ทธ๊ฒƒ์ด ์–ด๋–ป๊ฒŒ ๋‹ค๋ฅธ์ง€ ๋งํ•˜์ง€ ์•Š๋Š” ๋ถˆ๋ช…ํ™• ํ•œ ์ด๋ฆ„
๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์€ ๋„์›€์ด๋˜์ง€ ์•Š๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค (
"Method1"์ด๋ผ๋Š” ์ด๋ฆ„์„ ์ด๋ฏธ ์•Œ๊ณ  ์žˆ์œผ๋ฉฐ ์ฐพ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๊ณผ
"ํด๋ž˜์‹"์ด ํ•˜๋‚˜์˜ ๋ช…๋ฐฑํ•œ ๊ฒƒ์ด๋ผ๊ณ  ๋งํ•˜์ง€ ๋งˆ์‹ญ์‹œ์˜ค.
๊ตฌํ˜„์— ๋„ˆ๋ฌด ๋งŽ์€ ์ฐจ์ด๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

๋˜ ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์€ ๋” ์ด์ƒ ์‚ฌ์šฉ๋˜์ง€ ์•Š๋Š” "๋ณด๊ฐ„๋ฒ•"์ด์ง€๋งŒ ๋ชฉ๋ก์„ ๊ฐ–๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
of methods๋Š” "์„ ํ˜• ๋ณด๊ฐ„"์„ ์•”์‹œํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค ํ›จ์”ฌ ๋œ ์ข‹์Šต๋‹ˆ๋‹ค.
๊ทธ๊ฒƒ์€ ๋‹จ๊ณ„์ ์ธ ํ–‰๋™์ด ์•„๋‹ˆ๋ผ๊ณ  ๋งํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค ... ๊ทธ๋ฆฌ๊ณ  ์šฐ๋ฆฌ๊ฐ€ ๊ทธ๋ ‡๊ฒŒํ•œ๋‹ค๋ฉด,
๋‚˜๋Š” ์—ฌ์ „ํžˆ ํ•ฉ๋ฆฌ์ ์ธ ๊ฐœ์š”๋ฅผ ์›ํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋ ‡๊ฒŒ ํ•  ํ•„์š”๋Š” ์—†์ง€๋งŒ ์ƒˆ ๋ฉ”์„œ๋“œ๋ฅผ ์ถ”๊ฐ€ํ•˜๋ ค๋ฉด
๋” ๋งŽ์€ ์‚ฌ๋žŒ๋“ค์„ ํ˜ผ๋ž€์Šค๋Ÿฝ๊ฒŒํ•˜์ง€ ์•Š๊ณ  ๋ช…ํ™•ํ•˜๊ฒŒ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•!

๊ทธ๋Ÿผ ์š”์•ฝํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

1) ํ˜„์žฌ numpy๋Š” ๋ณด๊ฐ„ = '์„ ํ˜•'์ด๋ผ๋Š” ์œ ์šฉํ•œ ๋ฐฉ๋ฒ• ํ•˜๋‚˜๋งŒ ์ œ๊ณตํ•˜๊ณ  ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์€ ์‹ค์ œ๋กœ ์•„๋ฌด๋„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๋ณด์ด๋Š” ์ž‘์€ ๋ณ€ํ˜•์ž…๋‹ˆ๋‹ค. ๋‹ค๋ฅธ ํŒจํ‚ค์ง€์—๋Š” ๋” ๋งŽ์€ ๊ด€๋ จ ์˜ต์…˜์ด ์žˆ์Šต๋‹ˆ๋‹ค.

2) C = 0 ๋˜๋Š” C = 0.5์— ๋Œ€ํ•œ ๋‹ค๋ฅธ ๊ฐ’์„ ์ถ”๊ฐ€ํ•˜๋ฉด ์ดํ•ด๊ฐ€๋ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ๋ณด๊ฐ„ ๋ฐฉ๋ฒ•์€ ํ•จ๊ป˜ ์ž‘๋™ ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ๋‹ค์‹œ๋Š” ์‚ฌ์šฉ๋˜์ง€ ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

3) ๋ณด๊ฐ„ ๋ฐฉ๋ฒ•๊ณผ C ์ธ์ˆ˜ ์‚ฌ์ด์˜ ์ฝค๋ณด ์ค‘ ํ•˜๋‚˜๊ฐ€ ๊ณ ์ „์ ์ธ ๋ฐฉ๋ฒ•์„ ๋ณต์ œ ํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด (์ฐธ์กฐ ๋ฐ ์œ„ํ‚คํ”ผ๋””์•„์™€ ๋‚ด ๊ฐœ์ธ์ ์ธ ๊ฒฝํ—˜์ด ์ด๊ฒƒ์ด ๊ฐ€์žฅ ์ผ๋ฐ˜์ ์œผ๋กœ ๊ฐ€๋ฅด์น˜๋Š” ๋ฐฉ๋ฒ•์ด๋ผ๋Š” ๋ฐ ๋™์˜ ํ•จ), ๋‚˜๋Š” ๊ทธ๊ฒƒ์— ๋งŒ์กฑํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ฝค๋ณด๊ฐ€ ๊ณ ์ „์ ์ธ ๋น„๋ณด ๊ฐ„ ๋ฐฉ๋ฒ•์„ ์ƒ์„ฑํ•œ๋‹ค๊ณ  ๋ฌธ์„œ์— ๋ช…์‹œ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ํ”Œ๋กœํŠธ ์ •๋ฐ€๋„ ๋ฌธ์ œ ๋•Œ๋ฌธ์ธ์ง€ ํ™•์‹คํ•˜์ง€ ์•Š์ง€๋งŒ๋ณด๋‹ค ํ†ตํ•ฉ ๋œ ๋ฐฉ์‹์œผ๋กœ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ์œ„ํ•œ ๊ท€ํ•˜์˜ ๋…ธ๋ ฅ์— ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค!

4) ์–ด๋–ค ์ฝค๋ณด๋„ ๋™์ผํ•œ ๊ฒฐ๊ณผ๋ฅผ ์–ป์ง€ ๋ชปํ•˜๋ฉด ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์ด ์˜๋ฏธ๊ฐ€ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์•„๋งˆ๋„ interpolation = 'none'์ด๋ผ๊ณ  ๋ถ€๋ฅด๋Š” ๊ฒƒ์ด ๋œ ํ˜ผ๋ž€ ์Šค๋Ÿฌ์šธ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์š”์•ฝํ•˜๋ฉด, numpy.percentile์˜ ํ˜„์žฌ ์˜ต์…˜์€ ๋‹ค์†Œ ํ˜ผ๋ž€์Šค๋Ÿฝ๊ณ  ์ œํ•œ์ ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ์œ„์—์„œ ์–ธ๊ธ‰ ํ•œ ๋ฌธ์„œ๋Š” ๋‹ค๋ฅธ ์œ ์šฉํ•œ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์ข‹์€ ๊ฐœ์š”๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์œ„ํ‚ค ๋ฐฑ๊ณผ ํŽ˜์ด์ง€์™€ ํ•จ๊ป˜ numpy.percentile์— ๋Œ€ํ•œ๋ณด๋‹ค ํฌ๊ด„์ ์ด๊ณ  ์œ ์šฉํ•œ ์˜ต์…˜ ์„ธํŠธ๋ฅผ ๋””์ž์ธํ•˜๊ธฐ์œ„ํ•œ ์‹œ์ž‘์ ์œผ๋กœ ์ž‘๋™ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ˆ„๊ตฐ๊ฐ€๊ฐ€์ด ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๊ณ  ์‹ถ์–ดํ•˜๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค.

ํ˜„์žฌ์˜ "๊ฐ€์žฅ ๊ฐ€๊นŒ์šด"์ด ์ผ๋ถ€ / ์–ด๋–ค ๊ฒฝ์šฐ์—๋„ ์˜๋ฏธ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ? ๊ฐ„๊ฒฉ ๋ฐฉ๋ฒ• ( "C") ๋˜๋Š” ์–ด๋–ค ๊ฒƒ์ด ์„ ํ˜• ๋ณด๊ฐ„ / ๋ถ„์ˆ˜ ํ•ญ๋ชฉ์— ๋Œ€ํ•ด ๊ทธ๋ ‡๊ฒŒ ํฐ ์ฐจ์ด๋ฅผ ๋งŒ๋“œ๋Š” ๊ฒฝ์šฐ, ์•„๋ฌด๋„ ๋น„ ๋ถ„์ˆ˜ ๊ทผ์‚ฌ๋ฅผ ์œ„ํ•ด ๊ทธ๋ ‡๊ฒŒํ•˜์ง€ ์•Š์•˜๋Š”์ง€ ๋†€๋ž์Šต๋‹ˆ๊นŒ?! ์ƒ์ˆ˜ ์ง€์›์ด์ด ๋ชจ๋“  ๊ฒƒ์ด ์ค‘์š”ํ•˜๊ณ  ๋ณด๊ฐ„ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด CDF ์—ญ ์ธ์ˆ˜๋ฅผ ๋คํ”„ํ•˜๋Š” ์ด์œ ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?

์ฝค๋ณด๋Š” ์ดํ•ดํ•  ์ˆ˜ ์žˆ๊ณ  ์ผ๋ฐ˜์ ์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ๊ฒƒ์ด ์ฐพ๊ธฐ ์‰ฌ์šด ๊ฒฝ์šฐ๊ฐ€ ์•„๋‹ˆ๋ฉด ์“ธ๋ชจ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ๋ณด๊ฐ„์„ ์œ„ํ•ด ๋งŽ์€ ์˜ต์…˜์ด ์กด์žฌํ•˜๋Š” ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค (์˜ˆ : http://mathworld.wolfram.com/Quantile.html Q4 ~ Q9, R ๋ฌธ์„œ๋Š” ๊ฑฐ์˜ ๋™์ผํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•˜์ง€๋งŒ matlab๊ณผ ๊ฐ™์ด ์™„์ „ํ•˜์ง€ ์•Š์„ ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์Šต๋‹ˆ๋‹ค ...) ์‹ค์ œ๋กœ ๋ชจ๋‘ ์ดํ•ด๊ฐ€๋œ๋‹ค๋ฉด ๋‹จ์„œ๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค.).

๋ฌธ์ œ๋Š” ์ •ํ™•ํžˆ ์ •์˜ ๋œ ํฌ์ธํŠธ ์‚ฌ์ด์—์„œ ๋ฌด์—‡์„ํ•ด์•ผํ•˜๋Š”์ง€์— ๋Œ€ํ•œ "๋ณด๊ฐ„"ํฌ์ธํŠธ์ž…๋‹ˆ๋‹ค.ํ•˜์ง€๋งŒ ์ตœ์†Œํ•œ "์„ ํ˜• ๋ณด๊ฐ„"์„ ์‚ฌ์šฉํ•  ๋•Œ ์ด๋Ÿฌํ•œ ํฌ์ธํŠธ๋ฅผ ๋ฐฐ์น˜ํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์—ฌ๋Ÿฌ ๊ฐ€์ง€ (์ด์ƒํ•˜๊ฒŒ๋„ ๋งŽ์Œ)๊ฐ€ ์žˆ์œผ๋ฏ€๋กœ ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์€ ์ž˜๋ชป๋œ ์ ‘๊ทผ ๋ฐฉ์‹์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๋งŽ์ด ๋“ค๋ฆฌ๋Š” "๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ˆœ์œ„"๋ณด๊ฐ„ = "๊ฐ€์žฅ ๊ฐ€๊นŒ์šด"์„ ์›ํ–ˆ์ง€๋งŒ ์ •ํ™•ํ•œ "ํ”Œ๋กœํŒ… ์œ„์น˜"์˜ ์„ ํƒ์€ "๋น„ํ‘œ์ค€"์œผ๋กœ ๋ณด์ด๋ฏ€๋กœ ์ถ”์ธกํ•˜๊ธฐ๊ฐ€ ๋ถˆ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ์ž˜๋ชป๋œ ์„ ํƒ์ž…๋‹ˆ๋‹ค.

๊ทธ๋Ÿฐ ๋‹ค์Œ ๋‚˜๋Š” ๋ชจ๋“  ๊ฒƒ์„ ๊ณต๊ฒฉ์ ์œผ๋กœ ๋น„๋‚œํ•˜๋Š” ๊ฒƒ์„ ์„ ํ˜ธํ•ฉ๋‹ˆ๋‹ค (์•„๋งˆ๋„ ์„ ํ˜•์„ ์ œ์™ธํ•˜๊ณ ).

์šฐ๋ฆฌ๋Š” ๋” ์ด์ƒ ์‚ฌ์šฉํ•˜์ง€๋งŒ, ๋‚˜๋Š” ๊ทธ๊ฒƒ์„ 100 %์˜ ๊ถŒ๋ฆฌ๋ฅผ ์–ป์œผ๋ ค๋ฉด, ๊ทธ๊ฒƒ์€ ์–ด๋–ค ์กด์žฌํ•ด์•ผํ•˜๊ณ  ๋ฌด์—‡์˜ definetly ์กด์žฌํ•˜์ง€ ์•Š์•„์•ผ ์กด์žฌ ๋ฌด์—‡์ธ์ง€์— ์ข€ ๋” ๋ช…ํ™•ํ•ด์•ผ ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

์ „์ ์œผ๋กœ ๋™์˜ํ•ฉ๋‹ˆ๋‹ค

@ ricardoV94 : # 9211์—์„œ ์ œ์•ˆ ๋œ ๊ฐ€์ค‘ ๋ถ„์œ„์ˆ˜ ์ผ€์ด์Šค์— ๋Œ€ํ•œ linear ์˜ ์ •์˜์— ๋Œ€ํ•œ ์˜๊ฒฌ์ด ์žˆ์Šต๋‹ˆ๊นŒ? ๊ฐ™์€ ์Šคํƒ€์ผ์˜ ๊ทธ๋ž˜ํ”„๊ฐ€ ๋ช‡ ๊ฐœ ์žˆ์Šต๋‹ˆ๋‹ค.

์•„๋งˆ๋„ @ ricardoV94 ๊ฐ€ ๊ทธ๊ฒƒ์— ๋Œ€ํ•ด ๋…ผํ‰ ํ•  ์ˆ˜ ์žˆ์„์ง€๋„ ๋ชจ๋ฅด์ง€๋งŒ (

๋˜ํ•œ ํ•ด๋‹น PR์— ๋Œ€ํ•ด josef-pkt๋ฅผ ํ•‘ (ping) ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ๊ทธ๊ฐ€ ์ข‹์€ ์ƒ๊ฐ / ์˜ณ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋Š”์ง€์— ๋Œ€ํ•œ ๋น ๋ฅธ ์˜๊ฒฌ์„ ์ œ๊ณตํ•˜๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค.

๋ˆ„๊ตฐ๊ฐ€ ์—ฌ๊ธฐ์—์„œ ๊ฐ€์ ธ์˜ค๊ณ  ์‹ถ๋‹ค๋ฉด ๋‹ค์Œ์„ ๊ณ„์‚ฐํ•˜๋Š” ์ตœ์ ํ™”๋˜์ง€ ์•Š์€ ํ•จ์ˆ˜๋ฅผ ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค.
Hyndman and Fan (1996)์— ์˜ํ•ด ์„ค๋ช…๋˜๊ณ  R ์—์„œ๋„ ์‚ฌ์šฉ๋˜๋Š” 9 ๋ฐฑ๋ถ„์œ„ ์ˆ˜ / ์‚ฌ ๋ถ„์œ„์ˆ˜ ์ถ”์ • ๋ฐฉ๋ฒ•.

๋ฐฉ๋ฒ• 1์€ Wikipedia ์—์„œ ๋…ผ์˜ ๋œ '๊ณ ์ „์ ์ธ ๊ฐ€์žฅ ๊ฐ€๊นŒ์šด ์ˆœ์œ„ ๋ฐฉ๋ฒ•'์— ํ•ด๋‹นํ•ฉ๋‹ˆ๋‹ค. ๋ฐฉ๋ฒ• 7์€ ํ˜„์žฌ Numpy ๊ตฌํ˜„๊ณผ ๋™์ผํ•ฉ๋‹ˆ๋‹ค (๋ณด๊ฐ„ = '์„ ํ˜•'). Numpy ๋ณด๊ฐ„์˜ ๋‚˜๋จธ์ง€ ๋ฐฉ๋ฒ•์€ ํฌํ•จ๋˜์–ด ์žˆ์ง€ ์•Š์Šต๋‹ˆ๋‹ค (์–ด์จŒ๋“  ์œ ์šฉํ•˜์ง€ ์•Š์€ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค).

def percentile(x, p, method=7):
    '''
    Compute the qth percentile of the data.

    Returns the qth percentile(s) of the array elements.

    Parameters
    ----------
    x : array_like
        Input array or object that can be converted to an array.
    p : float in range of [0,100] (or sequence of floats)
        Percentile to compute, which must be between 0 and 100 inclusive.
    method : integer in range of [1,9]
        This optional parameter specifies one of the nine sampling methods 
        discussed in Hyndman and Fan (1996). 

        Methods 1 to 3 are discontinuous:
        * Method 1: Inverse of empirical distribution function (oldest
        and most studied method).
        * Method 2: Similar to type 1 but with averaging at discontinuities.
        * Method 3: SAS definition: nearest even order statistic.

        Methods 4 to 9 are continuous and equivalent to a linear interpolation 
        between the points (pk,xk) where xk is the kth order statistic. 
        Specific expressions for pk are given below:
        * Method 4: pk=kn. Linear interpolation of the empirical cdf.
        * Method 5: pk=(kโˆ’0.5)/n. Piecewise linear function where the knots 
        are the values midway through the steps of the empirical cdf 
        (Popular amongst hydrologists, used by Mathematica?).
        * Method 6: pk=k/(n+1), thus pk=E[F(xk)]. The sample space is divided
        in n+1 regions, each with probability of 1/(n+1) on average
        (Used by Minitab and SPSS).
        * Method 7: pk=(kโˆ’1)/(nโˆ’1), thus pk=mode[F(xk)]. The sample space
        is divided into n-1 regions (This is the default method of 
        Numpy, R, S, and MS Excell).
        * Method 8: pk=(kโˆ’1/3)/(n+1/3), thus pkโ‰ˆmedian[F(xk)]. The resulting
        estimates are approximately median-unbiased regardless of the
        distribution of x (Recommended by Hyndman and Fan (1996)).
        * Method 9: k=(kโˆ’3/8)/(n+1/4), thus pkโ‰ˆF[E(xk)]if x is normal (?).
        The resulting estimates are approximately unbiased for the expected 
        order statistics if x is normally distributed (Used for normal QQ plots).

        References:
        Hyndman, R. J. and Fan, Y. (1996) Sample quantiles in statistical packages, 
        American Statistician 50, 361--365.
        Schoonjans, F., De Bacquer, D., & Schmid, P. (2011). Estimation of population
        percentiles. Epidemiology (Cambridge, Mass.), 22(5), 750.

        '''

    method = method-1    
    x = np.asarray(x)
    x.sort()
    p = np.array(p)/100

    n = x.size  
    m = [0, 0, -0.5, 0, 0.5, p, 1-p, (p+1)/3, p/4+3/8][method]

    npm = n*p+m
    j = np.floor(npm).astype(np.int)
    g = npm-j

    # Discontinuous functions
    if method < 3:
        yg0 = [0, 0.5, 0][method]
        y = np.ones(p.size)
        if method < 2:
            y[g==0] = yg0
        else:
            y[(g==0) & (j%2 == 0)] = yg0      
    # Continuous functions
    else:
        y = g

    # Adjust indexes to work with Python
    j_ = j.copy()
    j[j<=0] = 1
    j[j > n] = n
    j_[j_ < 0] = 0
    j_[j_ >= n] = n-1 

    return (1-y)* x[j-1] + y*x[j_]

์—ฐ์† ๋ฉ”์„œ๋“œ๋„ ์ด์™€ ๊ฐ™์ด๋ณด๋‹ค ํšจ์œจ์ ์œผ๋กœ ๊ตฌํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

def percentile_continuous(x, p, method=7):
    '''
    Compute the qth percentile of the data.

    Returns the qth percentile(s) of the array elements.

    Parameters
    ----------
    x : array_like
        Input array or object that can be converted to an array.
    p : float in range of [0,100] (or sequence of floats)
        Percentile to compute, which must be between 0 and 100 inclusive.
    method : integer in range of [4,9]
        This optional parameter specifies one of the 5 continuous sampling
        methods discussed in Hyndman and Fan (1996). 
        '''

    x = np.asarray(x)
    x.sort()
    p = np.asarray(p)/100
    n = x.size

    if method == 4:
        r = p * n
    elif method == 5:
        r = p * n + .5
    elif method == 6:
        r = p * (n+1)
    elif method == 7:
        r = p * (n-1) + 1
    elif method == 8:
        r = p * (n+1/3) + 1/3
    elif method == 9:
        r = p * (n+1/4) + 3/8

    index = np.floor(r).astype(np.int)

    # Adjust indexes to work with Python
    index_ = index.copy()
    index[index_ <= 0] = 1
    index[index_  > n] = n
    index_[index_ < 0] = 0
    index_[index_ >= n] = n-1

    i = x[index - 1]
    j = x[index_]

    return i + r%1* (j-i)

๋ˆ„๊ตฌ๋“ ์ง€ ์—ฌ๊ธฐ์—์„œ ๊ฐ€์ ธ๊ฐˆ ์‚ฌ๋žŒ์ด ์žˆ์Šต๋‹ˆ๊นŒ? ๋‚˜๋Š” ๊ทธ๋ ‡๊ฒŒ ํ•  ์ž๊ฒฉ์ด ์—†์Šต๋‹ˆ๋‹ค.

์ด์ „ ๊ฒŒ์‹œ๋ฌผ์—์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด numpy์˜ ํ˜„์žฌ ๊ธฐ๋ณธ Quantile ๊ตฌํ˜„์€ R ๊ฒƒ๊ณผ ์ผ์น˜ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค.

R :

> quantile(c(15, 20, 35, 40, 50), probs=c(0.05, 0.3, 0.4, 0.5, 1))
  5%  30%  40%  50% 100% 
  16   23   29   35   50 
> quantile(c(3, 6, 7, 8, 8, 10, 13, 15, 16, 20), probs=c(0.25, 0.5, 0.75, 1))
  25%   50%   75%  100% 
 7.25  9.00 14.50 20.00
> quantile(c(3, 6, 7, 8, 8, 9, 10, 13, 15, 16, 20), probs=c(0.25, 0.5, 0.75, 1))
 25%  50%  75% 100% 
 7.5  9.0 14.0 20.0 

np.quantile :

>>> np.quantile([15, 20, 35, 40, 50], q=[0.05, 0.3, 0.4, 0.5, 1])
array([16., 23., 29., 35., 50.])
>>> np.quantile([3, 6, 7, 8, 8, 10, 13, 15, 16, 20], q=[0.25, 0.5, 0.75, 1])
array([ 7.25,  9.  , 14.5 , 20.  ])
>>> np.quantile([3, 6, 7, 8, 8, 9, 10, 13, 15, 16, 20], q=[0.25, 0.5, 0.75, 1])
array([ 7.5,  9. , 14. , 20. ])

๋ฌผ๋ก  Wikipedia์— ์ œ๊ณต๋œ ์˜ˆ์ œ๋ฅผ ์žฌํ˜„ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
https://en.wikipedia.org/wiki/Percentile

์‹ค์ œ๋กœ quantile https://www.rdocumentation.org/packages/stats/versions/3.5.0/topics/quantile์— ๋Œ€ํ•œ R ๋„์›€๋ง ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•˜๋ฉด
R ๊ธฐ๋ณธ ๋ฐฉ๋ฒ• (์œ ํ˜• 7)์ด np.quantile ์„ค์ • ๋ฐฉ๋ฒ•๊ณผ ๋™์ผํ•œ ๊ฒฝ๊ณ„ ์กฐ๊ฑด์„ ์„ค์ •ํ•˜๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. p_k = (k-1) / (n-1) , ์—ฌ๊ธฐ์„œ n์€ ์ƒ˜ํ”Œ ํฌ๊ธฐ์ด๊ณ  k = 1์€ ๊ฐ€์žฅ ์ž‘์€ ๊ฐ’์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ๊ฐ’, k = n์ด ๊ฐ€์žฅ ํฝ๋‹ˆ๋‹ค. ์ฆ‰, ์ •๋ ฌ ๋œ ๋ฐฐ์—ด์—์„œ ๊ฐ€์žฅ ์ž‘์€ ๊ฐ’์€ quantile = 0์— ๊ณ ์ •๋˜๊ณ  ๊ฐ€์žฅ ํฐ ๊ฐ’์€ quantile = 1์— ๊ณ ์ •๋ฉ๋‹ˆ๋‹ค.

๋˜ํ•œ ์ด์ „ ๊ฒŒ์‹œ๋ฌผ์—์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด ์œ ํ˜• 1์„ ์‚ฌ์šฉํ•˜์—ฌ Wikipedia์˜ 3 ๊ฐ€์ง€ ์˜ˆ์ œ๋ฅผ ์žฌํ˜„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

> quantile(c(15, 20, 35, 40, 50), probs=c(0.05, 0.3, 0.4, 0.5, 1), type=1)
  5%  30%  40%  50% 100% 
  15   20   20   35   50 
> quantile(c(3, 6, 7, 8, 8, 10, 13, 15, 16, 20), probs=c(0.25, 0.5, 0.75, 1), type=1)
 25%  50%  75% 100% 
   7    8   15   20 
> quantile(c(3, 6, 7, 8, 8, 9, 10, 13, 15, 16, 20), probs=c(0.25, 0.5, 0.75, 1), type=1)
 25%  50%  75% 100% 
   7    9   15   20 

์ด๋Š” ๋ช‡ ๊ฐ€์ง€ ํฅ๋ฏธ๋กœ์šด ์งˆ๋ฌธ์„ ์ œ๊ธฐํ•ฉ๋‹ˆ๋‹ค.

1.) np.quantile์˜ ๊ธฐ๋ณธ์ด R.quantile์˜ ๊ธฐ๋ณธ์„ ์ถ”์ ํ•ด์•ผํ•ฉ๋‹ˆ๊นŒ?
2.) np.quantile์ด ์œ ํ˜• 1 ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ์ „ํ™˜ํ•ด์•ผํ•ฉ๋‹ˆ๊นŒ?

Wikipedia ์ž์ฒด์กฐ์ฐจ๋„ ๋ฐฑ๋ถ„์œ„ ์ˆ˜์— ๋Œ€ํ•œ ํ‘œ์ค€ ์ •์˜๊ฐ€ ์—†๋‹ค๋Š” ๋ฐ ๋™์˜ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ๊ฑด์ „ํ•˜๊ณ  ์‚ฌ์šฉ์ž๊ฐ€ ์ž‘๋™ ๋ฐฉ์‹์„ ์•Œ๊ณ ์žˆ๋Š” ํ•œ (1) ๋˜๋Š” (2) ๊ทธ๋‹ค์ง€ ์ค‘์š”ํ•˜์ง€ ์•Š๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. Python๊ณผ R์€ ๊ฐ€์žฅ ์ธ๊ธฐ์žˆ๋Š” ๋ฐ์ดํ„ฐ ๋ถ„์„ ํ”Œ๋žซํผ ์ค‘ ํ•˜๋‚˜์ด๊ธฐ ๋•Œ๋ฌธ์— (1)์„ ๋” ์„ ํ˜ธํ•ฉ๋‹ˆ๋‹ค. ์„œ๋กœ๋ฅผ ๊ฒ€์ฆ ํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ (2)๋Š” ๋ถˆํ•„์š”ํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ, R๊ณผ Numpy ๋ชจ๋‘ ๊ธฐ๋ณธ์ ์œผ๋กœ ๋ฐฉ๋ฒ• 7์„ ์‚ฌ์šฉํ•˜๋ฏ€๋กœ ๊ทธ๋Œ€๋กœ ์œ ์ง€ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ๋ฌธ์ œ๋Š” ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์„ ์ถ”๊ฐ€ํ•˜๋Š”์ง€ ์—ฌ๋ถ€์ž…๋‹ˆ๋‹ค.

๊ด€์‹ฌ์ด์žˆ๋Š” ์‚ฌ๋žŒ์ด ์žˆ๋‹ค๋ฉด ์—ฌ๊ธฐ์— 9 ๊ฐœ์˜ ๋ฐฑ๋ถ„์œ„ ์ˆ˜ ๋ฐฉ๋ฒ•์ด์žˆ๋Š” ๋…๋ฆฝ ๋ชจ๋“ˆ์„ ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ฐฉ๋ฒ•์„ ์•Œ๊ณ  ์žˆ๋‹ค๋ฉด ์ž์œ ๋กญ๊ฒŒ ์‚ฌ์šฉํ•˜๊ฑฐ๋‚˜ Numpy์— ์ ์‘ํ•˜์‹ญ์‹œ์˜ค.

@ ricardoV94 ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ ๊ทธ๋ƒฅ ํ‚ฅ์„ ์œ„ํ•ด R ์‚ฌ์šฉ์ž๋ฅผ ๋Œ€์ƒ์œผ๋กœ ์„ค๋ฌธ ์กฐ์‚ฌ๋ฅผ ์‹ค์‹œํ–ˆ์Šต๋‹ˆ๋‹ค. ์‘๋‹ต ํ•œ 20 ๋ช… ์ค‘ 20 ๋ช…์€ quantile ์˜ ๊ธฐ๋ณธ ๋ฐฉ๋ฒ• ๋งŒ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋“ค์€ ๊ณต์ค‘ ๋ณด๊ฑด ์„์‚ฌ ๊ณผ์ •์—์„œ ํ†ต๊ณ„ํ•™ ๋ฐ•์‚ฌ ์—ฐ๊ตฌ์›๊นŒ์ง€ ๋‹ค์–‘ํ•ฉ๋‹ˆ๋‹ค.

๊ฐœ์ธ์ ์œผ๋กœ numpy๊ฐ€ ๋ถ„์œ„์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” 9 ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์„ ์ง€์›ํ•˜๊ธฐ ์œ„ํ•ด ๋…ธ๋ ฅํ•  ๊ฐ€์น˜๊ฐ€ ์žˆ๋Š”์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ์‚ฌ์šฉ์ž๋Š” ๊ธฐ๋ณธ๊ฐ’์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋งŒํ•œ ๊ฐ€์น˜๊ฐ€์žˆ๋Š” ๊ฒƒ์€ 9 ๊ฐœ์˜ ๋ฉ”์†Œ๋“œ ์ค‘ 6 ๊ฐœ (์—ฐ์†์ ์ธ ๋ฉ”์†Œ๋“œ)๋ฅผ ์ง€์›ํ•˜๋Š” scipy.stats.mstats.mquantiles ํ•จ์ˆ˜์ด๋ฉฐ ๋ฌธ์„œ๋Š” R ๊ตฌํ˜„๊ณผ์˜ ๋งํฌ๋ฅผ ๋งค์šฐ ๋ช…์‹œ ์ ์œผ๋กœ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

@albertcthomas ์•„, ์ž˜ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ๋น„๋ก ์ด์ƒ์ ์œผ๋กœ๋Š” ์šฐ๋ฆฌ๊ฐ€์ด ๋ณต์žก์„ฑ์„ numpy๋กœ ์ˆจ๊ธธ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์šฐ๋ฆฌ๋Š” ๋Œ€๋ถ€๋ถ„ ์—ฐ์†๋˜์ง€ ์•Š์€ IIRC ๋ฒ„์ „์„ ์ˆ˜์ •ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ๋“ค์€ ๊ธฐ๋ณธ์ ์œผ๋กœ ๊ฐ€์žฅ ์ผ๋ฐ˜์ ์ธ ๋ฐฉ๋ฒ•์„ ์ œ๊ณตํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

์˜ˆ, numpy๋Š” scipy stats ๋ชจ๋“ˆ์—์„œ ๊ตฌํ˜„ ๋œ ๊ฒฝ์šฐ ์ด๋Ÿฌํ•œ ๋ฉ”์„œ๋“œ๋ฅผ ๋ฐ˜๋“œ์‹œ ์ง€์›ํ•  ํ•„์š”๋Š” ์—†์Šต๋‹ˆ๋‹ค.

๊ฐœ์ธ์ ์œผ๋กœ ๋‚˜๋Š” ๋ˆ„์  ๋ถ„ํฌ ํ•จ์ˆ˜์˜ ์ผ๋ฐ˜ํ™” ๋œ ์—ญ์œผ๋กœ๋ถ€ํ„ฐ ๋ถ„์œ„์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ ํ˜ธํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌํ•œ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†๋‹ค๋Š” ์‚ฌ์‹ค์ด ๋‚˜๋ฅผ์ด ๋ฌธ์ œ๋กœ ์ด๋Œ์—ˆ์Šต๋‹ˆ๋‹ค. :).

@albertcthomas์— ๋Œ€ํ•œ ํžŒํŠธ / ์ง€์‹์ด ์žˆ์œผ๋ฉด ๊ทธ๋ ‡๊ฒŒ ๋ง ํ•ด์ฃผ์„ธ์š”! ์šฐ๋ฆฌ๋Š” ์‹ค์ œ๋กœ ์ข‹์€ ๊ธฐ๋ณธ๊ฐ’์ด ๋ฌด์—‡์ธ์ง€ ๋ช…ํ™•ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ์•ฝ๊ฐ„ ๊ฐ‡ํ˜€ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ๊ฒƒ์€ ๊ฝค ์„ฑ๊ฐ€์‹  ๋ฌธ์ œ๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

๊ฐ€์žฅ ์ค‘์š”ํ•œ ๊ฒƒ์€ ๋ช‡ ๊ฐ€์ง€ ์ข‹์€ ๊ธฐ๋ณธ๊ฐ’์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ๊ฒƒ์€ ์•„๋งˆ๋„ 2-3 ๊ฐœ์˜ ๋ฉ”์„œ๋“œ๋ฅผ ๊ตฌํ˜„ํ•˜๋Š” ๊ฒƒ์„ ์˜๋ฏธ ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค (์—ฐ์†๋˜์ง€ ์•Š์€ ๋ฉ”์„œ๋“œ๋ฅผ ์™„์ „ํžˆ ์ˆ˜์ •). ์ข€ ๋” ๋ณต์žกํ•œ ๊ฒƒ์„ ์ง€์›ํ•˜๋Š” ๊ฒƒ์€ ๊ดœ์ฐฎ์ง€ ๋งŒ "์ „ํ˜•์ ์ด๊ณ  ์ข‹์€"๊ฒƒ์„ ๋ช‡ ๊ฐ€์ง€ ๊ฒฐ์ •ํ•  ์ˆ˜ ์žˆ๋‹ค๋ฉด ์ข‹๊ฒ ์Šต๋‹ˆ๋‹ค.

์„ ํ˜• ๋ฐฉ๋ฒ• (ํ˜„์žฌ ๊ธฐ๋ณธ๊ฐ’)๊ณผ ๋ˆ„์  ๋ถ„ํฌ ํ•จ์ˆ˜์˜ ์—ญ (์ด ๋ฌธ์ œ๋ฅผ ๋งŒ๋“ค ๋•Œ ์ฐพ๋˜ ์—ญ)์ด โ€‹โ€‹์ถฉ๋ถ„ ํ•˜๋‹ค๊ณ  ๋งํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค . ๊ธฐ๋ณธ์ ์œผ๋กœ ๋ณด๊ฐ„์„ ์›ํ•˜๋Š”์ง€ ์—ฌ๋ถ€๋ฅผ ์„ ํƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ํ˜„์žฌ ๊ตฌํ˜„ ๋œ ๋‹ค๋ฅธ ๋Œ€์•ˆ์€ ํ™•์‹คํžˆ ์ œ๊ฑฐ๋˜์–ด์•ผํ•ฉ๋‹ˆ๋‹ค.

๋ˆ„์  ๋ถ„ํฌ ํ•จ์ˆ˜์˜ ์—ญํ•จ์ˆ˜๋ฅผ ํ™•์‹คํžˆ ์ถ”๊ฐ€ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ํ†ต๊ณ„์—์„œ ์ฃผ์–ด์ง„ ๊ด€์ธก ์ƒ˜ํ”Œ์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” ๋ถ„์œ„์ˆ˜ ์ถ”์ •์น˜ ์ค‘ ํ•˜๋‚˜์ž…๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ํ˜„์žฌ ๊ตฌํ˜„ ๋œ ๋‹ค๋ฅธ ๋Œ€์•ˆ์€ ํ™•์‹คํžˆ ์ œ๊ฑฐ๋˜์–ด์•ผํ•ฉ๋‹ˆ๋‹ค.

@ ricardoV94 ๋Š” ๋Œ€์•ˆ์ด Wikipedia ๋‚˜ Hyndman๊ณผ Fan์˜ ๋…ผ๋ฌธ์—์„œ ์–ธ๊ธ‰๋˜์ง€ ์•Š์•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋งํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๊นŒ?

์˜ˆ, ๋‹ค๋ฅธ ํŒจํ‚ค์ง€์—์„œ๋Š” ๊ตฌํ˜„๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” ์™œ ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๊ทธ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๊ณ  ์‹ถ์–ดํ•˜๋Š”์ง€ ์•Œ์ง€ ๋ชปํ•˜๋ฉฐ ๊ทธ๋“ค์˜ ์ด๋ฆ„์€
๋˜ํ•œ ์˜คํ•ด์˜ ์†Œ์ง€๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

Albert Thomas [email protected] escreveu no dia quarta, 2019 ๋…„ 2 ์›” 1 ์ผ
ร  (s) 14:18 :

๊ทธ๋ฆฌ๊ณ  ํ˜„์žฌ ๊ตฌํ˜„ ๋œ ๋‹ค๋ฅธ ๋Œ€์•ˆ์€ ํ™•์‹คํžˆ
์ œ๊ฑฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

@ ricardoV94 https://github.com/ricardoV94 ๋‹น์‹ ์ด ์ด๊ฒƒ์„ ๋งํ•˜๋Š” ์ด์œ ๋Š”
๋Œ€์•ˆ์€ Wikipedia ๋‚˜ Hyndman์—์„œ ์ฐธ์กฐ๋˜์ง€ ์•Š์œผ๋ฉฐ
ํŒฌ์˜ ์ข…์ด?

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰ ๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/numpy/numpy/issues/10736#issuecomment-450861068 ๋˜๋Š” ์Œ์†Œ๊ฑฐ
์‹ค
https://github.com/notifications/unsubscribe-auth/AbpAmfUoJNk3YHOSHNeVN03Va5wtvkHQks5u_LGugaJpZM4SnVpE
.

๊ฐ์‚ฌ! np.percentile์—์„œ ์‚ฌ์šฉํ•  ์ˆ˜์žˆ๋Š” ๋ฐฉ๋ฒ•์œผ๋กœ ๋ˆ„์  ๋ถ„ํฌ์˜ ์—ญ์ˆ˜๋ฅผ ์ถ”๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด PR์„ ์—ด์–ด ๋ณด๋Š” ๊ฒƒ์€ ์–ด๋–จ๊นŒ์š”? ๋Œ€์•ˆ์— ๋Œ€ํ•ด ๊ณ„์† ๋…ผ์˜ํ•˜๋ ค๋Š” ๊ฒฝ์šฐ์ด ๋ฌธ์ œ๋ฅผ ์—ด์–ด ๋‘๋Š” ๋™์•ˆ (๊ธฐ๋ณธ๊ฐ’์œผ๋กœ ์œ ์ง€ํ•ด์•ผํ•˜๋Š” ํ˜„์žฌ ๊ธฐ๋ณธ๊ฐ’ ์ œ์™ธ). numpy์—์„œ ์ง€์› ์ค‘๋‹จ์€ ์–ด๋–ป๊ฒŒ ์ฒ˜๋ฆฌ๋ฉ๋‹ˆ๊นŒ?

์—ฌ๊ธฐ์— ๋” ๋งŽ์€ ์ •๋ณด-Python 3.8์—์„œ statistics.quantiles ์ถ”๊ฐ€- np.quantile ๋™๋“ฑํ•œ ๋ชจ๋“œ๋ฅผ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์‚ดํŽด ๋ด์•ผํ•ฉ๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์„œ ์•ž์œผ๋กœ ๋‚˜์•„๊ฐ€๋Š” ๋ฐฉ๋ฒ•์€ statistics ํ•˜๋‚˜๋ฅผ ๋ฏธ๋Ÿฌ๋งํ•˜๋Š” method kwarg๋ฅผ ์ถ”๊ฐ€ํ•˜๊ณ  0-2 ๊ฐœ๋ฅผ ๋” ์ถ”๊ฐ€ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค (์ด ๊ฒฝ์šฐ ์›๋ž˜ ์ž‘์„ฑ์ž๋ฅผ python์—์„œ pingํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค). .

๊ธฐ๋ณธ๊ฐ’์ด ์šฐ๋ฆฌ์™€ ๊ทธ๋“ค์˜ ๊ฒƒ ์‚ฌ์ด์— ์ผ์น˜ํ•˜๋Š”์ง€ ํ™•์‹คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ๋ถ€๋„๋Ÿฌ ์šธ ๊ฒƒ์ž…๋‹ˆ๋‹ค.ํ•˜์ง€๋งŒ ์—ฌ์ „ํžˆ ์ตœ์„ ์˜ ์•„์ด๋””์–ด ์ธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค (๊ทธ๋ฆฌ๊ณ  ์šฐ๋ฆฌ๊ฐ€ ์—ผ๋‘์— ๋‘์—ˆ๋˜ ๊ฑฐ์˜ ๋Œ€๋ถ€๋ถ„). 0-2 ๊ฐœ์˜ ์ƒˆ๋กœ์šด "๋ฐฉ๋ฒ•"๋„ ์ถ”๊ฐ€ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์–ด๋–ค ๊ฒฝ์šฐ์—๋Š” ์‹ค์ œ ์ด๋ฆ„์— ๋Œ€ํ•ด ํŒŒ์ด์ฌ ํ†ต๊ณ„ ์‚ฌ๋žŒ๋“ค์—๊ฒŒ ํ•‘์„ ๋ณด๋‚ด๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

PR์€ ๋งค์šฐ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ์•ž์œผ๋กœ ๋‚˜์•„๊ฐ€๊ณ  ์‹ถ์ง€๋งŒ ์กฐ๋งŒ๊ฐ„ ๊ทธ๋ ‡๊ฒŒํ•˜์ง€ ์•Š๊ฒ ์Šต๋‹ˆ๋‹ค.

@ eric-wieser ๊ด€๋ จ PR์ด ๋ช‡ ๊ฐœ ์žˆ์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” ์ด๊ฒƒ์„ 1.19๋กœ ๋ฐ€์–ด์„œ ์ฐจ๋‹จ์ œ๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ๊ทธ๋ ‡๋‹ค๊ณ  1.18์—์„œ ๊ณ ์น  ์ˆ˜ ์—†๋‹ค๋Š” ์˜๋ฏธ๋Š” ์•„๋‹™๋‹ˆ๋‹ค. :)

@charris : ์–ด๋–ค PR์„ ์—ผ๋‘์—๋‘๊ณ  ๊ณ„์‹ญ๋‹ˆ๊นŒ?

์•ˆํƒ€๊น๊ฒŒ๋„ ์•„์ง์ด ๋ฐฉํ–ฅ์—๋Š” ์—†๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰