Numpy: ์ฝ”๋„ˆ ์ผ€์ด์Šค์—์„œ ๊ฐ์†Œ (Trac # 236)

์— ๋งŒ๋“  2012๋…„ 10์›” 19์ผ  ยท  49์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: numpy/numpy

_2006-08-07์— trac ์‚ฌ์šฉ์ž martin_wiechert์˜ ์›๋ณธ ํ‹ฐ์ผ“ http://projects.scipy.org/numpy/ticket/236 , unknown์— ํ• ๋‹น ๋จ _

.reduceat๋Š” ๋ฐ˜๋ณต ๋œ ์ธ๋ฑ์Šค๋ฅผ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์ฒ˜๋ฆฌํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ธ๋ฑ์Šค๊ฐ€ ๋ฐ˜๋ณต๋˜๋ฉด ์—ฐ์‚ฐ์˜ ์ค‘๋ฆฝ ์š”์†Œ๊ฐ€ ๋ฐ˜ํ™˜๋˜์–ด์•ผํ•ฉ๋‹ˆ๋‹ค. ์•„๋ž˜ ์˜ˆ์—์„œ๋Š” [1, 10]์ด ์•„๋‹Œ [0, 10]์ด ์˜ˆ์ƒ๋ฉ๋‹ˆ๋‹ค.

In [1]:import numpy

In [2]:numpy.version.version
Out[2]:'1.0b1'

In [3]:a = numpy.arange (5)

In [4]:numpy.add.reduceat (a, (1,1))
Out[4]:array([ 1, 10])
01 - Enhancement 23 - Wish List numpy.core

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

reduceat ์˜ ์ฃผ์š” ๋™๊ธฐ๋Š” ์ตœ๋Œ€ ์†๋„๋ฅผ ์œ„ํ•ด reduce ์ด์ƒ์˜ ๋ฃจํ”„๋ฅผ ํ”ผํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ reduce ์ด์ƒ์˜ for ๋ฃจํ”„ ๋ž˜ํผ๊ฐ€ Numpy์— ๋งค์šฐ ์œ ์šฉํ•œ ์ถ”๊ฐ€๊ฐ€ ๋ ์ง€ ์™„์ „ํžˆ ํ™•์‹ ํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. reduceat ์ฃผ ๋ชฉ์ ์— ์œ„๋ฐฐ๋ฉ๋‹ˆ๋‹ค.

๋˜ํ•œ reduce ์ด์ƒ์˜ ๋ฃจํ”„์— ๋Œ€ํ•œ ๋น ๋ฅธ ๋ฒกํ„ฐํ™” ๋œ ๋Œ€์ฒด๋กœ์„œ reduceat ์กด์žฌ ๋ฐ API์— ๋Œ€ํ•œ ๋…ผ๋ฆฌ๋Š” ๊นจ๋—ํ•˜๊ณ  ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ๋น„๋‚œํ•˜์ง€ ์•Š๊ณ  ์˜คํžˆ๋ ค ๊ณ ์น  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

reduceat ์†๋„์™€ ๊ด€๋ จํ•˜์—ฌ ๊ฐ„๋‹จํ•œ ์˜ˆ๋ฅผ ๊ณ ๋ คํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.ํ•˜์ง€๋งŒ reduceat ์‚ฌ์šฉํ•˜๋Š” ์ œ ์ฝ”๋“œ์—์žˆ๋Š” ์‹ค์ œ ์‚ฌ๋ก€์™€ ๋น„์Šทํ•ฉ๋‹ˆ๋‹ค.

n = 10000
arr = np.random.random(n)
inds = np.random.randint(0, n, n//10)
inds.sort()

%timeit out = np.add.reduceat(arr, inds)
10000 loops, best of 3: 42.1 ยตs per loop

%timeit out = piecewise_reduce(np.add, arr, inds)
100 loops, best of 3: 6.03 ms per loop

์ด๊ฒƒ์€ 100 ๋ฐฐ ์ด์ƒ์˜ ์‹œ์ฐจ์ด๋ฉฐ reduceat ํšจ์œจ์„ฑ ์œ ์ง€์˜ ์ค‘์š”์„ฑ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

์š”์•ฝํ•˜๋ฉด ์ƒˆ๋กœ์šด ๊ธฐ๋Šฅ์„ ๋„์ž…ํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค reduceat ์ˆ˜์ •์„ ์šฐ์„ ์‹œํ•ฉ๋‹ˆ๋‹ค.

start_indices ๋ฐ end_indices ๋Š” ๊ฒฝ์šฐ์— ๋”ฐ๋ผ ์œ ์šฉํ•˜์ง€๋งŒ ์ค‘๋ณต๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์œผ๋ฉฐ ๊ฐ€๋Šฅํ•œ ์ถ”๊ฐ€๋กœ ๋ณผ ์ˆ˜ ์žˆ์ง€๋งŒ ํ˜„์žฌ reduceat ๋ถˆ์ผ์น˜์— ๋Œ€ํ•œ ์ˆ˜์ • ์‚ฌํ•ญ์€ ์•„๋‹™๋‹ˆ๋‹ค. ํ–‰๋™.

๋ชจ๋“  49 ๋Œ“๊ธ€

_ @ teoliphant ๋‹˜ ์ด 2006-08-08์— ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค

๋ถˆํ–‰ํžˆ๋„ NumPy์˜ reduceat ๋ฉ”์†Œ๋“œ๋Š”์ด ์ฝ”๋„ˆ ์ผ€์ด์Šค์— ๋Œ€ํ•ด Numeric์˜ reduceat ๋ฉ”์†Œ๋“œ์˜ ๋™์ž‘์„ ๋”ฐ๋ฆ…๋‹ˆ๋‹ค.

์ธ๋ฑ์Šค๊ฐ€ ๊ฐ™์„ ๊ฒฝ์šฐ ์ž‘์—…์˜ "identity"์š”์†Œ๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ๊ธฐ๋Šฅ์ด ์—†์Šต๋‹ˆ๋‹ค. ์ •์˜ ๋œ ๋™์ž‘์€ ์Šฌ๋ผ์ด์Šค๊ฐ€ ๋นˆ ์‹œํ€€์Šค๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ๊ฒฝ์šฐ ์ฒซ ๋ฒˆ์งธ ์ธ๋ฑ์Šค๊ฐ€ ์ œ๊ณตํ•˜๋Š” ์š”์†Œ๋ฅผ ๋ฐ˜ํ™˜ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ์ด ๊ฒฝ์šฐ reduceat์˜ ๋ฌธ์„œํ™” ๋œ ์‹ค์ œ ๋™์ž‘์€ ๋‹ค์Œ์„ ๊ตฌ์„ฑํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

[a [1], add.reduce (a [1 :])]

์ด๊ฒƒ์€ ๊ธฐ๋Šฅ ์š”์ฒญ์ž…๋‹ˆ๋‹ค.

_trac ์‚ฌ์šฉ์ž martin_wiechert ๋‹˜์ด 2006-08-08์— ์ž‘์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค _

ํ‹ฐ์ผ“ # 835 ์ฐธ์กฐ

2007-05-12์— @alberts๊ฐ€ ๋งˆ์ผ์Šคํ†ค์„ 1.1 ๋กœ ๋ณ€๊ฒฝ ํ–ˆ์Šต๋‹ˆ๋‹ค.

2009 ๋…„ 3 ์›” Unscheduled ์— @cournape๊ฐ€ ๋งˆ์ผ์Šคํ†ค์„ Unscheduled ๋กœ ๋ณ€๊ฒฝ ํ–ˆ์Šต๋‹ˆ๋‹ค.

๋‚˜๋Š” ์ด๊ฒƒ์ด # 835์™€ ๋ฐ€์ ‘ํ•˜๊ฒŒ ๊ด€๋ จ๋˜์–ด ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค : ์ธ๋ฑ์Šค ์ค‘ ํ•˜๋‚˜๊ฐ€ len(a) ์ด๋ฉด reduceat ๋Š” ํ•ด๋‹น ์ธ๋ฑ์Šค์˜ ์š”์†Œ๋ฅผ ์ถœ๋ ฅ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ์ด๋Š” ์ธ๋ฑ์Šค len(a) ๊ฐ€ ๋‚˜ํƒ€๋‚˜๋Š” ๊ฒฝ์šฐ์— ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๋˜๋Š” ์ธ๋ฑ์Šค ๋์—์„œ ๋ฐ˜๋ณต๋ฉ๋‹ˆ๋‹ค.

๋ช‡ ๊ฐ€์ง€ ์†”๋ฃจ์…˜ :

  • ์˜ ์˜ต์…˜ reduceat ์ถœ๋ ฅ์˜ ๊ฐ’์„ ์„ค์ •ํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ end - start == 0
  • ์ถœ๋ ฅ์„ ์ฃผ์–ด์ง„ ๊ณ ์ • ๊ฐ’์œผ๋กœ ์„ค์ •ํ•˜๋Š” ์˜ต์…˜์ž…๋‹ˆ๋‹ค. end - start == 0
  • where ํŒŒ๋ผ๋ฏธํ„ฐ์ฒ˜๋Ÿผ ufunc() ์ถœ๋ ฅ ๋ชจ๋‘์—์„œ ๊ณ„์‚ฐ๋˜์–ด์•ผํ•˜๋Š” ๋งˆ์Šคํฌ.

์ด ๋ฌธ์ œ์— ๋Œ€ํ•ด ๋” ์ด์ƒ ์ƒ๊ฐ์ด ์žˆ์Šต๋‹ˆ๊นŒ? end-start == 0 ์ธ ID ๊ฐ’ (์กด์žฌํ•˜๋Š” ๊ฒฝ์šฐ)์œผ๋กœ ์ถœ๋ ฅ์„ ์„ค์ •ํ•˜๋Š” ์˜ต์…˜์— ๊ด€์‹ฌ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

์ €๋Š”์ด ์˜ค๋žœ ๊ณต๊ฐœ ํ˜ธ์—์„œ ์ œ์•ˆํ•œ๋Œ€๋กœ reduceat ๋™์ž‘์˜ ๋ณ€๊ฒฝ์„ ๊ฐ•๋ ฅํžˆ์ง€์ง€ํ•ฉ๋‹ˆ๋‹ค. ์ด ์œ„๋Œ€ํ•œ Numpy ๊ตฌ์กฐ์˜ ์œ ์šฉ์„ฑ์„ ๋ฐฉํ•ดํ•˜๋Š” ๋ช…ํ™•ํ•œ ๋ฒ„๊ทธ ๋˜๋Š” ๋ช…๋ฐฑํ•œ ๋””์ž์ธ ์‹ค์ˆ˜์ฒ˜๋Ÿผ ๋ณด์ž…๋‹ˆ๋‹ค.

reduceat ๋Š” ๋ชจ๋“  ์ธ๋ฑ์Šค์— ๋Œ€ํ•ด ์ผ๊ด€๋˜๊ฒŒ ์ž‘๋™ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ๋ชจ๋“  ์ธ๋ฑ์Šค i์— ๋Œ€ํ•ด ufunc.reduceat(a, indices) ๋Š” ufunc.reduce(a[indices[i]:indices[i+1]]) ๋ฐ˜ํ™˜ํ•ด์•ผํ•ฉ๋‹ˆ๋‹ค.

indices[i] == indices[i+1] ๊ฒฝ์šฐ์—๋„ ๋งˆ์ฐฌ๊ฐ€์ง€์ž…๋‹ˆ๋‹ค. ๋‚˜๋Š”์ด ๊ฒฝ์šฐ, reduceat ๋ฐ˜ํ™˜ํ•ด์•ผํ•˜๋Š” ์ด์œ , ์–ด๋–ค ํ•ฉ๋ฆฌ์ ์ธ ์ด์œ ๋ฅผ ๋ณผ ์ˆ˜ a[indices[i]] ๋Œ€์‹  ufunc.reduce(a[indices[i]:indices[i+1]]) .

Pandas ์ œ์ž‘์ž Wes McKinney ์˜ ์œ ์‚ฌํ•œ ๋Œ“๊ธ€๋„ ์—ฌ๊ธฐ๋ฅผ ์ฐธ์กฐ

์™€, ์ด๊ฑด ์ •๋ง ๋”์ฐํ•˜๊ณ  ๊นจ์กŒ์Šต๋‹ˆ๋‹ค.
.
๋ฉ”์ผ ๋ง๋ฆฌ์ŠคํŠธ์— ๋Œ€ํ•œ ๋…ผ์˜๊ฐ€ ํ•„์š”ํ•˜์ง€๋งŒ ์ ์–ด๋„ ๋‚˜๋Š”
๋‹ค์Œ ๋ฆด๋ฆฌ์Šค์—์„œ ํ•ด๋‹น ๋ฌธ์ œ๋ฅผ FutureWarning์œผ๋กœ ๋งŒ๋“œ๋Š” ๊ฒƒ์— ์ „์ ์œผ๋กœ ์ฐฌ์„ฑ
๋‚˜์ค‘์— ๋ช‡ ๋ฒˆ์˜ ๋ฆด๋ฆฌ์Šค์—์„œ ๋™์ž‘์„ ์ˆ˜์ •ํ•ฉ๋‹ˆ๋‹ค. ์šฐ๋ฆฌ๋Š” ๊ฑธ๋ฆด ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค
ํ† ๋ก ์„ ์‹œ์ž‘ํ•˜๊ณ  ํŒจ์น˜๋ฅผ ์ž‘์„ฑํ•˜๋Š” ์ผ์„ ์ฃผ๋„ํ•˜์‹ญ์‹œ์˜ค. ์•„๋งˆ๋„ ๋‹น์‹ ์ผ๊นŒ์š”?

์ง€์ง€ ํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ๋„์›€์ด๋œ๋‹ค๋ฉด ํ† ๋ก ์„ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ ๋ถˆํ–‰ํžˆ๋„ C ์ฝ”๋“œ๋ฅผ ํŒจ์น˜ ํ•  ์ˆ˜๋Š” ์—†์Šต๋‹ˆ๋‹ค.

np.maximum๊ณผ ๊ฐ™์ด ID๊ฐ€์—†๋Š” ufuncs์— ๋Œ€ํ•ด ๋ฌด์—‡์„ ์˜๋„ํ•ฉ๋‹ˆ๊นŒ?

์ด๋Ÿฌํ•œ ํ•จ์ˆ˜์˜ ๊ฒฝ์šฐ ๋นˆ ๊ฐ์†Œ๋Š” ์ด๋ฏธ ์˜ค๋ฅ˜์ด๋ฏ€๋กœ ์˜ค๋ฅ˜ ์—ฌ์•ผํ•ฉ๋‹ˆ๋‹ค.
.reduceat () ๋Œ€์‹  .reduce ()๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ.

์‚ฌ์‹ค, ํ–‰๋™์€ ๋ชจ๋“  ์‚ฌ์šฉ์ž๊ฐ€ ๊ธฐ๋Œ€ํ•˜๋Š” ufunc.reduce(a[indices[i]:indices[i+1]]) ์™€์˜ ์ผ๊ด€์„ฑ์— ์˜ํ•ด ์ฃผ๋„๋˜์–ด์•ผํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ์ด๊ฒƒ์€ ์ƒˆ๋กœ์šด ๋””์ž์ธ ๊ฒฐ์ •์ด ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ €์—๊ฒŒ ์˜ค๋ž˜ ์ง€์†๋˜๋Š” ๋ฒ„๊ทธ ์ˆ˜์ •์ฒ˜๋Ÿผ ๋ณด์ž…๋‹ˆ๋‹ค. ์•„๋ฌด๋„ ํ˜„์žฌ์˜ ์ผ๊ด€๋˜์ง€ ์•Š์€ ํ–‰๋™์„ ์ •๋‹นํ™” ํ•  ์ˆ˜ ์—†๋‹ค๋ฉด.

@njsmith Numpy ๋ชฉ๋ก์— ๋“ฑ๋ก ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๋‚ด ์ฃผ์†Œ๋ฅผ https://mail.scipy.org/mailman/listinfo/numpy-discussion์œผ๋กœ ๋ณด๋ƒˆ์ง€ ๋งŒ "ํ™•์ธ ์š”์ฒญ ์ด๋ฉ”์ผ"์„๋ฐ›์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ตฌ๋…์— ํŠน๋ณ„ํ•œ ์š”๊ตฌ ์‚ฌํ•ญ์ด ํ•„์š”ํ•œ์ง€ ํ™•์‹คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค ...

@divenex : ์ŠคํŒธ ํด๋”๋ฅผ ํ™•์ธํ•˜์…จ์Šต๋‹ˆ๊นŒ? (๋‚˜๋Š” ํ•ญ์ƒ ๊ทธ๊ฒƒ์„ ์žŠ๋Š”๋‹ค ...) ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ๋‚˜๋Š” ๋ฌด์—‡์ด ์ž˜๋ชป ๋  ์ˆ˜ ์žˆ๋Š”์ง€ ํ™•์‹ ํ•˜์ง€ ๋ชปํ•œ๋‹ค. "์ด๋ฉ”์ผ ์ฃผ์†Œ ์žˆ์Œ"์™ธ์— ๊ตฌ๋…์„์œ„ํ•œ ํŠน๋ณ„ํ•œ ์š”๊ตฌ ์‚ฌํ•ญ์ด ์—†์–ด์•ผํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜๋„ ์ œ๋Œ€๋กœ ์ž‘๋™ํ•˜์ง€ ์•Š์œผ๋ฉด ๋ฌธ์ œ๋ฅผ ์ œ๊ธฐํ•˜๋ฉด ๊ด€๋ จ ์‹œ์Šคํ…œ ๊ด€๋ฆฌ์ž๋ฅผ ์ถ”์  ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ๋ฌธ์ œ๊ฐ€ ์žˆ๋Š”์ง€ ํ™•์‹คํžˆ ์•Œ๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

ufunc.reduce(a[indices[i]:indices[i+1]]) ์™€ ์ผ์น˜ํ•˜๋Š” reduceat ๋ฒ„์ „์€ ์ •๋ง ์ •๋ง ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ํ›จ์”ฌ ๋” ์œ ์šฉ ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค! ๋™์ž‘์„ ์„ ํƒํ•˜๋Š” ์ธ์ˆ˜ ๋˜๋Š” ์ƒˆ ํ•จ์ˆ˜ ( reduce_intervals ? reduce_segments ? ...?)๋Š” ์ด์ „ ๋ฒ„์ „์˜ ๋น„ ํ˜ธํ™˜์„ฑ์„ ๊นจ๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•ฉ๋‹ˆ๋‹ค.

np.ufunc.reduceat ์‚ฌ์šฉํ•˜์ง€ ์•Š์œผ๋ ค๋Š” ์œ ํ˜น์„๋ฐ›์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. indices[i] > indices[i+1] ๊ฒฝ์šฐ๋ฅผ ํ”ผํ•˜๊ธฐ ์œ„ํ•ด ์‹œ์ž‘ ๋ฐ ๋ ์ธ๋ฑ์Šค ์ง‘ํ•ฉ์„ ์ง€์ •ํ•  ์ˆ˜์žˆ๋Š” ๊ฒƒ์ด ๋” ์œ ์šฉ ํ•ด ๋ณด์ž…๋‹ˆ๋‹ค. ๋˜ํ•œ at ๋ผ๋Š” ์ด๋ฆ„์€ ๊ธฐ์กด์— ์กด์žฌํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค at ์™€ ํ›จ์”ฌ ๋” ์œ ์‚ฌ ํ•จ์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค.

๋‚ด๊ฐ€ ๋Œ€์ฒด๋กœ ์ œ์•ˆํ•˜๊ณ  ์‹ถ์€ ๊ฒƒ์€ np.piecewise_reduce np.reducebins , ์•„๋งˆ๋„ ์ˆœ์ˆ˜ ํŒŒ์ด์ฌ์ž…๋‹ˆ๋‹ค.

def reducebins(func, arr, start=None, stop=None, axis=-1, out=None):
    """
    Compute (in the 1d case) `out[i] = func.reduce(arr[start[i]:stop[i]])`

    If only `start` is specified, this computes the same reduce at `reduceat` did:

        `out[i]  = func.reduce(arr[start[i]:start[i+1]])`
        `out[-1] = func.reduce(arr[start[-1]:])`

    If only `stop` is specified, this computes:

        `out[0] = func.reduce(arr[:stop[0]])`
        `out[i] = func.reduce(arr[stop[i-1]:stop[i]])`

    """
    # convert to 1d arrays
    if start is not None:
        start = np.array(start, copy=False, ndmin=1, dtype=np.intp)
        assert start.ndim == 1
    if stop is not None:
        stop = np.array(stop, copy=False, ndmin=1, dtype=np.intp)
        assert stop.ndim == 1

    # default arguments that do useful things
    if start is None and stop is None:
        raise ValueError('At least one of start and stop must be specified')
    elif stop is None:
        # start only means reduce from one index to the next, and the last to the end
        stop = np.empty_like(start)
        stop[:-1] = start[1:]
        stop[-1] = arr.shape[axis]
    elif start is None:
        # stop only means reduce from the start to the first index, and one index to the next
        start = np.empty_like(stop)
        start[1:] = stop[:-1]
        start[0] = 0
    else:
        # TODO: possibly confusing?
        start, stop = np.broadcast_arrays(start, stop)

    # allocate output - not clear how to do this safely for subclasses
    if not out:
        sh = list(arr.shape)
        sh[axis] = len(stop)
        sh = tuple(sh)
        out = np.empty(shape=sh)

    # below assumes axis=0 for brevity here
    for i, (si, ei) in enumerate(zip(start, stop)):
        func.reduce(arr[si:ei,...], out=out[i, ...], axis=axis)
    return out

๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ข‹์€ ์†์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

  • np.add.reduce(arr) ๋Š” np.piecewise_reduce(np.add, arr, 0, len(arr)) ์™€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.
  • np.add.reduceat(arr, inds) ๋Š” np.piecewise_reduce(np.add, arr, inds) ์™€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.
  • np.add.accumulate(arr) ๋Š” np.piecewise_reduce(np.add, arr, 0, np.arange(len(arr))) ์™€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.

์ด์ œ __array_ufunc__ ๊ธฐ๊ณ„๋ฅผ ํ†ต๊ณผํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๊นŒ? ์ฒ˜๋ฆฌํ•ด์•ผํ•˜๋Š” ๋Œ€๋ถ€๋ถ„์˜ ํ•ญ๋ชฉ์€ ์ด๋ฏธ func.reduce ๋กœ ์ฒ˜๋ฆฌ๋˜์–ด์•ผํ•ฉ๋‹ˆ๋‹ค. ์œ ์ผํ•œ ๋ฌธ์ œ๋Š” np.concatenate ๊ณต์œ ํ•˜๋Š” ๋ฌธ์ œ์ธ np.empty ๋ผ์ธ์ž…๋‹ˆ๋‹ค.

๊ทธ๊ฒƒ์€ API ๊ด€์ ์—์„œ ๋‚˜์—๊ฒŒ ์ข‹์€ ํ•ด๊ฒฐ์ฑ…์ฒ˜๋Ÿผ ๋“ค๋ฆฝ๋‹ˆ๋‹ค. reduceat ๋‘ ์„ธํŠธ์˜ ์ธ๋ฑ์Šค๋ฅผ ์ง€์ •ํ•  ์ˆ˜์žˆ๋Š” ๊ฒƒ๋งŒ์œผ๋กœ๋„ ์ถฉ๋ถ„ํ•ฉ๋‹ˆ๋‹ค. ๊ตฌํ˜„ ๊ด€์ ์—์„œ? ์ด์ ์„ ์ œ๊ณตํ•˜๋Š” ๊ฒฝ์šฐ ํ˜„์žฌ PyUFunc_Reduceat ๋ฅผ ๋‘ ์„ธํŠธ์˜ inds๋ฅผ ์ง€์›ํ•˜๋„๋ก ๋ณ€๊ฒฝํ•˜๋Š” ๊ฒƒ์€ ๊ทธ๋ฆฌ ์–ด๋ ต์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ์ถ•์ ๊ณผ ๊ฐ™์€ ์œ ์Šค ์ผ€์ด์Šค๋ฅผ ํšจ์œจ์ ์œผ๋กœ ์ง€์›ํ•˜๋Š” ์ด์ ์„ ์‹ค์ œ๋กœ ๋ณธ๋‹ค๋ฉด ๊ทธ๋ ‡๊ฒŒํ•˜๋Š” ๊ฒƒ๋„ ์–ด๋ ต์ง€ ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

Marten์€ ~ 1์˜ ๋น„์Šทํ•œ ํ† ๋ก ์—์„œ ์ด์™€ ๋น„์Šทํ•œ ๊ฒƒ์„ ์ œ์•ˆํ–ˆ์Šต๋‹ˆ๋‹ค.
1 ๋…„ ์ „์— ๊ทธ๋Š” '๋‹จ๊ณ„'์˜ต์…˜์„ ์ถ”๊ฐ€ ํ•  ๊ฐ€๋Šฅ์„ฑ๋„ ์–ธ๊ธ‰ํ–ˆ์Šต๋‹ˆ๋‹ค.

http://numpy-discussion.10968.n7.nabble.com/Behavior-of-reduceat-td42667.html

์ œ์•ˆ์—์„œ ๋‚ด๊ฐ€ ์ข‹์•„ํ•˜๋Š” ๊ฒƒ (๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๊ณ„์‚ฐ ์ค‘์ด๋ฉด +1) :

  • ๊ธฐ์กด ๊ธฐ๋Šฅ์„ ๊ตฌ์ œํ•˜๋ ค๊ณ ํ•˜์ง€ ์•Š๊ณ  ์ƒˆ๋กœ์šด ๊ธฐ๋Šฅ์„ ์ƒ์„ฑ
    ํ•˜๋‚˜.
  • ์‹œ์ž‘ ๋ฐ ๋ ์ธ๋ฑ์Šค ์ธ์ˆ˜๋ฅผ ์ง€์ •ํ•˜๋Š” ๋Œ€์‹ 
    ๋‹ค์ฐจ์› ๋ฐฐ์—ด์—์„œ ๋งˆ์ˆ ์ฒ˜๋Ÿผ ์•Œ์•„๋‚ด๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
  • None ์ธ๋ฑ์Šค์˜ ๊ธฐ๋ณธ๊ฐ’์€ ๋งค์šฐ ๊น”๋”ํ•ฉ๋‹ˆ๋‹ค.

์ด ์ƒˆ๋กœ์šด ๊ธฐ๋Šฅ์— ๋Œ€ํ•ด ์—ด์‹ฌํžˆ ์ƒ๊ฐํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋Š” ๊ฒƒ :

  • '๋‹จ๊ณ„'๋ฅผ ์„ ํƒํ•ด์•ผํ•ฉ๋‹ˆ๊นŒ? (์˜ˆ๋ผ๊ณ  ๋งํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค)
  • ์ธ๋ฑ์Šค ๋ฐฐ์—ด์ด ๋ธŒ๋กœ๋“œ ์บ์ŠคํŠธํ•˜๋Š” ๊ฒƒ์ด ํ•ฉ๋ฆฌ์ ์ž…๋‹ˆ๊นŒ, ์•„๋‹ˆ๋ฉด
    1D์ž…๋‹ˆ๊นŒ?
  • np ํ•จ์ˆ˜ ์—ฌ์•ผํ•ฉ๋‹ˆ๊นŒ, ์•„๋‹ˆ๋ฉด ufunc ๋ฉ”์†Œ๋“œ ์—ฌ์•ผํ•ฉ๋‹ˆ๊นŒ? (๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ์„ ํ˜ธํ•˜๋Š” ๊ฒƒ ๊ฐ™์•„์š”
    ๋ฐฉ๋ฒ•์œผ๋กœ)

๊ทธ๋ฆฌ๊ณ  ์ž์ „๊ฑฐ ์ฐฝ๊ณ  ๋ถ€์„œ์—์„œ ๋‚˜๋Š” ๋” ์ข‹์•„ํ•ฉ๋‹ˆ๋‹ค.

  • ๋” ๊ธฐ์–ต์— ๋‚จ๋Š” ์ด๋ฆ„์„ ์ง€์ •ํ•˜์ง€๋งŒ ์ œ์•ˆ์ด ์—†์Šต๋‹ˆ๋‹ค.
  • '์‹œ์ž‘'๊ณผ '์ค‘์ง€'(๊ทธ๋ฆฌ๊ณ  ์šฐ๋ฆฌ๊ฐ€ ๊ฐ€๊ธฐ๋กœ ๊ฒฐ์ •ํ•œ ๊ฒฝ์šฐ '๋‹จ๊ณ„')๋ฅผ ์‚ฌ์šฉํ•˜์‹ญ์‹œ์˜ค.
    np.arange ๋ฐ Python์˜ ์Šฌ๋ผ์ด์Šค์™€์˜ ์ผ๊ด€์„ฑ.
  • kwarg ์ด๋ฆ„์—์„œ _indices๋ฅผ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.

Jaime

2017 ๋…„ 4 ์›” 13 ์ผ ๋ชฉ์š”์ผ ์˜คํ›„ 1:47, Eric Wieser [email protected]
์ผ๋‹ค :

๋‚˜๋Š” ์•„๋งˆ๋„ np.ufunc.reduceat๋ฅผ ๋ชจ๋‘ ํ•จ๊ป˜ ๋น„๋‚œํ•˜๊ณ  ์‹ถ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์‹œ์ž‘ ๋ฐ ๋ ์ธ๋ฑ์Šค ์„ธํŠธ๋ฅผ ์ง€์ •ํ•  ์ˆ˜์žˆ๋Š” ๊ฒƒ์ด ๋” ์œ ์šฉ ํ•ด ๋ณด์ž…๋‹ˆ๋‹ค.
indices [i]> indices [i + 1] ์ธ ๊ฒฝ์šฐ๋ฅผ ํ”ผํ•˜์‹ญ์‹œ์˜ค. ๋˜ํ•œ, ์ด๋ฆ„์€
์กด์žฌํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค ํ›จ์”ฌ ๋” ํฐ ์œ ์‚ฌ์„ฑ

๋Œ€์ฒด๋กœ ์ œ์•ˆํ•˜๊ณ  ์‹ถ์€ ๊ฒƒ์€ ๊ธฐ๋ณธ์ ์œผ๋กœ np.piecewise_reduce์ž…๋‹ˆ๋‹ค.
์•Š์Šต๋‹ˆ๋‹ค :

def piecewise_reduce (func, arr, start_indices = None, end_indices = None, axis = -1, out = None) :
start_indices๊ฐ€ None์ด๊ณ  end_indices๊ฐ€ None ์ธ ๊ฒฝ์šฐ :
start_indices = np.array ([0], dtype = np.intp)
end_indices = np.array (arr.shape [์ถ•], dtype = np.intp)
elif end_indices๋Š” None์ž…๋‹ˆ๋‹ค.
end_indices = np.empty_like (start_indices)
end_indices [:-1] = start_indices [1 :]
end_indices [-1] = arr.shape [์ถ•]
elif start_indices๋Š” None์ž…๋‹ˆ๋‹ค.
start_indices = np.empty_like (end_indices)
start_indices [1 :] = end_indices
end_indices [0] = 0
๊ทธ๋ฐ–์—:
assert len โ€‹โ€‹(start_indices) == len (end_indices)

if not out:
    sh = list(arr.shape)
    sh[axis] = len(end_indices)
    out = np.empty(shape=sh)

# below assumes axis=0 for brevity here
for i, (si, ei) in enumerate(zip(start_indices, end_indices)):
    func.reduce(arr[si:ei,...], out=alloc[i, ...], axis=axis)
return out

๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ข‹์€ ์†์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

  • np.ufunc.reduce๋Š” np.piecewise_reduce (func, arr, 0,
    len (arr))
  • np.ufunc.accumulate๋Š”`np.piecewise_reduce (func, arr,
    np.zeros (len (arr)), np.arange (len (arr)))

์ž, ์ด๊ฒƒ์ด __array_ufunc__ ๊ธฐ๊ณ„๋ฅผ ํ†ต๊ณผํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๊นŒ? ๋Œ€๋ถ€๋ถ„์˜
์ฒ˜๋ฆฌํ•ด์•ผํ•˜๋Š” ๊ฒƒ์€ ์ด๋ฏธ func.reduce์— ์˜ํ•ด ๋‹ค๋ฃจ์–ด ์ ธ์•ผํ•ฉ๋‹ˆ๋‹ค.
์œ ์ผํ•œ ๋ฌธ์ œ๋Š” np.empty ์ค„์ด๋ฉฐ, ์ด๋Š” np.concatenate ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค.
์ฃผ์‹.

โ€”
์ด ์Šค๋ ˆ๋“œ๋ฅผ ๊ตฌ๋…ํ–ˆ๊ธฐ ๋•Œ๋ฌธ์—์ด ๋ฉ”์‹œ์ง€๊ฐ€ ์ „์†ก๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/numpy/numpy/issues/834#issuecomment-293867746 ๋˜๋Š” ์Œ์†Œ๊ฑฐ
์‹ค
https://github.com/notifications/unsubscribe-auth/ADMGdtjSCodONyu6gCpwofdBaJMCIKa-ks5rvgtrgaJpZM4ANcqc
.

-
(__ /)
(์šฐ)
(> <) Este es Conejo. Copia a Conejo en tu firma y ayรบdale en sus planes
de dominaciรณn mundial.

'์‹œ์ž‘'๋ฐ '์ค‘์ง€'์‚ฌ์šฉ

๋๋‚œ

'๋‹จ๊ณ„'๋ฅผ ์„ ํƒํ•ด์•ผํ• ๊นŒ์š”

๋งค์šฐ ์ข์€ ์‚ฌ์šฉ ์‚ฌ๋ก€์ฒ˜๋Ÿผ ๋ณด์ž…๋‹ˆ๋‹ค.

์ธ๋ฑ์Šค ๋ฐฐ์—ด์ด ๋ธŒ๋กœ๋“œ ์บ์ŠคํŠธํ•˜๋Š” ๊ฒƒ์ด ํ•ฉ๋ฆฌ์ ์ž…๋‹ˆ๊นŒ, ์•„๋‹ˆ๋ฉด 1D ์—ฌ์•ผํ•ฉ๋‹ˆ๊นŒ?

์—…๋ฐ์ดํŠธ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. > 1d๋Š” ๋ถ„๋ช…ํžˆ ๋‚˜์˜์ง€๋งŒ ์ถ•์ ๊ณผ ๊ฐ™์€ ๊ฒฝ์šฐ์—๋Š” 0d์™€ ๋ฐฉ์†ก์„ ํ—ˆ์šฉํ•ด์•ผํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

np ํ•จ์ˆ˜ ์—ฌ์•ผํ•ฉ๋‹ˆ๊นŒ, ์•„๋‹ˆ๋ฉด ufunc ๋ฉ”์†Œ๋“œ ์—ฌ์•ผํ•ฉ๋‹ˆ๊นŒ? (๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ์„ ํ˜ธํ•˜๋Š” ๊ฒƒ ๊ฐ™์•„์š”
๋ฐฉ๋ฒ•์œผ๋กœ)

๋ชจ๋“  ufunc ๋ฉ”์„œ๋“œ๋Š” __array_ufunc__ ์—์„œ ์ฒ˜๋ฆฌํ•ด์•ผ ํ•  ๋˜ ํ•˜๋‚˜์˜ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

reduceat ์˜ ์ฃผ์š” ๋™๊ธฐ๋Š” ์ตœ๋Œ€ ์†๋„๋ฅผ ์œ„ํ•ด reduce ์ด์ƒ์˜ ๋ฃจํ”„๋ฅผ ํ”ผํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ reduce ์ด์ƒ์˜ for ๋ฃจํ”„ ๋ž˜ํผ๊ฐ€ Numpy์— ๋งค์šฐ ์œ ์šฉํ•œ ์ถ”๊ฐ€๊ฐ€ ๋ ์ง€ ์™„์ „ํžˆ ํ™•์‹ ํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. reduceat ์ฃผ ๋ชฉ์ ์— ์œ„๋ฐฐ๋ฉ๋‹ˆ๋‹ค.

๋˜ํ•œ reduce ์ด์ƒ์˜ ๋ฃจํ”„์— ๋Œ€ํ•œ ๋น ๋ฅธ ๋ฒกํ„ฐํ™” ๋œ ๋Œ€์ฒด๋กœ์„œ reduceat ์กด์žฌ ๋ฐ API์— ๋Œ€ํ•œ ๋…ผ๋ฆฌ๋Š” ๊นจ๋—ํ•˜๊ณ  ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ๋น„๋‚œํ•˜์ง€ ์•Š๊ณ  ์˜คํžˆ๋ ค ๊ณ ์น  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

reduceat ์†๋„์™€ ๊ด€๋ จํ•˜์—ฌ ๊ฐ„๋‹จํ•œ ์˜ˆ๋ฅผ ๊ณ ๋ คํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.ํ•˜์ง€๋งŒ reduceat ์‚ฌ์šฉํ•˜๋Š” ์ œ ์ฝ”๋“œ์—์žˆ๋Š” ์‹ค์ œ ์‚ฌ๋ก€์™€ ๋น„์Šทํ•ฉ๋‹ˆ๋‹ค.

n = 10000
arr = np.random.random(n)
inds = np.random.randint(0, n, n//10)
inds.sort()

%timeit out = np.add.reduceat(arr, inds)
10000 loops, best of 3: 42.1 ยตs per loop

%timeit out = piecewise_reduce(np.add, arr, inds)
100 loops, best of 3: 6.03 ms per loop

์ด๊ฒƒ์€ 100 ๋ฐฐ ์ด์ƒ์˜ ์‹œ์ฐจ์ด๋ฉฐ reduceat ํšจ์œจ์„ฑ ์œ ์ง€์˜ ์ค‘์š”์„ฑ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

์š”์•ฝํ•˜๋ฉด ์ƒˆ๋กœ์šด ๊ธฐ๋Šฅ์„ ๋„์ž…ํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค reduceat ์ˆ˜์ •์„ ์šฐ์„ ์‹œํ•ฉ๋‹ˆ๋‹ค.

start_indices ๋ฐ end_indices ๋Š” ๊ฒฝ์šฐ์— ๋”ฐ๋ผ ์œ ์šฉํ•˜์ง€๋งŒ ์ค‘๋ณต๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์œผ๋ฉฐ ๊ฐ€๋Šฅํ•œ ์ถ”๊ฐ€๋กœ ๋ณผ ์ˆ˜ ์žˆ์ง€๋งŒ ํ˜„์žฌ reduceat ๋ถˆ์ผ์น˜์— ๋Œ€ํ•œ ์ˆ˜์ • ์‚ฌํ•ญ์€ ์•„๋‹™๋‹ˆ๋‹ค. ํ–‰๋™.

์‹œ์ž‘ ๋ฐ ์ค‘์ง€ ์ธ๋ฑ์Šค๊ฐ€ ๋‹ค๋ฅธ ๋ฐฐ์—ด์—์„œ ์˜ค๋Š” ๊ฒƒ์„ ํ—ˆ์šฉํ•˜์ง€ ์•Š๋Š”๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.
C๋กœ ๊ตฌํ˜„๋˜๋ฉด ํšจ์œจ์„ฑ์— ํฐ ์ฐจ์ด๋ฅผ ๋งŒ๋“ค ๊ฒƒ์ž…๋‹ˆ๋‹ค.

2017 ๋…„ 4 ์›” 13 ์ผ 23:40์— divenex [email protected] ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

reduceat์˜ ์ฃผ์š” ๋™๊ธฐ๋Š” reduce for ๋ฃจํ”„๋ฅผ ํ”ผํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ตœ๋Œ€ ์†๋„. ๊ทธ๋ž˜์„œ ๋‚˜๋Š” for ๋ฃจํ”„์˜ ๋ž˜ํผ๊ฐ€ ํ™•์‹คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.
reduce๋Š” Numpy์— ๋งค์šฐ ์œ ์šฉํ•œ ์ถ”๊ฐ€ ๊ธฐ๋Šฅ์ž…๋‹ˆ๋‹ค. ๋ฐ˜๋Œ€ ํ•  ๊ฒƒ์ด๋‹ค
์ฃผ๋œ ๋ชฉ์ ์œผ๋กœ ์ค„์ด์‹ญ์‹œ์˜ค.

๋˜ํ•œ ์กด์žฌ์™€ API๋ฅผ ์ค„์ด๊ธฐ์œ„ํ•œ ๋กœ์ง์€ ๋น ๋ฅธ ๋ฒกํ„ฐํ™”๋กœ
๋ฃจํ”„์— ๋Œ€ํ•œ ๊ต์ฒด๋Š” ๊นจ๋—ํ•˜๊ณ  ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. ๋‚ด๊ฐ€ ์•Š์„ ๊ฒƒ
๋” ์ด์ƒ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ  ์ˆ˜์ •ํ•˜์‹ญ์‹œ์˜ค.

๊ฐ์† ์†๋„์™€ ๊ด€๋ จํ•˜์—ฌ ๊ฐ„๋‹จํ•œ ์˜ˆ๋ฅผ ๊ณ ๋ คํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.
๋‚ด ์ฝ”๋“œ์— reduceat๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์‹ค์ œ ์‚ฌ๋ก€๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

n = 10000
arr = np.random.random (n)
inds = np.random.randint (0, n, n // 10)
inds.sort ()
% timeit out = np.add.reduceat (arr, inds) 10000 ๋ฃจํ”„, ์ตœ๊ณ  3 : 42.1 ยตs ๋ฃจํ”„ ๋‹น
% timeit out = piecewise_reduce (np.add, arr, inds) 100 ๋ฃจํ”„, ์ตœ๊ณ  3 : 6.03 ms ๋ฃจํ”„ ๋‹น

์ด๊ฒƒ์€ 100 ๋ฐฐ ์ด์ƒ์˜ ์‹œ์ฐจ์ด๋ฉฐ ์ค‘์š”์„ฑ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.
ํšจ์œจ์„ฑ์„ ๋ณด์กดํ•˜๋Š” ๊ฒƒ.

์š”์•ฝํ•˜๋ฉด, ์ €๋Š” ์ƒˆ๋กœ์šด
๊ธฐ๋Šฅ.

start_indices ๋ฐ end_indices๋ฅผ ๊ฐ–๋Š” ๊ฒƒ์€ ๊ฒฝ์šฐ์— ๋”ฐ๋ผ ์œ ์šฉํ•˜์ง€๋งŒ
์ค‘๋ณต๋˜๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์œผ๋ฉฐ ๊ฐ€๋Šฅํ•œ ์ถ”๊ฐ€ ํ•ญ๋ชฉ์œผ๋กœ ๋ณด ๊ฒ ์ง€๋งŒ ์ˆ˜์ • ์‚ฌํ•ญ์€ ์•„๋‹™๋‹ˆ๋‹ค.
์ผ๊ด€๋˜์ง€ ์•Š์€ ๋™์ž‘์—์„œ ํ˜„์žฌ ๊ฐ์†Œ.

โ€”
๋Œ“๊ธ€์„ ๋‹ฌ์•˜ ๊ธฐ ๋•Œ๋ฌธ์— ์ˆ˜์‹  ํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/numpy/numpy/issues/834#issuecomment-293898215 ๋˜๋Š” ์Œ์†Œ๊ฑฐ
์‹ค
https://github.com/notifications/unsubscribe-auth/AAEz6xPex0fo2y_MqVHbNP5YNkJ0CBJrks5rviW-gaJpZM4ANcqc
.

์ด๊ฒƒ์€ 100 ๋ฐฐ ์ด์ƒ์˜ ์‹œ์ฐจ์ด๋ฉฐ, ํšจ์œจ์„ฑ์„ ์œ ์ง€ํ•˜๋Š” ๊ฒƒ์˜ ์ค‘์š”์„ฑ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. reduce ํ˜ธ์ถœ์˜ ์ฒซ ๋ฒˆ์งธ ๋‹จ๊ณ„์™€ ๊ด€๋ จ๋œ ์˜ค๋ฒ„ ํ—ค๋“œ๋ฅผ ๊ณผ์†Œ ํ‰๊ฐ€ ํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค ( reduceat ๋Œ€ํ•ด ํ•œ ๋ฒˆ๋งŒ ๋ฐœ์ƒ ํ•จ).

์ž์œ  ํ•จ์ˆ˜์— ๋Œ€ํ•œ ์ฃผ์žฅ์€ ์•„๋‹ˆ์ง€๋งŒ ์ˆœ์ˆ˜ํ•œ ํŒŒ์ด์ฌ์œผ๋กœ ๊ตฌํ˜„ํ•˜๋Š” ๊ฒƒ์— ๋Œ€ํ•œ ์ฃผ์žฅ์ž…๋‹ˆ๋‹ค.

๊ทธ๋Ÿฌ๋‚˜ ์ผ๊ด€์„ฑ์—†๋Š” ๋™์ž‘์—์„œ ํ˜„์žฌ ๊ฐ์†Œ์— ๋Œ€ํ•œ ์ˆ˜์ • ์‚ฌํ•ญ์€ ์•„๋‹™๋‹ˆ๋‹ค.

๋ฌธ์ œ๋Š” ์˜ค๋žซ๋™์•ˆ ์‚ฌ์šฉ๋˜์–ด ์˜จ ์ฝ”๋“œ์˜ ๋™์ž‘์„ ๋ณ€๊ฒฝํ•˜๋Š” ๊ฒƒ์ด ๊นŒ๋‹ค ๋กญ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.


๋˜ ๋‹ค๋ฅธ ๊ฐ€๋Šฅํ•œ ํ™•์žฅ : indices[i] > indices[j] ์ด๋ฉด ์—ญ์„ ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค.

    for i, (si, ei) in enumerate(zip(start, stop)):
        if si >= ei:
            func.reduce(arr[si:ei,...], out=out[i, ...], axis=axis)
        else:
            func.reduce(arr[ei:si,...], out=out[i, ...], axis=axis)
            func.inverse(func.identity, out[i, ...], out=out[i, ...])

์—ฌ๊ธฐ์„œ np.add.inverse = np.subtract , np.multiply.inverse = np.true_divide . ๊ฒฐ๊ณผ์ ์œผ๋กœ

func.reduce(func.reduceat(x, inds_from_0)) == func.reduce(x))

์˜ˆ๋ฅผ ๋“ค๋ฉด

a = [1, 2, 3, 4]
inds = [0, 3, 1]
result = np.add.reduceat(a, inds) # [6, -5, 9] == [(1 + 2 + 3), -(3 + 2), (2 + 3 + 4)]

๋ฌธ์ œ๋Š” ์˜ค๋žซ๋™์•ˆ ์‚ฌ์šฉ๋˜์–ด ์˜จ ์ฝ”๋“œ์˜ ๋™์ž‘์„ ๋ณ€๊ฒฝํ•˜๋Š” ๊ฒƒ์ด ๊นŒ๋‹ค ๋กญ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์ด๊ฒƒ์ด ๋ถ€๋ถ„์ ์œผ๋กœ ๋‚˜๋Š” ์ „์ž ๋ฉ”์ผ ์Šค๋ ˆ๋“œ์—์„œ ์ถ”๊ฐ€ ์ฐจ์›์ด 2 ๋˜๋Š” 3 ์ธ ์ธ๋ฑ์Šค์˜ 2 ์ฐจ์› ๋ฐฐ์—ด์— ํŠน๋ณ„ํ•œ ์˜๋ฏธ๋ฅผ ๋ถ€์—ฌํ•˜๋„๋ก ์ œ์•ˆํ•œ ์ด์œ ์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ (ํšจ๊ณผ์ ์œผ๋กœ) ์Šฌ๋ผ์ด์Šค ์Šคํƒ์œผ๋กœ ํ•ด์„๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๊ฒƒ์€ ๋˜ํ•œ ๋‹ค์†Œ ์ง€์ €๋ถ„ํ•˜๊ณ  ๋ฌผ๋ก  reduce_by_slice , slicereduce ๋˜๋Š” reduceslice ๋ฉ”์„œ๋“œ๊ฐ€์žˆ์„ ์ˆ˜ ์žˆ์Œ์„ ์•Œ๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

์ถ”์‹  ๋งŽ์€ ufunc์—์„œ ์ž‘๋™ํ•˜๋Š” ๋ชจ๋“  ๊ฒƒ์ด __array_ufunc__ ํ†ตํ•ด ์ „๋‹ฌ๋˜๊ณ  ์žฌ์ •์˜ ๋  ์ˆ˜ ์žˆ๋„๋ก ๋ฉ”์„œ๋“œ ์—ฌ์•ผํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์‚ฌ์‹ค, ์ œ๊ฐ€ ์ƒ๊ฐํ•˜๋Š” ๋‹ค๋ฅธ ์ œ์•ˆ์ด ํ›จ์”ฌ ๋‚ซ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. reduceat ๊ตฌํ•˜๋Š” ๊ฒƒ๋ณด๋‹ค slice ์ธ์ˆ˜ (๋˜๋Š” start , stop , step )์—์„œ ufunc.reduce !? @ eric-wieser๊ฐ€ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด ์ด๋Ÿฌํ•œ ๊ตฌํ˜„์€ reduceat ์™„์ „ํžˆ ํ๊ธฐ ํ•  ์ˆ˜ ์žˆ์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

add.reduce(array, slice=slice(indices[:-1], indices[1:])

(์ด์ œ ๋นˆ ์Šฌ๋ผ์ด์Šค์— ๋Œ€ํ•ด ์˜ˆ์ƒ๋˜๋Š” ๋™์ž‘๊ณผ ์ผ์น˜ํ•˜๋„๋ก ์ž์œ ๋กญ๊ฒŒ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.)

์—ฌ๊ธฐ์—์„œ ์Šฌ๋ผ์ด์Šค๊ฐ€ 0-d์ด๋ฉด ๋ธŒ๋กœ๋“œ ์บ์ŠคํŠธํ•˜๊ณ  ์ถ•์˜ ํŠœํ”Œ์ด ์‚ฌ์šฉ ๋œ ๊ฒฝ์šฐ ์Šฌ๋ผ์ด์Šค์˜ ํŠœํ”Œ์„ ์ „๋‹ฌํ•˜๋Š” ๊ฒƒ์„ ๊ณ ๋ คํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

ํŽธ์ง‘ : ์œ„์˜ slice(indices[:-1], indices[1:]) ๋ฅผ ์Šฌ๋ผ์ด์Šค ํŠœํ”Œ์— ๋Œ€ํ•œ ํ™•์žฅ์„ ํ—ˆ์šฉํ•˜๋„๋ก ๋งŒ๋“ค์—ˆ์Šต๋‹ˆ๋‹ค ( slice ๋Š” ์ž„์˜์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๋ณด์œ  ํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ์ž˜ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค).

๊ฐ€์žฅ ๋…ผ๋ฆฌ์  ์ธ ๋””์ž์ธ ์†”๋ฃจ์…˜ ์ธ reduce ์˜ ์ ์ ˆํ•œ 100 % ๋ฒกํ„ฐํ™” ๋œ ๋ฒ„์ „์œผ๋กœ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด reduceat ๋Œ€ํ•œ ์ˆ˜์ • ์‚ฌํ•ญ์„ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜๋Š” ์ฝ”๋“œ๊ฐ€ ๊นจ์ง€๋Š” ๊ฒƒ์„ ๋ฐฉ์ง€ํ•˜๊ธฐ ์œ„ํ•ด (์•„๋ž˜ ์ฐธ์กฐ) reducebins ๊ฐ™์€ ์ด๋ฆ„์˜ ๋™๋“ฑํ•œ ๋ฉ”์„œ๋“œ๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๋‹จ์ˆœํžˆ reduceat ์˜ ์ˆ˜์ • ๋œ ๋ฒ„์ „์ž…๋‹ˆ๋‹ค. ์‚ฌ์‹ค, ๋‚˜๋Š” reduceat ์˜ ์ด๋ฆ„์ด at ํ•จ์ˆ˜์— ๋” ๋งŽ์€ ์—ฐ๊ฒฐ์„ ์ „๋‹ฌํ•œ๋‹ค๋Š” @ eric-wieser์— ๋™์˜ํ•ฉ๋‹ˆ๋‹ค.

์ฝ”๋“œ๋ฅผ ๊นฐ ํ•„์š”๊ฐ€ ์—†๋‹ค๋Š” ๊ฒƒ์„ ์ดํ•ดํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋‚˜๋Š” ๊ทธ๊ฒƒ์ด ๋‹จ์ˆœํžˆ ๋…ผ๋ฆฌ์ ์œผ๋กœ ๋ง์ด๋˜์ง€ ์•Š๋Š”๋‹ค๋Š” ์ ์„ ๊ฐ์•ˆํ•  ๋•Œ, ๋งŽ์€ ์ฝ”๋“œ๊ฐ€ ์˜ค๋ž˜๋œ ํ–‰๋™์— ์˜์กดํ•˜๊ณ  ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ์ƒ์ƒํ•˜๊ธฐ ์–ด๋ ต๋‹ค๊ณ  ๋งํ•ด์•ผํ•˜๋ฉฐ, ๋‹จ์ˆœํžˆ ์˜ค๋ž˜ ์ง€์†๋˜๋Š” ๋ฒ„๊ทธ๋ผ๊ณ  ๋ถ€๋ฅผ ๊ฒƒ์ž…๋‹ˆ๋‹ค. reduceat ์‚ฌ์šฉํ•˜๋Š” ์ฝ”๋“œ๋Š” indices ์˜ ๋„Œ์„ผ์Šค ๊ฒฐ๊ณผ๋ฅผ ํ”ผํ•˜๊ธฐ ์œ„ํ•ด out[:-1] *= np.diff(indices) > 0 ์‚ฌ์šฉํ•˜์—ฌ ์ถœ๋ ฅ์„ ์ˆ˜์ •ํ–ˆ๋‹ค๊ณ  ์˜ˆ์ƒํ•ฉ๋‹ˆ๋‹ค reduceat out[:-1] *= np.diff(indices) > 0 . ๋ฌผ๋ก  ์ด์ „ ๋™์ž‘ / ๋ฒ„๊ทธ๊ฐ€ ์˜๋„ ํ•œ๋Œ€๋กœ ์‚ฌ์šฉ ๋œ ์‚ฌ์šฉ์ž ์‚ฌ๋ก€์— ๊ด€์‹ฌ์ด์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@mhvk slice ์†”๋ฃจ์…˜์ด slice ๊ตฌ์กฐ์— ๋Œ€ํ•œ ๋น„ํ‘œ์ค€ ์‚ฌ์šฉ์„ ๋„์ž…ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋‚˜๋Š” ์™„์ „ํžˆ ํ™•์‹ ํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. ๋”์šฑ์ด ๊ทธ๊ฒƒ์€ ํ•˜๋‚˜์˜ ์ถ•์„ ๋”ฐ๋ผ ufunc๋ฅผ ์ ์šฉํ•˜์—ฌ _ "a์˜ ์ฐจ์›์„ 1๋งŒํผ ๊ฐ์†Œ์‹œํ‚ค๋Š” reduce ์˜ ํ˜„์žฌ ๋””์ž์ธ ์•„์ด๋””์–ด์™€ ์ผ์น˜ํ•˜์ง€ ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค."_

๋˜ํ•œ start ๋ฐ end ์ธ๋ฑ์Šค ๋ชจ๋‘์— ๋Œ€ํ•ด ์„ค๋“๋ ฅ์žˆ๋Š” ์‚ฌ์šฉ์ž ์‚ฌ๋ก€๋ฅผ ๋ณด์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์‹ค์ œ๋กœ np.histogram ์™€ ๊ฐœ๋…์ ์œผ๋กœ ์œ ์‚ฌํ•œ ํ˜„์žฌ reduceat ๋ฉ”์„œ๋“œ์˜ ๋ฉ‹์ง„ ๋””์ž์ธ ๋…ผ๋ฆฌ๋ฅผ ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ bins ๋Š” _ "๋นˆ ๊ฐ€์žฅ์ž๋ฆฌ๋ฅผ ์ •์˜ํ•˜๊ณ "_๋Š” ๋‹ค์Œ์œผ๋กœ ๋Œ€์ฒด๋ฉ๋‹ˆ๋‹ค. indices , Bins Edge๋„ ๋‚˜ํƒ€๋‚ด์ง€ ๋งŒ ๊ฐ’์ด ์•„๋‹Œ ์ธ๋ฑ์Šค ๊ณต๊ฐ„์— ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  reduceat ๋Š” ๊ฐ bin ๊ฐ€์žฅ์ž๋ฆฌ ์Œ์— ํฌํ•จ ๋œ ์š”์†Œ์— ํ•จ์ˆ˜๋ฅผ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค. ํžˆ์Šคํ† ๊ทธ๋žจ์€ ๋งค์šฐ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ๊ตฌ์„ฑ์ด์ง€๋งŒ ํ•„์š”ํ•˜์ง€ ์•Š์œผ๋ฉฐ Numpy์—๋Š” ์™ผ์ชฝ ๋ฐ ์˜ค๋ฅธ์ชฝ ๊ฐ€์žฅ์ž๋ฆฌ์˜ ๋‘ ๋ฒกํ„ฐ๋ฅผ ์ „๋‹ฌํ•˜๋Š” ์˜ต์…˜์ด ํฌํ•จ๋˜์–ด ์žˆ์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ฐ™์€ ์ด์œ ๋กœ ๋‚˜๋Š” reduceat ๋˜๋Š” ๊ทธ ๊ต์ฒด์—์„œ ์–‘์ชฝ ๊ฐ€์žฅ์ž๋ฆฌ์— ๋Œ€ํ•œ ๊ฐ•ํ•œ ํ•„์š”์„ฑ์ด ์žˆ๋Š”์ง€ ์˜์‹ฌํ•ฉ๋‹ˆ๋‹ค.

reduceat์˜ ์ฃผ๋œ ๋™๊ธฐ๋Š” ์ตœ๋Œ€ ์†๋„๋ฅผ ์œ„ํ•ด ๋ฃจํ”„ ์˜ค๋ฒ„ ๊ฐ์†Œ๋ฅผ ํ”ผํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ for ๋ฃจํ”„์˜ ๋ž˜ํผ๊ฐ€ Numpy์— ๋งค์šฐ ์œ ์šฉํ•œ ์ถ”๊ฐ€ ๊ธฐ๋Šฅ์ด ๋ ์ง€ ์™„์ „ํžˆ ํ™•์‹ ํ•˜์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์€ ์ฃผ๋œ ๋ชฉ์ ์—์„œ ๊ฐ์†Œ ํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์—ฌ๊ธฐ @divenex์— ๋™์˜ํ•ฉ๋‹ˆ๋‹ค. reduceat ์ธ๋ฑ์Šค๋ฅผ ์ •๋ ฌํ•˜๊ณ  ์ค‘์ฒฉํ•ด์•ผํ•œ๋‹ค๋Š” ์‚ฌ์‹ค์€ ๋ฃจํ”„๊ฐ€ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ๋‹จ์ผ ํŒจ์Šค๋กœ ์บ์‹œ ํšจ์œจ์ ์ธ ๋ฐฉ์‹์œผ๋กœ ๊ณ„์‚ฐ ๋  ์ˆ˜ ์žˆ๋„๋ก ๋ณด์žฅํ•˜๋Š” ํ•ฉ๋ฆฌ์ ์ธ ์ œ์•ฝ ์กฐ๊ฑด์ž…๋‹ˆ๋‹ค. ๊ฒน์น˜๋Š” ๋นˆ์„ ์›ํ•  ๊ฒฝ์šฐ ์›ํ•˜๋Š” ์ž‘์—…์„ ๊ณ„์‚ฐํ•˜๋Š” ๋” ๋‚˜์€ ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๋‹ค (์˜ˆ : ๋กค๋ง ์ฐฝ ์ง‘๊ณ„).

๋˜ํ•œ ๊ฐ€์žฅ ๊นจ๋—ํ•œ ํ•ด๊ฒฐ์ฑ…์€ ๊ณ ์ • API๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ reducebins ์™€ ๊ฐ™์€ ์ƒˆ๋กœ์šด ๋ฉ”์„œ๋“œ๋ฅผ ์ •์˜ํ•˜๊ณ  (๊ทธ๋ฆฌ๊ณ  reduceat ๋” ์ด์ƒ ์‚ฌ์šฉํ•˜์ง€ ์•Š์Œ) reduce ๋Š” ์ด๋ฏธ ๋‹ค๋ฅธ ์ผ์„ํ•ฉ๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„,

๋‚˜๋Š” ์ด๊ฒƒ์ด ๋ฒ„๊ทธ๋ผ๋Š” ํ† ๋ก ์„ ์‹น์“ธ์ดํ•˜๊ณ  ์‹ถ๋‹ค. ์ด๊ฒƒ์€ ๋… ์ŠคํŠธ๋ง ์˜ ๋™์ž‘ ์ž…๋‹ˆ๋‹ค.

For i in ``range(len(indices))``, `reduceat` computes
``ufunc.reduce(a[indices[i]:indices[i+1]])``, which becomes the i-th
generalized "row" parallel to `axis` in the final result (i.e., in a
2-D array, for example, if `axis = 0`, it becomes the i-th row, but if
`axis = 1`, it becomes the i-th column).  There are three exceptions to this:

* when ``i = len(indices) - 1`` (so for the last index),
  ``indices[i+1] = a.shape[axis]``.
* if ``indices[i] >= indices[i + 1]``, the i-th generalized "row" is
  simply ``a[indices[i]]``.
* if ``indices[i] >= len(a)`` or ``indices[i] < 0``, an error is raised.

๋”ฐ๋ผ์„œ reduceat ์˜ ๋™์ž‘์„ ๋ณ€๊ฒฝํ•˜๋ ค๋Š” ๋ชจ๋“  ์‹œ๋„์— ๋ฐ˜๋Œ€ํ•ฉ๋‹ˆ๋‹ค.

๋น ๋ฅธ github ๊ฒ€์ƒ‰์€ ํ•จ์ˆ˜์˜ ๋งŽ์€ ์šฉ๋„๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์žˆ๋Š” ๋ชจ๋“  ์‚ฌ๋žŒ๋“ค์ด ๋ชจ๋‘ ์—„๊ฒฉํ•˜๊ฒŒ ์ฆ๊ฐ€ํ•˜๋Š” ์ธ๋ฑ์Šค ๋งŒ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ํ™•์‹ ํ•ฉ๋‹ˆ๊นŒ?

์ƒˆ ํ•จ์ˆ˜์˜ ๋™์ž‘๊ณผ ๊ด€๋ จํ•˜์—ฌ ๋ณ„๋„์˜ ์‹œ์ž‘ / ์ค‘์ง€ ๋ฐฐ์—ด์ด ์—†์œผ๋ฉด ๊ธฐ๋Šฅ์ด ์‹ฌ๊ฐํ•˜๊ฒŒ ๋ฐฉํ•ด๋ฅผ๋ฐ›๋Š”๋‹ค๊ณ  ์ฃผ์žฅํ•ฉ๋‹ˆ๋‹ค. ๊ทœ์น™์ ์œผ๋กœ ๋ฐฐ์—ด๋˜์ง€ ์•Š์€ ๊ฒน์น˜๋Š” ์ฐฝ์—์„œ ๊ฐ’์„ ์ธก์ •ํ•˜๋ ค๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์ด ์žˆ์Šต๋‹ˆ๋‹ค (๋”ฐ๋ผ์„œ ๋กค๋ง ์ฐฝ์ด ์ž‘๋™ํ•˜์ง€ ์•Š์Œ). ์˜ˆ๋ฅผ ๋“ค์–ด, ๊ด€์‹ฌ ์˜์—ญ์€ ๋ช‡ ๊ฐ€์ง€ ๋…๋ฆฝ์  ์ธ ๋ฐฉ๋ฒ•์œผ๋กœ ๊ฒฐ์ •๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  @divenex ๋Š” Python ๋ฐ˜๋ณต์— ๋Œ€ํ•œ ์„ฑ๋Šฅ ์ฐจ์ด๊ฐ€ ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค .

๊ทœ์น™์ ์œผ๋กœ ๋ฐฐ์—ด๋˜์ง€ ์•Š์€ ๊ฒน์น˜๋Š” ์ฐฝ์—์„œ ๊ฐ’์„ ์ธก์ •ํ•˜๋ ค๋Š” ๊ฒฝ์šฐ๊ฐ€ ๋งŽ์ด ์žˆ์Šต๋‹ˆ๋‹ค (๋”ฐ๋ผ์„œ ๋กค๋ง ์ฐฝ์ด ์ž‘๋™ํ•˜์ง€ ์•Š์Œ).

์˜ˆ,ํ•˜์ง€๋งŒ reduceat ์˜ํ•ด ๊ตฌํ˜„ ๋œ ๊ฒƒ๊ณผ ๊ฐ™์€ ์ˆœ์ง„ํ•œ ๋ฃจํ”„๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์‹ถ์ง€๋Š” ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ๋‹จ์ผ ์„ ํ˜• ํŒจ์Šค๋กœ ์ˆ˜ํ–‰ ํ•  ์ˆ˜ ์žˆ๋„๋ก ์–ด๋–ค ๋ฐฉ์‹ ์œผ๋กœ๋“  ์ค‘๊ฐ„ ๊ฒฐ๊ณผ๋ฅผ ์ €์žฅํ•˜๋Š” ์ž์ฒด ๋กค๋ง ์œˆ๋„์šฐ ๊ณ„์‚ฐ์„ ๊ตฌํ˜„ํ•˜๋ ค๊ณ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด์ œ ์šฐ๋ฆฌ๋Š” reduceat ๋ณด๋‹ค ํ›จ์”ฌ ๋” ๋ณต์žกํ•œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

@shoyer ROI ์ค‘ ์ผ๋ถ€๋งŒ ๊ฒน์น˜๋Š” ๊ฒฝ์šฐ๋ฅผ ์ƒ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฒฝ์šฐ ์‚ฌ์šฉ์ž ์ •์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ž‘์„ฑํ•˜๋Š” ๊ฒƒ์€ ์—„์ฒญ๋‚œ ๊ณผ์ž‰์ž…๋‹ˆ๋‹ค. ์šฐ๋ฆฌ์˜ ์ฃผ์š” ์‚ฌ์šฉ์ž ๊ธฐ๋ฐ˜์€ ์ผ๋ฐ˜์ ์œผ๋กœ ์‹œ๊ฐ„์ด ๋ถ€์กฑํ•˜๊ณ  ์ ˆ๋Œ€ ์ตœ์ ์ด ์•„๋‹Œ "์ถฉ๋ถ„ํžˆ ์ข‹์€"์†”๋ฃจ์…˜์„ ํ•„์š”๋กœํ•˜๋Š” ๊ณผํ•™์ž๋ผ๋Š” ์‚ฌ์‹ค์„ ์žŠ์ง€ ๋งˆ์‹ญ์‹œ์˜ค. np.reduceat ์˜ ๋ณต์žก์„ฑ๊ณผ ๊ด€๋ จ๋œ ๋‚ฎ์€ ์ƒ์ˆ˜ ์š”์ธ์€ ์ˆœ์ˆ˜ํ•œ Python ์ฝ”๋“œ๋กœ ๋” ๋‚˜์€ ์†”๋ฃจ์…˜์„ ์–ป๋Š” ๊ฒƒ์ด ์–ด๋ ต๊ฑฐ๋‚˜ ๋ถˆ๊ฐ€๋Šฅํ•˜๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ ์‚ฌ์šฉ์ž๊ฐ€ ์ž‘์„ฑํ•˜๋ ค๋Š” ์œ ์ผํ•œ ์ฝ”๋“œ์ž…๋‹ˆ๋‹ค.

@jni ๋ฌผ๋ก ์ž…๋‹ˆ๋‹ค. ์ž„์˜์˜ ์‹œ์ž‘ ๋ฐ ์ค‘์ง€๊ฐ€์žˆ๋Š” ๊ทธ๋ฃน์œผ๋กœ ์ค„์ด๋Š” ๊ฒƒ์ด ์œ ์šฉ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๊ทธ๊ฒƒ์€ ๋‚˜์—๊ฒŒ ๋ฒ”์œ„๊ฐ€ ํฌ๊ฒŒ ์ฆ๊ฐ€ํ•œ ๊ฒƒ์ฒ˜๋Ÿผ ๋Š๊ปด์ง€๋ฉฐ reduceat (์šฐ๋ฆฌ๊ฐ€ ์ ˆ๋Œ€ ์ œ๊ฑฐํ•˜์ง€ ์•Š๋”๋ผ๋„ ํ™•์‹คํžˆ ์ œ๊ฑฐํ•˜๊ณ  ์‹ถ์€) ๋Œ€์‹ ์— ๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์— ๋” ์ ํ•ฉํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

์ž„์˜์˜ ์‹œ์ž‘ ๋ฐ ์ค‘์ง€๊ฐ€์žˆ๋Š” ๊ทธ๋ฃน์œผ๋กœ ์ค„์ด๋Š” ๊ฒƒ์ด ์œ ์šฉ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ œ๊ฒŒ๋Š” ๋ฒ”์œ„๊ฐ€ ํฌ๊ฒŒ ๋Š˜์–ด๋‚œ ๊ฒƒ ๊ฐ™์•„์š”

์ด๊ฒƒ์€ ๋‚˜์—๊ฒŒ ๋งค์šฐ ์‚ฌ์†Œํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ์ง€๊ธˆ ๋‹น์žฅ์€ ๋ณธ์งˆ์ ์œผ๋กœ ind1 = indices[i], ind2 = indices[i + 1] ๋ฅผ ์ˆ˜ํ–‰ํ•˜๋Š” ์ฝ”๋“œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ๋™์ผํ•œ ๋ฐฐ์—ด ๋Œ€์‹  ๋‘ ๊ฐœ์˜ ๋‹ค๋ฅธ ๋ฐฐ์—ด์„ ์‚ฌ์šฉํ•˜๋„๋ก ๋ณ€๊ฒฝํ•˜๋Š” ๊ฒƒ์€ ๋งค์šฐ ์ ์€ ๋…ธ๋ ฅ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ์—ฐ์† ๋ฒ”์œ„๋ฅผ ์ „๋‹ฌํ•  ๋•Œ ๋‹จ์ผ ํŒจ์Šค ๋™์ž‘์€ ์ง€๊ธˆ๊ณผ ๊ฑฐ์˜ ๋˜‘๊ฐ™์ด ๋นจ๋ผ์•ผํ•ฉ๋‹ˆ๋‹ค. ์œ ์ผํ•œ ์˜ค๋ฒ„ ํ—ค๋“œ๋Š” nditer์— ๋Œ€ํ•œ ํ•˜๋‚˜ ์ด์ƒ์˜ ์ธ์ˆ˜์ž…๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ๋‚˜์—๊ฒŒ ๋งค์šฐ ์‚ฌ์†Œํ•œ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๋ฐ”๋กœ ๊ทธ๊ฑฐ์ฃ . ๋˜ํ•œ ์‚ฌ์šฉ์ž๊ฐ€ reduceat (๋‹ค๋ฅธ ๋ชจ๋“  ์ธ๋ฑ์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ)๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ธฐ๋Šฅ์ด์ง€๋งŒ ์ค‘๋ณต์„ ์ง€์›ํ•˜์ง€ ์•Š๋Š” ์ƒˆ๋กœ์šด ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜๋ฉด ์†์‹ค๋ฉ๋‹ˆ๋‹ค.

๋˜ํ•œ ๋‘ ๊ฐœ์˜ ์ธ๋ฑ์Šค ํ˜•์‹์€ ์ด์ „ (๊ธฐ๊ดดํ•œ) ๋™์ž‘์„ ๋ชจ๋ฐฉ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

def reduceat(func, arr, inds):
    deprecation_warning()
    start = inds
    stops = zeros(inds.shape)
    stops[:-1] = start[1:]
    stops[-1] = len(arr)
    np.add(stops, 1, where=ends == starts, out=stops)  # reintroduce the "bug" that we would have to keep
    return reducebins(func, arr, starts, stops)

๋งค์šฐ ์œ ์‚ฌํ•œ ๋‘ ๊ฐ€์ง€ ๊ตฌํ˜„์„ ์œ ์ง€ํ•  ํ•„์š”๊ฐ€ ์—†์Œ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

์ƒˆ๋กœ์šด reducebins ์— ๋Œ€ํ•œ starts ๋ฐ stops ์ธ๋ฑ์Šค์— ๋Œ€ํ•ด ๊ฐ•๋ ฅํ•˜๊ฒŒ ๋ฐ˜๋Œ€ํ•˜์ง€๋Š” ์•Š์ง€๋งŒ ๋‘˜ ๋‹ค ํ•„์š”ํ•œ ๊ฒฝ์šฐ ๋ช…๋ฐฑํ•œ ์˜ˆ๋ฅผ ๋ณผ ์ˆ˜๋Š” ์—†์Šต๋‹ˆ๋‹ค. ์‹œ์ž‘ ๋ฐ ๋ bins ๊ฐ€์žฅ์ž๋ฆฌ๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ np.histogram ์„ ์ผ๋ฐ˜ํ™”ํ•˜๋Š” ๊ฒƒ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๊ถ๊ทน์ ์œผ๋กœ ์ฃผ์š” ์‚ฌ์šฉ์ด ์˜ํ–ฅ์„๋ฐ›์ง€ ์•Š๊ณ  ๋‹จ์ผ ์ธ๋ฑ์Šค ๋ฐฐ์—ด๋กœ ์†๋„ ์ €ํ•˜์—†์ด reducebins(arr, indices) ๋ฅผ ํ˜ธ์ถœ ํ•  ์ˆ˜๋„์žˆ๋Š” ํ•œ ๊ดœ์ฐฎ์Šต๋‹ˆ๋‹ค.

๋ฌผ๋ก  ๊ฒน์น˜์ง€ ์•Š๋Š” ๋นˆ์—์„œ ์ž‘์—…ํ•ด์•ผํ•˜๋Š” ์ƒํ™ฉ์ด ๋งŽ์ด ์žˆ์ง€๋งŒ์ด ๊ฒฝ์šฐ ์ผ๋ฐ˜์ ์œผ๋กœ ๋นˆ์ด ํ•œ ์Œ์˜ ์—์ง€๋งŒ์œผ๋กœ ์ •์˜ ๋˜์ง€ ์•Š์„ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์ข…๋ฅ˜์˜ ์‹œ๋‚˜๋ฆฌ์˜ค์— ์‚ฌ์šฉํ•  ์ˆ˜์žˆ๋Š” ํ•จ์ˆ˜๋Š” Scipy์˜ ndimage.labeled_comprehension ๋ฐ ndimage.sum ๋“ฑ๊ณผ ๊ฐ™์€ ๊ด€๋ จ ํ•จ์ˆ˜์ž…๋‹ˆ๋‹ค.

๊ทธ๋Ÿฌ๋‚˜ ์ด๊ฒƒ์€ reducebins ์˜ ๋ฒ”์œ„์™€๋Š” ์ƒ๋‹นํžˆ ๋‹ค๋ฅธ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ, ๋ฌด์—‡์„์œ„ํ•œ ์ž์—ฐ ์ด์šฉ์˜ ๊ฒฝ์šฐ ๊ฒƒ starts ๋ฐ stops ์—์„œ reducebins ?

๊ทธ๋ ‡๋‹ค๋ฉด reducebins์—์„œ ์‹œ์ž‘ ๋ฐ ์ค‘์ง€์— ๋Œ€ํ•œ ์ž์—ฐ์Šค๋Ÿฌ์šด ์‚ฌ์šฉ ์‚ฌ๋ก€๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

๋‹ค๋ฅธ ๋ฐฉ๋ฒ•์œผ๋กœ ๋‹ฌ์„ฑ ํ•  ์ˆ˜ ์žˆ์ง€๋งŒ k ๊ธธ์ด์˜ ์ด๋™ ํ‰๊ท ์€ reducebins(np,add, arr, arange(n-k), k + arange(n-k)) ์ž…๋‹ˆ๋‹ค. ์ธ๋ฑ์Šค ํ• ๋‹น ๋น„์šฉ์„ ๋ฌด์‹œํ•˜๋ฉด ์„ฑ๋Šฅ์ด as_strided ์ ‘๊ทผ ๋ฐฉ์‹๊ณผ ๋น„์Šทํ•  ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

๊ณ ์œ ํ•˜๊ฒŒ reducebins ๋Š” ๋‹ค์–‘ํ•œ ๊ธฐ๊ฐ„์˜ ์ด๋™ ํ‰๊ท ์„ ํ—ˆ์šฉํ•˜์ง€๋งŒ as_strided ์—์„œ๋Š” ๋ถˆ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

๋˜ ๋‹ค๋ฅธ ์‚ฌ์šฉ ์‚ฌ๋ก€-๋‹จ์ผ ์ธ์ˆ˜ ํ˜•์‹์— ๋ ๋˜๋Š” ์‹œ์ž‘์„ ํฌํ•จํ•˜๋Š” ๊ฒƒ ์‚ฌ์ด์˜ ๋ช…ํ™•์„ฑ.

์˜ˆ๋ฅผ ๋“ค๋ฉด :

a = np.arange(10)
reducebins(np.add, start=[2, 4, 6]) == [2 + 3, 4 + 5, 6 + 7 + 8 + 9]  # what `reduceat` does
reducebins(np.add, stop=[2, 4, 6])  == [0 + 1, 2 + 3, 4 + 5]          # also useful

๋˜ ๋‹ค๋ฅธ ์‚ฌ์šฉ ์‚ฌ๋ก€-๋‹จ์ผ ์ธ์ˆ˜ ํ˜•์‹์— ๋ ๋˜๋Š” ์‹œ์ž‘์„ ํฌํ•จํ•˜๋Š” ๊ฒƒ ์‚ฌ์ด์˜ ๋ช…ํ™•์„ฑ.

๋‚˜๋Š” ์ด๊ฒƒ์„ ์ž˜ ์ดํ•ดํ•˜์ง€ ๋ชปํ•œ๋‹ค. ์—ฌ๊ธฐ์— ์ž…๋ ฅ ํ…์„œ๋ฅผ ํฌํ•จ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ๋˜ํ•œ : start / stop ์˜ ๊ธฐ๋ณธ๊ฐ’์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

์–ด์จŒ๋“ , ๋‚˜๋Š” ๋ณ„๋„์˜ ์ฃผ์žฅ์— ๊ฐ•ํ•˜๊ฒŒ ๋ฐ˜๋Œ€ํ•˜์ง€๋Š” ์•Š์ง€๋งŒ ๊ต์ฒด๊ฐ€ ๊นจ๋—ํ•˜์ง€๋Š” ์•Š์Šต๋‹ˆ๋‹ค. "reducat์„ ์‚ฌ์šฉํ•˜์ง€ ๋ง๊ณ  ๋Œ€์‹  reducebin์„ ์‚ฌ์šฉํ•˜์‹ญ์‹œ์˜ค"๋ผ๊ณ  ๋งํ•˜๊ณ  ์‹ถ์ง€๋งŒ ์ธํ„ฐํŽ˜์ด์Šค๊ฐ€ ๋‹ค๋ฅด๊ฒŒ ๋ณด์ผ ๋•Œ (์•ฝ๊ฐ„) ๋” ์–ด๋ ต์Šต๋‹ˆ๋‹ค.

์‚ฌ์‹ค, ์‹œ์ž‘ / ์ค‘์ง€ ์˜ต์…˜์กฐ์ฐจ๋„ ๋นˆ ์Šฌ๋ผ์ด์Šค์˜ ์‚ฌ์šฉ ์‚ฌ๋ก€๋ฅผ ๋‹ค๋ฃจ์ง€ ์•Š๋Š”๋‹ค๋Š” ๊ฒƒ์„ ๊นจ๋‹ฌ์•˜์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๊ณผ๊ฑฐ์— ๋‚˜์—๊ฒŒ ์œ ์šฉํ–ˆ๋˜ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‚ด ์†์„ฑ / ๋ ˆ์ด๋ธ”์ด CSR ํฌ์†Œ ํ–‰๋ ฌ์˜ ํ–‰์— ํ•ด๋‹นํ•˜๋Š” ๊ฒฝ์šฐ ๊ทธ๋ฆฌ๊ณ  indptr ์˜ ๊ฐ’์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ์†Œํ•ฉ๋‹ˆ๋‹ค. reduceat ํ•˜๋ฉด ๋นˆ ํ–‰์„ ๋ฌด์‹œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋“  ๊ต์ฒด์—๋Š” ์ถ”๊ฐ€ ๋ถ€๊ธฐ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ, ๋‹น์‹ ์ด ์ƒ๊ฐํ•˜๋Š” ๊ต์ฒด๊ฐ€ ๋ฌด์—‡์ด๋“ , reduceat ์ฃผ๋ณ€์— ๋‚จ๊ฒจ์ฃผ์„ธ์š”.

In [2]: A = np.random.random((4000, 4000))
In [3]: B = sparse.csr_matrix((A > 0.8) * A)
In [9]: %timeit np.add.reduceat(B.data, B.indptr[:-1]) * (np.diff(B.indptr) > 1)
1000 loops, best of 3: 1.81 ms per loop
In [12]: %timeit B.sum(axis=1).A
100 loops, best of 3: 1.95 ms per loop
In [16]: %timeit np.maximum.reduceat(B.data, B.indptr[:-1]) * (np.diff(B.indptr) > 0)
1000 loops, best of 3: 1.8 ms per loop
In [20]: %timeit B.max(axis=1).A
100 loops, best of 3: 2.12 ms per loop

๋ง๋ถ™์—ฌ์„œ, ๋นˆ ์‹œํ€€์Šค ์ˆ˜์ˆ˜๊ป˜๋ผ๋Š” ํŒŒ์ด์ฌ์ด ํ•˜๋Š” ๊ฒƒ๊ณผ ๊ฐ™์€ ๋ฐฉ์‹์œผ๋กœ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค : ์ดˆ๊ธฐ ๊ฐ’์„ ์ œ๊ณตํ•จ์œผ๋กœ์จ. ์ด๊ฒƒ์€ ์Šค์นผ๋ผ์ด๊ฑฐ๋‚˜ indices ์™€ ๊ฐ™์€ ๋ชจ์–‘์˜ ๋ฐฐ์—ด ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ˆ, ์ฒซ ๋ฒˆ์งธ ์ดˆ์ ์ด ๋นˆ ์กฐ๊ฐ์„ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐ ์žˆ์–ด์•ผํ•œ๋‹ค๋Š” ๋ฐ ๋™์˜ํ•ฉ๋‹ˆ๋‹ค.
์ผ€์ด์Šค. start = end์˜ ๊ฒฝ์šฐ ์ถœ๋ ฅ์„ ์„ค์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ์Šต๋‹ˆ๋‹ค.
์š”์†Œ๋ฅผ ID๋กœ ๋ณ€๊ฒฝํ•˜๊ฑฐ๋‚˜ ์ถœ๋ ฅ ์š”์†Œ๋ฅผ
๋ฐฐ์—ด์„ ์ง€์ •ํ–ˆ์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ์˜ ๋ฌธ์ œ๋Š” ๋ฎ์–ด ์“ด๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๊ด€๋ จ์—†๋Š” ๋ฐ์ดํ„ฐ

๋‚˜๋Š” ๊ทธ์˜ ๋งˆ์ง€๋ง‰ ์ฝ”๋ฉ˜ํŠธ์— ๋Œ€ํ•ด @shoyer ์™€ ์™„์ „ํžˆ ํ•จ๊ป˜ํ•ฉ๋‹ˆ๋‹ค.

out=ufunc.reducebins(a, inds) ๋ฅผ out[i]=ufunc.reduce(a[inds[i]:inds[i+1]]) ๋ฅผ ์ œ์™ธํ•œ ๋ชจ๋“  i ์— ๋Œ€ํ•ด reduceat ๋” ์ด์ƒ ์‚ฌ์šฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

starts ๋ฐ ends ์ธ๋ฑ์Šค์— ๋Œ€ํ•œ ํ˜„์žฌ ์‚ฌ์šฉ ์‚ฌ๋ก€๋Š” as_strided ๋˜๋Š” ์ปจ๋ณผ ๋ฃจ์…˜๊ณผ ๊ฐ™์€ ๋Œ€์ฒด ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ๋ณด๋‹ค ์ž์—ฐ์Šค๋Ÿฝ๊ณ  ํšจ์œจ์ ์œผ๋กœ ๊ตฌํ˜„ ๋  ๊ฐ€๋Šฅ์„ฑ์ด ๋†’์Šต๋‹ˆ๋‹ค.

@shoyer :

๋‚˜๋Š” ์ด๊ฒƒ์„ ์ž˜ ์ดํ•ดํ•˜์ง€ ๋ชปํ•œ๋‹ค. ์—ฌ๊ธฐ์— ์ž…๋ ฅ ํ…์„œ๋ฅผ ํฌํ•จ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ? ๋˜ํ•œ : ์‹œ์ž‘ / ์ค‘์ง€์˜ ๊ธฐ๋ณธ๊ฐ’์€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

์ž…๋ ฅ์œผ๋กœ ์—…๋ฐ์ดํŠธ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’ ์€ ์ด๊ฒƒ์„ ์‹œ์ž‘ํ•œ ์ฃผ์„ ์—์„œ reduce_bins ๊ตฌํ˜„์„ ์ฐธ์กฐํ•˜์‹ญ์‹œ์˜ค. ๊ฑฐ๊ธฐ์—๋„ ๋… ์ŠคํŠธ๋ง์„ ์ถ”๊ฐ€ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ตฌํ˜„์€ ๊ธฐ๋Šฅ์ด ์™„์ „ํ•˜์ง€๋งŒ ํŒŒ์ด์ฌ์ด๊ธฐ ๋•Œ๋ฌธ์— ๋Š๋ฆฝ๋‹ˆ๋‹ค.

๊ทธ๋Ÿฌ๋‚˜ ์ธํ„ฐํŽ˜์ด์Šค๊ฐ€ ๋‹ค๋ฅด๊ฒŒ ๋ณด์ผ ๋•Œ (์•ฝ๊ฐ„) ๋” ์–ด๋ ต์Šต๋‹ˆ๋‹ค.

start ์ธ์ˆ˜๊ฐ€ ํ•˜๋‚˜๋งŒ ์ „๋‹ฌ ๋  ๋•Œ ์ธํ„ฐํŽ˜์ด์Šค๋Š” ๋™์ผํ•ฉ๋‹ˆ๋‹ค (์ฒ˜์Œ์— ์ˆ˜์ •ํ•˜๊ธฐ ์œ„ํ•ด ์„ค์ • ํ•œ ID ์ผ€์ด์Šค ๋ฌด์‹œ). ์ด ์„ธ ์ค„์€ ๊ฐ™์€ ์˜๋ฏธ์ž…๋‹ˆ๋‹ค.

np.add.reduce_at(arr, inds)
reduce_bins(np.add, arr, inds)
reduce_bins(np.add, arr, start=inds)

(๋ฉ”์„œ๋“œ / ํ•จ์ˆ˜ ๊ตฌ๋ถ„์€ ๋‚ด๊ฐ€ ๋„ˆ๋ฌด ์‹ ๊ฒฝ ์“ฐ๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ฉฐ, ํŒŒ์ด์ฌ์—์„œ ์ƒˆ๋กœ์šด ufunc ๋ฉ”์„œ๋“œ๋ฅผ ํ”„๋กœํ†  ํƒ€์ž…์œผ๋กœ ์ •์˜ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค!)


@jni :

์‚ฌ์‹ค, ๋‚˜๋Š” ๋‹จ์ง€ ์‹œ์ž‘ / ์ค‘์ง€ ์˜ต์…˜์กฐ์ฐจ๋„ ๋นˆ ์Šฌ๋ผ์ด์Šค์˜ ์‚ฌ์šฉ ์‚ฌ๋ก€๋ฅผ ๋‹ค๋ฃจ์ง€ ์•Š๋Š”๋‹ค๋Š” ๊ฒƒ์„ ๊นจ๋‹ฌ์•˜์Šต๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ๊ณผ๊ฑฐ์— ๋‚˜์—๊ฒŒ ์œ ์šฉํ–ˆ๋˜ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋‹น์‹ ์€ ํ‹€ ๋ ธ์Šต๋‹ˆ๋‹ค. ufunc.reduceat ์™€ ๋˜‘๊ฐ™์€ ๋ฐฉ์‹์ž…๋‹ˆ๋‹ค. start[i] == end[i] ์ „๋‹ฌํ•˜๋Š” ๊ฒƒ๋งŒ์œผ๋กœ๋„ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

๋นˆ ์‹œํ€€์Šค ์ˆ˜์ˆ˜๊ป˜๋ผ๋Š” ์ดˆ๊ธฐ ๊ฐ’์„ ์ œ๊ณตํ•˜์—ฌ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ˆ, ์šฐ๋ฆฌ๋Š” ์ด๋ฏธ ์ด๊ฒƒ์„ ๋‹ค๋ฃจ์—ˆ ๊ณ  ufunc.reduce ์ด๋ฏธ ufunc.identity ๋ฅผ ์ฑ„์›Œ์„œ ๊ทธ๋ ‡๊ฒŒํ•ฉ๋‹ˆ๋‹ค. ํŠนํžˆ # 8952๊ฐ€ ๋ณ‘ํ•ฉ ๋œ ๊ฒฝ์šฐ ๊ธฐ์กด ufunc.reduecat ์— ์ถ”๊ฐ€ํ•˜๋Š” ๊ฒƒ์€ ์–ด๋ ต์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋‹น์‹ ์ด ์Šค์Šค๋กœ ๋งํ–ˆ๋“ฏ์ด, ํ˜„์žฌ ๋™์ž‘์€ _documented_์ด๋ฏ€๋กœ ๋ณ€๊ฒฝํ•ด์„œ๋Š” ์•ˆ๋ฉ๋‹ˆ๋‹ค.


@divenex

๋งˆ์ง€๋ง‰์„ ์ œ์™ธํ•œ ๋ชจ๋“  i์— ๋Œ€ํ•ด out = ufunc.reducebins (a, inds)๋ฅผ out [i] = ufunc.reduce (a [inds [i] : inds [i + 1]])๋กœ ์ •์˜ํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ len(out) == len(inds) - 1 ? ์ด๊ฒƒ์€ reduceat ์˜ ํ˜„์žฌ ๋™์ž‘๊ณผ ๋‹ค๋ฅด ๋ฏ€๋กœ ์—ฌ๊ธฐ์—์„œ ์ „ํ™˜์— ๋Œ€ํ•œ


๋ชจ๋‘ :์ด ํ† ๋ก ์„ ์ฝ๊ธฐ ์–ด๋ ต๊ฒŒ ๋งŒ๋“ค์—ˆ ๊ธฐ ๋•Œ๋ฌธ์— ์ด์ „ ๋Œ“๊ธ€์„ ๊ฒ€ํ† ํ•˜๊ณ  ์ธ์šฉ ๋œ ์ด๋ฉ”์ผ ๋‹ต์žฅ์„ ์‚ญ์ œํ–ˆ์Šต๋‹ˆ๋‹ค.

@ eric-wieser ์ข‹์€ ์ง€์ ์ž…๋‹ˆ๋‹ค. ์œ„์˜ ๋ฌธ์žฅ์—์„œ ๋งˆ์ง€๋ง‰ ์ธ๋ฑ์Šค์˜ ๊ฒฝ์šฐ reducebins ์˜ ๋™์ž‘์ด ํ˜„์žฌ reduceat ์—์„œ์™€ ๋‹ค๋ฅผ ๊ฒƒ์ž„์„ ์˜๋ฏธํ–ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜์ด ๊ฒฝ์šฐ ๋งˆ์ง€๋ง‰ ๊ฐ’์ด ๊ณต์‹์ ์œผ๋กœ ์˜๋ฏธ๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— ๊ฐ’์ด ๋ฌด์—‡์ธ์ง€ ํ™•์‹คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

ํ˜ธํ™˜์„ฑ ๋ฌธ์ œ๋ฅผ ๋ฌด์‹œํ•˜๊ณ  ์ถœ๋ ฅ reducebins (1D์—์„œ)์˜ ํฌ๊ธฐ๋ฅผ ๊ฐ€์ ธ์•ผ inds.size-1 ๋ฐ”๋กœ ๊ทธ ๋™์ผํ•œ ์ด์œ ๋กœ, np.diff(a) ํฌ๊ธฐ ๊ฐ–๋Š”๋‹ค a.size-1 ๋ฐ np.histogram(a, bins) ํฌ๊ธฐ๋Š” bins.size-1 ์ž…๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ด๊ฒƒ์€ reduceat ๋Œ€ํ•œ ๋“œ๋กญ ์ธ ๊ต์ฒด๋ฅผ ์›ํ•˜๋Š” ์š•๊ตฌ์— ์œ„๋ฐฐ๋ฉ๋‹ˆ๋‹ค.

a.size-1 ์ด ์ •๋‹ต์ด๋ผ๋Š” ์„ค๋“๋ ฅ์žˆ๋Š” ์ฃผ์žฅ์ด ์—†๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ์ธ๋ฑ์Šค 0 ๋ฐ / ๋˜๋Š” ์ธ๋ฑ์Šค n ๋„ ๊ฝค ํ•ฉ๋ฆฌ์ ์ธ ํ–‰๋™์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ๊ทธ๋“ค ๋ชจ๋‘ ์–ด๋–ค ์ƒํ™ฉ์—์„œ๋Š” ํŽธ๋ฆฌํ•ด ๋ณด์ด์ง€๋งŒ ๊ต์ฒด๋ฅผํ•˜๋Š” ๊ฒƒ์ด ๋งค์šฐ ์ค‘์š”ํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์— ์ˆจ์–ด์žˆ๋Š” stop / start ๋Œ€ํ•œ ๋˜ ๋‹ค๋ฅธ ์ฃผ์žฅ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์›ํ•˜๋Š” ๊ฒฝ์šฐ ์•„์ฃผ ์ ์€ ๋น„์šฉ์œผ๋กœ diff ์™€ ์œ ์‚ฌํ•œ ๋™์ž‘์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. reduceat ํ–‰๋™ :

a = np.arange(10)
inds = [2, 4, 6]
reduce_bins(a, start=inds[:-1], stop=inds[1:])  #  [2 + 3, 4 + 5]

# or less efficiently:
reduce_at(a, inds)[:-1}
reduce_bins(a, start=inds)[:-1]
reduce_bins(a, stop=inds)[1:]

@ eric-wieser ํ•„์ˆ˜ start ๋ฐ stop ์ธ์ˆ˜๋Š” ๊ดœ์ฐฎ์ง€ ๋งŒ ๊ทธ์ค‘ ํ•˜๋‚˜๋ฅผ ์„ ํƒ์ ์œผ๋กœ ๋งŒ๋“œ๋Š” ๊ฒƒ์„ ์ข‹์•„ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. start ๋งŒ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์ด out[i] = func.reduce(arr[start[i]:]) ์•„๋‹ˆ๋ผ out[i] = func.reduce(arr[start[i]:start[i+1]]) ์˜๋ฏธํ•œ๋‹ค๋Š” ๊ฒƒ์€ ๋ถ„๋ช…ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

reducebins ๋Œ€ํ•œ ๋‚ด๊ฐ€ ์„ ํ˜ธํ•˜๋Š” API๋Š” reduceat ๋น„์Šทํ•˜์ง€๋งŒ docstring ์— ํ˜ผ๋™๋˜๋Š” "์˜ˆ์™ธ"๊ฐ€ ์—†์Šต๋‹ˆ๋‹ค. ์ฆ‰, ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

i in range(len(indices)) ๊ฒฝ์šฐ reduceat๋Š” ufunc.reduce(a[indices[i]:indices[i+1]]) ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค. ์ด๋Š” ์ตœ์ข… ๊ฒฐ๊ณผ์—์„œ ์ถ•์— ํ‰ํ–‰ ํ•œ i ๋ฒˆ์งธ ์ผ๋ฐ˜ํ™” ๋œ "ํ–‰"์ด๋ฉ๋‹ˆ๋‹ค (์˜ˆ : 2 ์ฐจ์› ๋ฐฐ์—ด์—์„œ, ์˜ˆ๋ฅผ ๋“ค์–ด axis = 0์ด๋ฉด i ๋ฒˆ์งธ ํ–‰์ด๋˜์ง€๋งŒ ์ถ• = 1์ด๋ฉด i ๋ฒˆ์งธ ์—ด์ด๋ฉ๋‹ˆ๋‹ค.

๋‚˜๋Š” ์Œ์ˆ˜๊ฐ€ ์•„๋‹Œ ์ธ๋ฑ์Šค ( 0 <= indices[i] <= a.shape[axis] )๋ฅผ ์š”๊ตฌํ•˜๋Š” ์„ธ ๋ฒˆ์งธ "์˜ˆ์™ธ"์— ๋Œ€ํ•ด ์–ด๋Š ์ชฝ์ด๋“  ๊ฐˆ ์ˆ˜ ์žˆ๋Š”๋ฐ, ์ด๋Š” ์˜ˆ์™ธ ๋ผ๊ธฐ๋ณด๋‹ค๋Š” ์˜จ ์ „์„ฑ ๊ฒ€์‚ฌ์— ๊ฐ€๊น๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์•„๋งˆ๋„ ๊ทธ ์‚ฌ๋žŒ๋„ ๊ฐˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์Œ์ˆ˜ ์ง€์ˆ˜๊ฐ€ ๋ˆ„๊ตฐ๊ฐ€์—๊ฒŒ ์–ผ๋งˆ๋‚˜ ์œ ์šฉ ํ•  ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ ์ˆ˜ ์žˆ์œผ๋ฉฐ ๊ทธ๋Ÿฌํ•œ ์ง€์ˆ˜๋ฅผ ์ •๊ทœํ™”ํ•˜๊ธฐ ์œ„ํ•ด ์ˆ˜ํ•™์„ํ•˜๋Š” ๊ฒƒ์ด ์–ด๋ ต์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋์— ์ธ๋ฑ์Šค๋ฅผ ์ž๋™์œผ๋กœ ์ถ”๊ฐ€ํ•˜์ง€ ์•Š๋Š” ๊ฒƒ์€ np.histogram ์˜ ๊ฒฐ๊ณผ์™€ ๊ฐ™์ด ๊ฒฐ๊ณผ ๊ธธ์ด๊ฐ€ len(a)-1 ์—ฌ์•ผ ํ•จ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.

@jni ํฌ์†Œ ํ–‰๋ ฌ์—์„œ ์ฐพ์€ ๋ฐฐ์—ด์—์„œ ์‹ค์ œ๋กœ ๊ณ„์‚ฐํ•˜๋ ค๋Š” ์˜ˆ๋ฅผ ๋“ค์–ด ์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ? ๊ฐ€๊ธ‰์ ์ด๋ฉด ๋ฌด์ž‘์œ„๊ฐ€ ์•„๋‹Œ ์ˆซ์ž์™€ ์ž์ฒด ํฌํ•จ (scipy.sparse์— ์˜์กดํ•˜์ง€ ์•Š์Œ)์ด ํฌํ•จ ๋œ ๊ตฌ์ฒด์ ์ธ ์˜ˆ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

์‹œ์ž‘ ๋งŒ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์ด out [i] = func.reduce (arr [start [i]๊ฐ€ ์•„๋‹ˆ๋ผ out [i] = func.reduce (arr [start [i] : start [i + 1]])๋ผ๋Š” ๊ฒƒ์€ ๋ถ„๋ช…ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. :]), ๋‚ด๊ฐ€ ์ง์ž‘ํ–ˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋‚ด๊ฐ€ํ•˜๋ ค๊ณ ํ–ˆ๋˜ ์ฝ๊ธฐ๋Š” "๊ฐ ๋นˆ์ด์ด ์œ„์น˜์—์„œ ์‹œ์ž‘ํ•œ๋‹ค"๋Š” ๊ฒƒ์ด๋ฉฐ, ๋ช…์‹œ ์ ์œผ๋กœ ๋‹ฌ๋ฆฌ ์ง€์ •ํ•˜์ง€ ์•Š๋Š” ํ•œ ๋ชจ๋“  ๋นˆ์ด ์—ฐ์†์ ์ด๋ผ๋Š” ์˜๋ฏธ์ž…๋‹ˆ๋‹ค. ์•„๋งˆ๋„ ๋” ์™„์ „ํ•œ ๋… ์ŠคํŠธ๋ง์„ ์ž‘์„ฑํ•˜๋ ค๊ณ  ๋…ธ๋ ฅํ•ด์•ผ ํ•  ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ๋‘ ์ธ์ˆ˜ ๋ชจ๋‘ ์ „๋‹ฌํ•˜๋Š” ๊ฒƒ์„ ๊ธˆ์ง€ํ•˜๋Š” ๊ฐ•๋ ฅํ•œ ์ธ์ˆ˜๋ฅผ ๋ณผ ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋ฏ€๋กœ ์ œ์•ˆ ํ•จ์ˆ˜์—์„œ์ด๋ฅผ ์ œ๊ฑฐํ•˜๊ฒ ์Šต๋‹ˆ๋‹ค.

์Œ์ˆ˜๊ฐ€ ์•„๋‹Œ ์ธ๋ฑ์Šค๊ฐ€ ํ•„์š”ํ•œ ๊ฒฝ์šฐ (0 <= ์ธ๋ฑ์Šค [i] <a.shape [axis])

์—ฌ๊ธฐ์—๋Š” ๋ฒ„๊ทธ (# 835)๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์Šฌ๋ผ์ด์Šค์ด๋ฏ€๋กœ ์ƒํ•œ๊ฐ’์ด ํฌํ•จ๋˜์–ด์•ผํ•ฉ๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์—๋„ ๋ฒ„๊ทธ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์Šฌ๋ผ์ด์Šค์ด๋ฏ€๋กœ ์ƒํ•œ๊ฐ’์ด ํฌํ•จ๋˜์–ด์•ผํ•ฉ๋‹ˆ๋‹ค.

๊ณ ๋งˆ์›Œ์š”.

reduceat ํ•จ์ˆ˜ ์ž์ฒด๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค.;)

:\doc\neps\groupby_additions.rst ์—๋Š” reduceby ํ•จ์ˆ˜์— ๋Œ€ํ•œ (IMO ์—ด๋“ฑํ•œ) ์ œ์•ˆ์ด ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰