Scikit-learn: ๋‹ค์ค‘ ํด๋ž˜์Šค roc_auc ์ ์ˆ˜ ์ง€์›

์— ๋งŒ๋“  2014๋…„ 06์›” 19์ผ  ยท  47์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: scikit-learn/scikit-learn

๋‚ฎ์€ ์šฐ์„  ์ˆœ์œ„ ๊ธฐ๋Šฅ ์š”์ฒญ: sklearn.metrics ๋‹ค์ค‘ ํด๋ž˜์Šค roc_auc ์ ์ˆ˜ ๊ณ„์‚ฐ์— ๋Œ€ํ•œ ์ง€์›์€ ๋ชจ๋“  ๋ฐฉ๋ฒ•๋ก ์— ๋Œ€ํ•ด ํ•˜๋‚˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋งค์šฐ ์œ ์šฉํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

New Feature

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

๋งˆ์ดํฌ๋กœ ํ‰๊ท ํ™”์—์„œ TPR(์ง„์–‘์„ฑ๋ฅ )์€ ๋ชจ๋“  ํด๋ž˜์Šค์˜ ๋ชจ๋“  TP ํ•ฉ๊ณ„๋ฅผ ๊ตฌํ•˜๊ณ  ๋ชจ๋“  ํด๋ž˜์Šค์˜ ๋ชจ๋“  TP ๋ฐ FN์˜ ํ•ฉ๊ณ„๋กœ ๋‚˜๋ˆ„์–ด ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค(์ฆ‰, 3๊ฐœ ํด๋ž˜์Šค ๋ฌธ์ œ์˜ ๊ฒฝ์šฐ).
TPR = (TP1+TP2+TP3)/(TP1+TP2+TP3+FN1+FN2+FN3)

ํ˜ผ๋™ ํ–‰๋ ฌ์˜ ์˜ˆ:
[[1,2,3],
[4,5,6],
[7,8,9]]
TPR = (1+5+9)/(1+5+9+(2+3)+(4+6)+(7+8))
์œ„์–‘์„ฑ ๋น„์œจ์— ๋Œ€ํ•ด ๋™์ผํ•œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋ฉด AUC๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋งคํฌ๋กœ ํ‰๊ท ํ™”๋Š” ๊ฐ ํด๋ž˜์Šค์— ๋Œ€ํ•œ TPR์„ ๊ฐœ๋ณ„์ ์œผ๋กœ ๊ณ„์‚ฐํ•˜๊ณ  ํ‰๊ท ํ™”ํ•ฉ๋‹ˆ๋‹ค(ํ•ด๋‹น ํด๋ž˜์Šค์˜ ์˜ˆ์ œ ์ˆ˜์— ๋”ฐ๋ผ ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌ).
TPR = (1/3)* (TP1/(TP1+FN1) + TP2/(TP2+FN2) + TP2/(TP2+FN2))

๊ฐ™์€ ์˜ˆ:
TPR = (1/3)* (1/(1+(2+3)) + 5/(5+(4+6)) + 9/(9+(7+8)))

์•„๋งˆ๋„ ์ด๊ฒƒ์ด ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค(์ •๋ฐ€๋„๋ฅผ ์‚ฌ์šฉํ•˜์ง€๋งŒ ์•„์ด๋””์–ด๋Š” ๋™์ผํ•ฉ๋‹ˆ๋‹ค).
http://stats.stackexchange.com/questions/156923/should-i-make-decisions-based-on-micro-averaged-or-macro-averaged-evaluation-mea

๋‚˜๋Š” ๊ฐœ์ธ์ ์œผ๋กœ ๊ฐ€์ค‘๋˜์ง€ ์•Š์€ ๊ฑฐ์‹œ ํ‰๊ท ์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š์„ ๊ฒƒ์ด์ง€๋งŒ, ์ด๊ฒƒ์„ ์—ฐ๊ตฌํ•œ ๋…ผ๋ฌธ์„ ์ฐพ์„ ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋ชจ๋“  47 ๋Œ“๊ธ€

๊ทธ๊ฒƒ์ด ๋ฌด์—‡์„ ์˜๋ฏธํ•˜๋Š”์ง€ ํ™•์‹คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ทธ๊ฒƒ์— ๋Œ€ํ•œ ์ฐธ์กฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?

2014๋…„ 6์›” 19์ผ 09:51 Madison May ์•Œ๋ฆผ @github.com์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

๋‚ฎ์€ ์šฐ์„  ์ˆœ์œ„ ๊ธฐ๋Šฅ ์š”์ฒญ: ๋‹ค์ค‘ ํด๋ž˜์Šค roc_auc ์ ์ˆ˜ ์ง€์›
์ผ๋Œ€์ผ ๋ฐฉ๋ฒ•๋ก ์„ ์‚ฌ์šฉํ•˜์—ฌ sklearn.metrics์—์„œ ๊ณ„์‚ฐ
์—„์ฒญ๋‚˜๊ฒŒ ์œ ์šฉํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

โ€”
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/scikit-learn/scikit-learn/issues/3298.

๋‹ค์Œ์€ ์ฐธ์กฐ์™€ ํ•จ๊ป˜ ๊ฝค ๊ดœ์ฐฎ์€ ์„ค๋ช…์ž…๋‹ˆ๋‹ค. https://www.cs.bris.ac.uk/~flach/ICML04tutorial/ROCtutorialPartIII.pdf

ํ , ๋ฉ€ํ‹ฐ ํด๋ž˜์Šค auc๊ฐ€ ๊ตฌํ˜„๋˜์ง€ ์•Š์€ ๋™์•ˆ ๊ถŒ์žฅ ์Šค์ฝ”์–ด๋Ÿฌ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

sklearn.metrics์—์„œ 1๋Œ€1 ๋ฐฉ๋ฒ•๋ก ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์ค‘ ํด๋ž˜์Šค roc_auc ์ ์ˆ˜ ๊ณ„์‚ฐ์„ ์ง€์›ํ•˜๋Š” ๊ฒƒ์€ ๋งค์šฐ ์œ ์šฉํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ํ•ด๋‹น ์Šฌ๋ผ์ด๋“œ๊ฐ€ ๊ฐ ํด๋ž˜์Šค์— ๋Œ€ํ•œ AUC์˜ ์ฃผํŒŒ์ˆ˜ ๊ฐ€์ค‘ ํ‰๊ท ์„ ์ทจํ•˜๋Š” ํ‘œ๋ฉด ์•„๋ž˜ ๋ถ€ํ”ผ์— ๋Œ€ํ•œ ๊ทผ์‚ฌ์น˜๋ฅผ ๊ณ ๋ คํ•˜๋Š” ๊ฒƒ์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ? ์ด๊ฒƒ์€ ์ด์ง„ํ™” ํ‘œํ˜„ ๋ฐ average='weighted' ์™€ ํ•จ๊ป˜ ํ˜„์žฌ roc_auc_score ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ๊ณผ ๋™์ผํ•œ ๊ฒƒ์œผ๋กœ ๋ณด์ž…๋‹ˆ๋‹ค. ( @arjoly , ์ด๋Ÿฌํ•œ ๊ณก์„  ๊ธฐ๋ฐ˜ ์ ์ˆ˜๊ฐ€ ๋‹ค์ค‘ ํด๋ž˜์Šค๋ฅผ ํ—ˆ์šฉํ•˜์ง€ ์•Š๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?)

๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ํ•ด๋‹น ์Šฌ๋ผ์ด๋“œ์™€ "๋‹ค์ค‘ ๋“ฑ๊ธ‰ ROC"์— ๋Œ€ํ•ด ์ฐพ์„ ์ˆ˜ ์žˆ๋Š” ๋Œ€๋ถ€๋ถ„์˜ ์ฐธ์กฐ๋Š” ํ‰๊ฐ€ ๋ฉ”ํŠธ๋ฆญ์ด ์•„๋‹Œ OvR์˜ ๋‹ค์ค‘ ๋“ฑ๊ธ‰ ๊ต์ •์— ์ค‘์ ์„ ๋‘ก๋‹ˆ๋‹ค. ์ด๊ฒƒ์ด ๋‹น์‹ ์ด ๊ด€์‹ฌ์„ ๊ฐ–๋Š” ๊ฒƒ์ž…๋‹ˆ๊นŒ? ๋‚˜๋Š” ์ด ๊ธฐ์ˆ ์ด ์–ผ๋งˆ๋‚˜ ๋„๋ฆฌ ํผ์ ธ ์žˆ๋Š”์ง€, scikit-learn์—์„œ ์‚ฌ์šฉํ•  ๊ฐ€์น˜๊ฐ€ ์žˆ๋Š”์ง€, ํƒ์š•์ ์ธ ์ตœ์ ํ™”๋ฅผ ๊ฐœ์„ ํ•ด์•ผ ํ•˜๋Š”์ง€ ์ „ํ˜€ ๋ชจ๋ฆ…๋‹ˆ๋‹ค.

( @arjoly , ์ด๋Ÿฌํ•œ ๊ณก์„  ๊ธฐ๋ฐ˜ ์ ์ˆ˜๊ฐ€ ๋‹ค์ค‘ ํด๋ž˜์Šค๋ฅผ ํ—ˆ์šฉํ•˜์ง€ ์•Š๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?)

y_true์—์„œ ํ•˜๋‚˜์˜ ํด๋ž˜์Šค๊ฐ€ ๋ˆ„๋ฝ๋  ๋•Œ๋งˆ๋‹ค ์ ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ํด๋ž˜์Šค ์ถ”๋ก ์— ๋งˆ๋ฒ•์„ ์ถ”๊ฐ€ํ•˜๊ณ  ์‹ถ์ง€ ์•Š์•˜๊ณ  ์‚ฌ์šฉ์ž๋ฅผ ๊ณค๊ฒฝ์— ๋น ๋œจ๋ ธ์Šต๋‹ˆ๋‹ค.

y_pred์˜ ๊ฒฝ์šฐ ์ ์ ˆํ•˜๊ฒŒ ์ฒ˜๋ฆฌํ•˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
y_true์—๋Š” ์—†๋Š” ๋ ˆ์ด๋ธ”์ด ์žˆ์Šต๋‹ˆ๋‹ค. ๊ทธ ๋ ˆ์ด๋ธ”์€ ์•„๋งˆ๋„
๊ฑฐ์‹œ ํ‰๊ท (Weka์— ๋”ฐ๋ฅด๋ฉด,
๋„ˆ๋ฌด), ๋˜๋Š” ROC ์ ์ˆ˜.

2014๋…„ 8์›” 1์ผ 17:08์— Arnaud Joly [email protected] ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

( @arjoly https://github.com/arjoly, ์™œ ์ด๋Ÿฌํ•œ ๊ณก์„  ๊ธฐ๋ฐ˜ ์ ์ˆ˜๊ฐ€
๋‹ค์ค‘ ํด๋ž˜์Šค๋ฅผ ํ—ˆ์šฉํ•˜์ง€ ์•Š์Šต๋‹ˆ๊นŒ?)

y_true์—์„œ ํ•˜๋‚˜์˜ ํด๋ž˜์Šค๊ฐ€ ๋ˆ„๋ฝ๋  ๋•Œ๋งˆ๋‹ค ๊ณ„์‚ฐํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.
์ ์ˆ˜. ๋‚˜๋Š” ํด๋ž˜์Šค ์ถ”๋ก ์— ๋งˆ๋ฒ•์„ ์ถ”๊ฐ€ํ•˜๊ณ  ์‹ถ์ง€ ์•Š์•˜๊ณ 
๋ฌธ์ œ์— ์‚ฌ์šฉ์ž.

โ€”
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ฑฐ๋‚˜ GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/scikit-learn/scikit-learn/issues/3298#issuecomment -50855460
.

@jnothman @arjoly ํ‰๊ท ํ™” ์ธก๋ฉด์—์„œ ๋งŽ์€ ์ง„์ „์ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. ์ง€๊ธˆ ๊ตฌํ˜„ํ•˜๊ธฐ๊ฐ€ ์–ผ๋งˆ๋‚˜ ์–ด๋ ต์Šต๋‹ˆ๊นŒ?

์•„๋งˆ๋„ pPROC ํŒจํ‚ค์ง€์˜ R ๊ธฐ๋Šฅ๊ณผ ์œ ์‚ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
http://www.inside-r.org/packages/cran/pROC/docs/multiclass.roc

์•ˆ๋…•ํ•˜์„ธ์š”, ๋งคํฌ๋กœ ํ‰๊ท  ROC/AUC ์ ์ˆ˜์˜ ์ดˆ์•ˆ์„ ๊ตฌํ˜„ํ–ˆ์ง€๋งŒ sklearn์— ์ ํ•ฉํ•œ์ง€ ํ™•์‹ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

์ฝ”๋“œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

from sklearn.metrics import roc_auc_score
from sklearn.preprocessing import LabelBinarizer

def multiclass_roc_auc_score(truth, pred, average="macro"):

    lb = LabelBinarizer()
    lb.fit(truth)

    truth = lb.transform(truth)
    pred = lb.transform(pred)

    return roc_auc_score(truth, pred, average=average)

์ด๋ ‡๊ฒŒ ๊ฐ„๋‹จํ•  ์ˆ˜ ์žˆ์„๊นŒ?

@fbrundu ์ด๊ฒƒ์ด ํ‘œ์ค€ ์˜๋ฏธ๋ผ๋ฉด. ๋ฌผ๋ก  ๊ฐ€๋Šฅํ•œ ํ•ด์„์ž…๋‹ˆ๋‹ค.

์—ฌ๊ธฐ์— ๋ฉ‹์ง„ ์š”์•ฝ์ด ์žˆ์Šต๋‹ˆ๋‹ค.
http://people.inf.elte.hu/kiss/13dwhdm/roc.pdf

proOC ํŒจํ‚ค์ง€๋Š” Hand ๋ฐ Till์„ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.
http://download.springer.com/static/pdf/398/art%253A10.1023%252FA%253A1010920819831.pdf?originUrl=http%3A%2F%2Flink.springer.com%2Farticle%2F10.1023%2FA% 3A1010920819831 & token2 = ํŠน๊ธ‰ = 1,469,743,016 ~ ACL = % 2Fstatic % 2Fpdf % 2F398 % 2Fart % 25253A10.1023 % 25252FA % 25253A1010920819831.pdf % 3ForiginUrl % 3Dhttp % 253A % 252F % 252Flink.springer.com % 252Farticle % 252F10.1023 % 252FA % 253A1010920819831 *~hmac=bc68686d3782ac6af3c3cda13c1b36aad6de5d01d16a25870cace5fe9699fb8a

Hand์™€ Till์˜ ๋ฒ„์ „์€ ์ผ๋ฐ˜์ ์œผ๋กœ ๋ฐ›์•„๋“ค์—ฌ์ง€๋Š” ๊ฒƒ์œผ๋กœ ๋ณด์ด๋ฉฐ ์ €๋Š” ์šฐ๋ฆฌ๊ฐ€ ๊ทธ๊ฒƒ์„ ๊ตฌํ˜„ํ•˜๋Š” ๋ฐ ์ฐฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.
Provost์™€ Domingos์˜ ๋ฒ„์ „๋„ ์žˆ์Šต๋‹ˆ๋‹ค. Provost๊ฐ€ ํ˜„์žฌ ์ œ ๋””๋ ‰ํ„ฐ๋ผ๋Š” ์ ์„ ๊ฐ์•ˆํ•  ๋•Œ ์ œ๊ฐ€ ์‘์›ํ•ด์•ผ ํ•  ๋ฒ„์ „์ด์ง€๋งŒ ์•„์ง๊นŒ์ง€๋Š” ๊ด€์‹ฌ์„ ๋ฐ›์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.
Provost-Domingos๋Š” @fbrundu ๊ฐ€ average='weighted' ๋งŒ ๋งํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

TLDR: Hand and Till PR ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ์„ ํƒ์ ์œผ๋กœ ํ‰๊ท ์„ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์žˆ๋Š” ์˜ต์…˜์ด ์žˆ๋Š” Provost ๋ฐ Domingos.

์•ˆ๋…•ํ•˜์„ธ์š”, ์ด๋ฅผ ๊ตฌํ˜„ํ•˜๋Š” ๋ฐ ์ง„์ „์ด ์žˆ์—ˆ์Šต๋‹ˆ๊นŒ?
๋‚ด๊ฐ€ ๋Œ€๋ถ€๋ถ„์˜ ๋‹ค๋ฅธ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ(์˜ˆ: WEKA)์—์„œ ๋ณธ ๊ฒƒ์€ ๊ฐ€์ค‘ ํ‰๊ท ์„ ์‚ฌ์šฉํ•œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋‚˜๋Š” ์ด๊ฒƒ์ด @fbrundu ๊ฐ€ average='micro' ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ œ์•ˆํ•œ ๊ฒƒ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๊นŒ?

@joaquinvanschoren R์€ ์†๊ณผ

@amueller ์ด ์ž‘์—…์„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค :)

@kchen17 ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!

์šฐ๋ฆฌ๋Š” ์ด๊ฒƒ์„ OpenML์—์„œ ๊ฝค ๋งŽ์ด ๋…ผ์˜ํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์ค‘ ํด๋ž˜์Šค AUC์˜ ๊ฒฝ์šฐ ํ•œ ์ ‘๊ทผ ๋ฐฉ์‹(๊ฑฐ์‹œ ํ‰๊ท , ๋ฏธ์„ธ ํ‰๊ท , ๊ฐ€์ค‘ ํ‰๊ท  ๋“ฑ)์ด ๋‹ค๋ฅธ ์ ‘๊ทผ ๋ฐฉ์‹๋ณด๋‹ค ๋‚ซ๋‹ค๋Š” ๋ณด์žฅ์€ ์—†์Šต๋‹ˆ๋‹ค. R์—์„œ๋Š” ์ตœ์†Œ 5๊ฐ€์ง€ ๋‹ค๋ฅธ ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ฐพ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค(ํ˜„์žฌ MLR์—์„œ๋„ ๋ชจ๋‘ ์‚ฌ์šฉ ๊ฐ€๋Šฅ).
scikit-learn์—์„œ ์ด๊ฒƒ์„ ๊ตฌํ˜„ํ•  ๋•Œ Hand-Till์„ ๊ธฐ๋ณธ๊ฐ’์œผ๋กœ ์‚ฌ์šฉํ•˜๋”๋ผ๋„ ์ตœ์†Œํ•œ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ๊ฐ€์žฅ ์ ํ•ฉํ•œ ๊ฒƒ์„ ์„ ํƒํ•  ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋‹ค๋ฉด ์ข‹์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. Hand-Till์€ ๊ฐ€์ค‘์น˜๊ฐ€ ์—†๋Š” ์ ‘๊ทผ ๋ฐฉ์‹์ด๋ฉฐ ๋ ˆ์ด๋ธ” ๋ถˆ๊ท ํ˜•์„ ๊ณ ๋ คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

์—ฌ๋Ÿฌ ๋ฒ„์ „์ด ์žˆ์–ด์„œ ๊ธฐ์ฉ๋‹ˆ๋‹ค. ๊ฐ€์ค‘์น˜๊ฐ€ ์—†๋Š” ๊ฒƒ๊ณผ "๋ผ๋ฒจ ๋ถˆ๊ท ํ˜•์„ ๊ณ ๋ คํ•˜์ง€ ์•Š๋Š” ๊ฒƒ"์€ ๋‘ ๊ฐ€์ง€ ๋‹ค๋ฅธ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ;) ๋ชฉ๋ก๊ณผ ์ฐธ์กฐ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?

์ด ๊ฒฝ์šฐ ๋งˆ์ดํฌ๋กœ ํ‰๊ท ์ด๋ž€ ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

์ด ์˜ˆ์—์„œ ๊ตฌํ˜„๋œ ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ฌธ์ œ์— ๋Œ€ํ•ด ์ด๋ฏธ ๋งˆ์ดํฌ๋กœ ๋ฐ ๋งคํฌ๋กœ ํ‰๊ท  ROC AUC๋ฅผ ํ™•์ธํ–ˆ์Šต๋‹ˆ๋‹ค.

http://scikit-learn.org/stable/auto_examples/model_selection/plot_roc.html#multiclass -์„ค์ •

์‚ฌ์‹ค, ๋‚˜๋Š” ๋ฌธ์„œ๊ฐ€ ์˜ฌ๋ฐ”๋ฅด์ง€ ์•Š๋‹ค๊ณ  ์ƒ๊ฐํ•˜๋ฉฐ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋งํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.
๋ฉ€ํ‹ฐ ๋ผ๋ฒจ...

2016๋…„ 9์›” 26์ผ 23:16, Olivier Grisel [email protected]
์ผ๋‹ค:

์ด๋ฏธ ๋‹ค์ค‘ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ๋ฏธ์‹œ ๋ฐ ๊ฑฐ์‹œ ํ‰๊ท  ROC AUC๊ฐ€ ์•„๋‹™๋‹ˆ๋‹ค.
์ด ์˜ˆ์ œ์—์„œ ๊ตฌํ˜„๋œ ๋ฌธ์ œ:

http://scikit-learn.org/stable/auto_examples/model_
selection/plot_roc.html#multiclass-settings

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/scikit-learn/scikit-learn/issues/3298#issuecomment -249566346,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/AAEz65IeU7k2CFwyHxTTAjk-5orIxWe6ks5qt8WsgaJpZM4CFzud
.

๋งˆ์ดํฌ๋กœ ํ‰๊ท ํ™”์—์„œ TPR(์ง„์–‘์„ฑ๋ฅ )์€ ๋ชจ๋“  ํด๋ž˜์Šค์˜ ๋ชจ๋“  TP ํ•ฉ๊ณ„๋ฅผ ๊ตฌํ•˜๊ณ  ๋ชจ๋“  ํด๋ž˜์Šค์˜ ๋ชจ๋“  TP ๋ฐ FN์˜ ํ•ฉ๊ณ„๋กœ ๋‚˜๋ˆ„์–ด ๊ณ„์‚ฐํ•ฉ๋‹ˆ๋‹ค(์ฆ‰, 3๊ฐœ ํด๋ž˜์Šค ๋ฌธ์ œ์˜ ๊ฒฝ์šฐ).
TPR = (TP1+TP2+TP3)/(TP1+TP2+TP3+FN1+FN2+FN3)

ํ˜ผ๋™ ํ–‰๋ ฌ์˜ ์˜ˆ:
[[1,2,3],
[4,5,6],
[7,8,9]]
TPR = (1+5+9)/(1+5+9+(2+3)+(4+6)+(7+8))
์œ„์–‘์„ฑ ๋น„์œจ์— ๋Œ€ํ•ด ๋™์ผํ•œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•˜๋ฉด AUC๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋งคํฌ๋กœ ํ‰๊ท ํ™”๋Š” ๊ฐ ํด๋ž˜์Šค์— ๋Œ€ํ•œ TPR์„ ๊ฐœ๋ณ„์ ์œผ๋กœ ๊ณ„์‚ฐํ•˜๊ณ  ํ‰๊ท ํ™”ํ•ฉ๋‹ˆ๋‹ค(ํ•ด๋‹น ํด๋ž˜์Šค์˜ ์˜ˆ์ œ ์ˆ˜์— ๋”ฐ๋ผ ๊ฐ€์ค‘์น˜ ๋ถ€์—ฌ).
TPR = (1/3)* (TP1/(TP1+FN1) + TP2/(TP2+FN2) + TP2/(TP2+FN2))

๊ฐ™์€ ์˜ˆ:
TPR = (1/3)* (1/(1+(2+3)) + 5/(5+(4+6)) + 9/(9+(7+8)))

์•„๋งˆ๋„ ์ด๊ฒƒ์ด ๋„์›€์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค(์ •๋ฐ€๋„๋ฅผ ์‚ฌ์šฉํ•˜์ง€๋งŒ ์•„์ด๋””์–ด๋Š” ๋™์ผํ•ฉ๋‹ˆ๋‹ค).
http://stats.stackexchange.com/questions/156923/should-i-make-decisions-based-on-micro-averaged-or-macro-averaged-evaluation-mea

๋‚˜๋Š” ๊ฐœ์ธ์ ์œผ๋กœ ๊ฐ€์ค‘๋˜์ง€ ์•Š์€ ๊ฑฐ์‹œ ํ‰๊ท ์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š์„ ๊ฒƒ์ด์ง€๋งŒ, ์ด๊ฒƒ์„ ์—ฐ๊ตฌํ•œ ๋…ผ๋ฌธ์„ ์ฐพ์„ ์ˆ˜ ์žˆ๋Š”์ง€ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

์ข…์ด:
https://www.math.ucdavis.edu/~saito/data/roc/ferri-class-perf-metrics.pdf

์ด๊ฒƒ์€ R์—์„œ ์ง€์›๋˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค(์ถ”๊ฐ€ ๋ฌธํ—Œ ํฌํ•จ):
https://mlr-org.github.io/mlr-tutorial/devel/html/measures/index.html

์•ˆ๋…•ํ•˜์„ธ์š”! ์ง€๋‚œ ์ฃผ์— ์ด ๋ฌธ์ œ๋ฅผ ์กฐ์‚ฌํ•˜๊ธฐ ์‹œ์ž‘ํ•  ์ˆ˜ ์žˆ์—ˆ๊ณ  ์˜ฌ๋ฐ”๋ฅธ ๋ฐฉํ–ฅ์œผ๋กœ ๊ฐ€๊ณ  ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด ๋น ๋ฅธ ์—…๋ฐ์ดํŠธ/๋ช‡ ๊ฐ€์ง€ ์งˆ๋ฌธ์„ ๊ฒŒ์‹œํ•˜๊ณ  ์‹ถ์—ˆ์Šต๋‹ˆ๋‹ค.

  • ์ง€๊ธˆ๊นŒ์ง€: ๊ธฐ๋ณธ์ ์œผ๋กœ ์ผ๋ถ€ average ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ์—†์Œ์œผ๋กœ ์„ค์ •๋œ multiclass_roc_auc_score ํ•จ์ˆ˜์˜ ๊ตฌํ˜„์œผ๋กœ ์‹œ์ž‘ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ธฐ๋ณธ๊ฐ’์€ Hand-Till ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค(์„ค๋ช…ํ•œ ๋Œ€๋กœ ๋ ˆ์ด๋ธ” ๋ถˆ๊ท ํ˜•์€ ๊ณ ๋ คํ•˜์ง€ ์•Š์Œ).
  • ์ด ๋ฉ”์„œ๋“œ๋Š” roc_auc_score ๋งค๊ฐœ๋ณ€์ˆ˜์™€ ๋™์ผํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ํ—ˆ์šฉํ•ฉ๋‹ˆ๊นŒ?
  • ๊ทธ๋ฆฌ๊ณ  ๊ทธ ์ฐจ์ด๋Š” y_true 2๊ฐœ ์ด์ƒ์˜ ๋ ˆ์ด๋ธ” ํด๋ž˜์Šค๊ฐ€ ์žˆ์„ ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. Hand-Till์€ ๊ฐ€๋Šฅํ•œ ๋ชจ๋“  ๋ ˆ์ด๋ธ” ์Œ์„ ์ฐพ๊ณ  ์ด๋Ÿฌํ•œ ๊ฐ ์Œ์— ๋Œ€ํ•ด roc_auc_score ๋ฅผ ๊ณ„์‚ฐํ•œ ๋‹ค์Œ ์ด๋“ค์˜ ํ‰๊ท ์„ ๊ตฌํ•˜๋Š” ์ž‘์—…์„ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.

์ˆ˜์ •/์ œ์•ˆ ์‚ฌํ•ญ์„ ์•Œ๋ ค์ฃผ์„ธ์š”!

์ผ๋ฐ˜์ ์œผ๋กœ roc_auc_score ์žฌ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ํ•ฉ๋ฆฌ์ ์œผ๋กœ ๊ฐ€๋Šฅํ•˜๋‹ค๋ฉด ๋‹ค๋ฅธ ํ•จ์ˆ˜๋ฅผ ๋งŒ๋“œ๋Š” ๊ฒƒ์„ ํ”ผํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์„ '๋งคํฌ๋กœ'๋กœ ๋‘๋Š” ๊ฒƒ์ด ํ—ˆ์šฉ๋œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

๊ณ ๋ คํ•ด์•ผ ํ•  ํ•œ ๊ฐ€์ง€ ํ•ต์‹ฌ ์‚ฌํ•ญ์€ metrics/tests/test_common.py์—์„œ roc_auc_score ์˜ ํŠน์„ฑ ๋ณ€๊ฒฝ์„ ํฌํ•จํ•˜์—ฌ ์ด๋Ÿฌํ•œ ๋ณ€๊ฒฝ ์‚ฌํ•ญ์„ ํ…Œ์ŠคํŠธํ•˜๋Š” ๋ฐฉ๋ฒ•์ž…๋‹ˆ๋‹ค.

๋„ค, ๋ฌธ์„œ๋ฅผ ์—…๋ฐ์ดํŠธํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

@joaquinvanschoren ํฅ๋ฏธ๋กญ๊ฒŒ๋„ ๊ทธ ๋…ผ๋ฌธ์€ ์œ„์—์„œ ์–ธ๊ธ‰ํ•œ ๋‹ค์ค‘ ํด๋ž˜์Šค AUC ๋…ผ๋ฌธ, ํŠนํžˆ 2005๋…„์˜ Fawcett ๋…ผ๋ฌธ์— ๋Œ€ํ•ด ๋…ผ์˜ํ•˜์ง€ ์•Š์•˜์Šต๋‹ˆ๋‹ค. ํ  1-vs-1 ๋‹ค์ค‘ ํด๋ž˜์Šค์˜ ์žฌ์ •๊ทœํ™”์ธ๊ฐ€?

๊ทธ๋ž˜์„œ ํ˜„์žฌ ์šฐ๋ฆฌ๋Š” ๋‹ค์ค‘ ๋ ˆ์ด๋ธ”๋งŒ ๊ฐ€์ง€๊ณ  ์žˆ์œผ๋ฏ€๋กœ 1vs1 ๋ฐ 1vsRest๊ฐ€ ์žˆ๋Š” ๋‹ค์ค‘ ํด๋ž˜์Šค๋ฅผ ์ถ”๊ฐ€ํ•˜๊ณ  ๊ฐ๊ฐ ๊ฐ€์ค‘ ๋ฐ ๋น„๊ฐ€์ค‘ ๋ณ€ํ˜•์ด ์žˆ์Šต๋‹ˆ๋‹ค.
sample ๋ฐ micro ํ‰๊ท ์ด AUC์—์„œ ์–ด๋–ป๊ฒŒ ์ž‘๋™ํ•˜๋Š”์ง€ ์ž˜ ๋ชจ๋ฅด๊ฒ ์Šต๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ... ์ €๋Š” multi-class ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ AUC์— ์ถ”๊ฐ€ํ•  ๊ฒƒ์„ ์ œ์•ˆํ•ฉ๋‹ˆ๋‹ค. ovo ๋˜๋Š” ovr ์ผ ์ˆ˜ ์žˆ์œผ๋ฉฐ ๊ฐ€์ค‘์น˜ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๊ณ ๋ คํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค. sample ๋ฐ micro ๋ฅผ ํ—ˆ์šฉํ•˜๋Š” ๊ฒƒ์€ ์ œ๊ฒŒ ์ดํ•ด๊ฐ€ ๋˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ํ™•์‹คํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

@arjoly ๊ทธ๋ž˜์„œ micro ๋ฐ sample ํ–‰๋ ฌ์˜ ์—ด์ด ์•„๋‹Œ ํ–‰์—์„œ ์ž‘๋™ํ•ฉ๋‹ˆ๊นŒ? ๊ทธ๊ฒƒ์— ๊ด€ํ•œ ๋…ผ๋ฌธ์ด ์žˆ์Šต๋‹ˆ๊นŒ? ๋‚˜๋Š” ROC ๋ฌธํ—Œ์—์„œ ๊ทธ๊ฒƒ์„ ์ฐพ์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.

๊ทธ๊ฒƒ์˜ ๋ฌธ์ œ๋Š” ํ•ธ๋“œ ํ‹ธ ์ธก์ •์„ ๊ธฐ๋ณธ๊ฐ’์œผ๋กœ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด ๊ฐ€์ค‘ ํ‰๊ท  OvO๋ฅผ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•˜๊ณ  ๊ฐ€์ค‘ ์˜ต์…˜์„ ์‹ค์ œ๋กœ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์—†๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๊ธฐ๋ณธ์ ์œผ๋กœ OVR์„ ์ˆ˜ํ–‰ํ•˜๊ณ  ๊ฐ€์ค‘์น˜๊ฐ€ ์žˆ๋Š” OvO๋„ ์ข‹์€ ์„ ํƒ์ด๋ผ๊ณ  ์„ค๋ช…ํ•˜๊ณ  ์ฐธ์กฐ๋ฅผ ์ถ”๊ฐ€ํ• ๊นŒ์š”?

@joaquinvanschoren์ด ์ธ์šฉํ•œ ๋…ผ๋ฌธ ์š”์•ฝ์—์„œ๋„ ๋ชจ๋“  AUC ๋ฒ„์ „์ด ๊ฑฐ์˜ ๋™์ผํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ œ๊ณตํ•œ๋‹ค๊ณ  ๋งํ•ฉ๋‹ˆ๋‹ค.

@amueller : ๊ท€ํ•˜์˜ ์˜๊ฒฌ์„ ๋‹ค์‹œ ์ฝ์„ ๊ธฐํšŒ๊ฐ€

๊ทธ๊ฒƒ์˜ ๋ฌธ์ œ๋Š” ํ•ธ๋“œ ํ‹ธ ์ธก์ •์„ ๊ธฐ๋ณธ๊ฐ’์œผ๋กœ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด ๊ฐ€์ค‘ ํ‰๊ท  OvO๋ฅผ ์ˆ˜ํ–‰ํ•ด์•ผ ํ•˜๊ณ  ๊ฐ€์ค‘ ์˜ต์…˜์„ ์‹ค์ œ๋กœ ๋ณ€๊ฒฝํ•  ์ˆ˜ ์—†๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ ๊ธฐ๋ณธ์ ์œผ๋กœ OVR์„ ์ˆ˜ํ–‰ํ•˜๊ณ  ๊ฐ€์ค‘์น˜๊ฐ€ ์žˆ๋Š” OvO๋„ ์ข‹์€ ์„ ํƒ์ด๋ผ๊ณ  ์„ค๋ช…ํ•˜๊ณ  ์ฐธ์กฐ๋ฅผ ์ถ”๊ฐ€ํ• ๊นŒ์š”?

๊ท€ํ•˜์˜ ์‘๋‹ต์— ๋”ฐ๋ผ multiclass=['ovo', 'ovr'] ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ํ†ตํ•ฉํ•˜๋„๋ก roc_auc_score ์„ ์ˆ˜์ •ํ•˜๋ ค๊ณ  ํ–ˆ์Šต๋‹ˆ๋‹ค. OvR์ด ๊ธฐ๋ณธ๊ฐ’( roc_auc_score(y_true, y_score, multiclass="ovo" ... ) )์ด์ง€๋งŒ Hand & Till์ด OvO์ธ ๊ฒฝ์šฐ ๊ตฌํ˜„์˜ OvR ๋ถ€๋ถ„์„ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ? (์ฆ‰, y_true๊ฐ€ ๋‹ค์ค‘ ํด๋ž˜์Šค์ž„์„ ๊ฐ์ง€ํ•˜๋ฉด "ovr"์ด ๊ตฌํ˜„๋˜์ง€ ์•Š์€ ๊ฒฝ์šฐ ์˜ค๋ฅ˜๋ฅผ ๋ฐœ์ƒ์‹œํ‚ค๊ณ  ์‚ฌ์šฉ์ž์—๊ฒŒ "ovo"๋ฅผ ์ „๋‹ฌํ•˜๋„๋ก ์ง€์‹œํ•ฉ๋‹ˆ๊นŒ?)

์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. ovo ๋ฐ ovr ;) ๋ชจ๋‘ ๊ตฌํ˜„ํ•˜๊ธฐ๋ฅผ ๊ธฐ๋Œ€ํ–ˆ์Šต๋‹ˆ๋‹ค.

@amueller : y_score ์˜ ์น˜์ˆ˜๋งŒ ํ™•์ธํ–ˆ์ง€๋งŒ ์ด๊ฒƒ์ด ์ถฉ๋ถ„ํ•˜์ง€ ์•Š๋‹ค๋Š” ๊ฒƒ์„ ๊ธˆ์„ธ ๊นจ๋‹ฌ์•˜์Šต๋‹ˆ๋‹ค. (์ฆ‰, ๋ ˆ์ด๋ธ”์ด 0๊ณผ 1๋งŒ ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๋Š” ๊ฒƒ๋ฟ์ธ๊ฐ€์š”?)

๋‹ค์ค‘ ๋ ˆ์ด๋ธ”์€ ์—ฌ๋Ÿฌ ๋ ˆ์ด๋ธ”์ด ํ•œ ๋ฒˆ์— ์˜ˆ์ธก๋จ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.
์ธ์Šคํ„ด์Šค๋‹น ์˜ˆ์ธก ๋ฒกํ„ฐ. ๋ฉ€ํ‹ฐํด๋ž˜์Šค๋Š” ๋‹จ์ผ ํด๋ž˜์Šค๋ฅผ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค.
์˜ˆ์ธกํ•˜์ง€๋งŒ ๊ทธ ์˜ˆ์ธก์€ 2๊ฐœ ์ด์ƒ์˜ ๊ฐ’์„ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
๋ฐ”์ด๋„ˆ๋ฆฌ).

๋•Œ๋•Œ๋กœ ์‚ฌ๋žŒ๋“ค์€ ์ถœ๋ ฅ์„ ์ด์ง„ํ™”ํ•˜์—ฌ ๋‹ค์ค‘ ํด๋ž˜์Šค ์‚ฌ๋ก€๋ฅผ ํ•ด๊ฒฐํ•˜๋ฏ€๋กœ
์ธ์Šคํ„ด์Šค๋‹น ์—ฌ๋Ÿฌ ์ด์ง„ ๊ฐ’(๋”ฐ๋ผ์„œ ๋‹ค์ค‘ ๋ ˆ์ด๋ธ”)์„ ์–ป์Šต๋‹ˆ๋‹ค.
์ข…์ข… ํ˜ผ๋ž€์„ ์•ผ๊ธฐํ•ฉ๋‹ˆ๋‹ค.
2016๋…„ 10์›” 8์ผ ํ† ์š”์ผ 16์‹œ 33๋ถ„์— Kathy Chen [email protected]์ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ผ์Šต๋‹ˆ๋‹ค.

@amueller https://github.com/amueller : ์–ธ๊ธ‰๋˜์—ˆ์œผ๋ฉฐ ๋‹ค์Œ์ด ๋  ๊ฒƒ์ž…๋‹ˆ๋‹ค.
๋„ ํ†ตํ•ฉ! ๋˜ํ•œ ๋ฌป๊ณ  ์‹ถ์—ˆ์Šต๋‹ˆ๋‹ค. ๋ฐฉ๋ฒ•์— ๋Œ€ํ•œ ์กฐ์–ธ์ด ์žˆ์Šต๋‹ˆ๊นŒ?
๋‹ค์ค‘ ํด๋ž˜์Šค์™€ ๋‹ค์ค‘ ๋ ˆ์ด๋ธ”์˜ ์ฐจ์ด์ ์„ ๊ฐ์ง€ํ•ฉ๋‹ˆ๊นŒ? ์ฒ˜์Œ์— ๋‚˜๋Š”
y_score์˜ ์น˜์ˆ˜๋ฅผ ํ™•์ธํ–ˆ์ง€๋งŒ ๋งค์šฐ ๋นจ๋ฆฌ ๊นจ๋‹ฌ์•˜์Šต๋‹ˆ๋‹ค.
์ถฉ๋ถ„ํ•˜์ง€ ์•Š์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/scikit-learn/scikit-learn/issues/3298#issuecomment -252427642,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/ABpQV7Mv0rHGEfrkYi5Xezz3PItyrLZ6ks5qx6mdgaJpZM4CFzud
.

์•ˆ๋…•ํ•˜์„ธ์š”, type_of_target ์ด multi-label ์™€ multi-class ์ถœ๋ ฅ์„ ๊ตฌ๋ณ„ํ•˜๋Š” ๋ชฉ์ ์„ ํ•ด๊ฒฐํ•  ์ˆ˜ ์žˆ๊ธฐ๋ฅผ ๋ฐ”๋ž๋‹ˆ๋‹ค. HTH

type_of_target ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. scikit-learn์—์„œ y ์˜ ์ฐจ์›์€ ์‹ค์ œ๋กœ ๋‹ค์ค‘ ๋ ˆ์ด๋ธ” ๋˜๋Š” ๋‹ค์ค‘ ๋Œ€์ƒ์„ ์ˆ˜ํ–‰ํ• ์ง€ ์—ฌ๋ถ€๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ง€ํ‘œ์ž…๋‹ˆ๋‹ค. @joaquinvanschoren์ด ์ œ์•ˆํ•œ ๋Œ€๋กœ ์ถœ๋ ฅ์„ ์ด์ง„ํ™”ํ•˜๋ฉด scikit-learn์€ ํ•ญ์ƒ ๋‹ค์ค‘ ๋ ˆ์ด๋ธ”์„ ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค.

type_of_target์€ y_trues, @amueller๋ฅผ ๊ตฌ๋ณ„ํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

2016๋…„ 10์›” 9์ผ 05:18 Andreas Mueller [email protected]
์ผ๋‹ค:

type_of_target์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. scikit-learn์—์„œ
y์˜ ์ฐจ์›์€ ์‹ค์ œ๋กœ ์šฐ๋ฆฌ๊ฐ€ ์›ํ•˜๋Š”์ง€ ์—ฌ๋ถ€๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์ง€ํ‘œ์ž…๋‹ˆ๋‹ค.
๋‹ค์ค‘ ๋ ˆ์ด๋ธ” ๋˜๋Š” ๋‹ค์ค‘ ๋Œ€์ƒ. ์ถœ๋ ฅ์„ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ด์ง„ํ™”ํ•˜๋ฉด
@joaquinvanschoren https://github.com/joaquinvanschoren ์ œ์•ˆ
scikit-learn์€ ํ•ญ์ƒ ๋‹ค์ค‘ ๋ ˆ์ด๋ธ”์„ ๊ฐ€์ •ํ•ฉ๋‹ˆ๋‹ค.

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/scikit-learn/scikit-learn/issues/3298#issuecomment -252439908,
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/AAEz6wa5fnE_LX3LLXbCoc0Z4hBbSAQ0ks5qx95rgaJpZM4CFzud
.

์•ˆ๋…•ํ•˜์„ธ์š” ์—ฌ๋Ÿฌ๋ถ„, ์ €๋Š” "์˜ˆ๋น„" PR์„ ์ œ์ถœํ–ˆ์Œ์„ ์•Œ๋ ค๋“œ๋ฆฌ๊ณ  ์‹ถ์—ˆ์Šต๋‹ˆ๋‹ค. ์ƒˆ๋กœ์šด ํ…Œ์ŠคํŠธ, ๋ฌธ์„œ ํ‘œํ˜„ ๋“ฑ์„ ์ถ”๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ๋ชจ๋ฒ” ์‚ฌ๋ก€์™€ ํ•จ๊ป˜ ๊ตฌํ˜„์— ๋Œ€ํ•œ ํ”ผ๋“œ๋ฐฑ์„ ๋“ฃ๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค(์˜ˆ: numpy/etc.๋ฅผ ์ง€๊ธˆ๋ณด๋‹ค ๋” ๋‚˜์€ ๋ฐฉ์‹์œผ๋กœ ํ™œ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ์žˆ๋‹ค๊ณ  ํ™•์‹ ํ•ฉ๋‹ˆ๋‹ค).

์ง€๊ธˆ๊นŒ์ง€ ๋„์›€์„ ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!

AUC์— ๋Œ€ํ•œ ๋‹ค์ค‘ ํด๋ž˜์Šค ์ง€์›์„ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐ ์ง„์ „์ด ์žˆ์Šต๋‹ˆ๊นŒ?

@joaquinvanschoren : #7663์—์„œ @jnothman ์˜ ์ฝ”๋“œ ๊ฒ€ํ†  ํ›„ ์ˆ˜์ • ์ž‘์—… ์ค‘์ž…๋‹ˆ๋‹ค. ์ค‘๊ฐ„๊ณ ์‚ฌ๊ฐ€ ๋๋‚˜๋ฉด ๋‹ค์Œ ์ฃผ์— ๋˜ ๋‹ค๋ฅธ ์—…๋ฐ์ดํŠธ๋ฅผ ์ œ์ถœํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

์•ˆ๋…•ํ•˜์„ธ์š” @kathyxchen , @jnothman ,

PR์— ๋Œ€ํ•œ ์—…๋ฐ์ดํŠธ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?

AUC์— ๋Œ€ํ•œ ๋‹ค์ค‘ ํด๋ž˜์Šค ์ง€์›์„ ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐ ์ง„์ „์ด ์žˆ๋Š”์ง€ ํ™•์ธํ•˜๊ธฐ ์œ„ํ•ด ์ฒดํฌ์ธํ•˜์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

์šฐ๋ฆฌ๋Š” ๋ฌด์—‡์ด ๋ฐ›์•„๋“ค์—ฌ์ง€๊ณ  ์›์น™์ ์ธ์ง€ ๊ฒฐ์ •ํ•˜๋Š” ๋ฐ ์–ด๋ ค์›€์„ ๊ฒช์Šต๋‹ˆ๋‹ค.
๋‹ค์ค‘ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ROC AUC์˜ ๊ณต์‹ํ™”. ๋ณด๋‹ค
https://github.com/scikit-learn/scikit-learn/pull/7663#issuecomment -307566895
๊ทธ๋ฆฌ๊ณ  ์•„๋ž˜.

๊ทธ๋ž˜์„œ ์นœ๊ตฌ๋“ค. ๋ฉ€ํ‹ฐํด๋ž˜์Šค uc ์ ์ˆ˜์— ์ง„์ „์ด ์žˆ์Šต๋‹ˆ๊นŒ? ํ™์ฑ„ ๋ฐ์ดํ„ฐ ์„ธํŠธ์™€ ํ•จ๊ป˜ ๋งค์šฐ ํ˜ผ๋ž€์Šค๋Ÿฌ์šด ๊ณต์‹ ๋ฌธ์„œ ์ฝ”๋“œ๋ฅผ ์ฐพ์•˜์Šต๋‹ˆ๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ ๋‚ด ๋ชจ๋ธ์ด ๋‚œ์ˆ˜๋ฅผ ์ƒ๋‹นํžˆ ์ž˜ ์˜ˆ์ธกํ•œ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ฃผ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

์ด๊ฒƒ์€ ๊ฑฐ์˜ ์™„๋ฃŒ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋ณ‘ํ•ฉํ•˜๊ธฐ ์ „์— API ์„ธ๋ถ€ ์ •๋ณด๋ฅผ ๊ฒฐ์ •ํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. https://github.com/scikit-learn/scikit-learn/pull/12789#discussion_r295693965

@trendsearcher ์˜ˆ๋ฅผ ๋“ค์–ด ์ฃผ์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ? ์ง€๊ธˆ์€ ๋ณ‘ํ•ฉ๋˜์—ˆ์ง€๋งŒ ๊ท€ํ•˜๊ฐ€ ๊ฒฝํ—˜ํ•œ ๋ฌธ์ œ๋ฅผ ๋ณด๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.

๋„์›€์ด ๋˜์–ด ๊ธฐ์ฉ๋‹ˆ๋‹ค. ์–ด๋–ป๊ฒŒ ์˜ˆ๋ฅผ ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ(๋งŽ์€ ์ฝ”๋“œ๊ฐ€ ์žˆ๊ณ  ๊ทธ๋ ‡์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค)
์ง๊ด€์ )? ์ผ๋ฐ˜ ํ…์ŠคํŠธ๋กœ ์“ธ ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

ั‡ั‚, 18 ะธัŽะป. 2019๋…„ ะณ. ะฒ 00:35, Andreas Mueller [email protected] :

@trendsearcher https://github.com/trendsearcher ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?
์˜ˆ๋ฅผ ๋“ค์–ด์ฃผ์„ธ์š”? ์ง€๊ธˆ์€ ๋ณ‘ํ•ฉ๋˜์—ˆ์ง€๋งŒ ๋ฌธ์ œ๋ฅผ ๋ณด๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค.
๊ฒฝํ—˜ํ–ˆ๋‹ค.

โ€”
๋‹น์‹ ์ด ์–ธ๊ธ‰๋˜์—ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์„ ๋ฐ›๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.
์ด ์ด๋ฉ”์ผ์— ์ง์ ‘ ๋‹ต์žฅํ•˜๊ณ  GitHub์—์„œ ํ™•์ธํ•˜์„ธ์š”.
https://github.com/scikit-learn/scikit-learn/issues/3298?email_source=notifications&email_token=AKS7QOFYRQY7RZJBWUVVJSTP76GDFA5CNFSM4AQXHOO2YY3PNVWWK3TUL52HS4MVVEXJ4
๋˜๋Š” ์Šค๋ ˆ๋“œ ์Œ์†Œ๊ฑฐ
https://github.com/notifications/unsubscribe-auth/AKS7QOFQ5LAIZ2ZBR4M4EATP76GDFANCNFSM4AQXHOOQ
.

์•ˆ๋…•ํ•˜์„ธ์š”, ๋งคํฌ๋กœ ํ‰๊ท  ROC/AUC ์ ์ˆ˜์˜ ์ดˆ์•ˆ์„ ๊ตฌํ˜„ํ–ˆ์ง€๋งŒ sklearn์— ์ ํ•ฉํ•œ์ง€ ํ™•์‹ ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.

์ฝ”๋“œ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค.

from sklearn.metrics import roc_auc_score
from sklearn.preprocessing import LabelBinarizer

def multiclass_roc_auc_score(truth, pred, average="macro"):

    lb = LabelBinarizer()
    lb.fit(truth)

    truth = lb.transform(truth)
    pred = lb.transform(pred)

    return roc_auc_score(truth, pred, average=average)

์ด๋ ‡๊ฒŒ ๊ฐ„๋‹จํ•  ์ˆ˜ ์žˆ์„๊นŒ?

@fbrundu ๊ณต์œ ํ•ด์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค! ๋‚˜๋Š” ๋‹น์‹ ์˜ ์ฝ”๋“œ๋ฅผ ์‹œ๋„ํ–ˆ์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ์ด ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•˜๋ฉด "๋ ˆ์ด๋ธ” ์ด์ง„ํ™”์—์„œ๋Š” ๋‹ค์ค‘ ์ถœ๋ ฅ ๋Œ€์ƒ ๋ฐ์ดํ„ฐ๊ฐ€ ์ง€์›๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค"๋ผ๋Š” ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ ํ•จ์ˆ˜์—์„œ "pred=lb.transform(pred)" ์ฝ”๋“œ๋ฅผ ์ œ๊ฑฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ "์ƒ˜ํ”Œ ์ˆ˜๊ฐ€ ์ผ์น˜ํ•˜์ง€ ์•Š๋Š” ์ž…๋ ฅ ๋ณ€์ˆ˜๋ฅผ ์ฐพ์•˜์Šต๋‹ˆ๋‹ค: [198, 4284]"๋ผ๋Š” ๋˜ ๋‹ค๋ฅธ ๋ฌธ์ œ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋Š” ๋ฐ ๋„์›€์„ ์ค„ ์ˆ˜ ์žˆ๋Š”์ง€ ์—ฌ์ญค๋ด๋„ ๋ ๊นŒ์š”? ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค!

@Junting-์™•

 I meet a problem saying "Multioutput target data is not supported with label binarization". 

predict_proba ๋Œ€์‹  predict๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

@fbrundu ๊ตฌํ˜„์ด ์ •ํ™•ํ•ฉ๋‹ˆ๊นŒ? ๋‚˜๋Š” ๊ทธ๊ฒƒ์„ ์‚ฌ์šฉํ•˜๊ณ  ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰