Xgboost: [R] ๋ชจ๋ธ์ด ๋””์Šคํฌ์— ์ €์žฅ๋  ๋•Œ ์กฐ๊ธฐ ์ค‘์ง€์—์„œ ๊ฐ€์žฅ ์ข‹์€ ๋ฐ˜๋ณต ์ธ๋ฑ์Šค๊ฐ€ ์‚ญ์ œ๋ฉ๋‹ˆ๋‹ค.

์— ๋งŒ๋“  2020๋…„ 01์›” 15์ผ  ยท  33์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: dmlc/xgboost

๋‹ค์Œ ๊ฐ’์€ xgboost::xgb.train ๋‹ค์Œ์— ์˜ˆ์ธก๋ฉ๋‹ˆ๋‹ค.
247367.2 258693.3 149572.2 201675.8 250493.9 292349.2 414828.0 296503.2 260851.9 190413.3

์ด ๊ฐ’์€ ์ด์ „ ๋ชจ๋ธ์˜ xgboost::xgb.save ๋ฐ xgboost::xgb.load ๋‹ค์Œ์— ์˜ˆ์ธก๋ฉ๋‹ˆ๋‹ค.
247508.8 258658.2 149252.1 201692.6 250458.1 292313.4 414787.2 296462.5 260879.0 190430.1

๊ทธ๋“ค์€ ๊ฐ€๊น์ง€๋งŒ ๋™์ผํ•˜์ง€๋Š” ์•Š์Šต๋‹ˆ๋‹ค. ์ด ๋‘ ์˜ˆ์ธก์˜ ์ฐจ์ด๋Š” 25,000๊ฐœ ์ƒ˜ํ”Œ ์„ธํŠธ์—์„œ -1317.094 ์—์„œ 1088.859 ์ž…๋‹ˆ๋‹ค. ์‹ค์ œ ๋ ˆ์ด๋ธ”๊ณผ ๋น„๊ตํ•  ๋•Œ ์ด ๋‘ ์˜ˆ์ธก์˜ MAE/RMSE๋Š” ํฌ๊ฒŒ ๋‹ค๋ฅด์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋”ฐ๋ผ์„œ MAE/RMSE๊ฐ€ ํฌ๊ฒŒ ๋‹ค๋ฅด์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๋กœ๋“œ/์ €์žฅ ์ค‘ ๋ฐ˜์˜ฌ๋ฆผ ์˜ค๋ฅ˜์™€ ๊ด€๋ จ์ด ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜๋„ ๋ชจ๋ธ์„ ์ €์žฅํ•˜๋Š” ๋ฐ”์ด๋„ˆ๋ฆฌ๊ฐ€ ๋ฐ˜์˜ฌ๋ฆผ ์˜ค๋ฅ˜๋ฅผ ๋ฐœ์ƒ์‹œํ‚ค์ง€ ์•Š์•„์•ผ ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ด๊ฒƒ์ด ์ด์ƒํ•˜๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๊นŒ?

๋ˆ„๊ตฐ๊ฐ€ ๋‹จ์„œ?

PS ์—ฌ๊ธฐ์—์„œ๋Š” ๊ต์œก ๊ณผ์ •์„ ์—…๋กœ๋“œํ•˜๊ณ  ๋ฌธ์„œํ™”ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜์ง€ ์•Š์€ ๊ฒƒ ๊ฐ™์Šต๋‹ˆ๋‹ค. ํ•„์š”ํ•œ ๊ฒฝ์šฐ ์„ธ๋ถ€ ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜๊ฑฐ๋‚˜ ์š”์ ์„ ์ฆ๋ช…ํ•˜๊ธฐ ์œ„ํ•ด ๋”๋ฏธ ๋ฐ์ดํ„ฐ๋กœ ์‹œ๋ฎฌ๋ ˆ์ด์…˜์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

Blocking bug

๊ฐ€์žฅ ์œ ์šฉํ•œ ๋Œ“๊ธ€

์ˆ˜์ˆ˜๊ป˜๋ผ๊ฐ€ ํ’€๋ ธ์Šต๋‹ˆ๋‹ค. ์ง„์งœ ์›์ธ์„ ์•Œ์•„๋ƒˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ์ด ๋””์Šคํฌ์— ์ €์žฅ๋˜๋ฉด ์กฐ๊ธฐ ์ค‘์ง€์— ๋Œ€ํ•œ ์ •๋ณด๊ฐ€ ์‚ญ์ œ๋ฉ๋‹ˆ๋‹ค. ์˜ˆ์ œ์—์„œ XGBoost๋Š” 6381๊ฐœ์˜ ๋ถ€์ŠคํŒ… ๋ผ์šด๋“œ๋ฅผ ์‹คํ–‰ํ•˜๊ณ  6378๊ฐœ์˜ ๋ผ์šด๋“œ์—์„œ ์ตœ์ƒ์˜ ๋ชจ๋ธ์„ ์ฐพ์Šต๋‹ˆ๋‹ค. ๋ฉ”๋ชจ๋ฆฌ์˜ ๋ชจ๋ธ ๊ฐœ์ฒด์—๋Š” ์ œ๊ฑฐ๋œ ํŠธ๋ฆฌ๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— 6378๊ฐœ์˜ ํŠธ๋ฆฌ๊ฐ€ ์•„๋‹ˆ๋ผ 6381๊ฐœ์˜ ํŠธ๋ฆฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์–ด๋–ค ๋ฐ˜๋ณต์ด ๊ฐ€์žฅ ์ข‹์•˜๋Š”์ง€ ๊ธฐ์–ตํ•˜๋Š” ์ถ”๊ฐ€ ํ•„๋“œ best_iteration ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

> fit$best_iteration
[1] 6378

์ด ์ถ”๊ฐ€ ํ•„๋“œ๋Š” ๋ชจ๋ธ์„ ๋””์Šคํฌ์— ์ €์žฅํ•  ๋•Œ ์ž๋™์œผ๋กœ ์‚ญ์ œ๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ predict() ์›๋ž˜ ๋ชจ๋ธ์€ ๋ฐ˜๋ฉด, 6378 ๊ทธ๋ฃจ์˜ ๋‚˜๋ฌด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ predict() (๊ฐ€) ๋ชจ๋ธ์ด 6,381 ๊ทธ๋ฃจ์˜ ๋‚˜๋ฌด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ณต๊ตฌ์™€ ํ•จ๊ป˜.

> x <- predict(fit, newdata = dtrain2, predleaf = TRUE)
> x2 <- predict(fit.loaded, newdata = dtrain2, predleaf = TRUE)
> dim(x)
[1] 5000 6378
> dim(x2)
[1] 5000 6381

๋ชจ๋“  33 ๋Œ“๊ธ€

๋ฐ”์ด๋„ˆ๋ฆฌ ๋˜๋Š” json ๋ชจ๋‘์— ๋Œ€ํ•ด ๋ฐ˜์˜ฌ๋ฆผ ์˜ค๋ฅ˜๊ฐ€ ์—†์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๋‹คํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๊นŒ?

์•„๋‹ˆ ๋‚˜๋Š” ์•„๋ƒ:

params <- list(objective = 'reg:squarederror',
               max_depth = 10, eta = 0.02, subsammple = 0.5,
               base_score = median(xgboost::getinfo(xgb.train, 'label'))
)

xgboost::xgb.train(
  params = params, data = xgb.train,
  watchlist = list('train' = xgb.train, 'test' = xgb.test),
  nrounds = 10000, verbose = TRUE, print_every_n = 25,
  eval_metric = 'mae',
  early_stopping_rounds = 3, maximize = FALSE)

์ด ํ˜„์ƒ์ด ๋ฐœ์ƒํ•˜๋Š” ๋”๋ฏธ ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

์—ฌ๊ธฐ ์žˆ์Šต๋‹ˆ๋‹ค(Quick & Dirty):

N <- 100000
set.seed(2020)
X <- data.frame('X1' = rnorm(N), 'X2' = runif(N), 'X3' = rpois(N, lambda = 1))
Y <- with(X, X1 + X2 - X3 + X1*X2^2 - ifelse(X1 > 0, 2, X3))

params <- list(objective = 'reg:squarederror',
               max_depth = 5, eta = 0.02, subsammple = 0.5,
               base_score = median(Y)
)

dtrain <- xgboost::xgb.DMatrix(data = data.matrix(X), label = Y)

fit <- xgboost::xgb.train(
  params = params, data = dtrain,
  watchlist = list('train' = dtrain),
  nrounds = 10000, verbose = TRUE, print_every_n = 25,
  eval_metric = 'mae',
  early_stopping_rounds = 3, maximize = FALSE
)

pred <- stats::predict(fit, newdata = dtrain)

xgboost::xgb.save(fit, 'booster.raw')
fit.loaded <- xgboost::xgb.load('booster.raw')

pred.loaded <- stats::predict(fit.loaded, newdata = dtrain)

identical(pred, pred.loaded)
pred[1:10]
pred.loaded[1:10]

sqrt(mean((Y - pred)^2))
sqrt(mean((Y - pred.loaded)^2))

๋‚ด ์ปดํ“จํ„ฐ์—์„œ identical(pred, pred.loaded) ๋Š” FALSE์ž…๋‹ˆ๋‹ค(์ฆ‰, TRUE์—ฌ์•ผ ํ•จ). ๋‹ค์Œ์€ ๋งˆ์ง€๋ง‰ ๋ช…๋ น์˜ ์ถœ๋ ฅ์ž…๋‹ˆ๋‹ค.

> identical(pred, pred.loaded)
[1] FALSE
> pred[1:10]
 [1] -4.7971768 -2.5070562 -0.8889422 -4.9199696 -4.4374819 -0.2739395 -0.9825708  0.4579227  1.3667605 -4.3333349
> pred.loaded[1:10]
 [1] -4.7971768 -2.5070562 -0.8889424 -4.9199696 -4.4373770 -0.2739397 -0.9825710  0.4579227  1.3667605 -4.3333349
> 
> sqrt(mean((Y - pred)^2))
[1] 0.02890702
> sqrt(mean((Y - pred.loaded)^2))
[1] 0.02890565

์˜ˆ์ธก์ด ๋•Œ๋•Œ๋กœ ์•ฝ๊ฐ„ ๋‹ค๋ฅด๋‹ค๋Š” ๊ฒƒ์„ ์•Œ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ปดํ“จํ„ฐ์—์„œ ์˜ˆ์ œ ์ฝ”๋“œ๋ฅผ ๋‹ค์‹œ ์‹คํ–‰ํ•˜๊ณ  ๋™์ผํ•œ ๋ฌธ์ œ๊ฐ€ ์žˆ๋Š”์ง€ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?

R ๋ฐ xgboost์— ๋Œ€ํ•œ ์ถ”๊ฐ€ ์ •๋ณด:

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] compiler_3.6.1    magrittr_1.5      Matrix_1.2-17     tools_3.6.1       yaml_2.2.0        xgboost_0.90.0.2  stringi_1.4.3     grid_3.6.1       
 [9] data.table_1.12.4 lattice_0.20-38 

๋˜ํ•œ ๋‹ค์Œ ์‚ฌํ•ญ์— ์œ ์˜ํ•˜์‹ญ์‹œ์˜ค.

> identical(fit$raw, fit.loaded$raw)
[1] TRUE

์Šคํฌ๋ฆฝํŠธ ์ฃผ์…”์„œ ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ƒฅ ์—…๋ฐ์ดํŠธ, ๋‚˜๋Š” ๋‹ค์Œ์„ ์‚ฌ์šฉํ•˜์—ฌ json๊ณผ ๋ฐ”์ด๋„ˆ๋ฆฌ ํŒŒ์ผ์— ์ €์žฅํ•˜์—ฌ ์‹คํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.

xgboost::xgb.save(fit, 'booster.json')
fit.loaded <- xgboost::xgb.load('booster.json')

xgboost::xgb.save(fit.loaded, 'booster-1.json')

booster.json ๋ฐ booster-1.json ์˜ ํ•ด์‹œ ๊ฐ’ (via sha256sum ./booster.json )์€ ์ •ํ™•ํžˆ ๋™์ผํ•˜๋ฏ€๋กœ ๋ถ€๋™ ์†Œ์ˆ˜์  ์—ฐ์‚ฐ์œผ๋กœ ์ธํ•ด ์–ด๋”˜๊ฐ€์— ๋ถˆ์ผ์น˜๊ฐ€ ์žˆ๋Š” ๊ฒƒ์œผ๋กœ ์ถ”์ธก๋ฉ๋‹ˆ๋‹ค.

์›์ธ์„ ๋ชจ๋ฅธ ์ฑ„ ์ด์Šˆ๋ฅผ ์ข…๋ฃŒํ•˜๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ž…๋‹ˆ๊นŒ?

@trivialfis identical(pred, pred.loaded) ๋Œ€ํ•ด True๋ฅผ ์–ป์—ˆ ์Šต๋‹ˆ๊นŒ? OP๋Š” ๋‘ ๋ชจ๋ธ์ด ๋™์ผํ•œ ์ด์ง„ ์„œ๋ช…์„ ๊ฐ€์ง€๊ณ  ์žˆ์Œ์—๋„ ๋ถˆ๊ตฌํ•˜๊ณ  ์˜ˆ์ธก์ด ์ผ์น˜ํ•˜์ง€ ์•Š๋Š” ์ด์œ ๋ฅผ ๋ฌป์Šต๋‹ˆ๋‹ค.

์ง์ ‘ ์žฌํ˜„ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

์ฃ„์†กํ•ฉ๋‹ˆ๋‹ค. ๋‚ด๊ฐ€ ์ฐพ์€ ์›์ธ์€ ์˜ˆ์ธก ์บ์‹œ์ž…๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ ๋กœ๋“œํ•œ ํ›„ ์˜ˆ์ธก ๊ฐ’์€ ์บ์‹œ๋œ ๊ฐ’ ๋Œ€์‹  ์‹ค์ œ ์˜ˆ์ธก์—์„œ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค.

๊ทธ๋ž˜์„œ ๋‚ด ์ถ”์ธก์€ ๋ถ€๋™ ์†Œ์ˆ˜์  ์‚ฐ์ˆ ๋กœ ์ธํ•œ ๋ถˆ์ผ์น˜๊ฐ€ ์–ด๋”˜๊ฐ€์— ์žˆ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๊ทธ๋ ‡๋‹ค๋ฉด ์˜ˆ์ธก ์บ์‹œ๋Š” ํŒŒ๊ดด์ ์ธ ๋ฐฉ์‹์œผ๋กœ ๋ถ€๋™ ์†Œ์ˆ˜์  ์‚ฐ์ˆ ๊ณผ ์ƒํ˜ธ ์ž‘์šฉํ•ฉ๋‹ˆ๊นŒ?

@hcho3 ์ƒˆ๋กœ์šด ์‚ฐ์„ธ๊ณต๋ฒ•์„ ๊ตฌํ˜„ํ•˜๋ฉด์„œ ๋ฐœ๊ฒฌํ•œ ๋ฌธ์ œ์ž…๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์„œ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•œ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋จผ์ € ๋‚˜๋ฌด์˜ ์ˆ˜๋ฅผ 1000์œผ๋กœ ์ค„์ด์‹ญ์‹œ์˜ค(์ด๋Š” ์—ฌ์ „ํžˆ ๊ฝค ํฌ๊ณ  ๋ฐ๋ชจ์šฉ์œผ๋กœ ์ถฉ๋ถ„ํ•ด์•ผ ํ•จ).

์บ์‹œ๋ฅผ ๋ฐฉํ•ดํ•˜์ง€ ์•Š๋„๋ก ์˜ˆ์ธกํ•˜๊ธฐ ์ „์— DMatrix๋ฅผ ๋‹ค์‹œ ๊ตฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.

dtrain_2 <- xgboost::xgb.DMatrix(data = data.matrix(X), label = Y)

pred <- stats::predict(fit, newdata = dtrain_2)

identical ํ…Œ์ŠคํŠธ๋ฅผ ํ†ต๊ณผํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ์‹คํŒจํ•ฉ๋‹ˆ๋‹ค.

๋” ๋งŽ์€ ๋‚˜๋ฌด์— ๋“ค์–ด๊ฐ€๋Š” ๋™์ผํ•œ ํ…Œ์ŠคํŠธ์—๋Š” ์—ฌ์ „ํžˆ ์ž‘์€ ์ฐจ์ด๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค(2000๊ทธ๋ฃจ์˜ ๋‚˜๋ฌด์— ๋Œ€ํ•ด 1e-7). ํ•˜์ง€๋งŒ ๋ฉ€ํ‹ฐ ์“ฐ๋ ˆ๋“œ ํ™˜๊ฒฝ์—์„œ๋„ ๋น„ํŠธ ๋‹จ์œ„๋กœ ๋™์ผํ•œ ๊ฒฐ๊ณผ๋ฅผ ์ƒ์„ฑํ•ด์•ผ ํ•ฉ๋‹ˆ๊นŒ?

๋ถ€๋™ ์†Œ์ˆ˜์  ํ•ฉ๊ณ„๋Š” ์—ฐ๊ด€๋˜์ง€ ์•Š์œผ๋ฏ€๋กœ ์›ํ•˜๋Š” ๊ฒฝ์šฐ ๊ณ„์‚ฐ ์ˆœ์„œ๋ฅผ ๊ฐ•๋ ฅํ•˜๊ฒŒ ๋ณด์žฅํ•˜๊ธฐ ์œ„ํ•ด ํ•  ์ผ ํ•ญ๋ชฉ์œผ๋กœ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์‹ค์ œ๋กœ ์ฃผ๋ฌธ์„ ๊ฐ•๋ ฅํ•˜๊ฒŒ ๋ณด์žฅํ•˜๋Š” ๊ฒƒ์€ ํšจ๊ณผ๊ฐ€ ์—†์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค(๋งŽ์€ ๋„์›€์ด ๋  ๊ฒƒ์ด์ง€๋งŒ ์—ฌ์ „ํžˆ ๋ถˆ์ผ์น˜๊ฐ€ ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค). CPU FPU ๋ ˆ์ง€์Šคํ„ฐ์˜ ๋ถ€๋™ ์†Œ์ˆ˜์ ์€ ๋” ๋†’์€ ์ •๋ฐ€๋„๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์œผ๋ฉฐ ๋ฉ”๋ชจ๋ฆฌ์— ๋‹ค์‹œ ์ €์žฅ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. (ํ•˜๋“œ์›จ์–ด ๊ตฌํ˜„์€ ์ค‘๊ฐ„ ๊ฐ’์— ๋” ๋†’์€ ์ •๋ฐ€๋„๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. https://en.wikipedia.org/wiki/Extended_precision). ๋‚ด ์š”์ ์€ 1000๊ฐœ์˜ ํŠธ๋ฆฌ์— ๋Œ€ํ•œ ๊ฒฐ๊ณผ๊ฐ€ 32๋น„ํŠธ ๋ถ€๋™ ์†Œ์ˆ˜์  ๋‚ด์—์„œ ์ •ํ™•ํžˆ ์žฌํ˜„๋  ๋•Œ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ๋ฒ„๊ทธ๊ฐ€ ์•„๋‹ ๊ฐ€๋Šฅ์„ฑ์ด ์žˆ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

๋ถ€๋™ ์†Œ์ˆ˜์  ํ•ฉ๊ณ„๊ฐ€ ์—ฐ๊ด€๋˜์ง€ ์•Š๋Š”๋‹ค๋Š” ๋ฐ ๋™์˜ํ•ฉ๋‹ˆ๋‹ค. ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์ง์ ‘ ์‹คํ–‰ํ•˜๊ณ  ๊ทธ ์ฐจ์ด๊ฐ€ ๋ถ€๋™ ์†Œ์ˆ˜์  ์‚ฐ์ˆ ์— ๊ธฐ์ธํ•  ๋งŒํผ ์ถฉ๋ถ„ํžˆ ์ž‘์€์ง€ ํ™•์ธํ•ฉ๋‹ˆ๋‹ค.

์ผ๋ฐ˜์ ์œผ๋กœ ๋‘ ๊ฐœ์˜ float ๋ฐฐ์—ด์ด ์„œ๋กœ ๊ฑฐ์˜ ๊ฐ™์€์ง€ ํ…Œ์ŠคํŠธํ•˜๊ธฐ ์œ„ํ•ด ์ผ๋ฐ˜์ ์œผ๋กœ np.testing.assert_almost_equal ๋ฅผ decimal=5 ์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

์˜ˆ. ์ž์„ธํ•œ ์„ค๋ช… ์—†์ด ์ข…๋ฃŒ๋œ ์  ์‚ฌ๊ณผ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

@hcho3 ์—…๋ฐ์ดํŠธ๊ฐ€ ์žˆ์Šต๋‹ˆ๊นŒ?

์•„์ง ํ•ด๊ฒฐํ•˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฒˆ ์ฃผ์— ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

@trivialfis ๋ฒ„๊ทธ๋ฅผ ์žฌํ˜„ํ•  ์ˆ˜ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค. ์ œ๊ณต๋œ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‹คํ–‰ํ•˜๊ณ  FALSE ๋Œ€ํ•ด identical(pred, pred.loaded) FALSE ๋ฅผ) ๋ฐ›์•˜์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๋‹น์‹ ์ด ์ œ์•ˆํ•œ๋Œ€๋กœ ์ƒˆ๋กœ์šด DMatrix dtrain_2 ๋ฅผ ๋งŒ๋“ค๋ ค๊ณ  ์‹œ๋„ํ–ˆ์ง€๋งŒ ์—ฌ์ „ํžˆ ํ…Œ์ŠคํŠธ๋ฅผ ์œ„ํ•ด FALSE ๋ฅผ ์–ป์—ˆ์Šต๋‹ˆ๋‹ค.

@DavorJ ์Šคํฌ๋ฆฝํŠธ์˜ ์ถœ๋ ฅ:

[1] FALSE     # identical(pred, pred.loaded)
 [1] -4.7760534 -2.5083885 -0.8860036 -4.9163256 -4.4455137 -0.2548684
 [7] -0.9745615  0.4646015  1.3602829 -4.3288369     # pred[1:10]
 [1] -4.7760534 -2.5083888 -0.8860038 -4.9163256 -4.4454765 -0.2548686
 [7] -0.9745617  0.4646015  1.3602829 -4.3288369     # pred.loaded[1:10]
[1] 0.02456085   # MSE on pred
[1] 0.02455945   # MSE on pred.loaded

dtrain_2 <- xgboost::xgb.DMatrix(data = data.matrix(X), label = Y) ํ•˜์—ฌ ์ˆ˜์ •๋œ ์Šคํฌ๋ฆฝํŠธ์˜ ์ถœ๋ ฅ:

[1] FALSE     # identical(pred, pred.loaded)
 [1] -4.7760534 -2.5083885 -0.8860036 -4.9163256 -4.4455137 -0.2548684
 [7] -0.9745615  0.4646015  1.3602829 -4.3288369     # pred[1:10]
 [1] -4.7760534 -2.5083888 -0.8860038 -4.9163256 -4.4454765 -0.2548686
 [7] -0.9745617  0.4646015  1.3602829 -4.3288369     # pred.loaded[1:10]
[1] 0.02456085   # MSE on pred
[1] 0.02455945   # MSE on pred.loaded

๋”ฐ๋ผ์„œ ๋‹ค๋ฅธ ์ผ์ด ์ง„ํ–‰๋˜๊ณ  ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๋˜ํ•œ ์™•๋ณต ํ…Œ์ŠคํŠธ๋ฅผ ์‹คํ–‰ํ•ด ๋ณด์•˜์Šต๋‹ˆ๋‹ค.

xgboost::xgb.save(fit, 'booster.raw')
fit.loaded <- xgboost::xgb.load('booster.raw')
xgboost::xgb.save(fit.loaded, 'booster.raw.roundtrip')

๋‘ ๋ฐ”์ด๋„ˆ๋ฆฌ ํŒŒ์ผ booster.raw ๋ฐ booster.raw.roundtrip ๋Š” ๋™์ผํ–ˆ์Šต๋‹ˆ๋‹ค.

pred ์™€ pred.loaded ์‚ฌ์ด์˜ ์ตœ๋Œ€ ์ฐจ์ด๋Š” 0.0008370876์ž…๋‹ˆ๋‹ค.

๋” ๋น ๋ฅด๊ฒŒ ์‹คํ–‰๋˜๋Š” ๋” ์ž‘์€ ์˜ˆ:

library(xgboost)

N <- 5000
set.seed(2020)
X <- data.frame('X1' = rnorm(N), 'X2' = runif(N), 'X3' = rpois(N, lambda = 1))
Y <- with(X, X1 + X2 - X3 + X1*X2^2 - ifelse(X1 > 0, 2, X3))

params <- list(objective = 'reg:squarederror',
               max_depth = 5, eta = 0.02, subsammple = 0.5,
               base_score = median(Y)
)

dtrain <- xgboost::xgb.DMatrix(data = data.matrix(X), label = Y)

fit <- xgboost::xgb.train(
  params = params, data = dtrain,
  watchlist = list('train' = dtrain),
  nrounds = 10000, verbose = TRUE, print_every_n = 25,
  eval_metric = 'mae',
  early_stopping_rounds = 3, maximize = FALSE
)

pred <- stats::predict(fit, newdata = dtrain)

invisible(xgboost::xgb.save(fit, 'booster.raw'))
fit.loaded <- xgboost::xgb.load('booster.raw')
invisible(xgboost::xgb.save(fit.loaded, 'booster.raw.roundtrip'))

pred.loaded <- stats::predict(fit.loaded, newdata = dtrain)

identical(pred, pred.loaded)
pred[1:10]
pred.loaded[1:10]
max(abs(pred - pred.loaded))

sqrt(mean((Y - pred)^2))
sqrt(mean((Y - pred.loaded)^2))

์‚ฐ์ถœ:

[1] FALSE
 [1] -2.4875379 -0.9452241 -6.9658904 -2.9985323 -4.2192593 -0.8505422
 [7] -0.3928839 -1.6886091 -1.3611379 -3.1278882
 [1] -2.4875379 -0.9452239 -6.9658904 -2.9985323 -4.2192593 -0.8505420
 [7] -0.3928837 -1.6886090 -1.3611377 -3.1278882
[1] 0.0001592636
[1] 0.01370754
[1] 0.01370706

ํ•œ ๋ฒˆ์˜ ์ถ”๊ฐ€ ์™•๋ณต์„ ์‹œ๋„ํ–ˆ๋Š”๋ฐ ์ด์ œ ์˜ˆ์ธก์ด ๋” ์ด์ƒ ๋ณ€๊ฒฝ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

library(xgboost)

N <- 5000
set.seed(2020)
X <- data.frame('X1' = rnorm(N), 'X2' = runif(N), 'X3' = rpois(N, lambda = 1))
Y <- with(X, X1 + X2 - X3 + X1*X2^2 - ifelse(X1 > 0, 2, X3))

params <- list(objective = 'reg:squarederror',
               max_depth = 5, eta = 0.02, subsammple = 0.5,
               base_score = median(Y)
)

dtrain <- xgboost::xgb.DMatrix(data = data.matrix(X), label = Y)

fit <- xgboost::xgb.train(
  params = params, data = dtrain,
  watchlist = list('train' = dtrain),
  nrounds = 10000, verbose = TRUE, print_every_n = 25,
  eval_metric = 'mae',
  early_stopping_rounds = 3, maximize = FALSE
)

pred <- stats::predict(fit, newdata = dtrain)

invisible(xgboost::xgb.save(fit, 'booster.raw'))
fit.loaded <- xgboost::xgb.load('booster.raw')
invisible(xgboost::xgb.save(fit.loaded, 'booster.raw.roundtrip'))
fit.loaded2 <- xgboost::xgb.load('booster.raw.roundtrip')

pred.loaded <- stats::predict(fit.loaded, newdata = dtrain)
pred.loaded2 <- stats::predict(fit.loaded2, newdata = dtrain)

identical(pred, pred.loaded)
identical(pred.loaded, pred.loaded2)
pred[1:10]
pred.loaded[1:10]
pred.loaded2[1:10]
max(abs(pred - pred.loaded))
max(abs(pred.loaded - pred.loaded2))

sqrt(mean((Y - pred)^2))
sqrt(mean((Y - pred.loaded)^2))
sqrt(mean((Y - pred.loaded2)^2))

๊ฒฐ๊ณผ:

[1] FALSE
[1] TRUE
 [1] -2.4875379 -0.9452241 -6.9658904 -2.9985323 -4.2192593 -0.8505422
 [7] -0.3928839 -1.6886091 -1.3611379 -3.1278882
 [1] -2.4875379 -0.9452239 -6.9658904 -2.9985323 -4.2192593 -0.8505420
 [7] -0.3928837 -1.6886090 -1.3611377 -3.1278882
 [1] -2.4875379 -0.9452239 -6.9658904 -2.9985323 -4.2192593 -0.8505420
 [7] -0.3928837 -1.6886090 -1.3611377 -3.1278882
[1] 0.0001592636
[1] 0
[1] 0.01370754
[1] 0.01370706
[1] 0.01370706

๋”ฐ๋ผ์„œ ์˜ˆ์ธก ์บ์‹œ๊ฐ€ ์‹ค์ œ๋กœ ๋ฌธ์ œ์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ˆ์ธก ์บ์‹ฑ์„ ๋น„ํ™œ์„ฑํ™”ํ•œ ์ƒํƒœ์—์„œ ์Šคํฌ๋ฆฝํŠธ๋ฅผ ๋‹ค์‹œ ์‹คํ–‰ํ–ˆ์Šต๋‹ˆ๋‹ค.

diff --git a/src/predictor/cpu_predictor.cc b/src/predictor/cpu_predictor.cc
index ebc15128..c40309bc 100644
--- a/src/predictor/cpu_predictor.cc
+++ b/src/predictor/cpu_predictor.cc
@@ -259,7 +259,7 @@ class CPUPredictor : public Predictor {
     // delta means {size of forest} * {number of newly accumulated layers}
     uint32_t delta = end_version - beg_version;
     CHECK_LE(delta, model.trees.size());
-    predts->Update(delta);
+    //predts->Update(delta);

     CHECK(out_preds->Size() == output_groups * dmat->Info().num_row_ ||
           out_preds->Size() == dmat->Info().num_row_);

(์˜ˆ์ธก ์บ์‹ฑ์„ ๋น„ํ™œ์„ฑํ™”ํ•˜๋ฉด ํ›ˆ๋ จ ์†๋„๊ฐ€ ๋งค์šฐ ๋Š๋ ค์ง‘๋‹ˆ๋‹ค.)

์‚ฐ์ถœ:

[1] FALSE
[1] TRUE
 [1] -2.4908853 -0.9507379 -6.9615889 -2.9935317 -4.2165089 -0.8543566
 [7] -0.3940181 -1.6930715 -1.3572118 -3.1403396
 [1] -2.4908853 -0.9507380 -6.9615889 -2.9935317 -4.2165089 -0.8543567
 [7] -0.3940183 -1.6930716 -1.3572119 -3.1403399
 [1] -2.4908853 -0.9507380 -6.9615889 -2.9935317 -4.2165089 -0.8543567
 [7] -0.3940183 -1.6930716 -1.3572119 -3.1403399
[1] 0.0001471043
[1] 0
[1] 0.01284297
[1] 0.01284252
[1] 0.01284252

๋”ฐ๋ผ์„œ ์˜ˆ์ธก ์บ์‹œ๋Š” ํ™•์‹คํžˆ ์ด ๋ฒ„๊ทธ์˜ ์›์ธ์ด ์•„๋‹™๋‹ˆ๋‹ค .

๋ฆฌํ”„ ์˜ˆ์ธก๋„ ๋‹ค์–‘ํ•ฉ๋‹ˆ๋‹ค.

invisible(xgboost::xgb.save(fit, 'booster.raw'))
fit.loaded <- xgboost::xgb.load('booster.raw')
invisible(xgboost::xgb.save(fit.loaded, 'booster.raw.roundtrip'))
fit.loaded2 <- xgboost::xgb.load('booster.raw.roundtrip')

x <- predict(fit, newdata = dtrain2, predleaf = TRUE)
x2 <- predict(fit.loaded, newdata = dtrain2, predleaf = TRUE)
x3 <- predict(fit.loaded2, newdata = dtrain2, predleaf = TRUE)

identical(x, x2)
identical(x2, x3)

์‚ฐ์ถœ:

[1] FALSE
[1] TRUE

์ˆ˜์ˆ˜๊ป˜๋ผ๊ฐ€ ํ’€๋ ธ์Šต๋‹ˆ๋‹ค. ์ง„์งœ ์›์ธ์„ ์•Œ์•„๋ƒˆ์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ์ด ๋””์Šคํฌ์— ์ €์žฅ๋˜๋ฉด ์กฐ๊ธฐ ์ค‘์ง€์— ๋Œ€ํ•œ ์ •๋ณด๊ฐ€ ์‚ญ์ œ๋ฉ๋‹ˆ๋‹ค. ์˜ˆ์ œ์—์„œ XGBoost๋Š” 6381๊ฐœ์˜ ๋ถ€์ŠคํŒ… ๋ผ์šด๋“œ๋ฅผ ์‹คํ–‰ํ•˜๊ณ  6378๊ฐœ์˜ ๋ผ์šด๋“œ์—์„œ ์ตœ์ƒ์˜ ๋ชจ๋ธ์„ ์ฐพ์Šต๋‹ˆ๋‹ค. ๋ฉ”๋ชจ๋ฆฌ์˜ ๋ชจ๋ธ ๊ฐœ์ฒด์—๋Š” ์ œ๊ฑฐ๋œ ํŠธ๋ฆฌ๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— 6378๊ฐœ์˜ ํŠธ๋ฆฌ๊ฐ€ ์•„๋‹ˆ๋ผ 6381๊ฐœ์˜ ํŠธ๋ฆฌ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์–ด๋–ค ๋ฐ˜๋ณต์ด ๊ฐ€์žฅ ์ข‹์•˜๋Š”์ง€ ๊ธฐ์–ตํ•˜๋Š” ์ถ”๊ฐ€ ํ•„๋“œ best_iteration ๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

> fit$best_iteration
[1] 6378

์ด ์ถ”๊ฐ€ ํ•„๋“œ๋Š” ๋ชจ๋ธ์„ ๋””์Šคํฌ์— ์ €์žฅํ•  ๋•Œ ์ž๋™์œผ๋กœ ์‚ญ์ œ๋ฉ๋‹ˆ๋‹ค. ๊ทธ๋ž˜์„œ predict() ์›๋ž˜ ๋ชจ๋ธ์€ ๋ฐ˜๋ฉด, 6378 ๊ทธ๋ฃจ์˜ ๋‚˜๋ฌด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ predict() (๊ฐ€) ๋ชจ๋ธ์ด 6,381 ๊ทธ๋ฃจ์˜ ๋‚˜๋ฌด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ณต๊ตฌ์™€ ํ•จ๊ป˜.

> x <- predict(fit, newdata = dtrain2, predleaf = TRUE)
> x2 <- predict(fit.loaded, newdata = dtrain2, predleaf = TRUE)
> dim(x)
[1] 5000 6378
> dim(x2)
[1] 5000 6381

@trivialfis ๋‚˜๋Š” ๋ฌผ๋ฆฌ์ ์œผ๋กœ ๋‚˜๋ฌด๋ฅผ ์ œ๊ฑฐํ•˜๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ํ›ˆ๋ จ์ด 6381 ๋ผ์šด๋“œ์—์„œ ์ค‘์ง€๋˜๊ณ  ์ตœ์ƒ์˜ ๋ฐ˜๋ณต์ด 6378 ๋ผ์šด๋“œ์—์„œ ์ค‘๋‹จ๋œ ๊ฒฝ์šฐ ์‚ฌ์šฉ์ž๋Š” ์ตœ์ข… ๋ชจ๋ธ์— 6378 ํŠธ๋ฆฌ๊ฐ€ ์žˆ์„ ๊ฒƒ์œผ๋กœ ์˜ˆ์ƒํ•  ๊ฒƒ์ž…๋‹ˆ๋‹ค.

@ hcho3 https://github.com/dmlc/xgboost/issues/4052 ์—์„œ ๋น„์Šทํ•œ ๋ฌธ์ œ๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

bset_iteration ๋Š” xgboost::xgb.attr ํ†ตํ•ด ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ๋Š” Learner::attributes_ ์ €์žฅํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

@hcho3 , ์ข‹์€ ๋ฐœ๊ฒฌ!

xgboost:::predict.xgb.Booster() ๋ฌธ์„œ๋„ ์ฐธ๊ณ ํ•˜์‹ญ์‹œ์˜ค.

image

๋‚ด๊ฐ€ ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์ดํ•ดํ–ˆ๋‹ค๋ฉด ๋ฌธ์„œ๊ฐ€ ์™„์ „ํžˆ ์ •ํ™•ํ•˜์ง€ ์•Š์Šต๋‹ˆ๊นŒ? ๋ฌธ์„œ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์˜ˆ์ธก์ด ์ด๋ฏธ ๋ชจ๋“  ํŠธ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค๊ณ  ์˜ˆ์ƒํ–ˆ์Šต๋‹ˆ๋‹ค. ๋ถˆํ–‰ํžˆ๋„ ๋‚˜๋Š” ์ด๊ฒƒ์„ ํ™•์ธํ•˜์ง€ ๋ชปํ–ˆ์Šต๋‹ˆ๋‹ค.

@DavorJ ์กฐ๊ธฐ ์ •์ง€๊ฐ€ ํ™œ์„ฑํ™”๋˜๋ฉด predict() ๋Š” best_iteration ํ•„๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก์„ ์–ป์Šต๋‹ˆ๋‹ค.

@trivialfis ์ƒํ™ฉ์€ Python ์ธก์—์„œ ๋” ๋‚˜์ฉ๋‹ˆ๋‹ค. xgb.predict() ๋Š” ์กฐ๊ธฐ ์ค‘์ง€์˜ ์ •๋ณด๋ฅผ ์ „ํ˜€ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

import xgboost as xgb
import numpy as np
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split

X, y = load_boston(return_X_y=True)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

params = {'objective': 'reg:squarederror'}

bst = xgb.train(params, dtrain, 100, [(dtrain, 'train'), (dtest, 'test')],
                early_stopping_rounds=5)

x = bst.predict(dtrain, pred_leaf=True)
x2 = bst.predict(dtrain, pred_leaf=True, ntree_limit=bst.best_iteration)
print(x.shape)
print(x2.shape)

pred = bst.predict(dtrain)
pred2 = bst.predict(dtrain, ntree_limit=bst.best_iteration)

print(np.max(np.abs(pred - pred2)))

์‚ฐ์ถœ:

Will train until test-rmse hasn't improved in 5 rounds.
[1]     train-rmse:12.50316     test-rmse:11.92709
...
[25]    train-rmse:0.56720      test-rmse:2.56874
[26]    train-rmse:0.54151      test-rmse:2.56722
[27]    train-rmse:0.51842      test-rmse:2.56124
[28]    train-rmse:0.47489      test-rmse:2.56640
[29]    train-rmse:0.45489      test-rmse:2.58780
[30]    train-rmse:0.43093      test-rmse:2.59385
[31]    train-rmse:0.41865      test-rmse:2.59364
[32]    train-rmse:0.40823      test-rmse:2.59465
Stopping. Best iteration:
[27]    train-rmse:0.51842      test-rmse:2.56124
(404, 33)
(404, 27)
0.81269073

์‚ฌ์šฉ์ž๋Š” predict() ํ˜ธ์ถœํ•  ๋•Œ bst.best_iteration ๋ฅผ ๊ฐ€์ ธ์™€ ntree_limit ์ธ์ˆ˜๋กœ ์ „๋‹ฌํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ์ด๊ฒƒ์€ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜๊ธฐ ์‰ฝ๊ณ  ๋ถˆ์พŒํ•œ ๋†€๋ผ์›€์„ ์ค๋‹ˆ๋‹ค.

์ˆ˜์ •์„ ์œ„ํ•œ ๋‘ ๊ฐ€์ง€ ์˜ต์…˜์ด ์žˆ์Šต๋‹ˆ๋‹ค.

  1. best_iteration ์ง€๋‚œ ๋‚˜๋ฌด๋ฅผ ๋ฌผ๋ฆฌ์ ์œผ๋กœ ์‚ญ์ œํ•ฉ๋‹ˆ๋‹ค.
  2. ๋ชจ๋ธ์„ ์ง๋ ฌํ™”ํ•  ๋•Œ best_iteration ์ •๋ณด๋ฅผ ์œ ์ง€ํ•˜๊ณ  predict() ํ•จ์ˆ˜์—์„œ ์‚ฌ์šฉํ•˜๋„๋ก ํ•ฉ๋‹ˆ๋‹ค.

@hcho3 process_type = update ์˜ต์…˜ ๋ฐ ํฌ๋ฆฌ์ŠคํŠธ์™€ ๊ด€๋ จํ•˜์—ฌ ์ด์— ๋Œ€ํ•ด ๋ฐ˜์ฏค ๊ตฌ์šด ์•„์ด๋””์–ด๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฐฐ๊ฒฝ

๋ฌธ์ œ์˜ ์งง์€ ์š”์ ์„ ๋˜ํ’€์ด ์œ„ํ•ด ์šฐ๋ฆฌ์™€ ํ•จ๊ป˜์ด update ๊ฒฝ์šฐ, num_boost_round ์‚ฌ์šฉ update ์ด๋ฏธ ์กด์žฌํ•˜๋Š” ๋‚˜๋ฌด์˜ ์ˆ˜๋ณด๋‹ค ์ ์€์ด๋‹ค, ์—…๋ฐ์ดํŠธ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค ๊ทธ ๋‚˜๋ฌด๊ฐ€ ์ œ๊ฑฐ๋ฉ๋‹ˆ๋‹ค .

ํฌ๋ฆฌ์ŠคํŠธ ๊ด€๋ จ ๋ฌธ์ œ์— ๋Œ€ํ•œ ๊ฐ„๋žตํ•œ ์†Œ๊ฐœ๋ฅผ ์œ„ํ•ด predict ํ•จ์ˆ˜๋Š” ๋ฐ˜๋ณต ๋Œ€์‹  ํŠน์ • ์ˆ˜์˜ ํŠธ๋ฆฌ๊ฐ€ ํ•„์š”ํ•˜๋ฏ€๋กœ best_iteration ๋Š” ํฌ๋ฆฌ์ŠคํŠธ์— ์ ์šฉ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ Python์—๋Š” best_ntree_limit ๋ผ๋Š” ๊ฒƒ์ด ์žˆ์Šต๋‹ˆ๋‹ค. , ๊ทธ๊ฒƒ์€ ๋‚˜์—๊ฒŒ ๋งค์šฐ ํ˜ผ๋ž€ ์Šค๋Ÿฝ์Šต๋‹ˆ๋‹ค. ๋‚˜๋Š” ๋ช…์‹œ ์ ์œผ๋กœ ๋Œ€์ฒด ntree_limit ์— inplace_predict ์™€ iteration_range ์ด ์†์„ฑ์„ ํ”ผํ•˜๊ธฐ ์œ„ํ•ด.

์•„์ด๋””์–ด

slice ๋ฐ concat ๋ฉ”์„œ๋“œ๋ฅผ booster ์— ์ถ”๊ฐ€ํ•˜๊ณ  ์‹ถ์Šต๋‹ˆ๋‹ค. ์ด ๋ฉ”์„œ๋“œ๋Š” ๋‚˜๋ฌด๋ฅผ 2๊ฐœ์˜ ๋ชจ๋ธ๋กœ ์ถ”์ถœํ•˜๊ณ  2๊ฐœ์˜ ๋ชจ๋ธ์—์„œ 1๊ฐœ์˜ ๋‚˜๋ฌด๋ฅผ ์—ฐ๊ฒฐํ•ฉ๋‹ˆ๋‹ค. ์ด ๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์ด ์žˆ๋‹ค๋ฉด :

  • base_margin_ ๋Š” ๋” ์ด์ƒ ํ•„์š”ํ•˜์ง€ ์•Š์œผ๋ฉฐ ๋‹ค๋ฅธ ์‚ฌ์šฉ์ž์—๊ฒŒ ๋” ์ง๊ด€์ ์ด๋ผ๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.
  • ์˜ˆ์ธก์—์„œ ntree_limit ๋Š” ๋” ์ด์ƒ ํ•„์š”ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๋ชจ๋ธ์„ ์Šฌ๋ผ์ด์Šคํ•˜๊ณ  ์Šฌ๋ผ์ด์Šค์—์„œ ์˜ˆ์ธก์„ ์‹คํ–‰ํ•˜๊ธฐ๋งŒ ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.
  • update ํ”„๋กœ์„ธ์Šค๋Š” ๋…๋ฆฝ์ ์ด๋ฉฐ num_boost_rounds ์—†์ด ํ•œ ๋ฒˆ์— ์กฐ๊ฐ์œผ๋กœ ํŠธ๋ฆฌ๋ฅผ ์—…๋ฐ์ดํŠธํ•ฉ๋‹ˆ๋‹ค.

๋” ๋‚˜์•„๊ฐ€

๋˜ํ•œ ์ด๊ฒƒ์ด ์–ด๋–ป๊ฒŒ ๋“  ๋‹ค์ค‘ ๋Œ€์ƒ ๋‚˜๋ฌด์™€ ์—ฐ๊ฒฐ๋˜์–ด ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค. ๋ฏธ๋ž˜์— ๋‹ค์ค‘ ํด๋ž˜์Šค ๋‹ค์ค‘ ๋Œ€์ƒ ํŠธ๋ฆฌ๋ฅผ ์ง€์›ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์ฒ˜๋Ÿผ ๊ฐ ํด๋ž˜์Šค ๋˜๋Š” ๊ฐ ๋Œ€์ƒ์— ๋Œ€ํ•ด output_groups ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ˆฒ ๋ฐ ๋ฒกํ„ฐ ์žŽ๊ณผ ์Œ์„ ์ด๋ฃจ๋Š” ๋“ฑ ํŠธ๋ฆฌ๋ฅผ ์ •๋ ฌํ•˜๋Š” ์—ฌ๋Ÿฌ ๋ฐฉ๋ฒ•์ด ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค. ntree_limit ๋กœ๋Š” ์ถฉ๋ถ„ํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.

๋˜ํ•œ #5531 .

๊ทธ๋Ÿฌ๋‚˜ ์•„์ด๋””์–ด๋Š” ๋งค์šฐ ์ดˆ๊ธฐ์— ์žˆ์–ด์„œ ๊ณต์œ ํ•  ์ž์‹ ์ด ์—†์—ˆ์Šต๋‹ˆ๋‹ค. ์ด์ œ ์šฐ๋ฆฌ๋Š” ์ด ๋ฌธ์ œ์— ๋Œ€ํ•ด ์ด์•ผ๊ธฐํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์•„๋งˆ๋„ ์ด์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ์„ ๊ฒƒ์ž…๋‹ˆ๋‹ค.

1.1 ํƒ€์ž„๋ผ์ธ์ด ์ฃผ์–ด์ง€๋ฉด ์‚ฌ์šฉ์ž๊ฐ€ ์˜ˆ์ธก์—์„œ ์ด ์ตœ์ƒ์˜ ๋ฐ˜๋ณต์„ ์ˆ˜๋™์œผ๋กœ ์บก์ฒ˜ํ•˜๊ณ  ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๋ช…ํ™•ํžˆ ํ•˜๊ธฐ ์œ„ํ•ด ๋ฌธ์„œ๋ฅผ ํ™•์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๊นŒ?
๋ฆด๋ฆฌ์Šค ์ •๋ณด์˜ ์•Œ๋ ค์ง„ ๋ฌธ์ œ์— ์ถ”๊ฐ€ํ•˜์‹œ๊ฒ ์Šต๋‹ˆ๊นŒ?

@trivialfis ๋Š” ์ด๋ ‡๊ฒŒ ํ•˜์—ฌ ๊ตฌ์„ฑ ๋ฌธ์ œ๋ฅผ ๋” ์ด์ƒ ๋ณต์žกํ•˜๊ฒŒ ๋งŒ๋“ค์ง€ ์•Š๋Š” ํ•œ ํฅ๋ฏธ๋กญ๊ฒŒ ๋“ค๋ฆฝ๋‹ˆ๋‹ค.

@hcho3 ์ด ์ œ์•ˆํ•œ ๋Œ€๋กœ ๋ชจ๋ธ์—์„œ ์ถ”๊ฐ€ ํŠธ๋ฆฌ๋ฅผ ์‚ญ์ œํ•˜๋Š” ๊ฒƒ์€ ์‹ค์ œ ๋ชจ๋ธ ๊ธธ์ด์™€ ์ด๋ก ์ ์ธ ๋ชจ๋ธ ๊ธธ์ด๋ฅผ ๋™์‹œ์— ๊ฐ€์งˆ ๋•Œ ๋ถˆ์ผ์น˜๋ฅผ ์ฒ˜๋ฆฌํ•  ํ•„์š”๊ฐ€ ์—†๊ธฐ ๋•Œ๋ฌธ์— ๋งค๋ ฅ์ ์ž…๋‹ˆ๋‹ค.

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰

๊ด€๋ จ ๋ฌธ์ œ

pplonski picture pplonski  ยท  3์ฝ”๋ฉ˜ํŠธ

FabHan picture FabHan  ยท  4์ฝ”๋ฉ˜ํŠธ

frankzhangrui picture frankzhangrui  ยท  3์ฝ”๋ฉ˜ํŠธ

nnorton24 picture nnorton24  ยท  3์ฝ”๋ฉ˜ํŠธ

uasthana15 picture uasthana15  ยท  4์ฝ”๋ฉ˜ํŠธ