Evalml: Elastic Net Classifier ์ถ”์ •๊ธฐ๋กœ SHAP ํ…Œ์ŠคํŠธ ์‹คํŒจ

์— ๋งŒ๋“  2021๋…„ 05์›” 17์ผ  ยท  4์ฝ”๋ฉ˜ํŠธ  ยท  ์ถœ์ฒ˜: alteryx/evalml

์ด PR ์—์„œ๋Š” test_algoritthms.py ํŒŒ์ผ์˜ shap ํ…Œ์ŠคํŠธ(test_shap)๋ฅผ ํ†ต๊ณผํ•˜๊ธฐ ์œ„ํ•ด ENC์˜ ์ดˆ๊ธฐํ™” ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๋ณ€๊ฒฝํ•ฉ๋‹ˆ๋‹ค. init ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ alpha = 0.0001 ๋ฐ l1_ratio=0.15 ๋กœ ๋‘๋ฉด shap์„ ๊ณ„์‚ฐํ•˜๋Š” ๋™์•ˆ ZeroDivisionError ๊ฐ€ ๋ฐœ์ƒํ–ˆ์œผ๋ฉฐ ์ด๋Š” ์ด๊ฒƒ์— ์—ฐ๊ฒฐ๋˜์—ˆ์„ ๊ฐ€๋Šฅ์„ฑ

์ด ๋ฌธ์ œ๋ฅผ ์ œ์ถœํ•˜์—ฌ ํ…Œ์ŠคํŠธ๊ฐ€ ์‹คํŒจํ•œ ์ด์œ ๋ฅผ ํ™•์ธํ•˜๊ณ  ์ด ์˜ค๋ฅ˜๋ฅผ ๋ฐฉ์ง€ํ•  ์ˆ˜ ์žˆ๋Š” ์ข‹์€ ๋ฐฉ๋ฒ•์„ ์ฐพ์œผ์‹ญ์‹œ์˜ค.

๋ชจ๋“  4 ๋Œ“๊ธ€

๋‚˜๋Š” ์ด๊ฒƒ์ด ์ƒคํ”„ ๋ฌธ์ œ๋ผ๊ณ  ์ƒ๊ฐํ•œ๋‹ค. ํ† ๋ก ์„ ์œ„ํ•ด https://github.com/slundberg/shap/issues/2000 ์„ ์ œ์ถœํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‹จ๊ธฐ์ ์œผ๋กœ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ•  ์ˆ˜ ์žˆ๋‹ค๊ณ  ์ƒ๊ฐํ•ฉ๋‹ˆ๋‹ค.

  • ์„ ํ˜• ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ KernelExplainer ์— link="identity" ์‚ฌ์šฉ
  • ๋กœ์ง“ ๋งํฌ์™€ ํ•จ๊ป˜ LinearExplainer ์‚ฌ์šฉ, explainer = shap.LinearExplainer(classifier, X, link=shap.links.logit)

์‹คํŒจํ•œ ์ถ”๊ฐ€ ํ…Œ์ŠคํŠธ:

์ด ํ…Œ์ŠคํŠธ๋ฅผ evalml/tests/model_understanding_tests/prediction_explaination_tests/test_explainers.py ์ถ”๊ฐ€ํ•˜๋ฉด ๋ฉ”์ธ์—์„œ alpha=0.5, l1_ratio=0.5 ๋ฅผ ์‚ฌ์šฉ ํ•ฉ๋‹ˆ๋‹ค .

@pytest.mark.parametrize("estimator", ["Extra Trees Classifier", "Elastic Net Classifier"])
def test_elastic_net(estimator, fraud_100):
    pytest.importorskip('imblearn', reason='Skipping test because imblearn not installed')
    X, y = fraud_100
    pipeline = BinaryClassificationPipeline(component_graph=["Imputer", "One Hot Encoder", "DateTime Featurization Component", estimator])
    pipeline.fit(X=X, y=y)
    pipeline.predict(X)
    importance = explain_predictions(pipeline, X, y, indices_to_explain=[0], top_k_features=4)
    assert report['feature_names'].isnull().sum() == 0
    assert report['feature_values'].isnull().sum() == 0

ํ…Œ์ŠคํŠธ ์‹คํŒจ:
image

alpha์™€ l1_ratio๋ฅผ ๋ณ€๊ฒฝํ•ด๋„ ์—ฌ์ „ํžˆ ์‹คํŒจํ•ฉ๋‹ˆ๋‹ค.

์ข‹์•„ @bchen1116 , ๊ฐ€์ž.

์ด PR๋กœ ๋งˆ๋ฌด๋ฆฌ

์ด ํŽ˜์ด์ง€๊ฐ€ ๋„์›€์ด ๋˜์—ˆ๋‚˜์š”?
0 / 5 - 0 ๋“ฑ๊ธ‰