Evalml: SHAP test fails with the Elastic Net Classifier estimator

Created on 17 May 2021 · 4Comments · Source: alteryx/evalml

In this PR, we change the init params for ENC in order to pass the shap tests (test_shap) in the test_algoritthms.py file. Leaving the init params as alpha = 0.0001 and l1_ratio=0.15 caused a ZeroDivisionError while calculating shap, which is likely linked to this.

Filing this issue to determine why the test fails and figure out a good way to avoid this error.

bug

Source

bchen1116

All 4 comments

I think this is a shap issue. I filed https://github.com/slundberg/shap/issues/2000 for discussion. In the short term, I think we can do the following:

Use link="identity" in the KernelExplainer for linear models
Use LinearExplainer with logit link, explainer = shap.LinearExplainer(classifier, X, link=shap.links.logit)

freddyaboulton on 18 May 2021

👍1

A further test that failed:

On main, using alpha=0.5, l1_ratio=0.5, which is what we have prior to this, if we add this test into evalml/tests/model_understanding_tests/prediction_explaination_tests/test_explainers.py:

@pytest.mark.parametrize("estimator", ["Extra Trees Classifier", "Elastic Net Classifier"])
def test_elastic_net(estimator, fraud_100):
    pytest.importorskip('imblearn', reason='Skipping test because imblearn not installed')
    X, y = fraud_100
    pipeline = BinaryClassificationPipeline(component_graph=["Imputer", "One Hot Encoder", "DateTime Featurization Component", estimator])
    pipeline.fit(X=X, y=y)
    pipeline.predict(X)
    importance = explain_predictions(pipeline, X, y, indices_to_explain=[0], top_k_features=4)
    assert report['feature_names'].isnull().sum() == 0
    assert report['feature_values'].isnull().sum() == 0

The test fails:

If we change the alpha and l1_ratio, it still fails.

bchen1116 on 18 May 2021

Ok @bchen1116 , go do it.

chukarsten on 21 May 2021

😄1

Closing with this PR

bchen1116 on 23 Jun 2021

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Warning messages in unit test: "invalid value encountered in double_scalars" and others

dsherry · 3Comments

Stacked ensembler: use same CV data splitter as the rest of automl

dsherry · 4Comments

Disallow recall as an automl objective

dsherry · 5Comments

Docs:Back Arrow on Install Page

chukarsten · 4Comments

AutoMLSearch: calling search twice on same instance doesn't work

angela97lin · 5Comments