In this PR, we change the init params for ENC in order to pass the shap tests (test_shap) in the test_algoritthms.py
file. Leaving the init params as alpha = 0.0001
and l1_ratio=0.15
caused a ZeroDivisionError
while calculating shap, which is likely linked to this.
Filing this issue to determine why the test fails and figure out a good way to avoid this error.
I think this is a shap issue. I filed https://github.com/slundberg/shap/issues/2000 for discussion. In the short term, I think we can do the following:
KernelExplainer
for linear modelsexplainer = shap.LinearExplainer(classifier, X, link=shap.links.logit)
A further test that failed:
On main, using alpha=0.5, l1_ratio=0.5
, which is what we have prior to this, if we add this test into evalml/tests/model_understanding_tests/prediction_explaination_tests/test_explainers.py
:
@pytest.mark.parametrize("estimator", ["Extra Trees Classifier", "Elastic Net Classifier"])
def test_elastic_net(estimator, fraud_100):
pytest.importorskip('imblearn', reason='Skipping test because imblearn not installed')
X, y = fraud_100
pipeline = BinaryClassificationPipeline(component_graph=["Imputer", "One Hot Encoder", "DateTime Featurization Component", estimator])
pipeline.fit(X=X, y=y)
pipeline.predict(X)
importance = explain_predictions(pipeline, X, y, indices_to_explain=[0], top_k_features=4)
assert report['feature_names'].isnull().sum() == 0
assert report['feature_values'].isnull().sum() == 0
The test fails:
If we change the alpha and l1_ratio, it still fails.
Ok @bchen1116 , go do it.
Closing with this PR