A new feature we could add for model understanding is prediction explanation. This would answer the question "why did my model predict x?", allowing users to see which input features were the most impactful to that prediction. This sort of feature can be useful for debugging models from a data setup perspective, because users could examine predictions they've categorized as "bad" and alter or eliminate the features which contributed to those predictions.
Some resources:
We should look into using SHAP (SHapley Additive exPlanations)
https://github.com/slundberg/shap
I discovered this library via this notebook:
https://github.com/d6t/d6t-python/blob/master/blogs/blog-20200426-shapley.ipynb
Design Document: https://alteryx.quip.com/D4ZwAOUklY5X/Prediction-Explanations
@freddyaboulton and I met to discuss this yesterday. This is ready for implementation. Below is the implementation plan from the design doc:
Tasks
Phase 1
Note: until all this is complete, we should keep the implementation private for the July release, i.e. _explain_prediction
Overall estimate: 9 days
Phase 2
Overall estimate: 5 days
Key Dates
July release: July 28, 2020.
Goal
Merge Phase 1 by Tues August 4th
Merge Phase 2 by Tues August 11th
Stretch Goal
Merge Phase 1 by Tues July 28th (July release)
Merge Phase 2 by Tues Aug 4th
Hey @freddyaboulton , to date we've been keeping epics in the Epic pipeline and instead moving the individual issues through the pipeline. Could you please follow that pattern here as well? If that feels weird or incorrect to you, happy to discuss changing our process for how we organize epics. Its pretty simplistic at the moment.
@dsherry My mistake! Keeping epics in the epic pipeline makes sense to me 👍
@freddyaboulton from my perspective, we should finish reviewing the shap qualitative analysis you did (which is super helpful!!), resolve those discussions and perhaps make some fixes/updates. But what I see in there already feels good enough to make public for July!
To confirm: explain_predictions
is now public, in the API docs and we added a user guide, correct? Meaning it will be a part of the July release? So great!!
@freddyaboulton can this epic be closed?
@dsherry I think once we get #1107 merged we can close this!
Most helpful comment
We should look into using SHAP (SHapley Additive exPlanations)
https://github.com/slundberg/shap
I discovered this library via this notebook:
https://github.com/d6t/d6t-python/blob/master/blogs/blog-20200426-shapley.ipynb