Evalml: Disallow recall as an automl objective

Created on 10 Mar 2020 · 5Comments · Source: alteryx/evalml

Problem
A model which always predicts true has perfect recall score. By allowing automl to optimize for recall, we're encouraging it to produce a trivial model.

Helpful reference here.

Proposal
Delete the recall objective.

In general, we should limit the set of automl objectives to be those which we feel are valuable, and where optimizing for those objectives will produce good models.

I think we should also add more binary classification objectives. #457 includes a proposal for one which could be good for imbalanced classes.

Questions
*Can a similar argument be made for precision? Or is there value in optimizing for that?
*Can a similar argument be made for accuracy (#294)?

@angela97lin @kmax12 FYI

enhancement

Source

dsherry

👀1

Most helpful comment

I think precision and accuracy are fine in the sense they won't give you a trivial model.

we don't necessarily want to delete recall as an objective, it just shouldn't optimize against it in automl search. for example, I might want to optimize for f1, but then see my recall score along side of it

kmax12 on 10 Mar 2020

👍2

All 5 comments

I think precision and accuracy are fine in the sense they won't give you a trivial model.

kmax12 on 10 Mar 2020

👍2

@kmax12 yeah, right, we don't want to delete the code which computes recall, and we want to still support computing recall as a score on a pipeline, but we do want to disallow it as a supported optimization objective in automl.

This reminds me of the ongoing discussion around the binary classification plotting/info methods for ROC and confusion matrix (#427, #365). Those aren't metrics we can optimize for in automl, and they're also not single-number scores, but under our API, the easiest way to define them was to add them as instances of ObjectiveBase.

We currently have a number of things which can be computed using pipelines:

Predictions
Objective function scores for automl
Scoring metrics, after automl
Plotting data (example for binary classification: ROC curve, confusion matrix)
Feature importances

I think to date we've been trying to use ObjectiveBase to represent 2, 3, and 4. In other words, we're missing a clear API to define scoring methods and plotting methods, separate from the automl process.

I think the next step here should be to design those APIs. Looks like I already filed that as #392. I'll update this ticket to be blocked on that.

dsherry on 11 Mar 2020

👍1

For the objectives API rework right now, we've moved ROC and Confusion Matrix to PlotMetrics instead (less design went into this, this was the just the easiest way to separate out these two from the rest of the objectives without breaking things). We've also added can_optimize_threshold as an attribute for BinaryClassificationObjective, so if fit() is called with an objective with can_optimize_threshold=True then we optimize for that objective, otherwise we optimize for Accuracy. Thoughts on this and how this might align with some of the questions raised here? Would it be unclear if a user called fit on Recall but instead we optimized for Accuracy instead?

angela97lin on 11 Mar 2020

@angela97lin yes, I think moving ROC/confusion out from ObjectiveBase was a positive step! I think #392 should track going further. Let's continue the conversation about how to update the API on #392 instead. That way, this issue can just track updating recall once we've made a decision about how to handle this stuff more generally.

Also I think binary classification threshold optimization is a separate subject, and thankfully one that your ongoing work in #346 is handling 100%!

dsherry on 11 Mar 2020

Summarizing discussion with @eccabay and @jeremyliweishih earlier: Options for supporting this are:

Delete the recall objectives entirely.
Delete the entries for the recall objectives in objectives/utils.py OPTIONS, and confirm that disallows those objectives in automl.

dsherry on 19 May 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Imputer cannot fit when there is None in a categorical or boolean column

freddyaboulton · 3Comments

Update pipeline and components to return Woodwork data structures

angela97lin · 5Comments

Allow components which are not "leaf" children in the class hierarchy

angela97lin · 4Comments

Add partial dependence plot

dsherry · 3Comments

Add a classification accuracy objective

dsherry · 4Comments