Evalml: Error when using recall as primary automl objective

Created on 12 Mar 2021  ·  4Comments  ·  Source: alteryx/evalml

Dear All,
In a binary classification problem, where the cost of FN is higher than the cost of FP, when trying to use recall as an objective I get the following error:

ValueError: recall is not allowed in AutoML! Use evalml.objectives.utils.get_core_objective_names() to get all objective names allowed in automl.

I went through the allowed names and I can see recall is there

Any suggestions?

Many thanks in advance
Nikos

Most helpful comment

Hello @npapan69 !

We discourage using recall as a primary objective in automl search because a trivial pipeline that always predicts the positive class will usually produce a perfect recall score. So automl is incentivized to find trivial pipelines. See issue #476. That's why it's not in our list of core objectives, like the error says:

image

However, you still can use recall as an automl objective! Just pass it as an instance rather than a string:

from evalml.demos import load_breast_cancer
from evalml.automl import AutoMLSearch
from evalml.objectives import Recall
X, y = load_breast_cancer()
automl = AutoMLSearch(X, y, problem_type="binary", objective=Recall())

automl.search()

Output:

image

All 4 comments

Hello @npapan69 !

We discourage using recall as a primary objective in automl search because a trivial pipeline that always predicts the positive class will usually produce a perfect recall score. So automl is incentivized to find trivial pipelines. See issue #476. That's why it's not in our list of core objectives, like the error says:

image

However, you still can use recall as an automl objective! Just pass it as an instance rather than a string:

from evalml.demos import load_breast_cancer
from evalml.automl import AutoMLSearch
from evalml.objectives import Recall
X, y = load_breast_cancer()
automl = AutoMLSearch(X, y, problem_type="binary", objective=Recall())

automl.search()

Output:

image

Thanks @freddyaboulton !

I had one more suggestion: if you want automl to know that the cost of false positives on your problem should be higher than the cost of false negatives, you could try using the CostBenefitMatrix objective:

import evalml
obj = evalml.objectives.CostBenefitMatrix(true_positive=1.0, true_negative=1.0, false_positive=10.0, false_negative=1.0)
automl = AutoMLSearch(X, y, problem_type="binary", objective=obj)

Thanks for filing the issue @npapan69 ! I'm closing it since @dsherry and @freddyaboulton chimed in. Please reach out if you have any additional questions!

Thank you TylerNickolas Papanikolaou, Ph.D.Head of Computational Clinical Imaging GroupChampalimaud FoundationCentre for the UnknownAv. Brasília, Doca de Pedrouços1400-038 Lisbon, PortugalLandline: ++351210480073Mobile: ++351969323757On Thu, Mar 18, 2021 at 5:40pm, Tyler @.*> wrote:
Thanks for filing the issue @npapan69 ! I'm closing it since @dsherry and @freddyaboulton chimed in. Please reach out if you have any additional questions!

—You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or unsubscribe.
[
{
@.": "http://schema.org",
*
@.": "EmailMessage",
"potentialAction": {
@.": "ViewAction",
"target": "https://github.com/alteryx/evalml/issues/1973#issuecomment-802154198",
"url": "https://github.com/alteryx/evalml/issues/1973#issuecomment-802154198",
"name": "View Issue"
},
"description": "View this Issue on GitHub",
"publisher": {
@.*": "Organization",
"name": "GitHub",
"url": "https://github.com"
}
}
]

Was this page helpful?
0 / 5 - 0 ratings