Evalml: AutoMLSearch: update objective string to use human-readable format

Created on 19 Aug 2020 · 9Comments · Source: alteryx/evalml

The get_objective(objective) only works if the input is an Objective or a snakecase representation.
It does not work if the objective str is something like log loss binary or Log Loss Binary, even those this is from the name of the Objective

enhancement

Source

gsheni

👍3

All 9 comments

@gsheni But what if the objective is not listed in Options (e.g Recall)? Do you still want it to work in that case? The reason not every objective is listed in Options is because we don't want some objectives to be used in AutoMLSearch.

freddyaboulton on 19 Aug 2020

So I think its 2 things

I need a way to get all objectives.
I need a way to go from objective name to actual Python Object.
- The name could be lowercase name, snakecase, the capital name case

gsheni on 19 Aug 2020

get_objectives --> should return all Objectives in EvalML, just like the name of the functions implies.
get_automl_objectives --> should return valid Objectives for AutoML search.

gsheni on 19 Aug 2020

Got it, and just to be clear on the api:

def get_objectives(name: str) -> ObjectiveBase:
    pass

def get_automl_objectives(name: str) -> ObjectiveBase:
    pass

And the name can be lowercase, snakecase, or the "official name" (objective.name)

freddyaboulton on 19 Aug 2020

sorry I meant singular
get_objective
get_automl_objective

gsheni on 19 Aug 2020

👍1

def get_objective(objective):
    if objective is None:
        raise TypeError("Objective parameter cannot be NoneType")
    if isinstance(objective, ObjectiveBase):
        return objective
    if objective in OPTIONS:
        return OPTIONS[objective]

    name_to_objective = {}
    for key, value in OPTIONS.items():
        name_to_objective[value.name] = value
        name_to_objective[value.name.lower()] = value
        name_to_objective[value.name.upper()] = value
        name_to_objective[key] = value
        name_to_objective[key.lower()] = value
        name_to_objective[key.upper()] = value
    if objective in name_to_objective:
        return name_to_objective[objective]

    raise ObjectiveNotFoundError("Could not find the specified objective.")

probably a bit overkill, but it works

gsheni on 19 Aug 2020

Background
This relates to #580 which is some more tech debt surrounding objectives.

We currently have two sets of names for objectives. The first is defined in this global OPTIONS dict, which is used by the get_objective/get_objectives methods to convert the user-defined AutoMLSearch objective string input into the right ObjectiveBase class. The second is the name field defined in each objective class, as seen here.

We don't need two sets of names for objectives. This is tech debt.

Goal

Have one set of names for our objectives
Use human-readable names. No more snake-case

Proposal

Update the OPTIONS dict to get its keys from each objective class's name field, instead of manually defined snake-case strings
Update the AutoMLSearch API ref (and possibly also our automl search user guide) to describe what's required for the "objective" input to AutoMLSearch
Mention this as a breaking change in our release notes, because the snake-case format will be gone

For consideration by whoever picks this up: the proposal above would get this done. But see if you can find a way to mirror the pattern we've taken with components. There, we use get_importable_subclasses to define getter methods for all components. I think we should delete OPTIONS in favor of this pattern, and update get_objective/get_objectives accordingly.

Also, it may be simpler if we require the capitalization matches. We don't currently required that, but thats unnecessary complexity and is confusing.

dsherry on 27 Aug 2020

@dsherry I can take this on! I like the idea of deleting Options in favor of get_importable_subclasses. I think we would need to pass in an allowed_in_automl_search flag to get_objective to keep the current behavior of not allowing all objectives to be used in AutoMLSearch (unless we want to get rid of that restriction).

freddyaboulton on 27 Aug 2020

👍1

@freddyaboulton ah good point about that...

I suggest we don't add a flag for that to get_objective. I think we should put validation logic into AutoMLSearch instead, where we read in and parse the objective.

        if objective == 'auto':
            objective = self._DEFAULT_OBJECTIVES[self.problem_type.value]
        objective_obj = get_objective(objective)
        self._validate_objective(objective_obj)
        self.objective = objective

If we do that validation at a lower level I think its too confusing.

dsherry on 27 Aug 2020

👍1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Error when using recall as primary automl objective

npapan69 · 4Comments

Add partial dependence plot

dsherry · 3Comments

AutoMLSearch get_pipeline always returns pipelines with the same name

freddyaboulton · 3Comments

Imputer cannot fit when there is None in a categorical or boolean column

freddyaboulton · 3Comments

BalancedClassificationDataCVSplit produces different splits each time it's called

freddyaboulton · 3Comments