Evalml: AutoMLSearch: update objective string to use human-readable format

Created on 19 Aug 2020  ·  9Comments  ·  Source: alteryx/evalml

  • The get_objective(objective) only works if the input is an Objective or a snakecase representation.
  • It does not work if the objective str is something like log loss binary or Log Loss Binary, even those this is from the name of the Objective
enhancement

All 9 comments

@gsheni But what if the objective is not listed in Options (e.g Recall)? Do you still want it to work in that case? The reason not every objective is listed in Options is because we don't want some objectives to be used in AutoMLSearch.

So I think its 2 things

  • I need a way to get all objectives.
  • I need a way to go from objective name to actual Python Object.

    • The name could be lowercase name, snakecase, the capital name case

get_objectives --> should return all Objectives in EvalML, just like the name of the functions implies.
get_automl_objectives --> should return valid Objectives for AutoML search.

Got it, and just to be clear on the api:

def get_objectives(name: str) -> ObjectiveBase:
    pass

def get_automl_objectives(name: str) -> ObjectiveBase:
    pass

And the name can be lowercase, snakecase, or the "official name" (objective.name)

sorry I meant singular
get_objective
get_automl_objective

def get_objective(objective):
    if objective is None:
        raise TypeError("Objective parameter cannot be NoneType")
    if isinstance(objective, ObjectiveBase):
        return objective
    if objective in OPTIONS:
        return OPTIONS[objective]

    name_to_objective = {}
    for key, value in OPTIONS.items():
        name_to_objective[value.name] = value
        name_to_objective[value.name.lower()] = value
        name_to_objective[value.name.upper()] = value
        name_to_objective[key] = value
        name_to_objective[key.lower()] = value
        name_to_objective[key.upper()] = value
    if objective in name_to_objective:
        return name_to_objective[objective]

    raise ObjectiveNotFoundError("Could not find the specified objective.")
  • probably a bit overkill, but it works

Background
This relates to #580 which is some more tech debt surrounding objectives.

We currently have two sets of names for objectives. The first is defined in this global OPTIONS dict, which is used by the get_objective/get_objectives methods to convert the user-defined AutoMLSearch objective string input into the right ObjectiveBase class. The second is the name field defined in each objective class, as seen here.

We don't need two sets of names for objectives. This is tech debt.

Goal

  • Have one set of names for our objectives
  • Use human-readable names. No more snake-case

Proposal

  • Update the OPTIONS dict to get its keys from each objective class's name field, instead of manually defined snake-case strings
  • Update the AutoMLSearch API ref (and possibly also our automl search user guide) to describe what's required for the "objective" input to AutoMLSearch
  • Mention this as a breaking change in our release notes, because the snake-case format will be gone

For consideration by whoever picks this up: the proposal above would get this done. But see if you can find a way to mirror the pattern we've taken with components. There, we use get_importable_subclasses to define getter methods for all components. I think we should delete OPTIONS in favor of this pattern, and update get_objective/get_objectives accordingly.

Also, it may be simpler if we require the capitalization matches. We don't currently required that, but thats unnecessary complexity and is confusing.

@dsherry I can take this on! I like the idea of deleting Options in favor of get_importable_subclasses. I think we would need to pass in an allowed_in_automl_search flag to get_objective to keep the current behavior of not allowing all objectives to be used in AutoMLSearch (unless we want to get rid of that restriction).

@freddyaboulton ah good point about that...

I suggest we don't add a flag for that to get_objective. I think we should put validation logic into AutoMLSearch instead, where we read in and parse the objective.

        if objective == 'auto':
            objective = self._DEFAULT_OBJECTIVES[self.problem_type.value]
        objective_obj = get_objective(objective)
        self._validate_objective(objective_obj)
        self.objective = objective

If we do that validation at a lower level I think its too confusing.

Was this page helpful?
0 / 5 - 0 ratings