Evalml: [Discuss] Is AutoClassifer.fit() and AutoRegressor.fit() the best API?

Created on 5 Nov 2019  ·  10Comments  ·  Source: alteryx/evalml

After you call fit, you need to select the model you want to use before calling predict. In that way, it's not like other objects that have a fit method. fit is more like search in this case.

Most helpful comment

Yep, I agree. Regressor can also refer to the input features, variables, etc, so regression is better.

In addition, if AutoClassification and AutoRegression are more analogous to "auto machine learning", since machine learning, classification, and regression are all processes.

unless there's any final opposition, we'll go with

automl = AutoClassificationSearch()
automl = AutoRegressionSearch()
automl.search()

All 10 comments

The way I see it, there are three decisions to make

  1. What do we call the class
  2. What is the recommend name of the variable that holds the initialized class
  3. What is the method the kicks things off

For 1 and 2 here are the options that I came up with

import evalml

clf = evalml.AutoClassifierSearch()
auto = evalml.AutoClassifierSearch()
automl = evalml.AutoClassifierSearch()
seacher = evalml.AutoClassifierSearch()

clf = evalml.AutoClassifier()
auto = evalml.AutoClassifier()
automl = evalml.AutoClassifier()
seacher = evalml.AutoClassifier()

For 2 and 3 here are the options I came up with

clf.search()
auto.search()
automl.search()
seacher.search()

clf.run()
auto.run()
automl.run()
seacher.run()

I think I like

import evalml
automl = evalml.AutoClassifierSearch()
automl.search()

My reasoning is

  1. The class evalml.AutoClassifierSearch is descriptive. It is a search, not a classifier itself
  2. The method tells you specifically what is happening when you call it. run() is very general.
  3. The variable name is broad and doesn't conflict with conventions of using clf with tools like sklearn.

The alternative I like is

import evalml
search = evalml.AutoClassifierSearch()
search.run()

This uses a more descriptive variable name, so using run() would be okay. The pro to this is that it doesn't introduce the extra automl term to our examples. My concern would be that a user could name search whatever they wanted and then my_custom_variable_name.run() isn't very descriptive.

Any thoughts? Any variable naming approaches I missed? Even you don't think they are better, might as well throw them out there for us to consider.

I like AutoClassifierSearch() for the same reasons you stated above but not quite a fan of calling the variable automl as it seems to broad and search as it's a verb. What do you think about autosearch.run()? Someone proposed searcher in our meeting and that works as well but it doesn't roll off the tongue quite as nicely.

I quite like AutoClassifierSearch() and using search() as I agree that run() could be way too generic if the user defines their own variable name.

For the recommended name of the variable that holds the initialized class, I'm a fan of automl, searcher or autosearch/autosearcher over search, even if we did go with run(). (Did a brief google search and automl or aml pops up for some other libraries out there, for what its worth).

@jeremyliweishih im not sure I understand your point about search(). i think the goal would be to pick a descriptive verb for that method. can you clarify what you were saying?

Hopefully I'm not misunderstanding but I think @jeremyliweishih is saying search as the recommended name of the variable is a little weird because it's a verb... at least, that was also the feeling I got :D

@kmax12 I meant that comment about the recommended name of the variable: search = AutoClassifierSearch(). To clarify I think AutoClassifierSearch() for the class name would be great and autosearch could work for the variable name. autosearch = AutoClassifierSearch(). Maybe another three letter variable name could work too (aml, asc, ats).

@jeremyliweishih i understand what you were saying now. i agree.

Based on the above, I think we all like renaming toAutoClassifierSearch and AutoRegressorSearch.

In terms of the variable name we're down too

autosearch = AutoClassifierSearch()
automl = AutoClassifierSearch()
aml = AutoClassifierSearch()

I personally like automl since it is descriptive. aml is similar if you know what it stands for, but I don't think it's that intuitive. this is what I've seen H20 uses in their docs (are there other libraries?) I'm generally not a fan of shortening variable names without a really good reason. The only advantage i see to aml is that it is 3 characterss instead of 6.

autosearch is make sense to me since it basically the class name. however, it is longer than automl and i'm not sure it adds clarity. If we were to go the autosearch route, I think the method should be run, since autosearcher = AutoClassifierSearch(); autosearcher.search() just seems like overkill on the search haha.

So I see two options based on the above convo

automl = AutoClassifierSearch()
automl.search()

or

autosearch = AutoClassifierSearch()
autosearch.run()

Think either would work but I i like .search() over .run()!

I like all the ideas you've thrown out. I'm a particular fan of:

automl = AutoClassificationSearch()
automl = AutoRegressionSearch()
automl.search()

Not to complicate an already lengthy discussion with more options, but: should we say AutoRegressorSearch or AutoRegressionSearch? I'm partial to the latter. I don't hear the term "regressor" nearly as much as I hear "regression model," and similarly for classification

Yep, I agree. Regressor can also refer to the input features, variables, etc, so regression is better.

In addition, if AutoClassification and AutoRegression are more analogous to "auto machine learning", since machine learning, classification, and regression are all processes.

unless there's any final opposition, we'll go with

automl = AutoClassificationSearch()
automl = AutoRegressionSearch()
automl.search()
Was this page helpful?
0 / 5 - 0 ratings

Related issues

dsherry picture dsherry  ·  4Comments

dsherry picture dsherry  ·  3Comments

dsherry picture dsherry  ·  3Comments

dsherry picture dsherry  ·  3Comments

dsherry picture dsherry  ·  5Comments