In #252 , @angela97lin fixed a bug which went undiscovered because our unit test coverage of automl fit
was configured to not raise errors (specifically, the parameter raise_errors
currently defaults to false). Her PR updated the unit tests to set raise_errors
to true in each call to automl fit
in the tests.
I wanted to create a ticket to discuss this a bit further. I'm confused about the raise_errors
parameter. Why does it exist? Why does it only apply to automl fit
? If it is a necessary parameter, is there a better design, and in particular one which would discourage bugs like this? I'm concerned that if we rely on the team to remember to set raise_errors
to true in the unit tests, we could run into similar issues again.
Ideas: I know @kmax12 mentioned using an environment variable for this. If this parameter is necessary but my unit testing concerns is valid, this could be a good solution. We could also update all the tests to use a test fixture which somehow makes sure raise_errors
is set appropriately, either via env var, other global config, or by wrapping the automl fit
method.
I think you raise a lot of valid points here! My gut assumption / guess with the raise_error
flag is that if just one pipeline fails, a user may not want their entire run of Auto(*) to fail. So instead, we (quietly) don't raise the error and instead just set all of the scores to NaN for that pipeline.
I agree though, it would be too easy to slip up, forget to set this flag in newer tests, and miss bugs like this one again, and this warrants further discussion!
This one came up in the recent usability blitz. The decision we reached was to keep raise_errors
but default it to true.
@angela97lin is working on a PR for this: #638
@angela97lin can you please move this to in progress since you have a PR open for it?
Most helpful comment
This one came up in the recent usability blitz. The decision we reached was to keep
raise_errors
but default it to true.