Evalml: Warning messages in unit test: "invalid value encountered in double_scalars" and others

Created on 9 Jan 2020 · 3Comments · Source: alteryx/evalml

Problem
Running locally, (on python 3.8, but I've seen similar on other python versions):

(featurelabs) ➜  evalml git:(master) pytest -v evalml/tests/automl_tests/test_autobase.py::test_pipeline_limits
====================================================================== test session starts ======================================================================
platform darwin -- Python 3.8.0, pytest-4.4.1, py-1.8.0, pluggy-0.13.1 -- /Users/dsherry/anaconda/envs/featurelabs/bin/python
cachedir: .pytest_cache
rootdir: /Users/dsherry/development/aie/featurelabs/evalml, inifile: setup.cfg
plugins: xdist-1.26.1, cov-2.6.1, nbval-0.9.3, forked-1.1.3
collected 1 item

evalml/tests/automl_tests/test_autobase.py::test_pipeline_limits PASSED       [100%]

======================================================================= warnings summary ========================================================================
evalml/tests/automl_tests/test_autobase.py::test_pipeline_limits
evalml/tests/automl_tests/test_autobase.py::test_pipeline_limits
...
  /Users/dsherry/anaconda/envs/featurelabs/lib/python3.8/site-packages/sklearn/metrics/classification.py:1436: UndefinedMetricWarning:

  Precision is ill-defined and being set to 0.0 due to no predicted samples.
...
  /Users/dsherry/anaconda/envs/featurelabs/lib/python3.8/site-packages/sklearn/metrics/classification.py:1436: UndefinedMetricWarning:

  F-score is ill-defined and being set to 0.0 due to no predicted samples.
...
  /Users/dsherry/anaconda/envs/featurelabs/lib/python3.8/site-packages/sklearn/metrics/classification.py:872: RuntimeWarning:

  invalid value encountered in double_scalars
...
  /Users/dsherry/development/aie/featurelabs/evalml/evalml/automl/auto_base.py:307: RuntimeWarning:

  invalid value encountered in double_scalars

Three warnings coming from sklearn and one coming from our code. I seem to get a slightly different combination/order of warnings every time I run the test.

More info
Here's the line the last warning is coming from, in AutoBase._add_result:
high_variance_cv = (scores.std() / scores.mean()) > .2

I suspect scores is all empty or all 0. But why? This is the next thing to look into. Perhaps empty test data when we're scoring the model?

My suspicion is that this dataset is too small or too uniform, and that the models trained on it are predicting all the same value or something like that. If I'm right, this reinforces that we need guard rails to detect this problem when the user uploads their data, and that mocking in the unit tests to avoid actual fitting (#275 ) is important (even if this particular test isn't mockable).

I encountered these warnings while debugging bug #167 , so it's possible this is related to that.

Next steps
We should determine why these warnings are showing up. If it's a problem with the test setup, let's change the test to fix or avoid it. Otherwise, it could be a bug. We shouldn't be printing out warnings like this under normal use anyways.

bug

Source

dsherry

All 3 comments

This doesn't seem to show up in master after #445 was merged in. Test for 3.8 can be seen here. @dsherry not too sure why it disappeared with objectives merged in but should I close for now?

jeremyliweishih on 14 Apr 2020

@jeremyliweishih hm, weird! Yeah I don't see that particular warning about double_scalar. Perhaps #445 shuffled the unit tests around in just the right way.

I do see this in the circleci job you linked to:

=============================== warnings summary ===============================
evalml/utils/gen_utils.py:98
  /home/circleci/evalml/evalml/utils/gen_utils.py:98: RuntimeWarning: invalid value encountered in true_divide
    conf_mat = conf_mat.astype('float') / conf_mat.sum(axis=0)

test_python/lib/python3.8/site-packages/numpy/core/_methods.py:38
  /home/circleci/evalml/test_python/lib/python3.8/site-packages/numpy/core/_methods.py:38: ComplexWarning: Casting complex values to real discards the imaginary part
    return umr_sum(a, axis, dtype, out, keepdims, initial, where)

Let's get rid of those, yeah? Could be covering up bugs.

Suggestion for the first: add "try except RuntimeWarning as e: assert False, e" in evalml/evalml/utils/gen_utils.py:98, run that on circleci and see where it breaks. For the second one, not sure. Maybe there's a way to have unit tests fail if they throw warnings?

dsherry on 14 Apr 2020

👍1

@dsherry @jeremyliweishih I talked to Jeremy about this but I think the PR I'm currently doing takes care of the second warning! :) (#638)

angela97lin on 14 Apr 2020

🚀1

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Disallow recall as an automl objective

dsherry · 5Comments

Have automl auto-fit the best pipeline on entire training data

dsherry · 3Comments

build_conda_pkg failing on main

dsherry · 3Comments

Imputer cannot fit when there is None in a categorical or boolean column

freddyaboulton · 3Comments

Error when using recall as primary automl objective

npapan69 · 4Comments