Scikit-learn: 在管道上的网格搜索期间出现错误,变压器步骤无

创建于 2020-11-12  ·  3评论  ·  资料来源: scikit-learn/scikit-learn

描述错误

当对具有None的管道执行网格搜索时,会引发AttributeError 。 下面的这个片段以前用scikit-learn==0.23.2成功运行,但不再适用于0.24.dev0

重现的步骤/代码

from sklearn.datasets import load_iris
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.svm import SVC

iris = load_iris()
X, y = iris.data, iris.target

pipe = Pipeline([("setup", None), ("svc", SVC(kernel="linear", random_state=0))])

param_grid = [
    {"svc__C": [0.1, 0.1]},
    {"setup": [StandardScaler()]},
]

gs = GridSearchCV(pipe, param_grid=param_grid, return_train_score=True, cv=3)
gs.fit(X, y)

预期成绩

示例:没有抛出错误。 请粘贴或描述预期的结果。

GridSearchCV.fit调用能够成功完成

实际结果

请粘贴或具体描述实际输出或回溯。

引发了以下错误(我已经进一步包含了完整的回溯):

  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/base.py", line 863, in _is_pairwise
    pairwise_tag = estimator._get_tags().get('pairwise', False)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/base.py", line 348, in _get_tags
    more_tags = base_class._more_tags(self)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/pipeline.py", line 626, in _more_tags
    estimator_tags = self.steps[0][1]._get_tags()
AttributeError: 'NoneType' object has no attribute '_get_tags'

当将_is_pairwise检查应用于带有None的步进变压器的管道时,它似乎无法按预期工作。


完整追溯:

Traceback (most recent call last):
  File "test-pipeline.py", line 18, in <module>
    gs.fit(X, y)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/utils/validation.py", line 60, in inner_f
    return f(*args, **kwargs)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/model_selection/_search.py", line 841, in fit
    self._run_search(evaluate_candidates)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/model_selection/_search.py", line 1288, in _run_search
    evaluate_candidates(ParameterGrid(self.param_grid))
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/model_selection/_search.py", line 795, in evaluate_candidates
    out = parallel(delayed(_fit_and_score)(clone(base_estimator),
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/parallel.py", line 1048, in __call__
    if self.dispatch_one_batch(iterator):
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/parallel.py", line 866, in dispatch_one_batch
    self._dispatch(tasks)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/parallel.py", line 784, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
    return [func(*args, **kwargs)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/joblib/parallel.py", line 262, in <listcomp>
    return [func(*args, **kwargs)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/utils/fixes.py", line 222, in __call__
    return self.function(*args, **kwargs)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/model_selection/_validation.py", line 585, in _fit_and_score
    X_train, y_train = _safe_split(estimator, X, y, train)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/utils/metaestimators.py", line 198, in _safe_split
    if _is_pairwise(estimator):
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/base.py", line 863, in _is_pairwise
    pairwise_tag = estimator._get_tags().get('pairwise', False)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/base.py", line 348, in _get_tags
    more_tags = base_class._more_tags(self)
  File "/Users/james/miniforge3/envs/dask-ml/lib/python3.8/site-packages/sklearn/pipeline.py", line 626, in _more_tags
    estimator_tags = self.steps[0][1]._get_tags()
AttributeError: 'NoneType' object has no attribute '_get_tags'

版本

System:
    python: 3.8.6 | packaged by conda-forge | (default, Oct  7 2020, 18:42:56)  [Clang 10.0.1 ]
executable: /Users/james/miniforge3/envs/dask-ml/bin/python3.8
   machine: macOS-10.15.5-x86_64-i386-64bit

Python dependencies:
          pip: 20.2.4
   setuptools: 49.6.0.post20201009
      sklearn: 0.24.dev0
        numpy: 1.19.4
        scipy: 1.5.3
       Cython: None
       pandas: 1.1.4
   matplotlib: None
       joblib: 0.17.0
threadpoolctl: 2.1.0

Built with OpenMP: True
Blocker Bug

最有用的评论

我会将其标记为阻止程序,因为错误不仅会在使用None ,而且在使用任何没有_get_tags属性的步骤时才会出现(可能是因为它没有继承自BaseEstimator )

所有3条评论

感谢@jrbourbeau的报告,我们可以复制。 如果您有兴趣,我们正在研究上面链接的不同问题的最佳解决方案

我会将其标记为阻止程序,因为错误不仅会在使用None ,而且在使用任何没有_get_tags属性的步骤时才会出现(可能是因为它没有继承自BaseEstimator )

由 #18797 修复。 感谢及时的错误报告@jrbourbeau

此页面是否有帮助?
0 / 5 - 0 等级