Currently we only check that if there is an estimator it is the last component.
However, the pipeline breaks on fit on the following line without an estimator:
self.input_feature_names.update({self.estimator.name: list(pd.DataFrame(X_t))})
We should either enforce that all pipelines must have an estimator or fix _fit
to allow this case.
I filed #273 to track discussing the long-term plan around this (i.e., do we want to support pipelines with more than one estimator? etc.).
My suggestion: we continue discussion there, but in the meantime, we resolve this ticket by having PipelineBase::__init__
error out if an estimator is not specified as the final component. Does that seem reasonable?
Tagging @angela97lin @jeremyliweishih because we were just discussing this in slack :)
that seems like a reasonable solution for now until we have the long term plan