np.array([0.25, np.array([0.3])])
will fail to create a float array, the dtype will be object.
This seems wrong to me, is it intentional?
Cc @nschloe, who is against this type of thing and can link to their issue
Thanks for CCing. Yep, this really looks like something I've been going on about. :smiley_cat: https://github.com/numpy/numpy/issues/10404
I would argue that creating an object array is the correct thing here. After all, you're putting a float and an array together. If you created a Python list,
[0.25, np.array([0.3])]
you'd expect the same thing: The first entry is a float, the second an array of length 1. It would be confusing if lists and np.arrays behaved differently here.
Also, implicitly creating a dtype float array here would make it impossible to ever create a [float, vector[1]]
array even if I wanted to.
Most of the time, specifying np.array([0.25, np.array([0.3])])
is done by mistake, and can easily be fixed; see, e.g., https://github.com/scipy/scipy/pull/11147/files#diff-21a6a0b0d89357857304bfba2da5a971L321. After all,
Explicit is better than implicit.
OK, closing. That PR would have made the recent NEP 34 changes (since reverted) less disruptive to scipy.
@nschloe
implicitly creating a dtype float array here would make it impossible to ever create a [float, vector[1]] array even if I wanted to.
For the record, you could do np.array(0.3, np.array([0.3]), dtype=object)
.
If you created a Python list ...
NumPy ndarrays are very different from python lists. I would have no expectation that
np.array([0.2, 0.3, 0.4])
would create an object array, even though I did not specify np.float64
for the dtype. So we agree we are comfortable with some level of automatic, value-based dtype discovery. The question is what should gain precedence: numeric types or object types.
So we agree we are comfortable with some level of automatic, value-based dtype discovery.
The "automation" is quite clear here I think: Always get the "lowest" data type that can capture all input values:
numpy.array([1, 2]).dtype # int64
numpy.array([1, numpy.array(2)]).dtype # int64, array of rank 0 are basically scalars
numpy.array([1.0, 2]).dtype # float64
numpy.array([1, [2]]).dtype # O