Numpy: BUG: dimension discovery fails when mixing scalars and shape==(1,) arrays

Created on 9 Dec 2019  ·  5Comments  ·  Source: numpy/numpy

np.array([0.25, np.array([0.3])]) will fail to create a float array, the dtype will be object.

This seems wrong to me, is it intentional?

00 - Bug numpy.dtype

All 5 comments

Cc @nschloe, who is against this type of thing and can link to their issue

Thanks for CCing. Yep, this really looks like something I've been going on about. :smiley_cat: https://github.com/numpy/numpy/issues/10404

I would argue that creating an object array is the correct thing here. After all, you're putting a float and an array together. If you created a Python list,

[0.25, np.array([0.3])]

you'd expect the same thing: The first entry is a float, the second an array of length 1. It would be confusing if lists and np.arrays behaved differently here.

Also, implicitly creating a dtype float array here would make it impossible to ever create a [float, vector[1]] array even if I wanted to.

Most of the time, specifying np.array([0.25, np.array([0.3])]) is done by mistake, and can easily be fixed; see, e.g., https://github.com/scipy/scipy/pull/11147/files#diff-21a6a0b0d89357857304bfba2da5a971L321. After all,

Explicit is better than implicit.

OK, closing. That PR would have made the recent NEP 34 changes (since reverted) less disruptive to scipy.

@nschloe

implicitly creating a dtype float array here would make it impossible to ever create a [float, vector[1]] array even if I wanted to.

For the record, you could do np.array(0.3, np.array([0.3]), dtype=object).

If you created a Python list ...

NumPy ndarrays are very different from python lists. I would have no expectation that
np.array([0.2, 0.3, 0.4]) would create an object array, even though I did not specify np.float64 for the dtype. So we agree we are comfortable with some level of automatic, value-based dtype discovery. The question is what should gain precedence: numeric types or object types.

So we agree we are comfortable with some level of automatic, value-based dtype discovery.

The "automation" is quite clear here I think: Always get the "lowest" data type that can capture all input values:

numpy.array([1, 2]).dtype   # int64
numpy.array([1, numpy.array(2)]).dtype  # int64, array of rank 0 are basically scalars
numpy.array([1.0, 2]).dtype   # float64
numpy.array([1, [2]]).dtype   # O
Was this page helpful?
0 / 5 - 0 ratings