Numpy: Enh: Object array creation function

Created on 2 Jun 2015 · 4Comments · Source: numpy/numpy

As discussed in issue #5303, currently it is not possible to create arrays of object dtype containing equal-length sequences, since the sequence is automatically read in as array elements. There is a suggestion to only do this for lists, but this would be a major backwards compatibility break and would require a long deprecation period.

Another approach would be to have a function explicitly for creating arrays with an object dtype. Perhaps this could be called "objectarray". The default for this function would be to take in a sequence, and consider each element of the sequence as an element in a 1D object array.

The function, however, could have an optional "ndim" or "depth" argument, that could be used to specify how many levels of the sequence should be considered part of the array. This would default to 0 (only the outermost level is considered). This would raise an exception if the dimensions don't match.

Note that this approach is not mutually exclusive with the alternative, but has the advantage that it wouldn't break backwards-compatibility.

So for example:

>>> arr = objectarray([((1, 2, 3), (4, 5, 6)), ((7, 8, 9), (10, 11, 12))])
>>> arr
array([((1, 2, 3), (4, 5, 6)), ((7, 8, 9), (10, 11, 12))], dtype=object)
>>> arr.shape
(2,)

>>> arr = objectarray([((1, 2, 3), (4, 5, 6)), ((7, 8, 9), (10, 11, 12))], depth=1)
>>> arr
array([[(1, 2, 3), (4, 5, 6)],
       [(7, 8, 9), (10, 11, 12)]], dtype=object)
>>> arr.shape
(2, 2)

>>> arr = objectarray([((1, 2, 3), (4, 5, 6)), ((7, 8, 9), (10, 11, 12))], depth=2)
>>> arr
array([[[1, 2, 3],
        [4, 5, 6]],

       [[7, 8, 9],
        [10, 11, 12]]], dtype=object)
>>> arr.shape
(2, 2, 3)

Source

toddrjen

Most helpful comment

Hope I'm not missing anything, but it seems to me that a ndmax argument would not only solve the problem reported ("_create arrays of object dtype containing equal-length sequences_"), but also bring performance gains in those cases in which e.g. the last object in the input is not a list (or is a list with different length). Also see this question.

toobaz on 4 Dec 2017

👍3

All 4 comments

I think the easiest way to get equal sized lists into an object array is in two steps:

>>> a = empty((2,), dtype=np.object)
>>> a[:] = [[1,2,3],[4,5,6]]

>>> b = empty((2,3), dtype=np.object)
>>> b[:] = [[1,2,3],[4,5,6]]

Probable an implementation of objectarray would work like this.

ahaldane on 18 Jul 2015

Yes, that is currently the best way, but it is needlessly verbose. Hence this idea.

I would hope that an implementation of this idea would simply be able to bypass the automatic conversion used in the array function and substitute its own to the ndarray constructor.

toddrjen on 21 Jul 2015

toobaz on 4 Dec 2017

👍3

Any progress or on or plans to implement ndmax? What I'm doing right now:

np.array([*data, None])[:-1]

# This would look a lot cleaner:
np.array(data, ndmax=1)

bergkvist on 14 Jun 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Invalid data and segfault when reading past the size of file with fromfile on Ubuntu 16.04

amuresan · 4Comments

ndarray does not accept memoryview for buffer= under python2

navytux · 4Comments

A new tremandous BUG in Advanced Indexing?

Levstyle · 3Comments

RMS (root mean squared error) calculation is missing from numpy package

dcsaba89 · 3Comments

np.conj vs np.conjugate

astrofrog · 4Comments