Numpy: Issue with concatenating structured arrays (segmentation fault) (Trac #2084)

Created on 20 Oct 2012  ·  7Comments  ·  Source: numpy/numpy

_Original ticket http://projects.scipy.org/numpy/ticket/2084 on 2012-03-19 by @astrofrog, assigned to unknown._

The following demonstrates the issue - even though Numpy knows how to concatenate the fields of a structured array, it crashes if one tries to concatenate the structured arrays themselves:

In [1]: import numpy as np

In [2]: d1 = np.array(zip(['a','b','c']), dtype=[('b', '|S1')])

In [3]: d2 = np.array(zip(['aa','bb','cc']), dtype=[('b', '|S2')])

In [4]: np.hstack([d1['b'],d2['b']])
Out[4]: 
array(['a', 'b', 'c', 'aa', 'bb', 'cc'], 
      dtype='|S2')

In [5]: np.hstack([d1, d2])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/Users/tom/<ipython-input-5-bd5cc420043d> in <module>()
----> 1 np.hstack([d1, d2])

/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/core/shape_base.pyc in hstack(tup)
    268 
    269     """
--> 270     return _nx.concatenate(map(atleast_1d,tup),1)
    271 

TypeError: invalid type promotion

A similar issue occurs with floating-point values of different endian-ness:

In [1]: import numpy as np

In [2]: d1 = np.array(zip([1,2,3]), dtype=[('a', '<f4')])

In [3]: d2 = np.array(zip([1,2,3]), dtype=[('a', '>f4')])

In [4]: np.hstack([d1['a'],d2['a']])
Out[4]: array([ 1.,  2.,  3.,  1.,  2.,  3.], dtype=float32)

In [5]: np.hstack([d1, d2])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/Users/tom/<ipython-input-5-bd5cc420043d> in <module>()
----> 1 np.hstack([d1, d2])

/Users/tom/Library/Python/2.7/lib/python/site-packages/numpy/core/shape_base.pyc in hstack(tup)
    271     # As a special case, dimension 0 of 1-dimensional arrays is "horizontal"
    272     if arrs[0].ndim == 1:
--> 273         return _nx.concatenate(arrs, 0)
    274     else:
    275         return _nx.concatenate(arrs, 1)

TypeError: invalid type promotion

In some cases, this can even cause segmentation faults, though I have yet to find a way to reproduce this consistently.

00 - Bug numpy.core

Most helpful comment

@charris: Any chance this issue could be reopened? At least in the case where the field names all match and are in the same order, I think it is clear that the result should have these same field names in this order and each field should use the result of promoting the corresponding types.

All 7 comments

_trac user lcampagn wrote on 2012-09-24_

I can confirm this bug and reproduce the segmentation fault:

>>> import numpy as np
>>> a = np.empty(1, dtype=[('x', object)])
>>> b = np.empty(1, dtype=[('x', float)])
>>> np.concatenate([a,b])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: invalid type promotion
>>> np.concatenate([a,b])
Segmentation fault (core dumped)

I've run into this crash in a few different instances.
I can cause this crash in both Linux (1.6.1-6ubuntu1) and on Windows.
I have not been able to generate useful stack traces--it would appear this bug causes some stack corruption.

Title changed from Issue with concatenating structured arrays to Issue with concatenating structured arrays (segmentation fault) by trac user lcampagn on 2012-09-24

I think this has to do with the TODO at numpy/core/src/multiarray/convert_datatype.c:1122:

/* TODO: Also combine fields, subarrays, strings, etc */

At least, this is the ultimate failure in this issue.

I've not been able to reproduce the segfault.

I've also dug into this enough to realize that the ultimate issue here is asking NumPy to combine arbitrary structured arrays is more complicated than asking it to combine strings. This issue presents a case where the combination is obvious to the eye, but it's not clear how NumPy should merge arbitrary structured arrays (assume fields with the same name are to be combined? take the union of what remains?).

Since I'm new to this project, I'll leave this open for now, and try to find somewhere else to direct my efforts :-P.

Well, it should not segfault. We should raise an error until something if figured out.

Yeah--it is raising an error now. I couldn't get it to segfault.

@charris: Any chance this issue could be reopened? At least in the case where the field names all match and are in the same order, I think it is clear that the result should have these same field names in this order and each field should use the result of promoting the corresponding types.

Was this page helpful?
0 / 5 - 0 ratings