Numpy: Funky vstack

Created on 22 Jul 2020  ·  7Comments  ·  Source: numpy/numpy

From vstack docs:

Signature: hstack(tup)
Docstring:
Stack arrays in sequence horizontally (column wise).

This is equivalent to concatenation along the second axis, except for 1-D
arrays where it concatenates along the first axis.

My question is why the exception???
We already have concatenate().

Reproducing code example:

v = rand(5)
concatenate((v, v)).shape
(10,)

hstack((v, v)).shape
(10,)

stack((v, v), axis=1).shape
(5, 2)

vstack((v, v)).shape
(2, 5)

stack((v, v), axis=0).shape
(2, 5)

It would make much more sense to have vstack = stack(axis=1).

33 - Question

All 7 comments

We don't encourage hstack/vstack/dstack, and we do specifically encourage np.stack instead. However, the difference is if you pass in a 2-D array. Some of the stack functions insert new dimensions (for certain incarnations), I guess hstack does not.

Adding a last sentence stating that either np.concatenate or np.stack are the preferred API (usually) would be fine, but I do not think we have real aspirations in deleting the functions.

My question is _why the exception_???

Just because @seberg didn't directly answer this:

  • someone long ago thought it was a good idea
  • now it's too late to change it without breaking everyone

1D arrays are generally treated as "horizontal" in numpy rather than "vertical". For example, when broadcasting an (N, N) 2D array with an (N,) 1D array, the 1D array gets broadcasted to (1, N), not (N, 1). hstack()/vstack()/dstack() aren't built around concepts of constant axes (you can use stack() if you want that) but concepts of "horizontal/vertical/depth" which don't map neatly onto fixed axes for all array dimensionalities.

I'm -1 on language discouraging hstack()/vstack()/dstack(), per se. I _still_ think they are good and useful _because_ of the exceptions in their semantics. They capture concepts that aren't captured concisely by stack().

For example, there's a common need to prepend or append a scalar value to a 1D array. np.hstack([0.0, some_vector]) works great for this. np.stack([0.0, some_vector]) and np.concatenate([0.0, some_vector]) balk because they don't have the same dimensionality.

True, these tools are simply a bit related to working in a given context I think. So lets just close this. We have the generic tools without "funky" behaviour, and the others are there to stay for when they are good utilities.

@rkern, then concatenate needs fixing, IMHO. Stacking means concatenate after adding an extra dim, which is not respected here. But, okay…

@seberg, seems like a deprecation warning would be welcome here. A 1D array is not horizontal nor vertical. And hstack falls back to concatenate, forgetting to add an extra dimension. This indeed looks "funky". But if you're not up to discussing, well…

All other ?stack functions also add extra dims only when the dims are lower then what is needed to do the operation right now. In this case that encompasses only 0d.

@Atcold I am open for suggestions to nudge users to stack, but probably within the bounds of not actively scaring them away from using these functions.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Foadsf picture Foadsf  ·  3Comments

inducer picture inducer  ·  3Comments

keithbriggs picture keithbriggs  ·  3Comments

manuels picture manuels  ·  3Comments

Kreol64 picture Kreol64  ·  3Comments