Numpy: np.ceil and np.floor are inconsistent with math.ceil and math.floor

Created on 7 May 2017  ·  10Comments  ·  Source: numpy/numpy

In Python 3, floor and ceil changed to return integers, rather than floats:

>>> math.floor(1.5), math.ceil(1.5)
(1, 2)
>>> np.floor(1.5), np.ceil(1.5)
(1.0, 2.0)   # also the output for the first case on 2.7

Should we update these function in numpy to also return integer types?

I think the deprecation path would be

  • add f->i loops to the ufunc, so that np.floor(1.5, dtype=int) is possible
  • on python 3, start FutureWarning on np.floor(1.5) with no dtype
  • on python 3, switch the default dtype to int
numpy.core

Most helpful comment

As in #5700 I vote for returning ints. Especially now that numpy is a bit picky with requiring ints for indices (which is good IMO), having to cast the return value of float is quite annoying (and gratuitiously requires another intermediate buffer).

All 10 comments

That's a pretty major change which we might want to put off until we drop support fot 2.7. Note that the C functions return floats, and the C functions are what numpy uses. OTOH, integers are usually what folks want and I've considered adding ifloor and iceil functions as integer versions in the past. Note that Python has an advantage here as the integer type has unlimited precision.

What would happen for np.floor([1.5, 1e300])? Python int type can be arbitrarily long

What would happen for np.floor([1.5, 1e300])

The same thing as what x = np.array([1.5, 1e300]); np.int64(x) does would make sense. Both should probably give a warning.

The workaround would be to explicitly request a float, with np.floor(..., dtype=float)

The workaround would be to explicitly request a float, with np.floor(..., dtype=float)

I don't think this works given how ufunc dispatch is handled. If you had a float,float->int loop then saying dtype=float would cast to int and then cast back to float, which is no use. But I think that's the only way currently to get the int-by-default behavior. You'd need like... Two versions of the loop that differ only in output type, and then some way to make sure that the integer version has higher "priority" than the float version. Or something. (Actually for all I know that's possible by carefully arranging the lookup table in the right order; the loop dispatch code is tricky. But it doesn't seem like something to rely on.)

I vote for documenting that np.ceil and np.floor returns float and be done with it. I can make a pull request if you agree.

As in #5700 I vote for returning ints. Especially now that numpy is a bit picky with requiring ints for indices (which is good IMO), having to cast the return value of float is quite annoying (and gratuitiously requires another intermediate buffer).

There is also some surprising behaviour with the Python 3 math.ceil and math.floor functions, found via HypothesisWorks/hypothesis#1667:

>>> np.dtype('int64').type(2 ** 53 + 1)
9007199254740993
>>> math.ceil(_)
9007199254740992

Just as not all floats are representable as integers, 64-bit integers can be unrepresentable as floats - and there's an intermediate cast in the __ceil__ method. I don't have a proposed solution for this and it wasn't hard to work out what's happening, but even knowing that np.ceil returns a float this surprised me.

Found this while looking at old issues: can we avoid breaking backward compatibility by adding new f->i loops so that at least one can select integer output? (i.e., the opposite of the dtype=float suggestion above).

Perhaps it would be better to add new functions, iceil, ifloor, etc. That floats eventually develop integer sized gaps could be a problem, as already pointed out. I think that dealing with that could be tricky if we want to do a bit better in the integer domain.

@charris - yes, perhaps that is best. It could then extent to iround, ifix, etc., too

Was this page helpful?
0 / 5 - 0 ratings