Numpy: BUG: PyArray_BufferConverter is unsafe

Created on 27 Jan 2020  ·  5Comments  ·  Source: numpy/numpy

This function:

  1. Calls PyObject_GetBuffer
  2. Extracts a data pointer
  3. Calls PyBuffer_Release
  4. Returns the pointer to the caller

PyBuffer_Release calls PyBufferProcs.bf_releasebuffer(PyObject *exporter, Py_buffer *view), which according to the docs may _"free all memory associated with view"_, and gives no requirement that view remains alive as long as exporter.

So in principle a type that allocates a buffer for itself on the fly in bf_getbuffer and deletes it when the last uses releases it will cause a use-after-free in numpy.

I don't know if any implementers of the buffer protocol actually do this, but my reading of it is that they are permitted to.

00 - Bug

All 5 comments

This is strange. I think I would like if the buffer interface specified that while the buffer struct fields (such as strides, etc.) get free'd by PyBuffer_Release the actual memory pointed to must be owned by the original object (this would solve our problem with the very annoying _dealloc_cached_buffer_info). I think this is the intention and true, but...

Now this function returns a PyArray_Chunk and not a buffer, so if you make the clarification that I say above, it is actually completely fine.

Your interpretation sounds more reasonable - perhaps a pull request against python/cpython would be wise, to update the docs to tell people not to implement what I described above.

Was going to open, but then went to lunch first: I may look into proposing actually changes to the text. If we can agree on my interpretation, we can clean up very ugly and very slow code in the buffer protocol implementation (it currently slows down scalar math by 20+%). (EDIT: But I suppose only after Python fixes their ArgParse code :()

That adds some clarity, thanks - I hadn't realized that python was using the same reading as bf_releasebuffer as I was.

Damn, I had overread that before (from the PEP):

Exporters will need to define a bf_releasebuffer function if they can re-allocate their memory, strides, shape, suboffsets, or format variables which they might share through the struct bufferinfo.

Which reads like it is valid to give the buffer a new chunk of memory :(.

Although: looking at the implementation of array (python one), all it does is use it to ensure that it disables resizing if a buffer to it exists.

Was this page helpful?
0 / 5 - 0 ratings