Pyradiomics: Memory Error during Feature Extraction (for many images or large mask)

Created on 27 Sep 2017 · 16Comments · Source: AIM-Harvard/pyradiomics

When I tried to extract features on many images in a loop using PyRadiomics I ran into a memory error. I was able to reproduce two types of memory errors with a simple script and just one image and mask, repeatedly calling the feature extraction on the same image.

I put the code to reproduce the error here on gist
You can find the test image data to run the script (as well as the script file itself again) here

There are two different error cases:

Trying to extract features from a rather large mask (-> "MemoryError" in numpy)
Trying to extract features from small masks for many images in a loop (-> "Failed to allocate memory for image" in SimpleITK)

You can reproduce them by (un)commenting on of the lines defining which mask to use (see also the top explanatory comment in the script).

Case 1 error details:

Traceback (most recent call last):
  File "d:/Test/PyRadiomicsMemoryExceptionTest.py", line 26, in <module>
    featureVector = extractor.execute(testImage, testMask, label = 1)
  File "C:\Python36\lib\site-packages\pyradiomics-1.2.0.post25.dev0+g13274ff-py3.6-win32.egg\radiomics\featureextractor.py", line 354, in execute
    shapeClass = self.featureClasses['shape'](croppedImage, croppedMask, **self.settings)
  File "C:\Python36\lib\site-packages\pyradiomics-1.2.0.post25.dev0+g13274ff-py3.6-win32.egg\radiomics\shape.py", line 62, in __init__
    physicalCoordinates -= numpy.mean(physicalCoordinates, axis=0)  # Centered at 0
  File "C:\Python36\lib\site-packages\numpy\core\fromnumeric.py", line 2909, in mean
    out=out, **kwargs)
  File "C:\Python36\lib\site-packages\numpy\core\_methods.py", line 54, in _mean
    arr = asanyarray(a)
  File "C:\Python36\lib\site-packages\numpy\core\numeric.py", line 583, in asanyarray
    return array(a, dtype, copy=False, order=order, subok=True)
MemoryError

Case 2 error details:

Traceback (most recent call last):
  File "d:/Test/PyRadiomicsMemoryExceptionTest.py", line 27, in <module>
    featureVector = extractor.execute(testImage, testMask, label = 1)
  File "C:\Python36\lib\site-packages\pyradiomics-1.2.0.post25.dev0+g13274ff-py3.6-win32.egg\radiomics\featureextractor.py", line 346, in execute
    featureVector.update(self.getProvenance(imageFilepath, maskFilepath, mask))
  File "C:\Python36\lib\site-packages\pyradiomics-1.2.0.post25.dev0+g13274ff-py3.6-win32.egg\radiomics\featureextractor.py", line 440, in getProvenance
    for k, v in six.iteritems(generalinfoClass.execute()):
  File "C:\Python36\lib\site-packages\pyradiomics-1.2.0.post25.dev0+g13274ff-py3.6-win32.egg\radiomics\generalinfo.py", line 56, in execute
    generalInfo[el] = getattr(self, 'get%sValue' % el)()
  File "C:\Python36\lib\site-packages\pyradiomics-1.2.0.post25.dev0+g13274ff-py3.6-win32.egg\radiomics\generalinfo.py", line 139, in getVolumeNumValue
    ccif.Execute(labelMap)
  File "C:\Python36\lib\site-packages\SimpleITK\SimpleITK.py", line 20584, in Execute
    return _SimpleITK.ConnectedComponentImageFilter_Execute(self, *args)
RuntimeError: Exception thrown in SimpleITK ConnectedComponentImageFilter_Execute: c:\d\vs14-win32-pkg\simpleitk-build\itk-prefix\include\itk-4.11\itkImportImageContainer.hxx:199:
Failed to allocate memory for image.

bug installation

Source

michaelschwier

All 16 comments

@JoostJM let us know if you have any idea what is going on, or if we should investigate further.

fedorov on 27 Sep 2017

@michaelschwier, what kind of hardware are you using? Specifically, how much RAM did you have available when running the script?
I tested your script on my computer (Intel Xeon E3-1241, 16 GB RAM) and had no issues with your large mask (ran 11 iterations, memory fluctuates, max need about 2 GB). I'm also running the small mask script (~600 iterations now, still requires only about 200 MB of RAM), but this also appears to be running fine.

It is possible there isn't enough RAM available to run pyradiomics. We already incorporate enhancements to reduce the memory footprint when extracting features, such as cropping on the bounding box of the segmentation prior to feature extraction. However, generating texture matrices, especially when using large masks, simply requires a lot of memory. To further check, I will run a memory profiling over time of your script.

Your small mask poses an interesting case though. It is possible that this crashes because later because the RAM is enough for the first few iterations, but runs out when the results vector grows too large (even though this vector is relatively small compared to the overall memory usage of pyradiomics.

As to solutions. If it already fails on the first connected component image filter (as is the case in your large mask case), I'm not sure what to do. You can remove that part of the code by disabling the additional info (parameter additionalInfo set to False), but I think it will fail in some other part of the code (the most heavily memory intensive functions are the generation of texture matrices).
For your small mask case, check out the batch script contained in the examples. This script writes out the results of each case (by appending to a file), thereby preventing a build-up of memory usage when extracting a large batch. In theory, the batch script should only fail due to a memory shortage if the any one case is too large to extract (regardless of how many cases were extracted before).

JoostJM on 2 Oct 2017

Here are the graphs of memory usage over time for the large (~20 iterations) and small (~200 iterations). I got no memory errors and halted the process.

Large mask
large mask memory usage

Small Mask
small mask memory usage

If you still get your memory errors, could you make a similar graph?
I used a simple python package called mprof (pip install memory_profiler) and then ran your script using python C:\Python27\scripts\mprof run PyRadiomicsMemoryExceptionTest.py. After this has finished you can see the graph by running python C:\Python27\scripts\mprof plot

JoostJM on 2 Oct 2017

@JoostJM Thank you for your answer and checking on your machine.
My System is Win 10 with 16GB RAM. During all my tests there were always at least 7GB RAM still available. Unfortunately the mprof tool doesn't work for me (Windows) it throws an exception that it cannot access the source code!?

So I did "manual" memory observation by looking at the memory consumption of the process in the Task Manager. For the case with the large mask the process crashes when using around 1.4 GB of memory. For the case with the small mask the process never consumes more than 350 MB of memory.

However: I was using a 32 bit Python. So in the large-mask case I can understand that it runs our of memory (though it should still have some headroom at 1.4). For the small-mask case it shouldn't be an issue, though :/

I now also installed a 64 bit Python in parallel and i cannot reproduce the errors up to now (> 1200 iterations on the small mask). So for me that could be the solution. Nevertheless the crash of the small mask on 32 bit still puzzles me ...

michaelschwier on 2 Oct 2017

We should think about explicitly not supporting 32bit python. We explicitly don't support 32 bit platforms in Slicer because we had so many memory related errors.

pieper on 2 Oct 2017

I would suggest not spend any time on this issue debugging it in 32 bit, and add a note to the user guide that 64 bit python should be used.

fedorov on 2 Oct 2017

Could maybe even add a warning during installation of pyradiomics when detecting 32bit python!?

michaelschwier on 2 Oct 2017

Could maybe even add a warning during installation of pyradiomics when detecting 32bit python!?

Definitely, or even failure.

fedorov on 2 Oct 2017

👍1

Hi,

I've been trying to run some test.py to check radiomics is working but it keep saying that it cannot import featureextractor from radiomics. Is there anyone having the same issue?

Thanks in advance

CristianIzquierdoLitii on 6 Mar 2018

I have a similar issue.
I am facing a memory error when I try to run my model. I am sending 6500 images to train with 7 captions each.
I am using Ubuntu.

Yukti-09 on 11 Jul 2019

@Yukti-09, which specific version of PyRadiomics are you using? what parameters? How much RAM does your system have?

JoostJM on 11 Jul 2019

Not using pyradiomics

Yukti-09 on 11 Jul 2019

👎1

Hi! I am using pyradiomics for 2D ultrasound image for feature extraction. I am using an open access database of thyroid ultrasound images (Available: thyroid http://cimalab.intec.co/applications/thyroid/ ). When I am using pyradiomics for feature extraction from mask it requires more than 16 GB RAM. Are there any settings required to process pyradiomics to limit the memory usage? Mask is small in compare to the whole image. If features extraction from mask is taking these much memory then what will happen if I will do the same for whole image? Kindly guide.

ReemaParekh on 16 Jun 2020

@ReemaParekh what kind of settings are you using when performing the extraction?

JoostJM on 16 Jun 2020

I have 360x560 size USG image and using voxel based features extraction which is applied on whole image and all feature class are on. In this case the memory usage reached 20 GB in some cases. I am using below settings for the same.
featureVector = extractor.execute(image3D, mask,label=1,voxelBased=True)
settings = {}
settings['binWidth'] = 25

settings['force2D'] = True

settings['force2Ddimension'] = 0
settings['maskedKernel']=True
settings['initvalue']=5
settings['kernelRadius']=10
settings['resampledPixelSpacing'] = None # [3,3,3] is an example for defining resampling (voxels with size 3x3x3mm)
settings['interpolator'] = sitk.sitkBSpline
settings['verbose'] = True

Further to this, if I write force2D= true then my code is not working. Else it works but require huge memory. I have 2D image, but unable to use directly 2D image in pyradiomics program so I have converted 2D image into 1x360x560 to process. image3D=sitk.JoinSeries(image)

ReemaParekh on 16 Jun 2020

The memory requirement sounds valid. Be aware that voxel-based radiomics can be quite memory intensive, especially when extracting the entire image and enabling all features. Output is float64 maps for each feature, which in your case means 360x560x8 bytes per feature map. Furthermore, there is some additional memory requirement for intermediate feature maps, mask etc.

As to the force2D, what do you mean with "not working"? PyRadiomics should be able to deal with both truly 2D input, as when you enable force2D. The only thing I can imagine going wrong in your code is that if your image has sitk size 1x360x560, it means that x is your force2D dimension, and you should set force2Ddimension to 2 (reason from the matrix, which is ordered as z, y, x).

JoostJM on 17 Jun 2020

Was this page helpful?

0 / 5 - 0 ratings