Pyradiomics: Profiling PyRadiomics for very large segmentations

Created on 17 May 2017  ·  14Comments  ·  Source: AIM-Harvard/pyradiomics

Profiling PyRadiomics using the brain1 testcase (256x256x25 voxels) with full-mask

  • all features
  • Original, LoG (sigma 1, 3 and 5), Wavelet (1 level, 8 decompositions)

Main bottlenecks:

  • GLSZM 12x calculated matrix: 76.9% of time required (+/- 22 min total, 1.86 min per calculation)
  • ShapStatistics image filter (implemented in SimpleITK): 16.7% of time required (+/- 5 min), only applied once

profiling_full_mask

cc @Radiomics/developers

enhancement

Most helpful comment

Thanks for checking @JoostJM -- sounds like we should decide if we want to keep an independent implementation in PyRadiomics or work together with the ITK community to improve itk-texturefeatures. My knee-jerk reaction is to prefer the community effort whenever possible in the hopes that code maintenance will be more efficient.

All 14 comments

@blezek I mentioned to @Radiomics/developers that you'd run into performance issues trying to run radiomics on whole brains so @JoostJM did some profiling.

From a quick look at the glszm code it looks like there's a lot of room to optimize the implementation.

In the c code? Do you have a general idea how to tackle it?

In the c code? Do you have a general idea how to tackle it?

May be you could look into using threads

That said, using threading libraries with vanilla c-code may be challenging to get right.

I think that implementing ITK filters could be helpful (e.g as ITK remote module), you would then be able to leverage the associated cross-platform image processing and threading API. These filters could then be be exposed through SimpleITK

@thewtex @jbvimort I wonder if you have any insights ...

It would be best to do a line-by-line profiling, but my guess is that this set of nested loops is pretty slow. Using numpy.where can be slow and will return long lists of coordinates.

I agree with Jc that this operation could be threaded since each pixel should be independent. Also a SimpleITK implementation of the same operation could be way faster. But I'd start by doing the profiling and see if there are ways to just optimize the existing python code.

@pieper it appears from the profiling that Joost already did that _calculateCMatrix accounts for 77% of the time (if I interpret the profiling summary correctly):

image

I see, you are saying it is probably this call to this code. Yes, that looks pretty compute intensive.

If this feature is really informative, and large image masks are an important use case, then it could be that the "simple C library" approach is not enough, at least as currently written. It's possible the C code could be made more efficient or we could explore other options like threading or SimpleITK implementation.

@pieper @fedorov, it's indeed this call. I'll take a look to see if it can be made more efficient.

If I can, I'd like to avoid implementing multithreading here for 2 reasons

  1. This will make the C code a lot more complex
  2. When I extract features for multiple patients, I increase performance by multithreading on a patient level (each patient 1 thread). In this case I also set SimpleITK to single thread to keep the patient extraction single thread.

On a side note, the main reason c GLSZM takes so much time is because it is computationally intensive and is called 12 times (due to the different derived images). The call to SimpleITK's labelshapestatistics image filter is the call that takes the most time for a single call, but is only called once per extraction (as it is part of shape and therefore only calculated on original image type).

As to the SimpleITK call taking so much time. This is due to the compute Feret diameter (max 3D diameter).
If you run labelshapestatistics image filter with and without this diameter there is a time difference of 271053 ms vs 21 ms for the full brain1 mask.

There is now a Python package available, itk-texturefeatures, that could be helpful here.

@JoostJM did you have a look at itk-texturefeatures? It would be interesting to know how the speed compares and if the feature sets are comparable to what we have now in the pyradiomics c code.

https://github.com/InsightSoftwareConsortium/ITKTextureFeatures/tree/2b4544fe39e0ece9007a0d87c396c8586c6f4df5/example

@pieper, Just returned from my holiday, but this is certainly interesting.
I did some profiling using PyRadiomics and similar mask as were used by the ITK texture features (implementing a voxel wise calculation). Pyradiomics performance was similar to ITK's performance.
Downside of PyRadiomics is that it only accepts 3D input, whereas ITK is N-dimensional. On the other hand, PyRadiomics acchieves this performance in single thread mode and calculates more features in GLCM.

In terms of featureset, as far as I can see the formula definitions are similar, with PyRadiomics having more features in both GLCM and GLRLM. ITK's texture features has only 1 feature (in GLCM) that PyRadiomics doesn't have, but this is a similar implementation of PyRadiomics' correlation, which is present in ITK as Haralick's correlation. That being said, I still believe there will be some differences, due to the fact that ITK used a fixed bin count, as opposed to the fixed bin width in PyRadiomics.

Thanks for checking @JoostJM -- sounds like we should decide if we want to keep an independent implementation in PyRadiomics or work together with the ITK community to improve itk-texturefeatures. My knee-jerk reaction is to prefer the community effort whenever possible in the hopes that code maintenance will be more efficient.

Was this page helpful?
0 / 5 - 0 ratings