Pyradiomics: Inconsistent wavelet feature values between v1.2.0 and v1.3.0

Created on 21 Jun 2018 · 11Comments · Source: AIM-Harvard/pyradiomics

Hi! I have taken on a project that was previously started by someone who was using pyradiomics v1.2. Given that I just started, I opted for the latest version (v.1.3). To ensure that I was on the right track, I was comparing resulting features from my runs with the person who was working with v1.2 and I found some significant discrepancies in the features calculated from the wavelet filtered images. I have been able to pin-point two issues:

Issue 1:
In version 1.3, I found that some of the decomposition names were not attributed to the correct decomposition image. I discovered this when I saw that there was almost perfect symmetry comparing to version 1.2.

This is specific to the wavelet decompositions whose names are not palindromes (e.g. “LLL” features were the same across versions whereas HHL(v1.3) = LHH(v.1.2), LLH (v1.3) = HHL (v.1.2), etc). Digging a bit into the getWaveletImage function in imageoperations.py, I saw that there was the addition of the following lines in v1.3:

axes = {2, 1, 0} # set
if kwargs.get('force2D', False):
axes -= {kwargs.get('force2Ddimension', 0)} # set

approx, ret = _swt3(inputImage, kwargs.get('wavelet', 'coif1'), kwargs.get('level', 1), kwargs.get('start_level', 0), axes=tuple(axes))

From my understanding of pywavelet, axes is the variable whose order determines the order of the letters in the decomposition name. I was wondering why this is a set? It seems to me that its order would then be unpredicatable when converting to a tuple in the _swt3 function which could then change the decomposition name and feature labels. When I change the axes variable to a list, I'm no longer getting the issue.

Issue 2:
When I fixed the above issue, I started getting the same wavelet feature values as v1.2 for some of the images but not others. And this time the discrepancies were random (and significant).

Again looking at the getWaveletImage function in imageoperations.py v1.3, it seems that the images are getting padded if their dimensions are not divisible by 2. This seems to be a constraint placed by pywavelets where signal length must be divisible by 2**level (2 in this case). I don't think the same padding is occurring in v 1.2. Indeed, when I look at the dimensions of the images that I tested, the images that match perfectly with v1.2 had even dimensions (no padding in v.1.3) whereas those that did not had at least one odd dimension (padding in v1.3). Is it safe to say that v1.3 has the most accurate feature values for the wavelet filtered images (with the exception of the first issue)?

question

Source

sandfis

👍1

Most helpful comment

Thanks @michaelschwier
Indeed we can see that the features line up with v1.2 when reverting 7ff0548

sandfis on 21 Jun 2018

🎉2 👍1

All 11 comments

@sandfis thank you for investigating this!

Two more related commits that happened between 1.2 and 1.3:

https://github.com/Radiomics/pyradiomics/commit/7ff05482d3615e26ff7439e8f9044aefcba50a9a (BUG: Fixed resizing for wavelet filtering. Resizing does not scramble image anymore.)
https://github.com/Radiomics/pyradiomics/commit/67845cfe7434e42fc4dccd4fb15f99961001c874 (ENH: changed pad mode from constant to wrap)

cc: @michaelschwier

fedorov on 21 Jun 2018

@sandfis about your notes:

issue 1: your fix makes sense to me. Let's wait to hear from @JoostJM, but I think it would be great if you could submit a PR with the fix.
issue 2: I think this commit that I referenced earlier is probably to blame: https://github.com/Radiomics/pyradiomics/commit/67845cfe7434e42fc4dccd4fb15f99961001c874. I would not say one is more accurate than the other, it's just that the conventions changed, and we failed to communicate or even document this change. I think we should amend the release information to let users know that wavelet features will not be the same as in 1.2 for the images with the size not divisible by 2.

fedorov on 21 Jun 2018

Actually issue 2 is most definitely related to 7ff05482d3615e26ff7439e8f9044aefcba50a9a as well. Padding for images with odd dimensions was not correct before!

michaelschwier on 21 Jun 2018

Great, thanks @fedorov! I'll wait for @JoostJM to weigh in too

sandfis on 21 Jun 2018

I have to say, if indeed this is the source of the discrepancy, it is remarkable how much difference that change in padding strategy introduced.

@sandfis do you think you could continue your investigations, and see if reverting the commit referenced above would make the values match? This would be super helpful.

fedorov on 21 Jun 2018

@fedorov @sandfis Sorry, I should maybe have explained it in a little more detail before, because it is not obvious from the code change alone without knowing what numpy's resize does: The way padding was done before (with resize), literally scrambled the image! The resize would add zeros to the end of the array/matrix buffer (not the end of each dimension). See example 3 "Enlarging an array" here: https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.ndarray.resize.html#numpy.ndarray.resize

Hence the big difference, wavelet computation for odd-dimension images was unfortunately completely wrong before 7ff05482d3615e26ff7439e8f9044aefcba50a9a

michaelschwier on 21 Jun 2018

👍1

@michaelschwier thanks for clarifying this, indeed I completely missed it in https://github.com/Radiomics/pyradiomics/commit/7ff05482d3615e26ff7439e8f9044aefcba50a9a. I only looked at the second relevant commit, and thought that the change is in switching from pad to wrap. Indeed, now that you explain it, it was completely wrong. So I think this should be listed in the "Bug fixes" section of https://github.com/Radiomics/pyradiomics/releases/tag/1.3.0.

Now, what do we do about all those papers that managed to develop novel radiomics signatures predicting disease and eradicating cancer based on the wavelets features as implemented in v1.2.0? 🤣

fedorov on 21 Jun 2018

Thanks @michaelschwier
Indeed we can see that the features line up with v1.2 when reverting 7ff0548

sandfis on 21 Jun 2018

🎉2 👍1

Sorry for the late reply!

I agree with both @fedorov and @michaelschwier, altough I think this should be documented in the upcoming release, as this change is in the current master, and after the release of v1.3.0.

Am I correct to assume v1.3 here means the current master?

Additonally regarding issue 1, A set is indeed incorrect here. However, changing to a tuple does not work, as you cannot delete elements as would be necessary when extracting features in 2D (during which the wavelet will also be calculated in 2D, and therefore requires removal of the between-plane axis).

I fixed this issue by using a list and list.remove(), I will push this bugfix to the master shortly.