Numpy: '<' not supported between instances of 'int' and 'str' Type Error attributed to a print statement

Created on 8 Nov 2018  ·  4Comments  ·  Source: numpy/numpy

I am quite positive that the following is a bug, but please enlighten me if it isn't. A solution would be very much appreciated.

I am doing a standard one hot encoding through SCKlearn, and obviously using Numpy in the process. It is all fine when I leave the print option as default, but when I use numpy.set_printoptions(threshold='nan') function to print the whole of the one hot encoding array (instead of the Numpy summary) I get the error in the issue title. Here is the code and the corresponding error/Traceback report:

import numpy
from numpy import array
from numpy import argmax
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder

data = "+++++++++QVQLVQSGGGVVQPGRSLRLSCAASGFTFSSHRMHWVRQAPGKGLEWVAAVSNDGSNEYYADSVKGRFTISRDKSTSTLYLQMDSLRPEDTAVYYCARERCVSSSCWARALDYWGQGSLVTVCS++++++++++"
seq_string = list(data)
print(seq_string)
values = array(seq_string)
print(values)
label_encoder = LabelEncoder()
integer_encoded = label_encoder.fit_transform(values)
print(integer_encoded)
onehot_encoder = OneHotEncoder(sparse=False)
integer_encoded = integer_encoded.reshape(len(integer_encoded), 1)
onehot_encoded = onehot_encoder.fit_transform(integer_encoded)
numpy.set_printoptions(threshold='nan')
print(onehot_encoded)
inverted = label_encoder.inverse_transform([argmax(onehot_encoded[1, :])])
print(inverted)
> ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-43683b44d2e3> in <module>()
     24 onehot_encoded = onehot_encoder.fit_transform(integer_encoded)
     25 numpy.set_printoptions(threshold='nan')
---> 26 print(onehot_encoded)
     27 # invert first example
     28 inverted = label_encoder.inverse_transform([argmax(onehot_encoded[1, :])])

/d/harpy1/s/python/v3-5.1.0/lib/python3.6/site-packages/numpy/core/arrayprint.py in array_str(a, max_line_width, precision, suppress_small)
   1400         return str(a[()])
   1401 
-> 1402     return array2string(a, max_line_width, precision, suppress_small, ' ', "")
   1403 
   1404 def set_string_function(f, repr=True):

/d/harpy1/s/python/v3-5.1.0/lib/python3.6/site-packages/numpy/core/arrayprint.py in array2string(a, max_line_width, precision, suppress_small, separator, prefix, style, formatter, threshold, edgeitems, sign, floatmode, suffix, **kwarg)
    620         return "[]"
    621 
--> 622     return _array2string(a, options, separator, prefix)
    623 
    624 

/d/harpy1/s/python/v3-5.1.0/lib/python3.6/site-packages/numpy/core/arrayprint.py in wrapper(self, *args, **kwargs)
    420             repr_running.add(key)
    421             try:
--> 422                 return f(self, *args, **kwargs)
    423             finally:
    424                 repr_running.discard(key)

/d/harpy1/s/python/v3-5.1.0/lib/python3.6/site-packages/numpy/core/arrayprint.py in _array2string(a, options, separator, prefix)
    435     data = asarray(a)
    436 
--> 437     if a.size > options['threshold']:
    438         summary_insert = "..."
    439         data = _leading_trailing(data, options['edgeitems'])

TypeError: '>' not supported between instances of 'int' and 'str'

Most helpful comment

Try threshold=sys.maxsize instead, threshold is documented as an int.

All 4 comments

Try threshold=sys.maxsize instead, threshold is documented as an int.

Perhaps we should start throwing an exception in 1.16 when people pass the string "nan", to prepare them for python 3?

Unfortunately stackoverflow recommends passing 'nan'.

Try threshold=sys.maxsize instead, threshold is documented as an int.

Thank you! that worked perfectly fine.
Indeed, the problem arose by following the mentioned SO link.

Was this page helpful?
0 / 5 - 0 ratings