Caffe: How to classify 1-channel inputs with the python wrapper (LeNet deploy)

Created on 28 May 2014  ·  37Comments  ·  Source: BVLC/caffe

Greetings,

I'm evaluating Caffe for a commercial application. I have compiled Caffe and pycaffe and matcaffe and everything appears to be good: the installation passed all tests that are run using make runtest.

Now I want to use the python wrapper to do a simple image classification. I am trying to use a trained mnist network to do the classification (staying away from the imagenet example due to the commercial restriction associated with obtaining the pretrained network).

For my lenet_deploy.prototxt I changed the very beginning and end of the example lenet_test.prototxt. In the beginning of lenet_deploy.prototxt I have the following before the first convolutional layer for my non-RGB single test image of size [28,28]:

name: "LeNet-deploy"
input: "data"
input_dim: 1
input_dim: 1
input_dim: 28
input_dim: 28

At the end of lenet_deploy.prototxt I have:

layers {
name: "prob"
type: SOFTMAX
bottom: "ip2"
top: "prob"
}

For the trained model I simply used the lenet_iter_10000 result, which seems very good, and renamed it to lenet_pretrained. Now I want to classify my non-RGB image of size [28 28] by running the python wrapper with the options listed below:

python classify.py
--model_def='/usr/local/caffe/examples/mnist/lenet_deploy.prototxt' --pretrained_model='/usr/local/caffe/examples/mnist/lenet_pretrained'
--gpu
--center_only
--images_dim='28,28'
--mean_file=''
/usr/local/caffe/examples/images/cat2.jpg
/usr/local/caffe/examples/mnist/looker

Launching the python script results in the network being read in, but it fails in pycaffe.py line 66 in _Net_forward:

self.blobs[in_].data[...] = blob
ValueError: could not broadcast input array from shape (1,3,28,28) into shape (1,1,28,28)

I interpret this error to mean that the code (or the pretrained network?) is expecting a 3 channel image (RGB or the like), but it is only seeing a single channel image. If that is the right interpretation how do I change this? I've spent a lot of time looking at the various posts on this site, but I haven't been able to get past this point. My apologies if I'm missing something super obvious here.

Many thanks for your time.

question

Most helpful comment

I stumbled upon the same problem and I solved it without changing the files cited above. In my case, the image were really saved with 3 channels, but all of them with the same pixel values. So, I just took one of the channels and reshaped it, like the code below:

In [51]: img = caffe.io.load_image('image7.png', color=False)

In [52]: img.shape
Out[52]: (20, 20, 3)

In [53]: grayimg = img[:,:,0]

In [54]: grayimg.shape
Out[54]: (20, 20)

In [55]: gi = np.reshape(grayimg, (20,20,1))

In [56]: gi.shape
Out[56]: (20, 20, 1)

Without reshaping the gray scale image, the predict function of classifier.Classifier (this line) was raising an error because the index 2 in inputs[0].shape[2] didn't exist.

All 37 comments

I interpret this error to mean that the code (or the pretrained network?) is expecting a 3 channel image (RGB or the like), but it is only seeing a single channel image.

The reverse, actually: the Net is expecting an input blob of 1 x 28 x 28 images (1 channel, 28 pixels high, 28 pixels wide). The included caffe.io.load_image() always loads images to three channels https://github.com/BVLC/caffe/blob/master/python/caffe/io.py#L8-L23 but you could load the image yourself in classify.py or turn off the greyscale if img.ndim == 2 check in io.py accordingly.

The code automatically checks whatever the network architecture expects as input.

Got it. Thanks so much! I'm up and running now.

hi shelhamer,
would you kindly tell me all the place I should modify when make prediction of 1 channel input? because I have turn off the greyscale if img.ndim == 2 check in io.py, but the python still complain some channel unmatch problems?

Hi,
I am using C++ and encountered the same type of problem - number of channels mismatch.
I want to classify two image at a time and modified the MNIST Lenet model:
1) replaced the input type: DATA layer with type: IMAGE_DATA with "source" pointing to an image_list.txt with a list of 28 x 28 gray scale (1 channel) images.
2) replaced the top layer with SOFTMAX layer.

In the main() routine, after setting mode to CPU I have:

Net<float> caffe_test_net(argv[1]);
caffe_test_net.CopyTrainedLayersFrom(argv[2]);

where argv[1] is the net prototxt and argv[2] is the pre-trained data file of LeNet MNIST ( lenet_iter_10000).

When runnin it I get an error:
....
... 6068 net.cpp:316] Ignoring source layer mnist
... 6068 net.cpp:319] Copying source layer conv1
... 6068 net.cpp:326] Check failed: target_blobs[j]->channels() == source_layer.blobs(j).channels() (3 vs. 1)

Examining it in Debug single step I see the the error occurs when stepping over the line
caffe_test_net.CopyTrainedLayersFrom(argv[2]);

I interpert this error as an indication that the first conv layer expects 3 channel data while the pre-trained data provides data for only 1 channel (which is how it was trained and what I need).

anybody advice ?

Thanks
Shaile.

caffe take your input as color image by default, you should set an images_in_color field in your predict prototxt file, please refer to caffe.proto or #538 for detail

For those who run across this in the future, I got the network changed to 1 channel with the following diff:

git diff
diff --git a/python/caffe/io.py b/python/caffe/io.py
index aabcfdd..d1fcf73 100644
--- a/python/caffe/io.py
+++ b/python/caffe/io.py
@@ -24,7 +24,8 @@ def load_image(filename, color=True):
     if img.ndim == 2:
         img = img[:, :, np.newaxis]
         if color:
-            img = np.tile(img, (1, 1, 3))
+            img = np.tile(img, (1, 1, 1))
+        print('changed')
     elif img.shape[2] == 4:
         img = img[:, :, :3]
     return img
diff --git a/python/caffe/pycaffe.py b/python/caffe/pycaffe.py
index 31dc1f9..dbdcab7 100644
--- a/python/caffe/pycaffe.py
+++ b/python/caffe/pycaffe.py
@@ -295,7 +295,7 @@ def _Net_preprocess(self, input_name, input_):
     mean = self.mean.get(input_name)
     input_scale = self.input_scale.get(input_name)
     raw_scale = self.raw_scale.get(input_name)
-    channel_order = self.channel_swap.get(input_name)
+    channel_order = (0,)#self.channel_swap.get(input_name)
     in_size = self.blobs[input_name].data.shape[2:]
     if caffe_in.shape[:2] != in_size:
         caffe_in = caffe.io.resize_image(caffe_in, in_size)
@@ -305,7 +305,7 @@ def _Net_preprocess(self, input_name, input_):
     if raw_scale is not None:
         caffe_in *= raw_scale
     if mean is not None:
-        caffe_in -= mean
+        caffe_in -= mean[0,:,:]
     if input_scale is not None:
         caffe_in *= input_scale
     return caffe_in

Hi Russell91, your post saved me a lot, thanks!

I stumbled upon the same problem and I solved it without changing the files cited above. In my case, the image were really saved with 3 channels, but all of them with the same pixel values. So, I just took one of the channels and reshaped it, like the code below:

In [51]: img = caffe.io.load_image('image7.png', color=False)

In [52]: img.shape
Out[52]: (20, 20, 3)

In [53]: grayimg = img[:,:,0]

In [54]: grayimg.shape
Out[54]: (20, 20)

In [55]: gi = np.reshape(grayimg, (20,20,1))

In [56]: gi.shape
Out[56]: (20, 20, 1)

Without reshaping the gray scale image, the predict function of classifier.Classifier (this line) was raising an error because the index 2 in inputs[0].shape[2] didn't exist.

@boechat107 have you fix the problem ? I use the same strategy but encouter the same problem.

Really, @RiweiChen? Are you sure of having made the changes on the lenet_deploy.prototxt too (input dim), as suggested by @pjstimac?

In my case, the function was expecting to get the third number of gi.shape, this is why my variable grayimg didn't work.

Hi , @boechat107 ,sure , I had trained a network using grayscale images, and the input images is also a grayscale image. I following the same strategy as you do ,and also got the IndexError: tuple index out of range at inputs[0].shape[2], so I change it to inputs[0].shape[1], but it got an another error at input_[ix]=caffe.io.resize_image(in_,self.image_dims).
it is a ValueError: cound not broadcast input array from shape (32,32) into shape (32,32,1) ,32 is the size of my input image.

can you tell me what I can do next to fix this problem?

Thanks .

chen

Implement my solution above and you will be fine. Boechat's solution only works if your test images are RGB.

Hi @Russell91 , actually I have also try your method before, and recomplie the corresponding modified *.py code to .pyc; however I still got the same Error at the same place about:
ValueError: cound not broadcast input array from shape (32,32) into shape (32,32,1)

What I can do to fix this problem?

Did you remember to add the line: img = np.tile(img, (1, 1, 1))... That will change the array shape.

@Russell91 Yes ,I am sure of this .
Here is the load_image def in io.py
def load_image(filename, color=True):
"""
Load an image converting from grayscale or alpha as needed.

filename: string
color: flag for color format. True (default) loads as RGB while False
    loads as intensity (if image is already grayscale).

Give
image: an image with type np.float32 in range [0, 1]
    of size (H x W x 3) in RGB or
    of size (H x W x 1) in grayscale.
"""
img = skimage.img_as_float(skimage.io.imread(filename)).astype(np.float32)

if img.ndim == 2:
    img = img[:, :, np.newaxis]
    if color:
        img = np.tile(img, (1, 1, 1))   #change 1 here:
print('changed.')
elif img.shape[2] == 4:
    img = img[:, :, :3]
"""
if img.ndim == 2:
    img = img[:, :, np.newaxis]
else:
    img = img[:, :, :3]
"""

return img 

if I do not change the inputs[0].shape[2] to inputs[0].shape[1] I would get the
IndexError: tuple index out of range at inputs[0].shape[2]

I got the same problem because I forgot these changes in pycaffe.py.

  • channel_order = self.channel_swap.get(input_name)
  • channel_order = (0,)#self.channel_swap.get(input_name)
  • channel_order = self.channel_swap.get(input_name)
  • channel_order = (0,)#self.channel_swap.get(input_name)

These fixed my problem.

If you have trained a model with 1-dimensional gray image, and want to classify another gray image, the following is the hack worked for me:

  1. copy the offical classify.py in $CAFFE_ROOT/python/classify.py
  2. specify input_dim as 1, 1, x, x in deploy.prototxt
  3. change all call to caffe.io.load_image(fname) in classify.py to caffe.io.load_image(fname, False) because if you do not specify the second parameter as False, True will be used by default, the meaning of the second parameter is to tell load_image whether the image is color or gray, if it's in color, then the returned image will have shape (width, height, 3) or (width, height, 4) depending on whether the alpha channel exists. If you specify False, the shape will be (width, height, 1) as you want.
  4. specify --channel_swap '0' in python classify.py because this value is to reorder RGB to BGR, let's say we have an image im, im is in numpy array format, and im.shape = (10, 10, 3), then caffe will do im = im[:, :, channel_wap] to swap channels, if you do not specify --channel_swap, it will be "2,1,0" by default, then in caffe, im = im[:, :, [2, 1, 0]], but the gray image's shape is really (10, 10, 1) (if you follow the 2 step), so an index out of bounds exception will be raised. So just specify '0' to --channel_swap, then caffe will run im = im[:, :, [0]], that's fine.

then just use the official classify.py.

Here is the gist of classify.py and test.sh worked for me.
classify.py: https://gist.github.com/uronce-cc/869afe1bd85e79dda111
test.sh: https://gist.github.com/uronce-cc/e834e9cd2a0a62ceb5d5

Hope it will work for you too.

@shelhamer What about adding an option to classify.py like --color, value to be True or False, passing the value to caffe.io.load_image, and if False, forcing channel_swap to be [0] ?

Hello, I have try @uronce-cc method, but it doesn't work. The funny thing is my mnist network works for image of png format, but still get an error when the images is 'jpg' format.

When I want to try @Russell91 method, I find the new version of caffe has changed the 'pycaffe.py' file a lot, so that i don't know how to modify it.

The error with jpg format image is "ValueError: could not broadcast input array from shape (1,3,28,28) into shape (1,1,28,28)"

Can someone help me solve it? Thanks in advance !

Hi,

I had the same problem. I followed *Russell91
https://github.com/Russell91 *commented on Sep 30, 2014
https://github.com/BVLC/caffe/issues/462#issuecomment-57393056 <-- this
fixed the problem. modify pycaffe.py file.

On Tue, Mar 3, 2015 at 6:04 PM, wang4249 [email protected] wrote:

Hello, I have try @uronce-cc https://github.com/uronce-cc method, but
it doesn't work. The funny thing is my mnist network works for image of png
format, but still get an error when the images is 'jpg' format.

When I want to try @Russell91 https://github.com/Russell91 method, I
find the new version of caffe has changed the 'pycaffe.py' file a lot, so
that i don't know how to modify it.

The error with jpg format image is "ValueError: could not broadcast input
array from shape (1,3,28,28) into shape (1,1,28,28)"

Can someone help me solve it? Thanks in advance !


Reply to this email directly or view it on GitHub
https://github.com/BVLC/caffe/issues/462#issuecomment-77058692.

@ToruHironaka Thanks for your response! But the current version of 'pycaffe.py' is quite different from the one mentioned by @Russell91 ,so I don't know how to modify it.

@wang4249, the problem seems clear from the error message. Your jpg image has been loaded with 3 channels. When I'm coding in Python, I like to be sure of the structure of my data by using the debugger:

# CODE to load an image file...
import pdb; pdb.set_trace()

If you are using caffe.io.load_image, you can check the structure of your data by

img = caffe.io.load_image(filename)
img.shape

For your jpg image, the returned shape is probably (28, 28, 3), while your network is expecting (28, 28, 1). In this case, use the solution that I proposed in a previous comment.

I didn't look at the code of caffe.io.load_image, but it's a good idea to always check the shape of your image data. Perhaps, for a gray scale image, you can have an image with shape (28, 28). In this case, you just need to reshape your data to (28, 28, 1).

@boechat107 Thank you very much for your response. I am a beginner with python, actually I don't know how to write python code , i can just modify some code. What I doing now is trying to figure out if the 'mnist' network works by modifying the 'classify.py' file. I know it's bad to put my code here, but I have spent 2 days to find the error but didn't make any progress.

#!/usr/bin/env python
"""
classify.py is an out-of-the-box image classifer callable from the command line.
By default it configures and runs the Caffe reference ImageNet model.
"""
import numpy as np
import pandas as pd
import os
import sys
import argparse
import glob
import time
from skimage.color import rgb2gray
import caffe


def main(argv):
    pycaffe_dir = os.path.dirname(__file__)

    parser = argparse.ArgumentParser()
    # Required arguments: input and output files.
    parser.add_argument(
        "input_file",
        help="Input image, directory, or npy."
    )
    parser.add_argument(
        "output_file",
        help="Output npy filename."
    )
    # Optional arguments.
    parser.add_argument(
        "--model_def",
        default=os.path.join(pycaffe_dir,
                "../models/bvlc_reference_caffenet/deploy.prototxt"),
        help="Model definition file."
    )
    parser.add_argument(
        "--pretrained_model",
        default=os.path.join(pycaffe_dir,
                "../models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel"),
        help="Trained model weights file."
    )
    parser.add_argument(
        "--gpu",
        action='store_true',
        help="Switch for gpu computation."
    )
    parser.add_argument(
        "--center_only",
        action='store_true',
        help="Switch for prediction from center crop alone instead of " +
             "averaging predictions across crops (default)."
    )
    parser.add_argument(
        "--images_dim",
        default='256,256',
        help="Canonical 'height,width' dimensions of input images."
    )
    parser.add_argument(
        "--mean_file",
        default=os.path.join(pycaffe_dir,
                             'caffe/imagenet/ilsvrc_2012_mean.npy'),
        help="Data set image mean of [Channels x Height x Width] dimensions " +
             "(numpy array). Set to '' for no mean subtraction."
    )
    parser.add_argument(
        "--input_scale",
        type=float,
        help="Multiply input features by this scale to finish preprocessing."
    )
    parser.add_argument(
        "--raw_scale",
        type=float,
        default=255.0,
        help="Multiply raw input by this scale before preprocessing."
    )
    parser.add_argument(
        "--channel_swap",
        default='2,1,0',
        help="Order to permute input channels. The default converts " +
             "RGB -> BGR since BGR is the Caffe default by way of OpenCV."
    )
    parser.add_argument(
        "--ext",
        default='jpg',
        help="Image file extension to take as input when a directory " +
             "is given as the input file."
    )
    parser.add_argument(
        "--labels_file",
        default=os.path.join(pycaffe_dir,
                "../data/ilsvrc12/synset_words.txt"),
        help="Readable label definition file."
    )

    parser.add_argument(
        "--print_results",
        action='store_true',
        help="Write output text to stdout rather than serializing to a file."
    )

    parser.add_argument(
        "--force_grayscale",
        action='store_true',
        help="Converts RGB images down to single-channel grayscale versions," +
             "useful for single-channel networks like MNIST."
    )

    args = parser.parse_args()

    image_dims = [int(s) for s in args.images_dim.split(',')]

    mean, channel_swap = None, None
    if args.force_grayscale:
        channels_swap = None
        mean_file = None
    else:
        mean = np.load(args.mean_file).mean(1).mean(1)
        channel_swap = [int(s) for s in args.channel_swap.split(',')]

    # Make classifier.
    classifier = caffe.Classifier(args.model_def, args.pretrained_model,
            image_dims=image_dims, mean=mean,
            input_scale=args.input_scale, raw_scale=args.raw_scale,
            channel_swap=channel_swap)

    if args.gpu:
        caffe.set_mode_gpu()
        print('GPU mode')

    # Load numpy array (.npy), directory glob (*.jpg), or image file.
    args.input_file = os.path.expanduser(args.input_file)
    if args.input_file.endswith('npy'):
        inputs = np.load(args.input_file)
    elif os.path.isdir(args.input_file):
        inputs =[caffe.io.load_image(im_f)
                 for im_f in glob.glob(args.input_file + '/*.' + args.ext)]
    else:
        inputs = [caffe.io.load_image(args.input_file)]

    # if args.force_grayscale:
    #   inputs = [rgb2gray(input) for input in inputs];


    print("Classifying %d inputs." % len(inputs))

    # Classify.
    start = time.time()
    scores = classifier.predict(inputs, not args.center_only).flatten()
    print("Done in %.2f s." % (time.time() - start))

    if args.print_results:
        with open(args.labels_file) as f:
          labels_df = pd.DataFrame([
               {
                   'synset_id': l.strip().split(' ')[0],
                   'name': ' '.join(l.strip().split(' ')[1:]).split(',')[0]
               }
               for l in f.readlines()
            ])
        labels = labels_df.sort('synset_id')['name'].values

        indices = (-scores).argsort()[:5]
        predictions = labels[indices]

        meta = [
                   (p, '%.5f' % scores[i])
                   for i, p in zip(indices, predictions)
               ]

        print meta



    # Save
    np.save(args.output_file, predictions)


if __name__ == '__main__':
    main(sys.argv)

Thanks very much for your help.

@wang4249, as you said that you are learning Python, here are some suggestions:

  • Play more with the Python REPL (I recommend to use ipython). Try to load some images with caffe.io.load_image, to create some arrays with numpy. In summary, play with each "piece of code" separately.
  • Use pdb (as described in my previous comment) to check the states of your program. Just put that line somewhere in your code (a place where you would like the program to stop) and see what happens.
  • Test a simpler Caffe example, like the imageNet classifier. It's not hard to modify it for the mnist problem.

@wang4249 what error have you encountered when trying my modification?

@shelhamer
What about adding as_grey=not color to the image loading function?

img = skimage.img_as_float(skimage.io.imread(filename, as_grey=not color)).astype(np.float32)

Some way to get error ''IndexError: tuple index out of range" at

python/caffe/classifier.py, line 63, in predict
self.image_dims[0], self.image_dims[1], inputs[0].shape[2]),

is pass to net.predict(image) not a list of images, but just image.
Note that you should make it in such way:

image_list = [caffe.io.load_image(image_path, False)]
features = net.predict(image_list)

May be it will be helpful for someone :)

I have some grayscale images and I am using gray flag to make db and mean files, so the current amount of channel is 1. I am going to import my dataset in groups of 20 images as 20 channels. how can I change the amount of channels variable fro 1 to 20, in train_val.prototxt file? should I change the caffe files?
I will be grateful if someone can help.

@fahimeh62 I am trying to do similar thing you try to do. I found this https://github.com/BVLC/caffe/issues/1494. This information might help. I am trying to test K channel number 4 (RGB with alpha) in order to see how to increase K channel but I am not successful yet.

@ToruHironaka Thanks for your prompt answer. I had a look at that page you suggest. Could you tell me where I should add theses lines?
"import numpy as np
import caffe
import lmdb

inputs = np.zeros((10, 5, 227, 227))

in_db = lmdb.open('input-lmdb', map_size=int(1e12))
with in_db.begin(write=True) as in_txn:
for in_idx, in_ in enumerate(inputs):
in_dat = caffe.io.array_to_datum(in_)
in_txn.put('{:0>10d}'.format(in_idx), im_dat.SerializeToString())
in_db.close()"

in convert_imageset.cpp or train_val.prototxt or somewhere else?
Thanks.

@ToruHironaka I am also curious how do you change your channels to 4 and then generate your LMDB regarding with 4-channel data?
Can you explain me?
Thanks a lot.

@fahimeh62 the script is written in python so you have to write own your python script. I add alpha channel to my 3 channel (RGB) png file by using Imagemagic. Then, I converted from RGBA png files into lmdb. However, it seemed to be my alpha channel automatically ignored or my python script does not properly convert my image file with 4 channel into lmdb. I am still working on this.

I am trying to train caffe using images which has 8 channels and I am also facing problems creating the LMDB.
@ToruHironaka Where you able to convert your 4 channel image into lmdb?

@fahimeh62 & @mtrth Above code converts image files into lmdb but I am still working on multi-channel things.

@mtrth I think you can just run build/tools/convert_imageset.bin convert 4 channel image (my image's channel was RGBA 4 channel, I think you are trying to combine 8 image files into 1 file by increasing a number of channel. I am trying to do it too) into lmdb or use the python code above. It should work. I am still leaning and trying to converting multi-channel images into lmdb. I think caffe seem to accept up to 10 channels but I am not sure.

I modified the python script mentioned in https://github.com/BVLC/caffe/issues/1494 when I test it on a image by following the command mentioned in https://github.com/BVLC/caffe/blob/master/examples/detection.ipynb

./python/detect.py --crop_mode=selective_search --pretrained_model=./examples/trial/trial_iter_10000.caffemodel --model_def=./examples/trial/trial.prototxt --gpu --raw_scale=255 _temp/det_input.txt _temp/det_output.h5
I get the error
Traceback (most recent call last):
File "./python/detect.py", line 173, in
main(sys.argv)
File "./python/detect.py", line 121, in main
context_pad=args.context_pad)
File "/home/revathy/caffe-master/python/caffe/detector.py", line 46, in __init__
self.transformer.set_mean(in_, mean)
File "/home/revathy/caffe-master/python/caffe/io.py", line 246, in set_mean
raise ValueError('Mean channels incompatible with input.')
ValueError: Mean channels incompatible with input.

Did anyone face similar issues?

I fixed that error; python/detect.py was taking 'caffe/imagenet/ilsvrc_2012_mean.npy' as the default mean file and its for 3 channel image, so I created a mean file for my 8-channel images using python script and used that.

Was this page helpful?
0 / 5 - 0 ratings