Caffe: ImageDataLayer enhancement: images as labels

Created on 21 Mar 2016  ·  3Comments  ·  Source: BVLC/caffe

Can we edit the implementation of the ImageData Layer?
Currently it supports taking one image and its label as input. But in use cases like segmentation (SegNet), motion flow (FlowNet), and super resolution (SRCNN), the output is also an image. To implement this we have to resort to using two ImageData layers and use fake labels as arguments.

Instead of this, can we add a parameter to the layer which controls this. For example

LabelFormat := 0 -> label is a digit (default)
LabelFormat: := 1 -> label is also an image

If the idea is acceptable, then I can do the implementation.

enhancement

Most helpful comment

Making the label optional seems fine to me. I think it would be more intuitive to use an enum, rather than a number constant that has to be documented. For example:

label_type: INTEGER (for the current setup)
label_type: IMAGE (to allow for multiple images)

If you are interested in further expanding the layer, you could make the field repeated to allow for multiple images or a combination of images and integer labels.

For example, if you have one image label and one integer label (e.g. a siamese network), then you could specify:

label_type: IMAGE
label_type: IMAGE
label_type: INTEGER

which would map to 3 top blobs. To maintain backwards compatibility, you could default to [IMAGE, INTEGER] if label_type is not specified.

Regarding whether or not this idea will get merged, that depends on how much interest there is. Having unit tests is also important. I personally am very interested in this kind of modification (and have already modified my fork to do something similar), but it would be good to see more feedback from others.

See #2108.

All 3 comments

@seanbell Can I go ahead with the implementation?

Making the label optional seems fine to me. I think it would be more intuitive to use an enum, rather than a number constant that has to be documented. For example:

label_type: INTEGER (for the current setup)
label_type: IMAGE (to allow for multiple images)

If you are interested in further expanding the layer, you could make the field repeated to allow for multiple images or a combination of images and integer labels.

For example, if you have one image label and one integer label (e.g. a siamese network), then you could specify:

label_type: IMAGE
label_type: IMAGE
label_type: INTEGER

which would map to 3 top blobs. To maintain backwards compatibility, you could default to [IMAGE, INTEGER] if label_type is not specified.

Regarding whether or not this idea will get merged, that depends on how much interest there is. Having unit tests is also important. I personally am very interested in this kind of modification (and have already modified my fork to do something similar), but it would be good to see more feedback from others.

See #2108.

Could I get your feedback @naibaf7 @shelhamer @jeffdonahue. Thanks in advance.

Was this page helpful?
0 / 5 - 0 ratings