Keras: How to convert a caffe model to Keras type?

Created on 14 May 2015  ·  10Comments  ·  Source: keras-team/keras

Hi all,

I want to use Keras to train a CNN model for classfication. As I know, there are many public pre-trained CNN models, like VGG, ImageNet etc. But unfortunately, these pre-trained models are generated with other CNN framework, like caffe or cuda-convnet. How can we use these kinds of pre-trained models or weights to initialize a Keras sequential model and then do the finetuning training?

Thanks
Dr. Wu

stale

Most helpful comment

We are working on it. This is definitely among the features we want to add soon. https://github.com/fchollet/keras/issues/100

All 10 comments

I also hope to see a tutorial to illustrate this.

We are working on it. This is definitely among the features we want to add soon. https://github.com/fchollet/keras/issues/100

If anybody is still interested you can use this fork of Keras which has a conversion module:
https://github.com/MarcBS/keras

Best,
Marc

Hi,
I was trying caffe to keras conversion module that you mentioned but im getting this error when i run the caffe2keras.py

global name 'network_input' is not defined

Any help?

@dhruvjain if you are referring to this fork then open a new issue here, please.

Furthermore, it would be very helpful if you could include the model files that you are trying to convert, or at least the .prototxt.

I used it to convert caffenet model, the prototxt is this one:
name: "CaffeNet"

force_backward: true
input: "data"
input_dim: 1
input_dim: 3
input_dim: 227
input_dim: 227

input: "label"
input_dim: 1
input_dim: 1
input_dim: 1
input_dim: 1

layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
}
}

layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
inner_product_param {
num_output: 4096
}
}

layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}

layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}

layer {
name: "fc8"
type: "InnerProduct"
bottom: "fc7"
top: "fc8"
inner_product_param {
num_output: 1000
}
}

Really an important feature...

This Caffe-to-Keras weight converter is what you are looking for:

https://github.com/pierluigiferrari/caffe_weight_converter

It converts .caffemodel files to .h5 weight files. It converts weights only, not the model definition, but the weights are really all you need anyway.

For any given model, the model definition either requires only Keras core library layers, in which case it's super easy to write in Keras manually, or the model definition is complex and has custom layer types, in which case a model definition converter would probably fail anyway.

what shape input is expected by converted new keras model? (3,224,224) or (224,224,3)? because caffe works with (3,224,224) and keras with (224,224,3)...

Was this page helpful?
0 / 5 - 0 ratings

Related issues

anjishnu picture anjishnu  ·  3Comments

kylemcdonald picture kylemcdonald  ·  3Comments

amityaffliction picture amityaffliction  ·  3Comments

vinayakumarr picture vinayakumarr  ·  3Comments

oweingrod picture oweingrod  ·  3Comments