Keras: How to convert a caffe model to Keras type?

Created on 14 May 2015 · 10Comments · Source: keras-team/keras

Hi all,

I want to use Keras to train a CNN model for classfication. As I know, there are many public pre-trained CNN models, like VGG, ImageNet etc. But unfortunately, these pre-trained models are generated with other CNN framework, like caffe or cuda-convnet. How can we use these kinds of pre-trained models or weights to initialize a Keras sequential model and then do the finetuning training?

Thanks
Dr. Wu

stale

Source

fuzhangwu

👍12

Most helpful comment

We are working on it. This is definitely among the features we want to add soon. https://github.com/fchollet/keras/issues/100

fchollet on 14 May 2015

👍9

All 10 comments

I also hope to see a tutorial to illustrate this.

willard-yuan on 14 May 2015

We are working on it. This is definitely among the features we want to add soon. https://github.com/fchollet/keras/issues/100

fchollet on 14 May 2015

👍9

https://github.com/fchollet/keras/pull/368

pranv on 14 Jul 2015

If anybody is still interested you can use this fork of Keras which has a conversion module:
https://github.com/MarcBS/keras

Best,
Marc

MarcBS on 10 Feb 2016

👍5 😕3

Hi,
I was trying caffe to keras conversion module that you mentioned but im getting this error when i run the caffe2keras.py

global name 'network_input' is not defined

Any help?

dhruvjain on 6 Jul 2016

@dhruvjain if you are referring to this fork then open a new issue here, please.

Furthermore, it would be very helpful if you could include the model files that you are trying to convert, or at least the .prototxt.

MarcBS on 6 Jul 2016

I used it to convert caffenet model, the prototxt is this one:
name: "CaffeNet"

force_backward: true
input: "data"
input_dim: 1
input_dim: 3
input_dim: 227
input_dim: 227

input: "label"
input_dim: 1
input_dim: 1
input_dim: 1
input_dim: 1

layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
}
}

layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
inner_product_param {
num_output: 4096
}
}

layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}

layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}

layer {
name: "fc8"
type: "InnerProduct"
bottom: "fc7"
top: "fc8"
inner_product_param {
num_output: 1000
}
}

dhruvjain on 6 Jul 2016

Really an important feature...

iamtechaddict on 14 Sep 2017

This Caffe-to-Keras weight converter is what you are looking for:

https://github.com/pierluigiferrari/caffe_weight_converter

It converts .caffemodel files to .h5 weight files. It converts weights only, not the model definition, but the weights are really all you need anyway.

For any given model, the model definition either requires only Keras core library layers, in which case it's super easy to write in Keras manually, or the model definition is complex and has custom layer types, in which case a model definition converter would probably fail anyway.

pierluigiferrari on 26 Jan 2018

👍7

what shape input is expected by converted new keras model? (3,224,224) or (224,224,3)? because caffe works with (3,224,224) and keras with (224,224,3)...