Hi all,
I want to use Keras to train a CNN model for classfication. As I know, there are many public pre-trained CNN models, like VGG, ImageNet etc. But unfortunately, these pre-trained models are generated with other CNN framework, like caffe or cuda-convnet. How can we use these kinds of pre-trained models or weights to initialize a Keras sequential model and then do the finetuning training?
Thanks
Dr. Wu
I also hope to see a tutorial to illustrate this.
We are working on it. This is definitely among the features we want to add soon. https://github.com/fchollet/keras/issues/100
If anybody is still interested you can use this fork of Keras which has a conversion module:
https://github.com/MarcBS/keras
Best,
Marc
Hi,
I was trying caffe to keras conversion module that you mentioned but im getting this error when i run the caffe2keras.py
global name 'network_input' is not defined
Any help?
I used it to convert caffenet model, the prototxt is this one:
name: "CaffeNet"
force_backward: true
input: "data"
input_dim: 1
input_dim: 3
input_dim: 227
input_dim: 227
input: "label"
input_dim: 1
input_dim: 1
input_dim: 1
input_dim: 1
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc8"
type: "InnerProduct"
bottom: "fc7"
top: "fc8"
inner_product_param {
num_output: 1000
}
}
Really an important feature...
This Caffe-to-Keras weight converter is what you are looking for:
https://github.com/pierluigiferrari/caffe_weight_converter
It converts .caffemodel
files to .h5
weight files. It converts weights only, not the model definition, but the weights are really all you need anyway.
For any given model, the model definition either requires only Keras core library layers, in which case it's super easy to write in Keras manually, or the model definition is complex and has custom layer types, in which case a model definition converter would probably fail anyway.
what shape input is expected by converted new keras model? (3,224,224) or (224,224,3)? because caffe works with (3,224,224) and keras with (224,224,3)...
Most helpful comment
We are working on it. This is definitely among the features we want to add soon. https://github.com/fchollet/keras/issues/100