I am trying to train yolo on my own dataset. I was able to successfully overfit train on 10 images with 2 objects. The inference works well.
Now I am trying to train on the full dataset of 6000 images and get an error.
The cmd I use to train:
./flow --train --dataset /home/ubuntu/datasets/img/ --annotation /home/ubuntu/datasets/anno/ --model cfg/yolo-17c.cfg --load bin/yolo.weights --keep 5 --epoch 30000 --save 1000 --lr 0.00001 --batch 16 --gpu .8
The actual error:
Caused by op 'Reshape', defined at:
File "./flow", line 44, in
tfnet = TFNet(FLAGS)
File "/home/ubuntu/darkflow/net/build.py", line 63, in __init__
self.setup_meta_ops()
File "/home/ubuntu/darkflow/net/build.py", line 106, in setup_meta_ops
if self.FLAGS.train: self.build_train_op()
File "/home/ubuntu/darkflow/net/help.py", line 15, in build_train_op
self.framework.loss(self.out)
File "/home/ubuntu/darkflow/net/yolov2/train.py", line 56, in loss
net_out_reshape = tf.reshape(net_out, [-1, H, W, B, (4 + 1 + C)])
File "/home/ubuntu/.conda/envs/py3/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2630, in reshape
name=name)
File "/home/ubuntu/.conda/envs/py3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
op_def=op_def)
File "/home/ubuntu/.conda/envs/py3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2327, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/ubuntu/.conda/envs/py3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1226, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 1149200 values, but the requested shape requires a multiple of 18590
[[Node: Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/gpu:0"](output, Reshape/shape)]]
[[Node: mul_30/_195 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_188_mul_30", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
My labels.txt
file has 17 labels.
The changes I made to the yolo-17c.cfg are:
filters=125
for the last layer I also tried this with 425
and 35
classes=17
based on the number of classes
I realize that the 18590
number is associated with width=416, height=416
when I change them to width=224, height=224
, I get
...but the requested shape requires a multiple of 21560
@thtrieu , @Dhruv-Mohan , @abagshaw do you have any ideas why I am getting this error ?
Thank you for the help
You need set the filters=110
for the last conv. layer.
It needs to be based on this formula:
filters= #num * (#classes + 5)
[convolutional]
size=1
stride=1
pad=1
filters=110
activation=linear
[region]
anchors = 0.738768,0.874946, 2.42204,2.65704, 4.30971,7.04493, 10.246,4.59428, 12.6868,11.8741
Hope that helps
@1NNcoder Please tell if the issue is resolved yet? If it is, you can close it and clear the space for other people :)
Yes this works
Most helpful comment
You need set the
filters=110
for the last conv. layer.It needs to be based on this formula:
filters= #num * (#classes + 5)
Hope that helps