Darkflow: Tensor flow tensor reshape error

Created on 8 Apr 2017  ·  3Comments  ·  Source: thtrieu/darkflow

I am trying to train yolo on my own dataset. I was able to successfully overfit train on 10 images with 2 objects. The inference works well.

Now I am trying to train on the full dataset of 6000 images and get an error.
The cmd I use to train:
./flow --train --dataset /home/ubuntu/datasets/img/ --annotation /home/ubuntu/datasets/anno/ --model cfg/yolo-17c.cfg --load bin/yolo.weights --keep 5 --epoch 30000 --save 1000 --lr 0.00001 --batch 16 --gpu .8

The actual error:

Caused by op 'Reshape', defined at:
File "./flow", line 44, in
tfnet = TFNet(FLAGS)
File "/home/ubuntu/darkflow/net/build.py", line 63, in __init__
self.setup_meta_ops()
File "/home/ubuntu/darkflow/net/build.py", line 106, in setup_meta_ops
if self.FLAGS.train: self.build_train_op()
File "/home/ubuntu/darkflow/net/help.py", line 15, in build_train_op
self.framework.loss(self.out)
File "/home/ubuntu/darkflow/net/yolov2/train.py", line 56, in loss
net_out_reshape = tf.reshape(net_out, [-1, H, W, B, (4 + 1 + C)])
File "/home/ubuntu/.conda/envs/py3/lib/python3.5/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2630, in reshape
name=name)
File "/home/ubuntu/.conda/envs/py3/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 763, in apply_op
op_def=op_def)
File "/home/ubuntu/.conda/envs/py3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2327, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/ubuntu/.conda/envs/py3/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1226, in __init__
self._traceback = _extract_stack()
InvalidArgumentError (see above for traceback): Input to reshape is a tensor with 1149200 values, but the requested shape requires a multiple of 18590
[[Node: Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/gpu:0"](output, Reshape/shape)]]
[[Node: mul_30/_195 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_188_mul_30", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

My labels.txt file has 17 labels.
The changes I made to the yolo-17c.cfg are:
filters=125 for the last layer I also tried this with 425 and 35
classes=17 based on the number of classes

I realize that the 18590 number is associated with width=416, height=416 when I change them to width=224, height=224, I get
...but the requested shape requires a multiple of 21560

@thtrieu , @Dhruv-Mohan , @abagshaw do you have any ideas why I am getting this error ?

Thank you for the help

Most helpful comment

You need set the filters=110 for the last conv. layer.
It needs to be based on this formula:
filters= #num * (#classes + 5)

[convolutional]
size=1
stride=1
pad=1
filters=110
activation=linear
[region]
anchors = 0.738768,0.874946, 2.42204,2.65704, 4.30971,7.04493, 10.246,4.59428, 12.6868,11.8741

Hope that helps

All 3 comments

You need set the filters=110 for the last conv. layer.
It needs to be based on this formula:
filters= #num * (#classes + 5)

[convolutional]
size=1
stride=1
pad=1
filters=110
activation=linear
[region]
anchors = 0.738768,0.874946, 2.42204,2.65704, 4.30971,7.04493, 10.246,4.59428, 12.6868,11.8741

Hope that helps

@1NNcoder Please tell if the issue is resolved yet? If it is, you can close it and clear the space for other people :)

Yes this works

Was this page helpful?
0 / 5 - 0 ratings

Related issues

eugtanchik picture eugtanchik  ·  4Comments

Khobzer picture Khobzer  ·  5Comments

wonny2001 picture wonny2001  ·  4Comments

realityzero picture realityzero  ·  3Comments

ShawnDing1994 picture ShawnDing1994  ·  4Comments