K.ctc_batch_cost() get slice index 0 of dimension 0 out of bounds error when using online trainning (batch_size=1) #7049

channingxiao · 2017-06-20T03:42:40Z

Hello, I am using CTC loss function in my model, everything were good until I tried to using online training (batch_size =1). The error was caused by K.ctc_batch_cost function.
The error can be reproduced with the keras example "image_ocr.py" by simply set the "minibatch_size = 1 " in line 446 ( the parameter of TextImagegenerator).

I am using keras 2.0.2 with tensorflow 1.1.0 backend.
Thank you!

cyprienruffino · 2017-07-17T14:42:44Z

Hi ! I am having the same problem with Keras 2.0.6 and TensorFlow 1.2.0.
As I have to do online training, I did get around the problem by making minibatches with twice the same data, but I admit it is a quite dirty solution...

xisnu · 2017-08-08T08:07:41Z

Yes. I am also having the same problem. The problem is definitely related to conversion of dense to sparse.

Cerno-b · 2017-09-07T09:18:41Z

Glad someone already posted this.
The specific error message is

InvalidArgumentError (see above for traceback): slice index 0 of dimension 0 out of bounds.
[[Node: ctc/scan/strided_slice = StridedSlice[Index=DT_INT32, T=DT_INT32, begin_mask=0, ellipsis_mask=0, end_mask=0, new_axis_mask=0, shrink_axis_mask=1, _device="/job:localhost/replica:0/task:0/cpu:0"](ctc/scan/Shape, ctc/scan/strided_slice/stack, ctc/scan/strided_slice/stack_1, ctc/scan/strided_slice/stack_2)]]

Here is a somewhat minimal example that shows what's happening. It only occurs for batch_size exactly equal to one.

test_lstm.py.txt

@fchollet It's fairly urgent for me, so if you have any pointers where I could look, I could aid in the investigation. I tried reading the corresponding Tensorflow source code at tensorflow-master\tensorflow\core\util\strided_slice_op.cc, line 299 but as a TF beginner, progress is pretty slow so far.

Cerno-b · 2017-09-07T12:22:58Z

Tried to get into TF debugging and was able to get this stack, if it helps:

Traceback of node construction:
[...]
7: test_lstm\test_lstm.py
Line: 46
Function:
Text: "loss_out = Lambda(ctc_lambda_func, output_shape=(1,), name='ctc')([y_pred, labels, input_length, label_length])"
8: Python35\lib\site-packages\keras\engine\topology.py
Line: 554
Function: call
Text: "output = self.call(inputs, **kwargs)"
9: Python35\lib\site-packages\keras\layers\core.py
Line: 659
Function: call
Text: "return self.function(inputs, **arguments)"
10: test_lstm\test_lstm.py
Line: 17
Function: ctc_lambda_func
Text: "return K.ctc_batch_cost(labels, y_pred, input_length, label_length)"
11: Python35\lib\site-packages\keras\backend\tensorflow_backend.py
Line: 3263
Function: ctc_batch_cost
Text: "sparse_labels = tf.to_int32(ctc_label_dense_to_sparse(y_true, label_length))"
12: Python35\lib\site-packages\keras\backend\tensorflow_backend.py
Line: 3222
Function: ctc_label_dense_to_sparse
Text: "initializer=init, parallel_iterations=1)"
13: Python35\lib\site-packages\tensorflow\python\ops\functional_ops.py
Line: 526
Function: scan
Text: "n = array_ops.shape(elems_flat[0])[0]"
14: Python35\lib\site-packages\tensorflow\python\ops\array_ops.py
Line: 509
Function: _SliceHelper
Text: "name=name)"
15: Python35\lib\site-packages\tensorflow\python\ops\array_ops.py
Line: 677
Function: strided_slice
Text: "shrink_axis_mask=shrink_axis_mask)"
16: Python35\lib\site-packages\tensorflow\python\ops\gen_array_ops.py
Line: 3744
Function: strided_slice
Text: "shrink_axis_mask=shrink_axis_mask, name=name)"
17: Python35\lib\site-packages\tensorflow\python\framework\op_def_library.py
Line: 767
Function: apply_op
Text: "op_def=op_def)"
18: Python35\lib\site-packages\tensorflow\python\framework\ops.py
Line: 2630
Function: create_op
Text: "original_op=self._default_original_op, op_def=op_def)"
19: Python35\lib\site-packages\tensorflow\python\framework\ops.py
Line: 1204
Function: init
Text: "self._traceback = self._graph._extract_stack() # pylint: disable=protected-access"

WindQAQ · 2017-11-21T09:59:01Z

@fchollet I think this problem can be basically solved by modifying the function ctc_batch_cost() in keras/backend/tensorflow_backend.py . Take a look at the following lines:

label_length = tf.to_int32(tf.squeeze(label_length))
input_length = tf.to_int32(tf.squeeze(input_length))

If the batch_size is 1, these two tensors label_length and input_length will be rank 0; however, they should be rank 1 with shape (1,). At least for tensorflow API ctc_loss(), the parameter sequence_length should be a 1-D tensor, see Tensorflow CTC loss. If the tensor input_length has rank 0, these lines will be broken. As for label_length, it seems that the input should also be at least rank 1 in Tensorflow scan(), so the error occurs in ctc_label_dense_to_sparse().

Hence, my basic solution is to squeeze only along axis 1, that is,

label_length = tf.to_int32(tf.squeeze(label_length, axis=1))
input_length = tf.to_int32(tf.squeeze(input_length, axis=1))

It leads to the rank 1 tensors for all kinds of batch size. I try this solution on my computer, and it works well!

saisumanth007 · 2018-01-28T12:02:54Z

@WindQAQ @fchollet Could you please tell what input_length and label_length specify? As per the documentation it seems label_length contains the lengths of ground truth strings (in case of OCR). But I'm not sure what input_length means.

xisnu · 2018-01-29T05:29:15Z

You are right about the label_length. Input length is the length of the input sequences. In case of OCR it may represent the sequence of feature vectors created from the input image.

saisumanth007 · 2018-01-29T06:06:05Z

@xisnu Input to the LSTM is (batch_size, 26, 512 ) in my case and the output is (batch_size, 26, 37 ). So what should be the input_length?

xisnu · 2018-01-29T06:16:47Z

Suppose you have three samples like this
Input
a1 a2 a3
b1b2 b3 b4
c1
Target
goat
mat
is
If you feed this to LSTM CTC model you should pad them to make it equal. So it becomes
Input
a1 a2 a3 PD
b1 b2 b3 b4
c1 PD PD PD
So input to LSTM is (3, 4, 1), But you can also input the actual input sequence lengths in an array [3 4 1] and of course target length is another array [4 3 2].

WindQAQ · 2018-01-29T06:20:11Z

@saisumanth007 It should be the length of inputs before padding, and thus, it can not be determined based on the information you give.

saisumanth007 · 2018-01-29T06:59:02Z

@xisnu @WindQAQ Suppose in OCR, I have 3 images : image1,image2 and image3 with ground truth strings "goat", "mat", "is" respectively.
While training, I will pad the labels to max length i.e., 4 in this case.
So label_length = [4,3,2] --> these are the lengths before padding.
Can we determine input_length in this case?

WindQAQ · 2018-01-29T07:17:23Z

@saisumanth007

Input to the LSTM is (batch_size, 26, 512 ) in my case

Basically, if you do not pad the inputs, that is, the feature vectors of images, the input_length should be an array filled with 26. It depends on whether you pad the inputs. Maybe you can talk about how you extract the feature vectors from images so that I can help you directly.

fchollet added the To investigate Looks like a bug. It needs someone to investigate. label Jun 20, 2017

selcouthlyBlue mentioned this issue Jan 16, 2018

implement CTC with keras? #383

Closed

WindQAQ mentioned this issue Mar 28, 2018

Fix ctc_batch_cost() error when batch_size = 1 #9775

Merged

fchollet closed this as completed in #9775 Mar 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

K.ctc_batch_cost() get slice index 0 of dimension 0 out of bounds error when using online trainning (batch_size=1) #7049

K.ctc_batch_cost() get slice index 0 of dimension 0 out of bounds error when using online trainning (batch_size=1) #7049

channingxiao commented Jun 20, 2017

cyprienruffino commented Jul 17, 2017

xisnu commented Aug 8, 2017

Cerno-b commented Sep 7, 2017

Cerno-b commented Sep 7, 2017 •

edited

Loading

WindQAQ commented Nov 21, 2017

saisumanth007 commented Jan 28, 2018

xisnu commented Jan 29, 2018

saisumanth007 commented Jan 29, 2018

xisnu commented Jan 29, 2018

WindQAQ commented Jan 29, 2018

saisumanth007 commented Jan 29, 2018

WindQAQ commented Jan 29, 2018

K.ctc_batch_cost() get slice index 0 of dimension 0 out of bounds error when using online trainning (batch_size=1) #7049

K.ctc_batch_cost() get slice index 0 of dimension 0 out of bounds error when using online trainning (batch_size=1) #7049

Comments

channingxiao commented Jun 20, 2017

cyprienruffino commented Jul 17, 2017

xisnu commented Aug 8, 2017

Cerno-b commented Sep 7, 2017

Cerno-b commented Sep 7, 2017 • edited Loading

WindQAQ commented Nov 21, 2017

saisumanth007 commented Jan 28, 2018

xisnu commented Jan 29, 2018

saisumanth007 commented Jan 29, 2018

xisnu commented Jan 29, 2018

WindQAQ commented Jan 29, 2018

saisumanth007 commented Jan 29, 2018

WindQAQ commented Jan 29, 2018

Cerno-b commented Sep 7, 2017 •

edited

Loading