Skip to content

Commit

Permalink
Added support for CTC in both Theano and Tensorflow along with image …
Browse files Browse the repository at this point in the history
…OCR example. (#3436)

* Added CTC to Theano and Tensorflow backend along with image OCR example

* Fixed python style issues, made data files remote, and made code more idiomatic to Keras

* Fixed a couple more style issues brought up in the original PR

* Reverted wrappers.py

* Fixed potential training-on-validation issue and removed unused imports

* Fixed PEP8 issue

* Remaining PEP8 issues fixed
  • Loading branch information
Mike Henry authored and fchollet committed Aug 16, 2016
1 parent 4e15513 commit e8190a8
Show file tree
Hide file tree
Showing 4 changed files with 695 additions and 1 deletion.
Loading

1 comment on commit e8190a8

@Binteislam
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi!
I am working on image OCR with my own dataset, I have 1000 images of variable length and I want to feed in images in form of patches of 46X1. I have generated patches of my images and my label values are in urdu text, so I have encoded them as utf-8. I want to implement CTC in the output layer. I have tried to implement CTC following your approach. But I get the following error in my CTC implementation.

'numpy.ndarray' object has no attribute 'get_shape'

Could anyone guide me about my mistakes???? Kindly suggest me the solution for it. I need to solve it urgently. Please, help me out!!!!

X_train, X_test, Y_train, Y_test =train_test_split(imageList, labelList, test_size=0.3)
X_train_patches = np.array([image.extract_patches_2d(X_train[i], (46, 1))for i in range (700)]).reshape(700,1,1) #(Samples, timesteps,dimensions)
X_test_patches = np.array([image.extract_patches_2d(X_test[i], (46, 1))for i in range (300)]).reshape(300,1,1)

Y_train=np.array([i.encode("utf-8") for i in str(Y_train)])
Label_length=1
input_length=1

####################Loss Function########
def ctc_lambda_func(args):
y_pred, labels, input_length, label_length = args
# the 2 is critical here since the first couple outputs of the RNN
# tend to be garbage:
y_pred = y_pred[:, 2:, :]
return K.ctc_batch_cost(labels, y_pred, input_length, label_length)

#Building Model

model =Sequential()
model.add(LSTM(20, input_shape=(None, X_train_patches.shape[2]), return_sequences=True))
model.add(Activation('relu'))
model.add(TimeDistributed(Dense(12)))
model.add(Activation('tanh'))
model.add(LSTM(60, return_sequences=True))
model.add(Activation('relu'))
model.add(TimeDistributed(Dense(40)))
model.add(Activation('tanh'))
model.add(LSTM(100, return_sequences=True))
model.add(Activation('relu'))
loss_out = Lambda(ctc_lambda_func, name='ctc')([X_train_patches, Y_train, input_length, Label_length])

Please sign in to comment.