cyprienruffino / CTCModel

Easy-to-use Connectionnist Temporal Classification in Keras

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

InvalidArgumentError: sequence_length(0) <= 32 [[{{node CTCloss_10/CTCLoss}}]]

alexandrucosminmihai opened this issue · comments

Hello!

I am trying to use the CTCModel at the end of a network containing 5 CNNs and 2 BLSTMs. The model compiles, but when I try to run fit it fails with the error InvalidArgumentError: sequence_length(0) <= 32 [[{{node CTCloss_10/CTCLoss}}]] .

This is how I run fit:
model.fit(x=[xs_train_pad, ys_train_pad, xs_train_len, ys_train_len], y=np.zeros(nb_train), \ batch_size=PARAM_BATCH_SIZE, epochs=PARAM_EPOCHS)

I am trying to train on a subset of the IAM dataset (9000 images).
My set of training images is padded and is a numpy array of the following shape:
xs_train_pad.shape=(9000, 128, 32, 1) # 9000 128x32 grayscale images.

The labels are also padded and the words are converted to rows of float64s representing the ASCII codes of the characters.
These are the shapes of the rest of the arguments:
ys_train_pad.shape=(9000, 18)
xs_train_len.shape=(9000,)
ys_train_len.shape=(9000,)

This is the network architecture (layer type and output shape):
`
(InputLayer) (None, 128, 32, 1)
(Conv2D (None, 128, 32, 32)
(BatchNormalization) (None, 128, 32, 32)
(ReLU) (None, 128, 32, 32)
(MaxPooling2D) (None, 64, 16, 32)
(Conv2D) (None, 64, 16, 64)
(BatchNormalization) (None, 64, 16, 64)
(ReLU) (None, 64, 16, 64)
(MaxPooling2D) (None, 32, 8, 64)
(Conv2D) (None, 32, 8, 128)
(BatchNormalization) (None, 32, 8, 128)
(ReLU) (None, 32, 8, 128)
(MaxPooling2D) (None, 32, 4, 128)
(Conv2D) (None, 32, 4, 128)
(BatchNormalization) (None, 32, 4, 128)
(ReLU) (None, 32, 4, 128)
(MaxPooling2D) (None, 32, 2, 128)
(Conv2D) (None, 32, 2, 256)
(BatchNormalization) (None, 32, 2, 256)
(ReLU) (None, 32, 2, 256)
(MaxPooling2D) (None, 32, 1, 256)
(Reshape) (None, 32, 256)
(BidirectionalLSTM) (None, 32, 512)
(BidirectionalLSTM) (None, 32, 512)
(TimeDistributeDense (None, 32, 80)
(ActivationSoftMax) (None, 32, 80)
labels (InputLayer) (None, None)
input_length (InputLayer) (None, 1)
label_length (InputLayer) (None, 1)

CTCloss (Lambda) (None, 1) SoftMax[0][0]
labels[0][0]
input_length[0][0]
label_length[0][0]
`

Any help would be greatly appreciated!
Thank you!