Invalid outputs

Question

Invalid outputs

nishant260190 opened this issue 6 years ago · comments

I have followed all the steps as described by you everything is running perfectly but output results are not correct, may be because I have used very small data set. But still for the exact conversation it should give correct result like "hello" in response of "hi" or "fine" in response of "how are you" which I have given as input while training.

Adit Deshpande · Answer 1 · Thu Jul 05 2018 19:26:26 GMT+0800 (China Standard Time)

Just because something is in the training set doesn't mean that the network will learn to output that. The problem could be a couple of different things: small dataset, too complex network architecture, improperly tuned hyperparameters, not enough variety in dataset, etc. It's hard to pinpoint which one it could be. One exercise that may be useful is just using a very small dataset (a couple of input-output pairs) and using a very small network and seeing if the network can at least learn those mappings. Once it can, then slowly increase the size of the dataset as well as the complexity of the network.

Nishant Goel · Answer 2 · Sat Jul 07 2018 13:10:14 GMT+0800 (China Standard Time)

@adeshpande3 : Thanks for the early response. I am new to this so I am not able to understand how to increase/decrease the complexity of network. I have not changed the code, it is same as given in this repository.
And one more thing on what basis I have to set hyperparameters.

Word2Vec :
wordVecDimensions = 100
batchSize = 128
numNegativeSample = 64
windowSize = 5
numIterations = 100000

numTrainingExamples : 919210 vocabSize : 5850

Seq2Seq :

batchSize = 24
maxEncoderLength = 15
maxDecoderLength = maxEncoderLength
lstmUnits = 112
embeddingDim = lstmUnits
numLayersLSTM = 3
numIterations = 70000

Adit Deshpande · Answer 3 · Sun Jul 08 2018 20:48:31 GMT+0800 (China Standard Time)

By decreasing the complexity of the network, I mean decreasing the number of LSTM units or the number of LSTM layers

Nishant Goel · Answer 4 · Mon Jul 09 2018 20:08:00 GMT+0800 (China Standard Time)

@adeshpande3 : Can you please help me out in understanding that on what basis we have to define parameters

Adit Deshpande · Answer 5 · Mon Jul 09 2018 20:31:38 GMT+0800 (China Standard Time)

There isn't really an easy answer to that question. It's highly dependent on what task you're trying to solve (question/answering in our case), the type of model you're trying to create, and the amount of data/compute power you have. All these things will affect the parameter values you choose. I'd recommend watching CS 224 to get a better understanding.