neuralconvo.ini + missing data/train.enc

Question

neuralconvo.ini + missing data/train.enc

johndpope opened this issue 8 years ago · comments

John D. Pope commented 8 years ago

the neuralconvo.ini specifies following files

[strings]

Mode : train, test, serve

mode = train
train_enc = data/train.enc
train_dec = data/train.dec
test_enc = data/test.enc
test_dec = data/test.enc

but there is no data folder in repo.
there is the working_dir

python3 execute.py

Mode : train

Preparing data in working_dir/
Tokenizing data in data/train.enc
Traceback (most recent call last):
File "execute.py", line 313, in
train()
File "execute.py", line 127, in train
enc_train, dec_train, enc_dev, dec_dev, _, _ = data_utils.prepare_custom_data(gConfig['working_directory'],gConfig['train_enc'],gConfig['train_dec'],gConfig['test_enc'],gConfig['test_dec'],gConfig['enc_vocab_size'],gConfig['dec_vocab_size'])
File "/Users/johndpope/Documents/gitWorkspace/tensorflow_chatbot/data_utils.py", line 137, in prepare_custom_data
data_to_token_ids(train_enc, enc_train_ids_path, enc_vocab_path, tokenizer)
File "/Users/johndpope/Documents/gitWorkspace/tensorflow_chatbot/data_utils.py", line 121, in data_to_token_ids
normalize_digits)
File "/Users/johndpope/Documents/gitWorkspace/tensorflow_chatbot/data_utils.py", line 100, in sentence_to_token_ids
words = basic_tokenizer(sentence)
File "/Users/johndpope/Documents/gitWorkspace/tensorflow_chatbot/data_utils.py", line 50, in basic_tokenizer
words.extend(re.split(_WORD_SPLIT, space_separated_fragment))
File "/usr/local/Cellar/python3/3.5.2_1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/re.py", line 203, in split
return _compile(pattern, flags).split(string, maxsplit)
TypeError: cannot use a bytes pattern on a string-like object

Eugene · Answer 1 · Sat Dec 03 2016 20:46:45 GMT+0800 (China Standard Time)

Eugene commented 8 years ago

Jeff Kriske · Answer 2 · Sun Dec 04 2016 02:24:38 GMT+0800 (China Standard Time)

The gitignore excluded his data folder from being checked in.

Julien Prissimitzis · Answer 3 · Sun Dec 04 2016 02:39:27 GMT+0800 (China Standard Time)

How do you create the .enc files from the movie-dialogs corpus he put in the readme ?

EDIT : Okay got them, https://github.com/suriyadeepan/datasets/tree/master/seq2seq/cornell_movie_corpus/

Jeff Kriske · Answer 4 · Sun Dec 04 2016 03:05:33 GMT+0800 (China Standard Time)

doh, can't perform git-lfs check for the repo properly because the account is over quota:
"This repository is over its data quota. Purchase more data packs to restore access"

Drew Perttula · Answer 5 · Sun Dec 04 2016 04:06:47 GMT+0800 (China Standard Time)

To rebuild them:
mkdir tensorflow_chatbot/data
cd tensorflow_chatbot/data
Get https://people.mpi-sws.org/~cristian/data/cornell_movie_dialogs_corpus.zip, put the *.txt files in this new data/ dir.
git clone https://github.com/suriyadeepan/datasets.git
Edit datasets-master/seq2seq/cornell_movie_corpus/scripts/prepare_data.py and uncomment the last lines so prepare_seq2seq_files executes.
python datasets-master/seq2seq/cornell_movie_corpus/scripts/prepare_data.py
This makes {train,test}.{enc,dec}

These files are just lines of text. I guess matching line numbers between enc and dec are conversation pairs.

Jeff Kriske · Answer 6 · Sun Dec 04 2016 04:08:48 GMT+0800 (China Standard Time)

I had to grab that script manually as I couldn't checkout the repo as mentioned. It appears to be training now.

Julien Prissimitzis · Answer 7 · Sun Dec 04 2016 04:20:02 GMT+0800 (China Standard Time)

Same here, had to correct a few issues with python 3.5 and the use of re.split that caused errors.

shlomis · Answer 8 · Sun Dec 04 2016 23:12:38 GMT+0800 (China Standard Time)

What was the re.split issue?

Niko Lim · Answer 9 · Mon Dec 05 2016 13:43:24 GMT+0800 (China Standard Time)

it is a problem with this line

line 8, in get_id2line:
lines=open("movie_lines.txt").read().split('\n')

it gives this error and will not let the file execute:
line 321, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xad in position 585399: invalid start byte

how did you guys fix it?

Niko Lim · Answer 10 · Mon Dec 05 2016 17:31:42 GMT+0800 (China Standard Time)

got it you have to run the prepare_data.py in python 2.7

you have to also uncomment the last lines so prepare_seq2seq_files executes.

and make sure you have a python 2.7 environment for the prepare_data.py and remember the assignment code is in python 2.7 as well.

Julien Prissimitzis · Answer 11 · Tue Dec 06 2016 01:44:08 GMT+0800 (China Standard Time)

@shlomis The re.split issue was about 're' not able to apply a byte rule to a string while training. Had to replace the line 50 of data_utils.py :
words.extend(re.split(_WORD_SPLIT, space_separated_fragment))
with
try: words.extend(re.split(_WORD_SPLIT, str.encode(space_separated_fragment))) except: words.extend(re.split(_WORD_SPLIT, space_separated_fragment))
as if you just .encode the string you'll get the reverse error while testing

Julien Prissimitzis · Answer 12 · Tue Dec 06 2016 01:45:38 GMT+0800 (China Standard Time)

@Niko2756 I am able to run it on Python 3 with a minimum amount of tweaks (a few print to change and maybe a small error message)

John D. Pope · Answer 13 · Tue Dec 06 2016 12:41:09 GMT+0800 (China Standard Time)

I think this could be related.
suriyadeepan/datasets#1

shlomis · Answer 14 · Tue Dec 06 2016 16:28:38 GMT+0800 (China Standard Time)

Here is my fix for python 3:

def get_id2line():
    lines=open('movie_lines.txt', encoding='utf-8', errors='ignore')
    lines = lines.read()
    lines = lines.split('\n')
    id2line = {}
    for line in lines:
        _line = line.split(' +++$+++ ')
        if len(_line) == 5:
            id2line[_line[0]] = _line[4]
    return id2line

and of course change print to print()

Ucchishta Sivaguru · Answer 15 · Tue Dec 06 2016 21:50:37 GMT+0800 (China Standard Time)

@drewp - sorry, python newbie here. I get this error when ran prepare_data.py. any ideas?

File "datasets/seq2seq/cornell_movie_corpus/scripts/prepare_data.py", line 89
    print '\n>> written %d lines' %(i)
                                ^
SyntaxError: invalid syntax

UPDATE - I switched to python2 and it ran successfully.

MJ · Answer 16 · Tue Jan 10 2017 02:47:56 GMT+0800 (China Standard Time)

I have been running the training for a week almost. currently
global step 253800 learning rate 0.1249 step-time 0.46 perplexity 1.00
eval: bucket 0 perplexity 1936.44
eval: bucket 1 perplexity 2054.61
eval: empty bucket 2
eval: empty bucket 3
does it end ever?

Jeff Kriske · Answer 17 · Tue Jan 10 2017 03:02:08 GMT+0800 (China Standard Time)

No, It doesn't end... it's an infinite loop. you can stop it at any time and it should pick up the latest checkpoint.

Julien Prissimitzis · Answer 18 · Tue Jan 10 2017 03:13:11 GMT+0800 (China Standard Time)

It doesn't end however your results aren't correct. Check if you're running the good Python and tensorflow version. You should also check in the .ini file if the train_dec line is correctly set to a .dec file.

MJ · Answer 19 · Tue Jan 10 2017 03:45:50 GMT+0800 (China Standard Time)

I am running with python 2 i think, I did not know it was infinite loop. Thank you. And I checked .ini file, looks fine. Also on test mode, how are you supposed to use it? can i still use it as chatbot (like talk to it) what are the output about?

MJ · Answer 20 · Tue Jan 10 2017 07:54:53 GMT+0800 (China Standard Time)

Can I test my chatbot, im getting this, why is it doing it?
python execute.py

Mode : test

Reading model parameters from working_dir/seq2seq.ckpt-266100

hi
size _UNK
hello
size _UNK

HARIOM YADAW · Answer 21 · Tue Jan 17 2017 19:48:06 GMT+0800 (China Standard Time)

I trained the using the default values in seq2seq.ini file as below after checkpoint at 16200(Reading model parameters from working_dir/seq2seq.ckpt-16200), but always getting responses as

_UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK _UNK

[strings]
mode = train
train_enc = data/train.enc
train_dec = data/train.dec
test_enc = data/test.enc
test_dec = data/test.enc

working_directory = working_dir/

[ints]

enc_vocab_size = 20000
dec_vocab_size = 20000

num_layers = 3

layer_size = 256

max_train_data_size = 0
batch_size = 64
steps_per_checkpoint = 300

[floats]
learning_rate = 0.5
learning_rate_decay_factor = 0.99
max_gradient_norm = 5.0

Any help would be appreciated. Thanks.

HARIOM YADAW · Answer 22 · Tue Jan 17 2017 19:59:08 GMT+0800 (China Standard Time)

Have anyone tried training neuralConvo.ini model ? How to test this ? should we enter the questions listed from test.enc and expect predicted output from test.dec ? or is there any other way out to test this ?

2075 · Answer 23 · Wed Jan 18 2017 18:11:40 GMT+0800 (China Standard Time)

Did you read the introduction tutorial, @hariom-yadaw, at tensorflow https://www.tensorflow.org/tutorials/seq2seq/ they explain roughly the .ini settings.

I had the UNK issue as well, and could not get to a stage, where you would have something like a conversational experience, no matter if had a few or a lot (> 2hrs) of training iterations.

HARIOM YADAW · Answer 24 · Wed Jan 18 2017 18:49:14 GMT+0800 (China Standard Time)

@2075 Yes, I had gone through the tutorial. I also trained it overnight(> 12 hrs), but UNK issue is always there. Can you please explain how to overcome this issue ? Thanks

Marcos Gutierrez · Answer 25 · Sat Jan 21 2017 23:35:52 GMT+0800 (China Standard Time)

I only get "facing Klein Chub Chub Chub Chub Strip Strip Strip Strip" :(

2075 · Answer 26 · Sun Jan 22 2017 00:50:09 GMT+0800 (China Standard Time)

@hariom-yadaw In my case with 3 layers and 256 I get kind of usable results after more than a day training. Before I tried smaller layer sizes, less and more layers, but my dual GTX 690 quad SLI setup cannot crunch all of it. As long as your training sources are well, you should get some kind of result and the longer I train, the less UNK replies come back, though it is far from a real conversation, but worthwhile test.

HARIOM YADAW · Answer 27 · Mon Jan 23 2017 16:24:40 GMT+0800 (China Standard Time)

@2075 I was following below details about LSTM networks.
http://colah.github.io/posts/2015-08-Understanding-LSTMs/

I'm not very sure what num_layers =3 & layer_size = 256 refers to here. I want to play with these parameter which are related to network size, but don't have clear understanding of these ? can you (or anyone else please) explain these and how it affects the performance. Thanks!

HARIOM YADAW · Answer 28 · Wed Jan 25 2017 14:42:20 GMT+0800 (China Standard Time)

@2075 I have tried all the solutions, I'm still getting only _UNKs. What python version ( 2 or 3) and tensorflow, are you using ? If possible, can you please share the working version ? Thanks.

Yash Kumar · Answer 29 · Tue Mar 14 2017 16:38:38 GMT+0800 (China Standard Time)

@2075 @hariom-yadaw
I trained it for more than a day with 3 layers and 256 layer size, still i get result as _UNK _UNK ...
I cloned the repo from here but as it did not include test.dec, test.enc, train.dec, train.enc I downloaded it separately from dropbox as mentioned above, also I have python 3.5.2 and tensorflow version 0.12. Can please anyone let me know what I am missing on as I am not getting any clue now. Thanks.

krati23 · Answer 30 · Thu Mar 16 2017 13:30:34 GMT+0800 (China Standard Time)

Hello

I am trying to make a chatbot in tensorflow. I clone the code from github and when i try to run the execute.py file then i get this error.

E:\python\tensorflow_chatbot-master>python execute.py

Mode : train

Preparing data in working_dir/
Tokenizing data in data/train.enc
Traceback (most recent call last):
File "execute.py", line 319, in
train()
File "execute.py", line 127, in train
enc_train, dec_train, enc_dev, dec_dev, _, _ = data_utils.prepare_custom_dat
a(gConfig['working_directory'],gConfig['train_enc'],gConfig['train_dec'],gConfig
['test_enc'],gConfig['test_dec'],gConfig['enc_vocab_size'],gConfig['dec_vocab_si
ze'])
File "E:\python\tensorflow_chatbot-master\data_utils.py", line 137, in prepare
_custom_data
data_to_token_ids(train_enc, enc_train_ids_path, enc_vocab_path, tokenizer)
File "E:\python\tensorflow_chatbot-master\data_utils.py", line 112, in data_to
_token_ids
vocab, _ = initialize_vocabulary(vocabulary_path)
File "E:\python\tensorflow_chatbot-master\data_utils.py", line 87, in initiali
ze_vocabulary
rev_vocab.extend(f.readlines())
File "C:\Python\Python35\lib\site-packages\tensorflow\python\lib\io\file_io.py
", line 131, in readlines
s = self.readline()
File "C:\Python\Python35\lib\site-packages\tensorflow\python\lib\io\file_io.py
", line 124, in readline
return compat.as_str_any(self._read_buf.ReadLineAsString())
File "C:\Python\Python35\lib\site-packages\tensorflow\python\util\compat.py",
line 106, in as_str_any
return as_str(value)
File "C:\Python\Python35\lib\site-packages\tensorflow\python\util\compat.py",
line 84, in as_text
return bytes_or_text.decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 1: invalid
start byte

I have seen one comment above with the same error code and i will run the prepare_data.py as well..but even after...i am getting the same error.

Can anyone please help where am i doing wrong.
Thanks in advance

Tylenol · Answer 31 · Tue Apr 18 2017 03:50:18 GMT+0800 (China Standard Time)

I am running into the same difficulties as @krati23 glad to see there is a common issue

I figured it out, Go to "https://github.com/suriyadeepan/datasets/blob/master/seq2seq/cornell_movie_corpus/pull_data.sh" and download all the filed then in the seq2seq.ini change the filed paths to those paths you just downloaded

jonsanti · Answer 32 · Wed May 03 2017 05:13:42 GMT+0800 (China Standard Time)

I am having issues with the tensorflow chatbot and was wondering if I could get pointed in the right direction. when running the execute.py I get the error

Mode : train

Traceback (most recent call last):
File "C:/Users/jonsa/Desktop/tensorflow_chatbot-master/execute.py", line 320, in
train()
File "C:/Users/jonsa/Desktop/tensorflow_chatbot-master/execute.py", line 138, in train
model = create_model(sess, False)
File "C:/Users/jonsa/Desktop/tensorflow_chatbot-master/execute.py", line 105, in create_model
model = seq2seq_model.Seq2SeqModel( gConfig['enc_vocab_size'], gConfig['dec_vocab_size'], buckets, gConfig['layer_size'], gConfig['num_layers'], gConfig['max_gradient_norm'], gConfig['batch_size'], gConfig['learning_rate'], gConfig['learning_rate_decay_factor'], forward_only=forward_only)
File "C:\Users\jonsa\Desktop\tensorflow_chatbot-master\seq2seq_model.py", line 165, in init
softmax_loss_function=softmax_loss_function)
File "C:\Users\jonsa\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\contrib\legacy_seq2seq\python\ops\seq2seq.py", line 1201, in model_with_buckets
decoder_inputs[:bucket[1]])
File "C:\Users\jonsa\Desktop\tensorflow_chatbot-master\seq2seq_model.py", line 164, in
lambda x, y: seq2seq_f(x, y, False),
File "C:\Users\jonsa\Desktop\tensorflow_chatbot-master\seq2seq_model.py", line 128, in seq2seq_f
feed_previous=do_decode)
File "C:\Users\jonsa\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\contrib\legacy_seq2seq\python\ops\seq2seq.py", line 855, in embedding_attention_seq2seq
encoder_cell, encoder_inputs, dtype=dtype)
File "C:\Users\jonsa\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn.py", line 197, in static_rnn
(output, state) = call_cell()
File "C:\Users\jonsa\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn.py", line 184, in
call_cell = lambda: cell(input, state)
File "C:\Users\jonsa\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn_cell_impl.py", line 881, in call
return self._cell(embedded, state)
File "C:\Users\jonsa\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn_cell_impl.py", line 953, in call
cur_inp, new_state = cell(cur_inp, cur_state)
File "C:\Users\jonsa\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn_cell_impl.py", line 146, in call
with _checked_scope(self, scope or "gru_cell", reuse=self._reuse):
File "C:\Users\jonsa\AppData\Local\Programs\Python\Python35\lib\contextlib.py", line 59, in enter
return next(self.gen)
File "C:\Users\jonsa\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\contrib\rnn\python\ops\core_rnn_cell_impl.py", line 77, in _checked_scope
type(cell).name))
ValueError: Attempt to reuse RNNCell <tensorflow.contrib.rnn.python.ops.core_rnn_cell_impl.GRUCell object at 0x0000014886DF21D0> with a different variable scope than its first use. First use of cell was with scope 'embedding_attention_seq2seq/rnn/multi_rnn_cell/cell_0/gru_cell', this attempt is with scope 'embedding_attention_seq2seq/rnn/multi_rnn_cell/cell_1/gru_cell'. Please create a new instance of the cell if you would like it to use a different set of weights. If before you were using: MultiRNNCell([GRUCell(...)] * num_layers), change to: MultiRNNCell([GRUCell(...) for _ in range(num_layers)]). If before you were using the same cell instance as both the forward and reverse cell of a bidirectional RNN, simply create two instances (one for forward, one for reverse). In May 2017, we will start transitioning this cell's behavior to use existing stored weights, if any, when it is called with scope=None (which can lead to silent model degradation, so this error will remain until then.)

Process finished with exit code 1

I also tried the suggested corrections and still nothing

Deni Shakhbulatov · Answer 33 · Wed May 10 2017 13:08:05 GMT+0800 (China Standard Time)

@jonsanti Check out the issue I started: #34
I gave the solution there. all you need to do is use a specific version of tensorflow in your virtual environment. As far as I understood they have made few changes and because of that it does not work properly. my solution should fix the problem, let me know if it doesnt.

John D. Pope · Answer 34 · Thu May 11 2017 00:17:04 GMT+0800 (China Standard Time)

@Denisolt - thanks

Mohanmk1 · Answer 35 · Sat May 13 2017 16:48:22 GMT+0800 (China Standard Time)

i am facing below error could you please help me to fix it.

Chejo-gt · Answer 36 · Wed May 17 2017 09:43:33 GMT+0800 (China Standard Time)

I fix the tensorflow.model import error downloading the models module of tensorflow and changing the reference to "tensorflow.models.tutorials.rnn" it is the correct path.

Saurabh Chaturvedi · Answer 37 · Fri Jul 21 2017 19:53:27 GMT+0800 (China Standard Time)

@drewp But there is no movie_lines.txt file.

ABC · Answer 38 · Tue Sep 05 2017 09:42:49 GMT+0800 (China Standard Time)

'module' object has no attribute 'seq2seq'

??????????????? · Answer 39 · Sat Sep 30 2017 22:09:42 GMT+0800 (China Standard Time)

#61 gaoshuming
maybe you should try to replace every 'tf.nn.seq2seq' by 'tf.contrib.legacy_seq2seq' in seq2seq_model.py

Kiran Varghese · Answer 40 · Wed Nov 15 2017 06:53:57 GMT+0800 (China Standard Time)

pywrap_tensorflow.TF_GetCode(status))

NotFoundError: NewRandomAccessFile failed to Create/Open: data/train.enc : The system cannot find the path specified.
Anyone, please let me know error

Stefano Ferri · Answer 41 · Tue Dec 05 2017 21:19:34 GMT+0800 (China Standard Time)

Hello Dears,
Did anybody got it working? I am curious to see Q and A :) 👍
Best!!

KB · Answer 42 · Mon Dec 11 2017 09:37:08 GMT+0800 (China Standard Time)

How to solve the UNK problem?

DanielUnn · Answer 43 · Mon Jan 15 2018 10:35:54 GMT+0800 (China Standard Time)

Hi guys, do you face this problem?
RecursionError: maximum recursion depth exceeded
Thank you if you guys can help. Appreciate.

alpha-256 · Answer 44 · Tue Jan 30 2018 00:18:46 GMT+0800 (China Standard Time)

crossent = softmax_loss_function(labels=target, logits=logit)

TypeError: sampled_loss() got an unexpected keyword argument 'logits'
anyone knows a fix?

Harshal Thakare · Answer 45 · Wed Feb 07 2018 16:33:18 GMT+0800 (China Standard Time)

Hi i am getting this error?can someone help me regarding this
Preparing data in working_dir/
Tokenizing data in data/train.enc
tokenizing line 100000
Tokenizing data in data/train.dec
tokenizing line 100000
Tokenizing data in data/test.enc
2018-02-07 13:56:48.209757: W d:\nwani\l\tensorflow_1498062690615\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE instructions
, but these are available on your machine and could speed up CPU computations.
2018-02-07 13:56:48.210757: W d:\nwani\l\tensorflow_1498062690615\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE2 instruction
s, but these are available on your machine and could speed up CPU computations.
2018-02-07 13:56:48.210757: W d:\nwani\l\tensorflow_1498062690615\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE3 instruction
s, but these are available on your machine and could speed up CPU computations.
2018-02-07 13:56:48.210757: W d:\nwani\l\tensorflow_1498062690615\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructi
ons, but these are available on your machine and could speed up CPU computations.
2018-02-07 13:56:48.210757: W d:\nwani\l\tensorflow_1498062690615\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructi
ons, but these are available on your machine and could speed up CPU computations.
2018-02-07 13:56:48.210757: W d:\nwani\l\tensorflow_1498062690615\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions
, but these are available on your machine and could speed up CPU computations.
2018-02-07 13:56:48.211757: W d:\nwani\l\tensorflow_1498062690615\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instruction
s, but these are available on your machine and could speed up CPU computations.
2018-02-07 13:56:48.211757: W d:\nwani\l\tensorflow_1498062690615\work\tensorflow-1.1.0\tensorflow\core\platform\cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions
, but these are available on your machine and could speed up CPU computations.
Creating 3 layers of 256 units.
Traceback (most recent call last):
File "execute.py", line 319, in
train()
File "execute.py", line 137, in train
model = create_model(sess, False)
File "execute.py", line 104, in create_model
model = seq2seq_model.Seq2SeqModel( gConfig['enc_vocab_size'], gConfig['dec_vocab_size'], _buckets, gConfig['layer_size'], gConfig['num_layers'], gConfig['max_gradient_norm'], gConfig['batch_size'
], gConfig['learning_rate'], gConfig['learning_rate_decay_factor'], forward_only=forward_only)
File "C:\Users\hthakare\python\tensorflow_chatbot-master\seq2seq_model.py", line 106, in init
single_cell = tf.nn.rnn_cell.GRUCell(size)
AttributeError: module 'tensorflow.python.ops.nn' has no attribute 'rnn_cell'

zillabunny · Answer 46 · Tue Feb 13 2018 14:37:35 GMT+0800 (China Standard Time)

On windows you can fix this by getting a peice of software called MiniConda, installing it on your system and then creating a 2.7 environment to run the importer for the movie pack.

John D. Pope · Answer 47 · Tue Feb 13 2018 23:06:07 GMT+0800 (China Standard Time)

also noteworthy - https://github.com/vyraun/chatbot-MemN2N-tensorflow

AnanyaChandraker · Answer 48 · Mon Feb 26 2018 20:33:56 GMT+0800 (China Standard Time)

I'm able to run python execute.py after making changes to data_utils.py and seq2seq_model.py
corrected file are available here:
https://github.com/llSourcell/tensorflow_chatbot/pull/77/files

It's started training the model and testing once mode is changed to test

Thanks to Chrisfauerbach for correction

drodevilosun · Answer 49 · Tue May 01 2018 19:16:50 GMT+0800 (China Standard Time)

Hi guys, can someone help me fix this problem: >> Mode : test

Traceback (most recent call last):
File "execute.py", line 324, in
decode()
File "execute.py", line 220, in decode
enc_vocab, _ = data_utils.initialize_vocabulary(enc_vocab_path)
File "D:\My_document\AI\Chatbot_Conversation\tensorflow_chatbot-master\data_utils.py", line 86, in initialize_vocabulary
rev_vocab.extend(f.readlines())
File "C:\Users\Hoang\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 131, in readlines
s = self.readline()
File "C:\Users\Hoang\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 124, in readline
return compat.as_str_any(self._read_buf.ReadLineAsString())
File "C:\Users\Hoang\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\util\compat.py", line 106, in as_str_any
return as_str(value)
File "C:\Users\Hoang\AppData\Local\Programs\Python\Python35\lib\site-packages\tensorflow\python\util\compat.py", line 84, in as_text
return bytes_or_text.decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 1: invalid start byte

Bendy Latortue · Answer 50 · Fri May 11 2018 15:07:14 GMT+0800 (China Standard Time)

@uccmen you have to change print '\n>> written %d lines' %(i) to print('\n>> written %d lines' %(i)') in python 3.x

Jai Aravind · Answer 51 · Mon May 28 2018 14:18:30 GMT+0800 (China Standard Time)

I get the below error how should i fix this error

File "D:\Anaconda\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))

NotFoundError: NewRandomAccessFile failed to Create/Open: data/train.enc : The system cannot find the path specified.

Hiren Rana · Answer 52 · Sun Aug 12 2018 16:25:01 GMT+0800 (China Standard Time)

Mode : train

Preparing data in working_dir/
2018-08-12 13:52:28.476005: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Creating 3 layers of 256 units.
Traceback (most recent call last):
File "execute.py", line 319, in
train()
File "execute.py", line 137, in train
model = create_model(sess, False)
File "execute.py", line 104, in create_model
model = seq2seq_model.Seq2SeqModel( gConfig['enc_vocab_size'], gConfig['dec_vocab_size'], _buckets, gConfig['layer_size'], gConfig['num_layers'], gConfig['max_gradient_norm'], gConfig['batch_size'], gConfig['learning_rate'], gConfig['learning_rate_decay_factor'], forward_only=forward_only)
File "/home/mark/tensorflow_chatbot/seq2seq_model.py", line 154, in init
self.outputs, self.losses = tf.nn.seq2seq.model_with_buckets(
AttributeError: module 'tensorflow.nn' has no attribute 'seq2seq'

anyone can help??

Ajay Renganathan · Answer 53 · Sun Apr 14 2019 01:02:24 GMT+0800 (China Standard Time)

I managed to fix Most of the issues with the code by looking through most of this forum and doing some of my own research , My bot seems to have named him self Alexander . I have trained him up till a Perplexity of ~8 , it been around 12 hours of training .

I am using all the latest versions , Anaconda3 , Python 3.7 etc .

I have even done some fixes on UI which was throwing configuration parser errors . It seems to be working fine . Alexander is still young he would need to be trained for another 15 hours approx.

Reply below if you would want me to Push changes !

DesiKeki · Answer 54 · Tue Apr 16 2019 15:08:25 GMT+0800 (China Standard Time)

crossent = softmax_loss_function(labels=target, logits=logit)
TypeError: sampled_loss() got an unexpected keyword argument 'logits'
anyone knows a fix?

Following change fixed it for me:

#Fix for Error: TypeError: sampled_loss() got an unexpected keyword argument 'logits'
def sampled_loss(labels, logits):
labels = tf.reshape(labels, [-1, 1])
return tf.nn.sampled_softmax_loss(w_t, b, labels, logits, num_samples, self.target_vocab_size)

#def sampled_loss(inputs, labels):
#labels = tf.reshape(labels, [-1, 1])
#return tf.nn.sampled_softmax_loss(w_t, b, inputs, labels, num_samples,
#self.target_vocab_size)

DesiKeki · Answer 55 · Tue Apr 16 2019 15:37:43 GMT+0800 (China Standard Time)

Dear Friends, Gurus and Experts,

I have one general question about creating a bot using this code. In this exercise we have a knowledge set with train.enc having one part of the conversation and train.dec having the other part (replies) of the conversation to train the bot.
My doubt is how does the model relate a reply in train.dec with its corresponding question in train.enc dataset?
Actually, I am trying to map the same code for developing a customer support bot for my college project. Here I have set of FAQs as my knowledge database. I have taken all the questions in FAQs as train.enc set and all the answers as train.dec set. But in this case questions and answers have a kind of tight coupling. In this case, how can I maintain this relevance of answers with questions in my model?

Any help or a pointer in this regard will be very much appreciated.