Python3 support?

Question

Python3 support?

johntiger1 opened this issue 5 years ago · comments

John Chen commented 5 years ago

Any chance we'll get a Python 3 version? Thanks

Atul Kumar · Answer 1 · Fri Nov 22 2019 15:44:28 GMT+0800 (China Standard Time)

I think this code will run on python 3. Let me know if you find any problem.

Atul Kumar · Answer 2 · Sat Nov 23 2019 01:47:03 GMT+0800 (China Standard Time)

I migrated it to python 3 but still not tested fully https://github.com/atulkum/pointer_summarizer/tree/transformer_encoder
You can try this branch

John Chen · Answer 3 · Sat Nov 23 2019 03:31:36 GMT+0800 (China Standard Time)

Thanks! Will take a look

John Chen · Answer 4 · Sat Nov 23 2019 03:33:24 GMT+0800 (China Standard Time)

Hey I noticed that this branch uses the transformer as encoder? So it is not exactly comparable to the main branch right?

Atul Kumar · Answer 5 · Sat Nov 23 2019 03:35:40 GMT+0800 (China Standard Time)

You can disable transformer encoder by setting
use_lstm=True
in config.py

John Chen · Answer 6 · Sat Nov 23 2019 03:37:59 GMT+0800 (China Standard Time)

Thanks! And just realized we can merge the branches, it is just readme.md that differs (and some config stuff)

Atul Kumar · Answer 7 · Sat Nov 23 2019 04:49:17 GMT+0800 (China Standard Time)

I will merge it after testing transformer encoder

John Chen · Answer 8 · Tue Nov 26 2019 13:47:41 GMT+0800 (China Standard Time)

Thanks, it seems to be OK with my current testing. Will let you know once it is done training. Btw, have you heard of tqdm? It could help reduce some of the logging/timing code

John Chen · Answer 9 · Wed Nov 27 2019 00:10:01 GMT+0800 (China Standard Time)

So I left it running overnight on a Titan X and just have this line printed a bunch: INFO:tensorflow:Bucket queue size: 100, Input queue size: 800; is that normal? No loss or acc printed anywhere

Atul Kumar · Answer 10 · Wed Nov 27 2019 04:53:04 GMT+0800 (China Standard Time)

you can start the tensorboard and see the progress of learning curve

John Chen · Answer 11 · Wed Nov 27 2019 12:17:50 GMT+0800 (China Standard Time)

Thanks! This is currently what I get, I guess it's working OK?

Btw, do you know how long your 500k iterations took? I'm worried the training is taking too long right now. (seems like 1 day for just 5k)

John Chen · Answer 12 · Thu Dec 05 2019 04:47:37 GMT+0800 (China Standard Time)

Hey @atulkum is there any update to this?

Atul Kumar · Answer 13 · Thu Dec 05 2019 05:01:28 GMT+0800 (China Standard Time)

1 day for 5k is too long on my GTX 1070 8GB it took 3 days to train 500k iterations.
By the way if you want you can try transformer encoder. It might improve the speed but you might need to fine-tune the parameters. These are the set of parameters.

https://github.com/atulkum/pointer_summarizer/blob/transformer_encoder/training_ptr_gen/model.py#L53

I am thinking about this idea that you can replace an LSTM with a transformer encoder. I already got a similar result for NER on conll2003 data set.

https://github.com/atulkum/sequence_prediction/blob/master/neural_ner/model_tx.py
https://github.com/atulkum/sequence_prediction/blob/master/neural_ner/model_lstm.py

Let me know if you are interested in running these experiments.