Simple comparisons with other rnn core models

Question

Simple comparisons with other rnn core models

AjayTalati opened this issue 7 years ago · comments

on the front page it says,

More generally, the DNC class found within dnc.py can be used as a standard TensorFlow rnn core and unrolled with TensorFlow rnn ops, such as tf.nn.dynamic_rnn on any sequential task.

Be fun if you could demonstrate this for something very simple like the basic TF rnn tutorial. I guess the quickest way to start using the dnc is by replacing it in simple familiar applications.

Thanks, Aj

dm-jrae · Answer 1 · Tue May 02 2017 16:53:00 GMT+0800 (China Standard Time)

Hi Aj, there's no current plans to add a suite of demos, however as you state the dnc core adheres to the TF rnn interface so you should be able to pick up any task's training script and get going. Or pick up other rnns and compare them to the DNC. Contributions are accepted if you want to add a comparison script for a given task of interest.

Hello :) · Answer 2 · Thu May 04 2017 21:07:05 GMT+0800 (China Standard Time)

Hi Jack @dm-jrae,

I think I'm starting to get a feel for how to use the DNC, I've done a simpler/more basic implementation :).

Now I'd like to try to reproduce the parts of the paper which learnt graphical representations - in particular,

In the paper, we showed that a DNC can learn on its own to write down a description of an arbitrary graph

can you recommend any simple dataset to start with please?

It doesn’t have to be the London underground, or a family tree, anything small, open-source, and in Python, which you guys know the DNC can learn, would be a really massive help 👍

If there's nothing out of the can in Python, I could try to generate sample networks using R, and then port it over to numpy? Sorry for the weird question, would be happy to contribute this task if I can get it to work?

Thanks a lot,

Ajay

Hello :) · Answer 3 · Thu May 04 2017 22:25:57 GMT+0800 (China Standard Time)

Hi @jingweiz,

I wonder if you're interested in reproducing the graph representation tasks, (and the subsequent querying), from the paper?

Hello :) · Answer 4 · Fri May 05 2017 03:08:05 GMT+0800 (China Standard Time)

Oh dear - I made bit a bit of boob 👎

On page 9 of the paper, in the section Graph Task Descriptions -> Random Graph Generation, it tells you how to generate the planar graphs.

I thinks the same method is used in a Google Brain paper I read on Combinatorial optimization, so I guess you DM/Google guys use this as one of your standard task generators?

Jingwei Zhang · Answer 5 · Tue May 09 2017 03:25:31 GMT+0800 (China Standard Time)

@AjayTalati
Oh hey, sorry I just saw your message! I am indeed currently implementing ntm and dnc in pytorch, I'm done with ntm, and finishing up dnc, and I currently only have the copy task and repeat_copy task and will make the code public very soon. I am very excited about the external memory idea and would definitely want to have more tasks and would be very happy to cooperate:) So you said you also have an implementation already right? Which framework do you use?

Hello :) · Answer 6 · Tue May 09 2017 03:56:25 GMT+0800 (China Standard Time)

Hi @jingweiz

thanks for the offer of co-operation, that's really cool, thank you :)

My implementation in pytorch is a simplified version of

https://github.com/ypxie/pytorch-NeuCom

So far I'm really just getting used to how it works, it doesn’t seem to scale too well for large inputs, but I guess I need to implement sparse read and write? I think external memory is very interesting too, the capacity seems very promising, and I hope it will learn faster than LSTM?

To be honest I'm only working on applications to very simple things at the moment, like basic time series sequence prediction, but if I get promising results, I'll be happy to move on to the more complicated tasks, and can offer you assistance :)

Thanks a lot for the reply 👍

Cheers,

Ajay

Jingwei Zhang · Answer 7 · Tue May 09 2017 05:58:39 GMT+0800 (China Standard Time)

Hey @AjayTalati
From their Figure4 it shows the sparsity does not seem to make too much of a difference performance wise, as to deal w/ large inputs I'm not sure, but maybe it's worth a try!
As for LSTM, I think in the NTM paper they pretty much already show that external memory performs better and learns faster.
And thanks for the reply! Good luck and have fun with all the implementations:D

Hello :) · Answer 8 · Tue May 09 2017 06:58:56 GMT+0800 (China Standard Time)

Thanks @jingweiz,

I'm running some reasonably large time series and language model experiments - will update you when I get some conclusive results.

Looking forward to your implementation - I think for RL the DNC shows a lot of promise 👍 - best of luck 🥇