jcjohnson / torch-rnn

Efficient, reusable RNNs and LSTMs for torch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Two questions about sampling

Wrongful opened this issue · comments

After finally getting this program working, I've been having a lot of fun with it. Thank you so much for this, @jcjohnson! However, I have two questions that I can't seem to figure out.

My first question is about the -start_text flag. I have managed to get it working, but so far, it only lets me use one word. Adding quotation marks could probably fix this, but I read in this blog post about pre-seeding, which is starting with a paragraph of text before the actual seed itself. How would I accomplish this?

My second question is more of a theory than an issue. If I wanted Torch-RNN to write a poem, I would just have to train it on some poetry. But what if I wanted to have it write a poem on a certain subject? Chances are, there probably won't be too many examples out there to train on. Is it possible for Torch-RNN to use both a "Subject" and "Style" NN at the same time? That way, someone could keep a consistent style while switching out subjects, all without having to re-train everything. I use Ubuntu 16.04 in VirtualBox to run this code, and VBox doesn't let me access my GPU to speed up the training process, so this would really help!

I'm by no means any kind of expert on neural nets - I'm just playing around with this myself, but based on what I've seen, the NN will be based on your training material. So if you train it on Shakespearean text, or Hemingway prose, then it will generate a sample of something that looks like either of those. But is it really in the style of Hemingway or Shakespeare?

The NN will learn the relationship between the words/characters based on input material, which I think equates to style. The context or subject should (must?) be derived from meaning, which requires a deeper understanding and nuance of language. I'm not sure that an RNN can "know" what it's generating out of a sample. I read about a NN that was trying to generate rap lyrics based on rhyming, and I think they did some work on context that might be interesting to you.

Ah, interesting. I have to go and do things, but I'll check that out later. Thank you!

Use quotes around start text for multiple words, e.g. --start_text "In the beginning there was". The only way to achieve what you want with torch-rnn is to train multiple LSTMs, one for each subject / style you wish to generate, and trained on a corpus of the appropriate category.