philipperemy / cond_rnn

Conditional RNNs for Tensorflow / Keras.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why not merge conditions in only one vector?

haocheng6 opened this issue · comments

This is an inspiring project. Thanks.

I am relatively new to neural network. I do not see why you use separate dense layers for different conditions. It seems to me that merging condition vectors into one vector and feed the vector into a dense layer can achieve the same result.

Take predicting air quality for example. If we have two features that are not time series, say city and number of vehicles, the first feature will be converted to one-hot encoding, and the second feature will be directly represented as a number. Can I append the number of vehicles to the city vector and feed the new vector to a dense layer? That seems natural to me.

What are your concerns when you use different dense layers for different conditions?

@BiggerHao yeah that's a good point!

You are totally correct here.

The only thing that would differ is the bias.

@philipperemy I get it. Thank you.

Hold on @BiggerHao , City is one-hot encoded right, that's a completely different method of representing data, and appending number of vehicles (which is an again a different way of representing numbers) will make the model's work of understanding data pattern extra hard.

Further, if we expect model to learn these differences then why bother merging non-sequential data with sequential, LSTM can at the end of the day learn to differentiate between these sequential and non-sequential data.

Lastly, i'm not trying to prove anyone wrong, I'm very curious and it is just a point of view. Have a good day, Thanks!!!

@shivam13juna I think using both continuous and one-hot categorical features simultaneously in one neural layer is a common practice which is not harder than using those two types features in two seperate layers and then merging the two layers.

However, I am still not an expert of deep learning, so my view may be biased.

@BiggerHao you said it's a common practice, any popular tensorflow or pytorch blog where you've seen that? i'm sorry i've never seen merging commonly done, but i've mostly worked in NLP domain, so it maybe be possible.

Please let me know if you seen those cases in any popular (can be trusted) brand. Thank you!

@shivam13juna I think I was wrong. 😢 Seems that it would be better to embed categorical features into a continuous space. (See How to combine categorical and continuous input features for neural network training.)

This tutorial from Tensorflow uses many types of features in one input layer, but it seems that its purpose is only for demonstration.

Please let me know if you have found any other good resources.