Why not merge conditions in only one vector?

Question

Why not merge conditions in only one vector?

haocheng6 opened this issue 5 years ago · comments

This is an inspiring project. Thanks.

I am relatively new to neural network. I do not see why you use separate dense layers for different conditions. It seems to me that merging condition vectors into one vector and feed the vector into a dense layer can achieve the same result.

Take predicting air quality for example. If we have two features that are not time series, say city and number of vehicles, the first feature will be converted to one-hot encoding, and the second feature will be directly represented as a number. Can I append the number of vehicles to the city vector and feed the new vector to a dense layer? That seems natural to me.

What are your concerns when you use different dense layers for different conditions?

Philippe Rémy · Answer 1 · Wed Jul 10 2019 16:08:45 GMT+0800 (China Standard Time)

@BiggerHao yeah that's a good point!

You are totally correct here.

Philippe Rémy · Answer 2 · Wed Jul 10 2019 16:10:50 GMT+0800 (China Standard Time)

The only thing that would differ is the bias.

Hao Cheng · Answer 3 · Wed Jul 10 2019 16:54:02 GMT+0800 (China Standard Time)

@philipperemy I get it. Thank you.

SHIVAM PRASAD · Answer 4 · Fri Feb 07 2020 14:57:36 GMT+0800 (China Standard Time)

Hold on @BiggerHao , City is one-hot encoded right, that's a completely different method of representing data, and appending number of vehicles (which is an again a different way of representing numbers) will make the model's work of understanding data pattern extra hard.

Further, if we expect model to learn these differences then why bother merging non-sequential data with sequential, LSTM can at the end of the day learn to differentiate between these sequential and non-sequential data.

Lastly, i'm not trying to prove anyone wrong, I'm very curious and it is just a point of view. Have a good day, Thanks!!!

Hao Cheng · Answer 5 · Fri Feb 14 2020 15:29:10 GMT+0800 (China Standard Time)

@shivam13juna I think using both continuous and one-hot categorical features simultaneously in one neural layer is a common practice which is not harder than using those two types features in two seperate layers and then merging the two layers.

However, I am still not an expert of deep learning, so my view may be biased.

SHIVAM PRASAD · Answer 6 · Fri Feb 14 2020 15:34:17 GMT+0800 (China Standard Time)

@BiggerHao you said it's a common practice, any popular tensorflow or pytorch blog where you've seen that? i'm sorry i've never seen merging commonly done, but i've mostly worked in NLP domain, so it maybe be possible.

Please let me know if you seen those cases in any popular (can be trusted) brand. Thank you!

Hao Cheng · Answer 7 · Fri Feb 14 2020 16:14:44 GMT+0800 (China Standard Time)

@shivam13juna I think I was wrong. 😢 Seems that it would be better to embed categorical features into a continuous space. (See How to combine categorical and continuous input features for neural network training.)

This tutorial from Tensorflow uses many types of features in one input layer, but it seems that its purpose is only for demonstration.

Please let me know if you have found any other good resources.