fchollet / deep-learning-with-python-notebooks

Jupyter notebooks for the code samples of the book "Deep Learning with Python"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

IMDB example, why we have 1 neuron in the last layer

Kuaranir opened this issue · comments

In case of IMDB example, why did we initialize the last layer with only 1 neuron? Though we have two classes: positive and negative reviews:

model.add(layers.Dense(1, activation='sigmoid'))

How are the true labels coded in that example you are referring to? I assume your true labels are coded as 0 or 1, not as [1, 0] (class 1) or [0, 1] (class 2), right?

Then essentially you have one class only because a sigmoid output < 0.5 could be deemed as negative prediction, and above 0.5 as positive.

I thought we should to set as many neurons in the output layer as classes. At least I heard it on the DL courses...

That's correct in general, but for a binary classification problem, sigmoid activation and one output layer neuron is sufficient.

Hi @Kuaranir , when doing binary classification problems, if you use a sigmoid activation then you use one neurone in the output layer. The reason for this is because with sigmoid activation your network predicts one probability which is the probability of success (probability of class 1). This means the probability of failure can be gotten simply as (1 - prob of success). So a single unit is enough.

If you wanted to predict a probability for each class the you can change the activation function to softmax and use 2 units (neurons). Hope that helps :)

@pkienle thanks)