Why does the cost function not hit zero for the adder when every pair is correct?

Question

Why does the cost function not hit zero for the adder when every pair is correct?

hippietrail opened this issue a year ago · comments

I'm trying to understand how the cost function works. I noticed in the adder, where all the numbers are discrete and have exact solutions, that the cost function is still wiggling and nonzero even when every pair of numbers adds to the exact correct solution.

If I understand correctly the loss is the mean of the squares of the differences between each actual and expected result. So I would expect that to hit zero. Any ideas what I'm missing?

Alexey Kutepov · Answer 1 · Wed May 24 2023 22:43:12 GMT+0800 (China Standard Time)

Because the output bits do not have to be perfectly 0 or perfectly 1. We consider signal <=0.5 a zero and >0.5 a one. (Completely arbitrary choice).

Andrew Dunbar · Answer 2 · Thu May 25 2023 12:59:46 GMT+0800 (China Standard Time)

Are we rounding them in the display?
I thought maybe we're taking a float and then doing int maths on it then I noticed that z is both a size_t and a float (-:

            size_t z = 0.0f;
            for (size_t i = 0; i < BITS; ++i) {
                size_t bit = MAT_AT(NN_OUTPUT(nn), 0, i) > 0.5;
                z = z|(bit<<i);
            }
            bool overflow = MAT_AT(NN_OUTPUT(nn), 0, BITS) > 0.5;

I wonder if it's ever possible for the cost to be lower for a wrong answer than a right one? Say if many are extremely close to 0.0 but at least one is > 0.5?