softmax_cross_entropy outputs shape [-1], when it should output shape [-1, 1].

Question

softmax_cross_entropy outputs shape [-1], when it should output shape [-1, 1].

AngelOfSol opened this issue 2 years ago · comments

I was attempting to optimize something with the softmax_cross_entropy loss function, but I kept getting broadcast shape errors, which confused me. I spent time digging into the code, and I realized that the output here isn't following the API which says it should output a 2-rank tensor, instead outputting a 1-rank tensor.

fn compute(&self, ctx: &mut crate::op::ComputeContext<T>) -> Result<(), crate::op::OpError> {
        let x = &ctx.input(0);
        let log_x: NdArray<T> = x - &tensor_ops::math_ops::logsumexp_forward(x, 1, true);
        // `t` must be one-hot
        let t = &ctx.input(1);
        assert_eq!(log_x.ndim(), 2, "x must be 2-ranked tensor");
        assert_eq!(t.ndim(), 2, "t must be 2-ranked tensor");
        // - t log x ( =(batch, num_classes))
        let minus_one = T::one().neg();
        ctx.append_output(
            (t * &log_x)
                .sum_axis(ndarray::Axis(1))
                .mapv(move |elem| elem * minus_one),
        );
        ctx.append_output(log_x);
        Ok(())
    }

I fixed it in my local copy by just reshaping the array, which seems to work, but I'm not super familiar with if this is how cross entropy calculations are usually resolved.

        let minus_one = T::one().neg();
        let result = (t * &log_x)
            .sum_axis(ndarray::Axis(1))
            .mapv(move |elem| elem * minus_one)
            .into_shape(ndarray::IxDyn(&[log_x.shape()[0], 1]))
            .unwrap();

        assert_eq!(result.ndim(), 2, "result must be 2-ranked tensor");

        ctx.append_output(result);

Ryo ASAKURA · Answer 1 · Sat Jun 04 2022 11:16:38 GMT+0800 (China Standard Time)

@AngelOfSol Could you send a PR? Your fix looks good to me. Thanks!