Do we need a `high-precision` vs `speed` option?
greggman opened this issue · comments
A recent stack overflow question questioned why Knuth's twoSum
was not working in WebGPU
Here is twoSum
in C++
std::tuple<float, float>twoSum(float a, float b) {
float x = a + b;
float bv = x - a;
float av = x - bv;
float br = b - bv;
float ar = a - av;
return { x, ar + br };
}
Passing in calling twoSum(1e10, 1e-10)
produces 1e10, 1e-10
(live)
But in WebGPU it's up to the GPU/implementation
Here's the function in WGSL
struct float2 {
x: f32,
y: f32
}
fn twoSum(a: f32, b: f32) -> float2 {
let x = a + b;
let bv = x - a;
let av = x - bv;
let br = b - bv;
let ar = a - av;
return float2(x, ar + br);
}
On my NVidia 2070 on Windows 11 I get 1e+10, 1.000000013351432e-10
which I think is correct for f32
On my M1 Mac I get 1e+10, 0
(live)
This is because Dawn is passing in fastMathEnabled = true
. Setting it to false makees the example compute the correct result.
Should the user be able to opt into "high-precision"
vs "speed"
when requesting an adapter the same way they can opt into "low-power"
vs "high-performance"
?
Should it instead be feature, "high-precision"
, that guarantees precision. Then, devices / drivers that can't provide the precision don't advertise the feature?
Should dawn maybe advertise "low-power" and "high-performance" on Metal and only pass in fast math for "high-performance"? (this seems like a solution that doesn't cover other cases)
Should it just be documented that math is not accurate in the spec?
thoughts?
I'll close this. Searched for issues with "precision" and somehow missed those