Explanation about expd
imartinezl opened this issue · comments
Hi @herumi, 🙂
Could you comment in more detail how the expd function (on fmath.hpp) works? I have tried to understand the flow explained in https://github.com/herumi/fmath/blob/master/algo-ja.md, but it does not clear any of my doubts. 😅
Sorry for these questions, but I am not very familiar with unions and binary operations, so this expd function is kind of difficult to unfold for me. From what I have been able to understand, first you store the values of powers of two from 0 to 1 (2^0 <> 2^1) in a lookup table (ExpdVar c.tbl).
Lines 177 to 182 in 0a10069
Then, this lookup table is used in the expd function. Let me know if this is correct.
However, I was not able to follow the rest of the operations. More specifically:
- What is the purpose of the variable b = 3ULL << 51?
- Why do you calculate di.d = x * c.a + b ?
- What does the variable iax represent?
- Should not the value of t always be zero? I suppose this has something to do with floating numbers, since the equation, with real numbers, should simplify to zero.
- What does the variable u represent? This computation is quite hard to understand (to a newbie like me)😭
- Finally, I suppose the value of y is the evaluation of a polynomial, but I do not know what it is exactly representing.
- And the final two operations (binary OR and the product of y with di.d) also no idea.
Lines 474 to 484 in 0a10069
I would appreciate very much if you could comment the overall picture for computing expd, and if it is also possible 🙏, a more detailed breakdown of each line in expd.
Thank you very much for your time.
For any x = s + t, exp(x) = exp(s)exp(t)
.
Suppose that the exp(s)
can be computed by a table lookup and exp(t)
with a small t
is computed by a Maclaurin series.
I want to compute exp(t)
by 1 + t + t^2/2 + t^3/6
.
The resolution of double
is 1e-16
, then I expect that |t|
is smaller than 1/2^12 = 1/4096
.
Let x' = x * a
. Split x' = n + t
where n=round(x')
is an integer and t
is a fraction (|t|<=1/2
).
Then x = (n/a) + (t/a).
If a=2048/log(2)~2954.6...
then |t/a| < 1/6000
.
exp(n/a) = e(n/2048 * log(2)) = 2^(n/2048) = 2^q 2^r where q = int(n/2048), r = n mod 2048.
2^q
can be computed by bit shift, so we apply a table lookup to 2^r
.
How to compute round(x')
.
There are some ways.
- FPU
- cvtsd2si (SSE)
- roundpd (SSE4.1/AVX)
- vrndscaleps (AVX-512)
I selected 1. because I programmed it a very long time ago.
The format of double
is sign(1 bit) + exponent(11 bit) + fraction (52 bit)
.
See https://en.wikipedia.org/wiki/Double-precision_floating-point_format .
If x is added a large value then the fraction is rounded.
For exampe,
12.25 + 2^52 = 2^52 * (1 + 2.72..10^(-15))
= 2^52 * (1 + 12/2^52)
Add b = (2^52 + 2^51) = 3ULL << 51
to account for the fact that x is negative.
This is the first magic number.
To be continued...
To extract a fraction of a double
value, use union.
union di {
double d;
uint64_t i;
};
tbl[di.i & 2047]
means 2^r
by a table lookup.
2^r
(0 <= r < 2048) is in [1, 2) then take only a fraction part.
for (int i = 0; i < s; i++) {
di di;
di.d = ::pow(2.0, i * (1.0 / s));
tbl[i] = di.i & mask64(52); // here
}
By the way, a table lookup is inconvenient for SIMD, so I think that https://github.com/herumi/simdgen/blob/main/algorithm.md is better algorithm.
Hi @herumi !🤗
Thank you so much for your rapid and detailed answer. 👏👏👏
After a thorough review, I think I comprehended all of the information you provided in the comments.
For the time being, all of my concerns have been addressed; if I have any further questions, I would be awesome to contact you again.
Thanks!😊