Which expressions are the set of equivalent expressions generated by Herbie for the input expression?

Question

Which expressions are the set of equivalent expressions generated by Herbie for the input expression?

TimoTokki opened this issue 2 years ago · comments

@pavpanchekha
Hello, sorry to bother you again for this question.
I am currently working on the generation of equivalence expressions. Therefore, I would like to compare the differences between the set of equivalent expressions generated by my method and the set of equivalent expressions generated by Herbie.

After your last help, I still have some doubts and questions, i.e. I am not sure which expressions are the set of equivalent expressions generated by Herbie for the input expression?

Take sqroot in FPBench as an example for illustration.
The expression is.
(-
(+ (- (+ 1 (* 1/2 x)) (* (* 1/8 x) x)) (* (* (* 1/16 x) x) x))
(* (* (* (* 5/128 x) x) x) x))
The interval is [0,1].
The details are available at sqroot-herbie

If you want to see all of the equivalent expressions Herbie considered, you'll need to click the "Metrics" link in the top right of a report. The blocks labeled "Simplify" and "Rewrite" have data labeled "Calls", and if you expand that you'll see all of the outputs (equivalent expressions) and the inputs (seed expressions) they were generated from.

Following your hint, I researched the cell part in the simplify and rewrite of Metrics. However, I found that inputs in cells are not always the input expression sqroot but seem to be sub-expressions of the input expressions sqroot. I can't determine which expressions in outputs are the equivalent expressions that Herbie generates for the input expression, i.e., sqroot.

Here is my understanding of what is in Metrics.
I'm not sure I understand it correctly, so please don't hesitate to correct me if there are mistakes.

Simplify 1st

For the first simplify step, I guess the input is the input expression sqroot and outputs are its 6 equivalent expressions.

Inputs
(-.f64 (+.f64 (-.f64 (+.f64 1 (.f64 1/2 x)) (.f64 (.f64 1/8 x) x)) (.f64 (.f64 (.f64 1/16 x) x) x)) (.f64 (.f64 (.f64 (.f64 5/128 x) x) x) x))

Outputs
(-.f64 (+.f64 (-.f64 (+.f64 1 (.f64 1/2 x)) (.f64 (.f64 1/8 x) x)) (.f64 (.f64 (.f64 1/16 x) x) x)) (.f64 (.f64 (.f64 (.f64 5/128 x) x) x) x))
(+.f64 (+.f64 1 (-.f64 (.f64 1/2 x) (.f64 x (.f64 x 1/8)))) (-.f64 (.f64 x (.f64 x (.f64 x 1/16))) (.f64 x (.f64 x (.f64 x (.f64 x 5/128))))))
(+.f64 1 (+.f64 (.f64 x (-.f64 1/2 (.f64 x 1/8))) (.f64 (.f64 x x) (-.f64 (.f64 x 1/16) (.f64 x (*.f64 x 5/128))))))
(+.f64 (fma.f64 x (+.f64 1/2 (.f64 x -1/8)) 1) (.f64 (pow.f64 x 3) (-.f64 1/16 (*.f64 x 5/128))))
(fma.f64 x (+.f64 1/2 (.f64 x (+.f64 -1/8 (.f64 x (+.f64 1/16 (*.f64 x -5/128)))))) 1)
(fma.f64 x (+.f64 1/2 (.f64 x (+.f64 1/8 (.f64 x (+.f64 1/16 (*.f64 x 5/128)))))) 1)

Rewrite 1st

Then, the inputs in the rewrite step contain 4 expressions.

Input
(.f64 (.f64 5/128 x) x)
(.f64 (.f64 (*.f64 1/16 x) x) x)
(.f64 (.f64 (.f64 (.f64 5/128 x) x) x) x)
(.f64 (.f64 (*.f64 5/128 x) x) x)

These appear to be part of the input expression, i.e. the sub-expressions of the input expression. I'm not sure I'm understanding it correctly.
And the outputs are 4 long statements consisting of a series of statements shaped like (#(struct:change # (2) ((x +.f64 0 (.f64 5/128 (.f64 x x)))))) which, by my guess, correspond to each of the 4 expressions in inputs.

Since the statements are too long, I will take the first statement in outputs as an example and expand it after processing as follows.

NO.	expression
1	(x +.f64 0 (.f64 5/128 (.f64 x x))))
2	(x +.f64 (log.f64 (pow.f64 (cbrt.f64 (pow.f64 (exp.f64 x) (.f64 5/128 x))) 2)) (log.f64 (cbrt.f64 (pow.f64 (exp.f64 x) (.f64 5/128 x))))))
3	(x +.f64 (log.f64 (sqrt.f64 (pow.f64 (exp.f64 x) (.f64 5/128 x)))) (log.f64 (sqrt.f64 (pow.f64 (exp.f64 x) (.f64 5/128 x))))))
4	(x -.f64 (exp.f64 (log1p.f64 (.f64 5/128 (.f64 x x)))) 1))
5	(x pow.f64 (.f64 5/128 (.f64 x x)) 1))
6	(x pow.f64 (*.f64 25/16384 (pow.f64 x 4)) 1/2))
7	(x pow.f64 (pow.f64 (*.f64 (sqrt.f64 5/128) x) 6) 1/3))
8	(x pow.f64 (cbrt.f64 (.f64 5/128 (.f64 x x))) 3))
9	(x pow.f64 (*.f64 (sqrt.f64 5/128) x) 2))
10	(x pow.f64 (exp.f64 1) (log.f64 (.f64 5/128 (.f64 x x)))))
11	(x pow.f64 (exp.f64 (pow.f64 (cbrt.f64 (log.f64 (.f64 5/128 (.f64 x x)))) 2)) (cbrt.f64 (log.f64 (.f64 5/128 (.f64 x x))))))
12	(x pow.f64 (exp.f64 (sqrt.f64 (log.f64 (.f64 5/128 (.f64 x x))))) (sqrt.f64 (log.f64 (.f64 5/128 (.f64 x x))))))
13	(x sqrt.f64 (*.f64 25/16384 (pow.f64 x 4))))
14	(x log.f64 (pow.f64 (exp.f64 x) (*.f64 5/128 x))))
15	(x log.f64 (+.f64 1 (expm1.f64 (.f64 5/128 (.f64 x x))))))
16	(x cbrt.f64 (pow.f64 (*.f64 (sqrt.f64 5/128) x) 6)))
17	(x log1p.f64 (expm1.f64 (.f64 5/128 (.f64 x x)))))
18	(x exp.f64 (log.f64 (.f64 5/128 (.f64 x x)))))
19	(x exp.f64 (.f64 (log.f64 (.f64 5/128 (*.f64 x x))) 1)))
20	(x exp.f64 (.f64 (log.f64 (pow.f64 (.f64 (sqrt.f64 5/128) x) 6)) 1/3)))
21	(x exp.f64 (.f64 (log.f64 (cbrt.f64 (.f64 5/128 (*.f64 x x)))) 3)))
22	(x exp.f64 (.f64 (log.f64 (.f64 (sqrt.f64 5/128) x)) 2)))
23	(x exp.f64 (.f64 (.f64 (log.f64 (.f64 5/128 (.f64 x x))) 1) 1)))
24	(x exp.f64 (+.f64 (.f64 (log.f64 x) 1) (log.f64 (.f64 5/128 x)))))
25	(x exp.f64 (+.f64 (log.f64 x) (.f64 (log.f64 (.f64 5/128 x)) 1))))
26	(x exp.f64 (+.f64 (.f64 (log.f64 x) 1) (.f64 (log.f64 (*.f64 5/128 x)) 1))))
27	(x exp.f64 (+.f64 (.f64 (log.f64 (.f64 5/128 x)) 1) (log.f64 x))))
28	(x exp.f64 (+.f64 (log.f64 (.f64 5/128 x)) (.f64 (log.f64 x) 1))))
29	(x exp.f64 (+.f64 (.f64 (log.f64 (.f64 5/128 x)) 1) (*.f64 (log.f64 x) 1))))
30	(x expm1.f64 (log1p.f64 (.f64 5/128 (.f64 x x))))

By analyzing the above information, I think that each statement in outputs is an equivalent expression generated by Herbie for the corresponding expression in the input respectively.
But since the input is not an input expression I still don't know how to get the equivalent expression of sqroot.

The last simplify

The inputs and outputs are show as the following:

Inputs
(-.f64 (+.f64 (-.f64 (+.f64 1 (.f64 1/2 x)) (.f64 (.f64 1/8 x) x)) (log.f64 (+.f64 1 (expm1.f64 (.f64 1/16 (pow.f64 x 3)))))) (.f64 (.f64 (.f64 (.f64 5/128 x) x) x) x))
(-.f64 (+.f64 (-.f64 (+.f64 1 (.f64 1/2 x)) (.f64 (.f64 1/8 x) x)) (.f64 (.f64 (.f64 1/16 x) x) x)) (.f64 (.f64 (.f64 (.f64 5/128 x) x) x) x))

Outputs
(-.f64 (+.f64 (-.f64 (+.f64 1 (.f64 1/2 x)) (.f64 (.f64 1/8 x) x)) (log.f64 (+.f64 1 (expm1.f64 (.f64 1/16 (pow.f64 x 3)))))) (.f64 (.f64 (.f64 (.f64 5/128 x) x) x) x))
(-.f64 (+.f64 (-.f64 (+.f64 1 (.f64 1/2 x)) (.f64 x (.f64 x 1/8))) (log.f64 (+.f64 1 (expm1.f64 (.f64 1/16 (pow.f64 x 3)))))) (.f64 x (.f64 x (.f64 x (.f64 x 5/128)))))
(-.f64 (+.f64 (+.f64 (+.f64 1 (.f64 1/2 x)) (.f64 x (.f64 x -1/8))) (log.f64 (+.f64 1 (expm1.f64 (.f64 1/16 (pow.f64 x 3)))))) (.f64 x (.f64 x (.f64 x (.f64 x 5/128)))))
(+.f64 (+.f64 (+.f64 (+.f64 1 (.f64 1/2 x)) (.f64 x (.f64 x -1/8))) (log.f64 (+.f64 1 (expm1.f64 (.f64 1/16 (pow.f64 x 3)))))) (.f64 x (.f64 x (.f64 x (.f64 x -5/128)))))
(-.f64 (+.f64 (-.f64 (+.f64 1 (.f64 1/2 x)) (.f64 (.f64 1/8 x) x)) (.f64 (.f64 (.f64 1/16 x) x) x)) (.f64 (.f64 (.f64 (.f64 5/128 x) x) x) x))
(-.f64 (+.f64 (-.f64 (+.f64 1 (.f64 1/2 x)) (.f64 x (.f64 x 1/8))) (.f64 x (.f64 x (.f64 x 1/16)))) (.f64 x (.f64 x (.f64 x (.f64 x 5/128)))))
(-.f64 (+.f64 (+.f64 (+.f64 1 (.f64 1/2 x)) (.f64 x (.f64 x -1/8))) (.f64 x (.f64 x (.f64 x 1/16)))) (.f64 x (.f64 x (.f64 x (.f64 x 5/128)))))
(+.f64 (+.f64 (+.f64 (+.f64 1 (.f64 1/2 x)) (.f64 x (.f64 x -1/8))) (.f64 x (.f64 x (.f64 x 1/16)))) (.f64 x (.f64 x (.f64 x (.f64 x -5/128)))))

Here I don't well understand why Herbie did the last simplify and how the result was picked from the 8 expressions in the outputs. Because the first expression in the inputs seems to be the final optimization result given by Herbie, while the second expression is its alternative 1.
Another question I have is whether the outputs in the last simplify step contain all equivalent expressions of the input expression?

Thank you very much!

Originally posted by @TimoTokki in #503 (comment)

Brett Saiki · Answer 1 · Thu Oct 20 2022 23:44:58 GMT+0800 (China Standard Time)

@TimoTokki The expressions you are listing in the screenshots as well as tables are not necessarily directly generated from the input expression. Herbie maintains a set of candidate expressions, iteratively selects subexpressions from that set that Herbie believes needs rewriting, rewrites them using various phases (timeline lists these as rewrite, taylor, simplify), and then updates the candidate set. The set of candidates after iterative improvement is finished will be listed in the first "regimes" phase:

The final set of expressions is listed on the report, on the top of the report and in the "Alternatives" section.

As a more pedantic note, these expressions are not necessarily "equivalent", as you say, which may make comparison difficult. Some candidates are created via algebraic rewrites that are indeed equivalent on the reals. However, others are approximations.

Hopefully this helps. I'm a little unsure if you're comparing the purely syntactic rewriting parts of Herbie like "rewrite" and "simplify", or if you're comparing the overall numerical analysis techniques of Herbie.