google / souper

A superoptimizer for LLVM IR

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Switch-case synthesis

fvrmatteo opened this issue · comments

Hi, I noticed that Souper has support for some examples based on phi nodes and I was wondering if it is able to synthesize the following small switch-case example:

define i8 @src(i8 %0) {
entry:
    switch i8 %0, label %default [
        i8 0, label %bb_0
        i8 1, label %bb_1
        i8 2, label %bb_2
        i8 3, label %bb_3
        i8 4, label %bb_4
        i8 5, label %bb_5
        i8 6, label %bb_6
        i8 7, label %bb_7
        i8 8, label %bb_8
        i8 9, label %bb_9
    ]
bb_0:
    br label %default
bb_1:
    br label %default
bb_2:
    br label %default
bb_3:
    br label %default
bb_4:
    br label %default
bb_5:
    br label %default
bb_6:
    br label %default
bb_7:
    br label %default
bb_8:
    br label %default
bb_9:
    br label %default
default:
    %1 = phi i8 [ 0, %entry ], [ 0, %bb_9 ], [ 0, %bb_8 ], [ 0, %bb_7 ], [ 0, %bb_6 ], [ 0, %bb_5 ], [ 0, %bb_4 ], [ 0, %bb_3 ], [ 0, %bb_2 ], [ 0, %bb_1 ], [ 1, %bb_0 ]
    ret i8 %1
}

Alive2 seems to be able to verify the equivalence between the function and a possible synthesized version: https://alive2.llvm.org/ce/z/qoQkhD.

I tried (without success) with the following command line: souper --souper-exploit-blockpcs --souper-infer-inst --souper-synthesis-comps=eq,select,const,const,const switch.bc.

Am I missing some option or is there something stopping Souper from synthesizing this example?

I used an ugly trick to convert the switch into a cascade of icmp + select instructions and now Souper is able to synthesize the provided example (I also successfully tested the same example with the full i8 condition range 0-255):

define i8 @src(i8 %0) {
entry:
  %icmp_0 = icmp eq i8 %0, 0
  %sele_0 = select i1 %icmp_0, i8 1, i8 0
  %icmp_1 = icmp eq i8 %0, 1
  %sele_1 = select i1 %icmp_1, i8 0, i8 %sele_0
  %icmp_2 = icmp eq i8 %0, 2
  %sele_2 = select i1 %icmp_2, i8 0, i8 %sele_1
  %icmp_3 = icmp eq i8 %0, 3
  %sele_3 = select i1 %icmp_3, i8 0, i8 %sele_2
  %icmp_4 = icmp eq i8 %0, 4
  %sele_4 = select i1 %icmp_4, i8 0, i8 %sele_3
  %icmp_5 = icmp eq i8 %0, 5
  %sele_5 = select i1 %icmp_5, i8 0, i8 %sele_4
  %icmp_6 = icmp eq i8 %0, 6
  %sele_6 = select i1 %icmp_6, i8 0, i8 %sele_5
  %icmp_7 = icmp eq i8 %0, 7
  %sele_7 = select i1 %icmp_7, i8 0, i8 %sele_6
  %icmp_8 = icmp eq i8 %0, 8
  %sele_8 = select i1 %icmp_8, i8 0, i8 %sele_7
  %icmp_9 = icmp eq i8 %0, 9
  %sele_9 = select i1 %icmp_9, i8 0, i8 %sele_8
  ret i8 %sele_9
}

This is theoretically doable for any switch, but it creates a dependency chain on the instructions that wasn't original there. The result is the following:

define i8 @src(i8 %0) {
entry:
  %icmp_0 = icmp eq i8 %0, 0
  %1 = zext i1 %icmp_0 to i8
  ret i8 %1
}

Obviously I'm still wondering if Souper can handle the switch without the need to convert it, but decided to post a solution if anyone is going to look into a similar example in the future.

hi! sorry none of us responded earlier. hopefully this issue is now fixed via #843

No hurry, glad to see Souper is still alive, I keep using it for many sub-tasks!

I tested it today and I can confirm the patch solves the issue if the enumerative synthesis is used, even though it's mandatory to specify --souper-enumerative-synthesis-ignore-cost, which is suboptimal, but it's my understanding the phi cost will need to be reworked somehow.

Is there any plan to support the same with the --souper-infer-inst option? I still find the component-based synthesis to beat the enumerative synthesis in many synthesis tasks, but I also noticed all the improvements are more enumerative-synthesis related.

hi Matteo, just a quick heads up that I just changed -souper-infer-inst to -souper-use-cegis since the old option was pretty confusing.

I'd be interested to hear more about your experience comparing these two algotihms. my experience is that CEGIS indeed performs better on hand-written tests, but that the enunerator greatly wins in real applications where the left-hand-sides being optimized are large and contain many instructions that are not really involved with the optimization. if you see examples where large LHSs are optimized better by CEGIS than by the enumerator, please let us know. I'll leave this PR open for a few days in case we want to chat more about this, then I'll close

Thanks John for the heads up!

I have some sliced IR files with quite big LHSs that are synthesized properly by CEGIS (given a possible set of feasible components as input) and I've never managed to get them to work with the enumerator. The root node is synthesized to approximately 5/6 instructions.

This evening I can collect them, retest them with the newest changes (even reducing the set of components used by the enumerator) and send them over to you if you'd like to give it a try. The only issue is that I'd like to keep them private for now. Can I send them somewhere safe? If not, I'll eventually come up with similar tests that show the same behaviour (if possible).

Hi Matteo, I'd love to look at these if you don't mind sharing. We won't share them outside of my group. If you want, email me at regehr@cs.utah.edu. Thanks!