HigherOrderCO / Bend

A massively parallel, high-level programming language

Home Page:https://higherorderco.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Running with all optimizations on is slower than running with the default optimizations

developedby opened this issue · comments

Reported by @Janiczek on discord

The program chosen is the README.md example, with sum(25,0) in the main.

CPU: AMD Ryzen 7 5800X3D (8 cores, 16 logical processors, 3.40 GHz)
GPU: NVIDIA GeForce RTX 3070 Ti

Measurements:
run -O all: 17.51s
run-c -O all: 22.74s
run-cu -O all: 3.44s
run: 16.40s
run-c: 3.47s
run-cu: 0.73s

Details:

run -O all
$ time bend run -s -O all sample.bend
Result: 0

  • ITRS: 805306351
  • TIME: 17.51s
  • MIPS: 46.00

16.81user 0.70system 0:17.51elapsed 100%CPU (0avgtext+0avgdata 6294448maxresident)k
0inputs+8outputs (0major+4418minor)pagefaults 0swaps

run-c -O all
$ time bend run-c -s -O all sample.bend
Result: 0

  • ITRS: 805306351
  • TIME: 22.72s
  • MIPS: 35.44

92.79user 270.53system 0:22.74elapsed 1597%CPU (0avgtext+0avgdata 437532maxresident)k
0inputs+8outputs (13major+98934minor)pagefaults 0swaps

run-cu -O all
$ time bend run-cu -s -O all sample.bend
Result: 0

  • ITRS: 803897327
  • LEAK: 33718271
  • TIME: 2.44s
  • MIPS: 329.46

2.35user 0.09system 0:03.44elapsed 71%CPU (0avgtext+0avgdata 337556maxresident)k
41608inputs+3400outputs (251major+59918minor)pagefaults 0swaps

run (without -O all)
$ time bend run -s sample.bend
Result: 0

  • ITRS: 738197489
  • TIME: 16.39s
  • MIPS: 45.03

15.55user 0.84system 0:16.40elapsed 99%CPU (0avgtext+0avgdata 6294392maxresident)k
0inputs+8outputs (0major+4419minor)pagefaults 0swaps

run-c (without -O all)
$ time bend run-c -s sample.bend
Result: 0

  • ITRS: 738197489
  • TIME: 3.36s
  • MIPS: 219.73

34.47user 19.35system 0:03.47elapsed 1550%CPU (0avgtext+0avgdata 5297008maxresident)k
0inputs+8outputs (13major+1318553minor)pagefaults 0swaps

run-cu (without -O all)
$ time bend run-cu -s sample.bend
Result: 0

  • ITRS: 803897327
  • LEAK: 33718271
  • TIME: 0.40s
  • MIPS: 1997.91

0.37user 0.03system 0:00.73elapsed 55%CPU (0avgtext+0avgdata 105984maxresident)k
13272inputs+8outputs (82major+4739minor)pagefaults 0swaps

Note this was on very early versions of bend+hvm right after release. Might be worth first checking this still happens.

I think I haven't changed anything significant about the default transformations since then, so it's likely that it still happens

The hvm program generated with or without -Oall is exactly the same for this bend program, so the issue is somewhere in HVM.

This is the code that is mentioned in this issue (the readme has changed since then):

def sum(depth, x):
  switch depth:
    case 0:
      return x
    case _:
      fst = sum(depth-1, x*2+0) # adds the fst half
      snd = sum(depth-1, x*2+1) # adds the snd half
      return fst + snd
    
def main:
  return sum(25, 0)

Moving it to this HVM issue since it's not a bend problem HigherOrderCO/HVM#378