Connect experimental/alp autotuning with harness' tuning hooks

Question

Connect experimental/alp autotuning with harness' tuning hooks

shabalind opened this issue 2 years ago · comments

Earlier, we experimented with simple randomised sampling of harness' tuning variables (see variables.py and their use in transforms.py), but eventually we decided against rolling our own tuning framework in the sandbox and it was removed in #76.

The longer term idea was to use externally provided search such as OpenTuner that is now used in experimental/alp. Conceptually both rely on the same concept of tuning variables with fixed domains, so it seems like there is an opportunity for a common codebase.

This issue is meant to be a discussion ground on whether it's a good idea, and what are the incremental steps to move towards that goal.

Denys Shabalin · Answer 1 · Mon Dec 13 2021 18:22:16 GMT+0800 (China Standard Time)

@giuseros Do you think you'd be interested in merging the efforts here to have a common OpenTuner-tunable infrastructure? Are there any other blockers apart from #93, #94, #95 that would prevent things from connecting?

Giuseppe Rossini · Answer 2 · Tue Dec 14 2021 20:40:06 GMT+0800 (China Standard Time)

Hi @shabalind ,
Thanks for the input!

I don't think #93, #94, #95 are blockers, because those are only needed for us to have an easier life when analyzing performance (assembly, intermediate IRs, etc...). OpenTuner is agnostic of how we collect performance. Basically, here:

    compile_result = self.call_program(' '.join(cmd))
    run_cmd = './exec_matmul'
    run_result = self.call_program(run_cmd, limit=0.7)
    secs, flops = parse(run_result['stderr'])
    return Result(time=1/flops)

We compile, run the and feed the result back to the framework. Compile and run can be obtained in any way, as long as we are able to "extract" flops or time from the execution to implement the feedback loop.

If you could show me how to do that with the current python harness, then I can come up with a very simply autotuner example. Then it's mostly about how to structure the folder within the repo.

Thanks,
Giuseppe

Denys Shabalin · Answer 3 · Wed Dec 15 2021 22:45:52 GMT+0800 (China Standard Time)

@giuseros If you look at examples/matmul/bench.py you'll see a few key things that you need to build an equivalent of your snippet.

test_harness is roughly equivalent of call_program in your example. It evaluates a concrete configuration (in the same python process) and prints the performance results to stdout. Currently the function doesn't actually return anything yet, but it's easy to modify it to return performance numbers as well (PR to add this: #120 ).

The configuraton is defined by an expert, which is a composition of a number of transformations (see experts.py). Each transformation can be parametrized by variables (see transforms.py). Variables are roughly equivalent to OpenTuner parameters: they are typed and have a clear precise domain of all possible valid values.

Harness accepts an instantiation of expert to some concrete values and runs it on a given problem definition (i.e. matmul vs conv vs reduction vs ...) + sizes + information of wheather sizes should be passed statically or dynamically.

Once harness can be invoked programmatically to obtain performance numbers it looks like it should be easy to make an equivalent of the code snippet you provided above.

Giuseppe Rossini · Answer 4 · Thu Dec 16 2021 22:58:45 GMT+0800 (China Standard Time)

Hi @shabalind ,
It really seems easy now, thanks! I will have a stab a this in the next few days and cc you in the review

Thanks,
Giuseppe

Denys Shabalin · Answer 5 · Thu Dec 16 2021 23:43:54 GMT+0800 (China Standard Time)

@giuseros Great! Let me know if there is anything else that blocks your progress -- I'd be happy to help to push this forward.