google / xls

XLS: Accelerated HW Synthesis

Home Page:http://google.github.io/xls/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[enhancement] XLS' understanding of delay needs to address (upside potential for) skew in bit-level delays

cdleary opened this issue · comments

What's hard to do? (limit 100 words)

Right now the delay model assigns a single delay to all of the output bits for a particular IR operation. That is, if I have a 32-bit adder, all of the 32-bit adder output pins arrive at some time d we determine. It's a single value like "after 100ps". In practice, there is often skew between initial bits being produced and final bits being produced, that results from the fact that bit patterns are often trees of logical operations, which can be e.g. left-leaning, right-leaning, having carry chains percolating across them, tree shaped with subsequent broadcast, stacks of wires with mux selectors at each level, etc. The high level XLS ops don't have the ability to express the richness of what can happen at the pin level inside the operation today with the current delay model.

Current best alternative workaround (limit 100 words)

Assuming all "pins" for an op arrive/depart at the same time.

This prevents us from also "fission-ing" large ops into pieces at some pipe stage boundary where we could profitably remove registers by splitting an operation in half automatically along some more minimal internal cut to the op (e.g. imagine a large adder you want to cleave into low half vs high half across a pipe stage).

Your view of the "best case XLS enhancement" (limit 100 words)

We could potentially characterize and calculate pin-level delays for every op. We could also potentially "draw a lasso" around a set of dependent operations to note when two operations can clearly pipeline their bit-level delays together effectively (kind of like a fusion cluster in XLA, if that happens to be a familiar concept :-), in order to capture the benefit of knowing how operations will dovetail at a coarser granularity.

@hzeller has several bugs filed that relate to potential benefits in refined understanding of bit-level delay, and @hongted was going to file a bug for particular "delay fusions" that we know could be interesting to us.