Dynamic Range Unbiased Multiplier
The operands A and B are fed to the approxMultiplier module, which truncates the required number of bits and pushes it to the 4-bit Wallace Tree Multiplier.
The output of the Wallace Multiplier (resoluteProduct) is then sent through the barrel shifter to obtain a compensated result (approx product).
The shift of the resoluteProduct is necessary to compensate for the offset, which can be calculated by:
<-----n bits---->
|0|0|1|X|...|X|X|
<---t bits-->
<-k bits->
In the paper's terms, t-k bits need to be shifted for each operand, which results in
loIndexA+1 + loIndexB+1 - 2*(MAIN_RESOLUTION)
for the program.
Shift amount wouldn't cross beyond log(OUT_WIDTH) and is an assumption made for the hardware of the barrelShifter.
- Is the only solution to a scaled 16x16 multiplier hardcoded logic?
- Instantiate 4-bit wallace-multipliers 8x8, 16x16 wallace multipliers?
- Testbench needed
LOD is implemented along with a simple MSB to LSB priority encoder.
Naturally, a break
statement will be necessary for prioritising the MSB, which isn't synthesizable, hence a basic FSM is used to improvise.
The figure below shows operandA
and operandB
looping from 17 to 31 (decimal).
The figure below shows one iteration of the testbench's outer loop, with operandA
fixed at 17 and operandB
ramping from 17 to 31.
The figure below shows 50 iterations of operandA
x operandB
where A ramps from 10 to 18 in increments of 2 while B ramps in single increments.
It can be seen that the steering logic determines to approximate the operand value whenever it goes higher than the RESOLUTION (4 bits). Correspondingly, signals truncateOperandA
and truncateOperandB
are altered.
All variables are hopefully self-explanatory.