libcg / bfp

Beyond Floating Point - Posit C/C++ implementation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Implement addition

libcg opened this issue · comments

This gives us substraction for free since we can easily negate.

  • Implement for 2-bit posits
  • Implement for 3-bit posits (regime bits)
  • Implement for 4-bit posits (exponent bits)
  • Implement for 5-bit posits (fraction bits)
  • Implement for n-bit posits
  1. Take lfraction() of the addend with the larger absolute value (the "first" addend) with the appropriate sign.
  2. Take fraction() of the other addend (the "second" addend) with the appropriate sign.
    (The integral type for the fractions must contain at least 2 more bits than the max fraction length of the posit type, including the hidden bit).
  3. Normalize the second addend according to the fraction length and the exponent difference of the addends to match the bit values of the first addend.
    3a. If the normalized value of the second addend is 0, return the first addend as the sum (as we have guard bits in the underlying fraction type, it will not happen if the second addend has a chance to influence the result).
  4. Perform the addition, rounding at will.
  5. Normalize the integral sum (may have to normalize to the right by 1 bit, or to the left by several, pulling in the low bits of the second addend that didn't participate in the addition), correcting the exponent.
  6. Construct the result.

Seems straightforward enough; what did I miss?

2bd86da implements add/sub operations. thanks for the help!