sdiehl / galois-field

Finite field and algebraic extension field arithmetic

Home Page:https://www.adjoint.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Discussion: representation of polynomials

aspiwack opened this issue · comments

Stop me if I didn't get it right, but I understand that:

  • Elements of Galois fields on non-prime order are represented as polynomials over their prime field (modulo an irreducible polynomial, as it should)
  • Polynomials are represented as list of coefficients (specifically, [] is the 0 polynomial, and a:l is a+Xl)

I pretty much don't know anything about computer polynomial arithmetic. But I find it hard to imagine that this representation is the most efficient for the computations in this package. Maybe it is, but then this thread should gather evidence.

Your understanding is right, and yes, the current list-based implementation is far from being the most efficient, and it slows inversion/division the most by the frequent use of polynomial quotient/remainder and degree/last functions. I am currently familiarising myself with the Data.Vector library and will be rewriting the current implementation within the next few days into something Vector-based, and unboxed. Open to suggestions, particularly whether unboxed Arrays, or any other library I am unaware of, would make it more efficient.

I was rather thinking of more abstractly about using the sequence of coefficient as one's encoding for polynomials. I would be surprised if there wasn't a better encoding.

If we go into low level representations, vectors are fine, they have their own cost as immutable arrays are wont to, but should not be a problem for polynomials. I don't think that you can get much leverage (if at all) from unboxed vectors for Integer, as it is an unsized structure. Unless you want to vectorise operations to benefit from, say, GPU, vector is your best bet as a library.

Noted, and thank you. I will look into encoding the polynomial coefficients some other way after everything is vectorised, and will update this thread when I get the chance to.

We did some experimentation with vectors (and sequences), and found out that they are significantly slower if the lists involved are very short (say 2 or 3 elements, which happens often in extension-field-tower-based cryptography unless the characteristic is 2), which makes sense because there is no nice way to add two vector-based polynomials while ensuring there are no trailing zeroes. As such we will delay this change temporarily in the vector branch, in favour of trying to encode polynomials using integers directly. This would go somewhere along the lines of:

  • An element a in the prime field Fp is represented as a mod p (as before, where p is its characteristic)
  • An element [a_0, ..., a_(n-1)] in the first extension field is represented as the sum a_0*p^0 + ... + a_(n-1)*p^(n-1) (where n is its degree as a vector space over the prime field), which will make it strictly smaller than p^n as an integer - I believe this was what you're hinting at
  • An element [b_0, ..., b_(m-1)] in the second extension field is represented as the sum b_0*(p^n)^0 + ... + b_(m-1)*(p^n)^(m-1) (where m is its degree as a vector space over the first extension field), so the first extension field can be naturally embedded in here.

I am currently working on this, and experiments show that it is indeed significantly faster than the current list-based polynomials for fields of characteristic p = 2, as bit operations can be abused. Have not tried this with extensions of larger prime fields - will do once the characteristic 2 case is ready - but it is entirely possible that it is slower than the list/vector/sequence-based approach, as the list/vector/sequence may not be entirely filled (say the extension field is of degree n >> 2 and the polynomials involved are always only [0, ..., 0, 1]) and the arithmetic in this approach will completely involve gargantuan integers (there's an example with 600 decimal digits).

This "p-adic" encoding of polynomials is now sketched out under the branch encoding, under the AdicField file that would be renamed and merged to ExtensionField once all is good. It would be wonderful if you could share your insights on this implementation, or if I misinterpreted your thoughts.

Think this can be closed out now. We landed the new poly dependency which seems to be the right representation.