libtom / libtomcrypt

LibTomCrypt is a fairly comprehensive, modular and portable cryptographic toolkit that provides developers with a vast array of well known published block ciphers, one-way hash functions, chaining modes, pseudo-random number generators, public key cryptography and a plethora of other routines.

Home Page:https://www.libtom.net

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

AES implementation is not side-channel resistant

ereshetova opened this issue · comments

I guess everyone is aware that the current AES implementation is not resistant to traditional non-speculative side channels due to the direct usage of T-tables, which can reveal the information about the key being used (usage of T-tables basically creates a data access pattern that depends on the key). The key question is what would be the right approach for libtomcrypt to address this.

I see a number of options (each has its pros and cons):

  1. Usage of dedicated CPU-specific instructions. Instructions like AES-NI [1] and their alternatives can be used to solve the above problem in a pretty straightforward way. Many open source libs have went this way, i.e. openssl, bearssl, etc.
    Pros: Easy to implement, the end implementation is fast and efficient, applicable to all modes
    Cons: Not a generic solution, CPU-specific.

  2. Bitslicing. This is a technique [2] that allows to achieve good side-channel resistance for the modes that can be parallelized.
    It is implemented in some crypto libraries, for example in bearSSL [3].
    Pros: no CPU dependency, relatively efficient implementation
    Cons: Not useful for non-parallelizable modes.

  3. Algebraic implementation. This approach would involve computing the result of S-box on the fly using algebraic structure of S-box, i.e. the fact that an Sbox is a combination of finding a byte-wise inverse in GF(2^8) + bitwise affine transformation. The computation itself should be very carefully implemented not to contain any side-channels as well. I am not aware of any major crypto library that follows this approach.
    Pros: no CPU dependency, applicable for all modes
    Cons: Likely to be slow and not efficient (albeit can try to use some speedup tricks given that we always work in the same GF)

[1] https://en.wikipedia.org/wiki/AES_instruction_set
[2] https://link.springer.com/chapter/10.1007/11935070_14
[3] https://bearssl.org/constanttime.html#bitslicing

What are people's opinions on this for libtomcrypt? Which way is seen as a preferred direction?

IMO it'd make sense to have at least a bitsliced implementation in addition to the current one.

To go for CPU-specific instructions would be nice as well but something that should be provided additional to bitslicing... AES-NI for x86 or AES instructions for aarch64 could be a starter.

I don't think an algebraic implementation makes sense... I've played with htable [1] some years ago and it was really slow ... :) Maybe it could be made faster but additional to its slowness it was also big IIRC.

[1] https://github.com/coron/htable

Just adding my support here for the addition of AES-NI. Many developers are currently looking for GCM with AES-NI due to the speed and security that's possible on Intel CPU's.

Detecting AES-NI capable CPU's can be done with just a few line of code, for example:

https://github.com/exscape/AES/blob/master/misc.c

It appears this developer has a working example using AES-NI with LibTomCrypt for Linux (I was unable to get it to compile on Windows using MinGW or Visual Studio however, though that might be easily solvable):

https://github.com/YutaroHayakawa/tomcrypt_aesni

Intel has other examples of detecting and using AES-NI here:

https://www.intel.com/content/dam/doc/white-paper/advanced-encryption-standard-new-instructions-set-paper.pdf

I'd offer to help, but I'm afraid assembly is not in my skill set. I can assist with testing however - I have an array of OS's and Intel CPU's to test on, so just let me know.

Thanks for considering adding AES-NI to this already excellent library!