intel / hexl

Intel:registered: Homomorphic Encryption Acceleration Library accelerates modular arithmetic operations used in homomorphic encryption

Home Page:https://intel.github.io/hexl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EltwiseFMAMod fails on EltwiseFMAModAVX512 operation with a modulus that is between 51 and 52 bits.

jcalafato1 opened this issue · comments

Hello,
I believe that this is related to the following issue: #121. That issue is fixed and working on v1.2.5, but using the EltwiseFMAMod function to scale numbers when the modulus is 52 bits (a number between 2^51 and 2^52) sometimes fails. Could you confirm this behaviour?

Example case:

modulus = 4503599627370486
input = {1191078607011827, 1769260550218270, 4345204646426905, 1153813479460416, 2994576917176123, 2254124429352543, 2866174865142532, 3780255914740878}
factor = 3724197286470134
input_mod_factor = 1  

function call:

intel::hexl::EltwiseFMAMod(result, input, factor, nullptr, input.size(), modulus, input_mod_factor);

Result:

{3669213812048074, 4438596699595472, 3062713067630738, 2441777770654938, 1332695674286306, 833480206939968, 2031613570657280, 229641522601742}

Expected (input * factor):

{3669213812048074, 4438596699595472, 3062713067630738, 2441777770654938, 1332695674286306, 833480206939968, 2031613570657280, 229641522601752}

Note the last element is not correct.

Hello @jcalafato1. Yes, I am able to reproduce these results. Let us check it further.

Hello @jcalafato1. Thanks for catching this one too. Similar to previous issue this is reaching the limits of Barrett reduction on AVX512-IFMA (52 bits) with both modulus and factor values being 52 bits values. In a situation like this it should be using AVX512-DQ instead. We will do more testing with these big numbers and provide a fix soon.

Sounds good to me, happy we can have these fixed quickly!

If I can recommend a better testing methodology, I run all of my unittests with random inputs over the entire range of supported numbers. I also run each unittest for a lot of iterations, so a wide range of inputs are tested at each run. For example (pseudocode):

test() {
   for (test_iter = 0; test_iter < 1000; ++test_iter) {
    modulus = generate_random_modulus(mod_lower, mod_upper);
    input = generate_random_input(in_lower, in_upper);

    chk = function(**args);

    expected = make_expected(**args);
    ASSERT(chk, expected);
  }
}

This helps find these edge cases without thinking too much about them.

Thanks @jcalafato1. Yes, we have tests like that, but it coincidentally stops at 51 bits :)

ah! Thanks for your help again.