JuliaCrypto / SEAL.jl

SEAL.jl is an easy-to-use wrapper for the original SEAL library and supports homomorphic encryption with the BFV and CKKS schemes.

Home Page:https://juliacrypto.github.io/SEAL.jl/stable

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SEAL memory leak

isentropic opened this issue · comments

@sloede Thanks again for this package this has been a learning journey for me with julia, ccall and binary builder. I have succeeded in bumping the version to 4.1.1 locally and unfortunately I did not see any performance gains as I was hoping for. Just as a backstory I'm trying to implement a large machine learning model (CNN) using seal. Before this I was using pyseal and did everything in python, but now I realized that I need multi-threading to bring the performance to the next level. The reason why I was pushing for 4.1.1 is that there was a promised multiplication performance improvement in 3.7.x version. However, for some reason pyseal is consistently like 30% faster than SEAL.jl even after I compile SEAL_jll manually. Though my benchmarks suggested that single ckks cipher-cipher multiplication if anything is faster in SEAL.jl but when I'm doing a lot of them many times, pyseal seems to take over. Maybe this has to do with the following.

I attempted to do multithreading, i.e. perform many cipher-cipher multiplications in parallel using FLoops.jl. It worked fine, but then I noticed sudden crashes which I attributed to non-stopping memory growth. Then I decided to test this with a single thread, and this is a mwe i came up with:

using SEAL
function create_seal_params(poly_modulus_degree=2^13, modulus_chain=[60, 60, 60])
    parms = EncryptionParameters(SchemeType.ckks)

    set_poly_modulus_degree!(parms, poly_modulus_degree)
    set_coeff_modulus!(parms, coeff_modulus_create(poly_modulus_degree, modulus_chain))

    context = SEALContext(parms)

    keygen = KeyGenerator(context)
    public_key_ = PublicKey()
    create_public_key!(public_key_, keygen)
    secret_key_ = secret_key(keygen)
    encryptor = Encryptor(context, public_key_)
    evaluator = Evaluator(context)
    decryptor = Decryptor(context, secret_key_)

    encoder = CKKSEncoder(context)
    nslots = slot_count(encoder)
    scale = 2^50
    return (; encryptor, evaluator, decryptor, encoder, nslots)
end


function encrypt_into_cipher(number, initial_scale, encoder, encryptor)
    plain = Plaintext()
    encode!(plain, number, initial_scale, encoder)
    encrypted = Ciphertext()
    encrypt!(encrypted, plain, encryptor)
    return encrypted
end

function cipher_multiplication(cipher1, cipher2, evaluator)
    result = Ciphertext()
    multiply!(result, cipher1, cipher2, evaluator)
    return result
end

function cipher_addition(cipher1, cipher2, evaluator)
    result = Ciphertext()
    add!(result, cipher1, cipher2, evaluator)
    return result
end

function test()
    seal_params = create_seal_params()
    nciphers = 1_000_000
    ciphers = [encrypt_into_cipher(randn(), 2^50, seal_params.encoder, seal_params.encryptor) for i in 1:nciphers]


    while true
        i = rand(1:nciphers)
        j = rand(1:nciphers)
        k = rand(1:nciphers)
        ciphers[k] = cipher_addition(ciphers[i], ciphers[k], seal_params.evaluator)
    end
end

My julia is

julia> versioninfo()
Julia Version 1.9.4
Commit 8e5136fa297 (2023-11-14 08:46 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 128 × Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, icelake-server)
  Threads: 1 on 128 virtual cores

using a fresh Project.toml environment, this script linearly consumes more and more memory and I assume would eventually crash, GC.gc() does not help. I was trying to diagnose how is this happening, found a good text about memory management in ccall but I have no idea really. This is all new.

I also head that seal uses some block memory management under the hood, so that might also be important

I have modified my script to

function test()
    seal_params = create_seal_params()
    nciphers = 1_000_000
    ciphers = [encrypt_into_cipher(randn(), 2^50, seal_params.encoder, seal_params.encryptor) for i in 1:nciphers]


    while true
        i = rand(1:nciphers)
        j = rand(1:nciphers)
        k = rand(1:nciphers)
        old = ciphers[k]
        ciphers[k] = cipher_addition(ciphers[i], ciphers[k], seal_params.evaluator)

        destroy!(old)
    end
end

to no avail. There have been other issues in the wild like this or this probably this is just the way seal is and probably nothing can be done

You could try to verify your hypothesis by regularly checking the GC usage with something like

using Printf: @printf

function meminfo_julia()
  # @printf "GC total:  %9.3f MiB\n" Base.gc_total_bytes(Base.gc_num())/2^20
  # Total bytes (above) usually underreports, thus I suggest using live bytes (below)
  @printf "GC live:   %9.3f MiB\n" Base.gc_live_bytes()/2^20
  @printf "JIT:       %9.3f MiB\n" Base.jit_total_bytes()/2^20
  @printf "Max. RSS:  %9.3f MiB\n" Sys.maxrss()/2^20
end

(originally posted in https://discourse.julialang.org/t/how-to-track-total-memory-usage-of-julia-process-over-time/91167/6?u=sloede). If you see that GC live bytes are spiraling out of control even when using GC.gc() it might yet be a GC issue, but if that memory usage is controlled, it is likely to be on the SEAL side.

Yeah, probably this is not julia issue:

julia> meminfo_julia()
GC live:      43.181 MiB
JIT:           0.055 MiB
Max. RSS:  96301.328 MiB

julia> GC.gc()

julia> meminfo_julia()
GC live:      20.535 MiB
JIT:           0.059 MiB
Max. RSS:  96301.535 MiB

96GB is just occupied for no reason