FRED-2 / OptiType

Precision HLA typing from next-generation sequencing data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optitype memory error in v1.3.5 due to pandas' reindex

jnktsj opened this issue · comments

Hi,

Thank you for developing the great tool! I have been using Optitype v1.3.2 and recently switched to the latest version, v1.3.5. Interestingly, all the samples that were used to successfully finish with v1.3.2 started failing due to out of memory error in v1.3.5 (tested both docker image and code versions). I have increased memory from 16GB to 32GB, but no luck so far. The input FASTQs I used were only HLA-mapping reads from the pre-processing step, so these were only ~5-8MB depending on samples..

Highly likely this memory issue is from pandas reindex implemented in #102 (hlatype = result.iloc[0].reindex(["A1", "A2", "B1", "B2", "C1", "C2"]).drop_duplicates().dropna()). It would be great if a fix for this memory leak can be investigated further..

commented

Hi @jnktsj , in my own case, a server with 128GB RAM will also encounter this problem.

Hi @b-niu, my solution for now is not using the latest Optitype. Here is the code snippet of my setup for Optitype v1.3.2:

pip install \
  numpy==1.15.4 pandas==0.22.0 matplotlib==2.1.2 \
  pyomo==5.3 pysam==0.13 future==0.16.0 \
  numexpr==2.6.4 tables==3.4.2 pyutilib==5.8

# razers3 3.5.8
wget 'http://packages.seqan.de/razers3/razers3-3.5.8-Linux-x86_64.tar.xz' && \
    tar -xf razers3-3.5.8-Linux-x86_64.tar.xz && rm razers3-3.5.8-Linux-x86_64.tar.xz && \
    mv razers3-3.5.8-Linux-x86_64/bin/razers3 /usr/local/bin/razers3 && rm -r razers3-3.5.8-Linux-x86_64

# Optitype v1.3.2 (don't use v1.3.5 due to memory leak)
wget 'https://github.com/FRED-2/OptiType/archive/refs/tags/v1.3.2.tar.gz' && \
    tar xzf v1.3.2.tar.gz && rm v1.3.2.tar.gz && mv OptiType-1.3.2 /usr/local/bin/
commented

Thanks a lot @jnktsj ! Nice solution 😄