raphael-group / chisel

CHISEL -- Copy-number Haplotype Inference in Single-cell by Evolutionary Links

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CHISEL failing on Combiner.py ?

laurenhummel opened this issue · comments

Hi there,

Thanks so much for posting this package - I am very excited to use it on my single cell data. I have been trying to configure Chisel on our cluster since your 10x Webinar, however I am running into some issues with the example data. I understand this package is still preliminary but hopefully you have some ideas.

My error logs and output data look good until it continually fails on "Combining" with exit status 137 where Combiner.py is implemented.

  • selectedSNPs.tsv, baf.tsv, rdr.tsv and total.tsv all look good.
  • combo.tsv and log_TMP are present but are empty files.
  • The directories calls, clones and plots are all empty.
  • I have been qsubbing it to an ultra_high_mem node with 220G of RAM, so I am surprised if the issue really is "out of memory"
  • I am using the recommended versions of all dependencies (Gurobi however is not available on our cluster).

Please let me know if you have any ideas of what could be happening here.

Thanks so much!!

Lauren

Thank you for the interest in our method CHISEL!

This error is indeed suspicious because in all our tests CHISEL never required more than ~45GB of RAM in total (even when running with a very high number of parallel processes) and your machine thus has more than enough memory. However, I will be very happy to assist you in solving this issue.

Exit status 137 indicates that someone decided to kill your process because excessive memory. Since you are running through a cluster, we can assume that the cluster manager did it because your job was exceeding the memory limit. The first important point is thus to make sure that you request a sufficient amount of memory. Since you mention qsub, I assume that yours is a PBS-based cluster therefore please check the corresponding guide to see what is the command to specify the maximum memory for your job; e.g., your command could be something like -l mem=100000 where 100000MB = 100GB for the total memory. Also, pay attention to two important points: (1) the units, often it is requested in MB and not in GB, so you need to specify 100000 instead of 100; (2) check whether the manager allow you the specify the TOTAL amount of memory for the job or the amount of memory per processor. In any case, make sure to specify a sufficient amount of memory, and, for the sake of debugging, I would try to request a large amount of memory, e.g.150-200GB. I was able to reproduce your issue on our cluster and, after specifying a sufficient amount of memory, everything ran smoothly.

Moreover, few additional points:

  1. Could you please provide the CHISEL's log? CHISEL prints several log messages with timestamps during the execution and this would allow me to better diagnose the issue.
  2. To avoid re-running all the steps during debugging, I suggest that you only run Combiner step for the purpose of testing. You can do that by the following command (where you can additionally add -j to match the number of processes that you want to use on your node, e.g. -j 20)
python 2.7 ${CHISEL-HOME}/src/Combiner.py -r ${PATH-TO-RDR-DIR}/rdr.tsv -b ${PATH-TO-BAF-DIR}/baf.tsv
  1. Why did you mention Gurobi? Gurobi is not a dependency of CHISEL.

Hi again,

Thanks so much for your help. After changing -j to 8 and running Combiner, Caller, Cloner and Plotter, I have been able to obtain 9 output plots: acn, bcn, cbaf, crdr, loh, minor, rbplot_mirrored, states and totalcn.

Can you please quick confirm the generation of phylogenetic trees is done separately from CHISEL pipeline?

I have attached the log (minus progress bars) from the failed job for reference, should anyone else have issues with a crash in "Combining"

CHISEL_test2_191105.e3225637.txt

Thanks again

Lauren

Good to hear that you have been able to fix it by limiting the maximum number of parallel processes! You can also check with your system manager whether there is any limitation on that regard.

Concerning the phylogenetic trees, yes, the tree inference is currently not part of CHISEL. We are planning to also add this step in the pipeline in the near future. For now, please check the CHISEL's manuscript to obtain details about the reconstruction of phylogenetic trees. Briefly, copy-number trees can be reconstructed from the copy-number profiles of the inferred clones using standard methods, including CNT from our research group (read corresponding manuscript) or the previous MEDICC which first introduced the interval evolutionary model for CNAs (read corresponding manuscript).

Last, please be aware that the current version of CHISEL has a small bug and does not output the corrected haplotype-specific copy numbers (i.e. corrected according to the clone of which a cell is part of, to clean occasional small noise), but only output the raw haplotype-specific copy numbers directly inferred for every bin of every cell. Next week a new version of CHISEL including this correction will be released, please remember to update.

I will close the issue for now, but please feel free to re-open it in case of additional issues.