ml-jku / DeepRC

DeepRC: Immune repertoire classification with attention-based deep massive multiple instance learning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

docs on how to apply DeepRC to BCR

antonkulaga opened this issue · comments

Unfortunately, in the paper I did not understand if I can make a training dataset from BCR AIRR seq. Could you clarify in the readme and/or manuscript if it is possible and if yes if any additional steps (in comparison with TCR-s) are needed

Hi! The implementation in the repo is rather general and should work with any kind of repertoires, as long as you can provide it in a suitable text-based format (please see https://github.com/ml-jku/DeepRC/blob/master/deeprc/datasets/README.md for the expected text-based data format).
In the paper we only conducted experiments with TCR data or simulated repertoire data but I would expect it to work similarly well for BCR data if you have a large enough dataset.
I would recommend to start by using https://github.com/ml-jku/DeepRC/blob/master/deeprc/examples/example_single_task_cnn.py with your dataset. Your text-based dataset will be automatically compressed to a hdf5 file for performance reasons. Preprocessing shouldn't be required.
Did this answer your questions? If yes, I will incorporate this into the readme.

marked as resolved since no response