liulab-dfci / TRUST4

TCR and BCR assembly from RNA-seq data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

full-length amino acid sequence

ljy-sys opened this issue · comments

CDR3序列.xlsx
Thanks to the developers of the tool for facilitating us to analyze the immune repertoire data. I used the trust-smartseq.pl function to analyze the VDJ information of smart-seq data. This is a CDR3 sequence that I am interested in, but I am not sure how to obtain the full-length amino acid sequences of TCRα and β?

Do you mean you also need the amino acid sequence including V and J genes? In this case, you shall use the AIRR output format, and translate the amino acid sequence from the sequence_alignment column if the sequence is full-length.

Do you mean you also need the amino acid sequence including V and J genes? In this case, you shall use the AIRR output format, and translate the amino acid sequence from the sequence_alignment column if the sequence is full-length.

Oh, yes, that's right. I'll have a try. Thank you very much.

Do you mean you also need the amino acid sequence including V and J genes? In this case, you shall use the AIRR output format, and translate the amino acid sequence from the sequence_alignment column if the sequence is full-length.

For smart-seq3 data, how to obtain all TCRa and TCRb chain amino acid sequences instead of only CDR3 sequences? Well, I have obtained the full-length TCR amino acid sequence by database alignment based on the obtained CDR3 sequence, but I want to see if it is consistent with our own alignment book. So I wanted to directly obtain TCR amino acid sequences for each cell.

The smartseq.pl wrapper shall also output the _airr.tsv file. The sequence_alignment column includes how the underlying full sequence aligned against the IMGT database (germline_alignment column). But you may need to write your own script to translate the sequence_alignment column into amino acids.

The smartseq.pl wrapper shall also output the _airr.tsv file. The sequence_alignment column includes how the underlying full sequence aligned against the IMGT database (germline_alignment column). But you may need to write your own script to translate the sequence_alignment column into amino acids.

Good, I have got the full-length amino acid sequence. However, I am not sure whether the "sequence" column or the "sequence_alignment" column should be used to translate the amino acid sequence in the aiir file. Hope to get your help.

"sequence" column is the underlying assembled contigs, which includes regions outside of the V genes. The "sequence_alignement" is the portion that can be aligned to the IMGT database, so it is more appropriate to use this one.

"sequence" column is the underlying assembled contigs, which includes regions outside of the V genes. The "sequence_alignement" is the portion that can be aligned to the IMGT database, so it is more appropriate to use this one.

Ok, thank you very much for your reply. It is very helpful to me, hey, good luck.