full-length amino acid sequence

Question

full-length amino acid sequence

ljy-sys opened this issue 7 months ago · comments

CDR3序列.xlsx
Thanks to the developers of the tool for facilitating us to analyze the immune repertoire data. I used the trust-smartseq.pl function to analyze the VDJ information of smart-seq data. This is a CDR3 sequence that I am interested in, but I am not sure how to obtain the full-length amino acid sequences of TCRα and β?

Li Song · Answer 1 · Mon Jan 15 2024 11:57:50 GMT+0800 (China Standard Time)

Do you mean you also need the amino acid sequence including V and J genes? In this case, you shall use the AIRR output format, and translate the amino acid sequence from the sequence_alignment column if the sequence is full-length.

ljy-sys · Answer 2 · Mon Jan 15 2024 12:18:03 GMT+0800 (China Standard Time)

Do you mean you also need the amino acid sequence including V and J genes? In this case, you shall use the AIRR output format, and translate the amino acid sequence from the sequence_alignment column if the sequence is full-length.

Oh, yes, that's right. I'll have a try. Thank you very much.

ljy-sys · Answer 3 · Mon Jan 15 2024 17:31:52 GMT+0800 (China Standard Time)

Do you mean you also need the amino acid sequence including V and J genes? In this case, you shall use the AIRR output format, and translate the amino acid sequence from the sequence_alignment column if the sequence is full-length.

For smart-seq3 data, how to obtain all TCRa and TCRb chain amino acid sequences instead of only CDR3 sequences? Well, I have obtained the full-length TCR amino acid sequence by database alignment based on the obtained CDR3 sequence, but I want to see if it is consistent with our own alignment book. So I wanted to directly obtain TCR amino acid sequences for each cell.

Li Song · Answer 4 · Mon Jan 15 2024 23:43:55 GMT+0800 (China Standard Time)

The smartseq.pl wrapper shall also output the _airr.tsv file. The sequence_alignment column includes how the underlying full sequence aligned against the IMGT database (germline_alignment column). But you may need to write your own script to translate the sequence_alignment column into amino acids.

ljy-sys · Answer 5 · Tue Jan 16 2024 11:13:46 GMT+0800 (China Standard Time)

The smartseq.pl wrapper shall also output the _airr.tsv file. The sequence_alignment column includes how the underlying full sequence aligned against the IMGT database (germline_alignment column). But you may need to write your own script to translate the sequence_alignment column into amino acids.

Good, I have got the full-length amino acid sequence. However, I am not sure whether the "sequence" column or the "sequence_alignment" column should be used to translate the amino acid sequence in the aiir file. Hope to get your help.

Li Song · Answer 6 · Tue Jan 16 2024 11:16:06 GMT+0800 (China Standard Time)

"sequence" column is the underlying assembled contigs, which includes regions outside of the V genes. The "sequence_alignement" is the portion that can be aligned to the IMGT database, so it is more appropriate to use this one.

ljy-sys · Answer 7 · Tue Jan 16 2024 11:20:32 GMT+0800 (China Standard Time)

"sequence" column is the underlying assembled contigs, which includes regions outside of the V genes. The "sequence_alignement" is the portion that can be aligned to the IMGT database, so it is more appropriate to use this one.

Ok, thank you very much for your reply. It is very helpful to me, hey, good luck.