liulab-dfci / TRUST4

TCR and BCR assembly from RNA-seq data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

failed: 139 at ./run-trust4 line 55.

BobW4ng opened this issue · comments

Hi,
When i was using TRUST4 for my 10X Fastq files,it reported failed: 139 at ./run-trust4 line 55.

[Mon Apr 8 14:41:02 2024] TRUST4 begins.
[Mon Apr 8 14:41:02 2024] SYSTEM CALL: /home/work1/Documents/TRUSt4git/TRUST4/fastq-extractor -t 32 -f /home/work1/Documents/TRUSt4git/TRUST4/hg38_bcrtcr.fa -o /home/work1/Documents/TRUST4/HThuman/HRR568828_toassemble -u /home/work1/Documents/HT_human/HRR568828_R2.fastq.gz
[Mon Apr 8 14:41:02 2024] Start to extract candidate reads from read files.
system /home/work1/Documents/TRUSt4git/TRUST4/fastq-extractor -t 32 -f /home/work1/Documents/TRUSt4git/TRUST4/hg38_bcrtcr.fa -o /home/work1/Documents/TRUST4/HThuman/HRR568828_toassemble -u /home/work1/Documents/HT_human/HRR568828_R2.fastq.gz failed: 139 at ./run-trust4 line 55.

can you hele me ? thanks a lot!

Which version of TRUST4 are you using? Do you have access permission to "/home/work1/Documents/TRUST4/HThuman/" folder?

An unrelated issue, for 10x-style single-cell data, you may need to use the --readFormat to specify the barcode and UMI domain in a read.

thank you for the reply ,i think is the latest version ,i was using gitclone and make to install TRUST4, and check the permission is ok,but still have the issue.

[Tue Apr 9 09:55:27 2024] TRUST4 finishes.
[Tue Apr 9 09:55:27 2024] TRUST4 begins.
[Tue Apr 9 09:55:27 2024] SYSTEM CALL: /home/work1/Documents/TRUSt4git/TRUST4/fastq-extractor -t 32 -f /home/work1/Documents/TRUST4/HTmodel/bcrtcr.fa -o /home/work1/Documents/TRUST4/HTmodel/A-3_S1_L004_toassemble --readFormat bc:0:15 -u /home/work1/Documents/HLY_ABC/A/A-3_S1_L004_R2_001.fastq.gz --barcode /home/work1/Documents/HLY_ABC/A/A-3_S1_L004_R1_001.fastq.gz
[Tue Apr 9 09:55:27 2024] Start to extract candidate reads from read files.
Read file and barcode file have different number of reads.
system /home/work1/Documents/TRUSt4git/TRUST4/fastq-extractor -t 32 -f /home/work1/Documents/TRUST4/HTmodel/bcrtcr.fa -o /home/work1/Documents/TRUST4/HTmodel/A-3_S1_L004_toassemble --readFormat bc:0:15 -u /home/work1/Documents/HLY_ABC/A/A-3_S1_L004_R2_001.fastq.gz --barcode /home/work1/Documents/HLY_ABC/A/A-3_S1_L004_R1_001.fastq.gz failed: 256 at ./run-trust4 line 55.
And another bug i dont konw how to fix, is Read file and barcode file have different number of reads. So i just use the R2.fastq.gz.

Have you preprocessed the raw fastq files? Like read trimming and filters, which may causing read number change?

i dont konw, these data download from Sequencing company

Could you please show me the first a few lines of these two files?

the HRR568828file:
@a00224:86:H5F3KDSXX:1:1101:18466:1000 2:N:0:GTTGAGAA
NCTAAGGGCTGTGGCACTGTCCTGCTCTCCGGTCCTCGCAAGGGCCGAGAGGTGTACCGGCATTTCGGTAAGGCCCCAGGAACCCCGCACAGCCACACCAAACCCTACGTCCGCTCCAAGGGCCGGAAGTTCGAGCGTGCCAGAGGCCGAC
+
#FFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFF:FFFFF,FFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFF
@a00224:86:H5F3KDSXX:1:1101:18846:1000 2:N:0:GTTGAGAA
NGCCTTAAATGCTCCAGGGACCTTTGAGTTTCTAGAGTTATCATGTGCTACAGGAATGAGCAGTTTAAAGTTTCCAAAAAGGCTGGGCGTGGTGGCTCACACCTGTAATCCTAGCAATTTGGGAGGATGAGGTGGGCAGATCACTTTAGGT
+
#,,FFFFFFFF,F:,FFFFF,,FFF:F:FFF,FFFF,FFFF::F:F:,F,,F::FFFFF,FFFFFFF:,:FFF:FFFF,FF,:F:,F,,F,FFF,FF:F,:F,FFFFFF:,:FFFF,FFFFF,FFF,FF,FFFF,FFF,FF:F:FF,FFFF
@a00224:86:H5F3KDSXX:1:1101:19479:1000 2:N:0:GTTGAGAA
NTAGTTATCACCTTAGGTACATTATTATTTTGAATGATGAGGAATTTTTATTTTCATCTGCCTCAGTGGAGTGATTATATAGTATGCTAAGTAATCTTTCATTTCTTACAGAAGACGATCACCTTCTCCTTATTATAGTCGATATAGATCA

How about the barcode file?

sorry for the wrong files ,the files are here :
R2 file:
@A01426:197:HMYWKDSX2:4:1101:1325:1000 2:N:0:CGTCAAGGGC+GAGTGACCTA
GTGTGGGCTAGTGCGTCTCTTTCATAGTCGCCAGTCATCATCTCTACATCATCCCAGGACATTATCGCTTGCCATGGTGGTACATATGATGTTTACTTTTGTATATGTTTGAAATTTTACATCAATCACTGTGTTACTCTGTTGTTCTCT
+
:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFF:FFFFFFF:FFFFF:FFFFFFFFFFFFFFFFF
@A01426:197:HMYWKDSX2:4:1101:1344:1000 2:N:0:CGTCAAGGGC+GAGTGACCTA
TGGTTGCTGAGAAGCGGCTCATTCCTGATGGCTGTGGTGTCAAATATATCCCCAATCGTGGTCCTCTGGACAAGTGGAGAGCCCTGCATTCCTGAAGGCTTCAATAGTTCTCCTATACCCTACCAAATCGTTCAATAATAAAATCTCGCA
+
FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,FFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A01426:197:HMYWKDSX2:4:1101:1434:1000 2:N:0:CGTCAAGGGC+GAGTGACCTA
CTCGCTTGCATCTACTCCGCCCTCATCCTGCACGACGACGAGGTGACGGTCACGGAGGATAAGATCAATGCCCTCATTAAAGCAGCTGGTGTCAGCGTCGAACCTTTCTGGCCTGGCTTGTTTGCCAAGGCTCTGGCCAATGTCAACATT

barcode file:
@A01426:197:HMYWKDSX2:4:1101:1325:1000 1:N:0:CGTCAAGGGC+GAGTGACCTA
GGTAACTCATGTGGTTTAGGTCGTATACTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTATTGTTTTTTAAAATTTAAATTTTAAAATTTAAATTTAATTTATGAAAATTGGTTAATATTAAAAAAAAAATTATAAATATTAAATAATTTT
+
FFFFF:F:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:,,FFFFF,,,FFF:F:FF,FF,,,:FF,,FFFF,FF,F,,F,,,F,:F,,:,,F,F,,FF,,,FF::,FF,:F,FFF,,,F:F:F,,:FF,
@A01426:197:HMYWKDSX2:4:1101:1344:1000 1:N:0:CGTCAAGGGC+GAGTGACCTA
GAAACCTGTTGAATCCGACGGCGTTTAGTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTATTGGAAATTTTTTTTTTTAAAAATTTTGTTTTGTTAAAGTTAAATATTTTAAACATAAAGAAATAATAGAAATAAAAATTTAAAGATTA
+
FFFFF:F:FFFFFFFFFFFFFFFF,,,FF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFF:,,:,,,,,:FFF,FF,:F,,::,,:,F,,,F,,:,F,,,F,,:F,,F,:,,::,,,,,,,F,:,,,:,,,,,,,:,F,F,:F,,F,,,,,
@A01426:197:HMYWKDSX2:4:1101:1434:1000 1:N:0:CGTCAAGGGC+GAGTGACCTA
GGGCGTGTCTAAGCGTTACAATATGTATTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTAGTTAAAAAATTTATTGGGTAAATTAAAAAAATAAGTTTAATAAAAAATAAAAAAGCACAAATTATATTAAAAATCATAAGAATGTTAATT

Could you please also count the number of lines in the two files?

It seems sequences in R1 file are pretty long. Does it also include actual sequence data?

the R1 files have 367680198lines,the R2 files have 206562599lines,R1 is so long,whats wrong with my data?thanks for reply,Sorry for causing trouble for you

I'm not sure about that. You may check some of the read ids that only show up in R1 but not R2. Can you also use gzip -t to check the integrity of these gzipped files?

Both file invalid crc error and length error,i think these files are broken,i will replace these files and try again,thank you so much!!!

I have the same issue . But I am using the mapped bam file.

./run-trust4 -b $data/G4-2135_sorted.bam -f $ref/GRCm38_bcrtcr.fa --ref $ref/mouse_IMGT+C.fa -t 10

The error looks like this :

Two reads from the unaligned fragment are not showing up together. Please use -u(--abnormalUnmapFlag from wrapper) option.
system /data/zxu/software/TRUST4/bam-extractor -b /archive/zxu/data/my_datasets/inhouse/Project_s1885r10t006_18Samples_20240520/Alignment/BAM/G4-2135_sorted.bam -t 10 -f /data/zxu/software/TRUST4/mouse/GRCm38_bcrtcr.fa -o output/G4-2135_sorted_toassemble  failed: 139 at ./run-trust4 line 55.

Does anyone have any clue ?

@xanthexu . Your file is all right. You can just add the "-u" option when running TRUST4. This happens when some tools generated the sorted BAM file, the unmapped read pairs may not be next to each other. "-u" makes the program slower, but it has less assumptions on the sort order in the BAM file.

Thanks @mourisl . Yesterday I tried it by myself, and later found that removing unmapped reads from the BAM file can also solve this problem. This follows the same idea as the "-u" option you mentioned.

Many of the reads from the VDJ region are unmapped due to the recombination. I would recommend keeping them in the analysis.