fgvieira / ngsDist

Estimation of pairwise distances under a probabilistic framework

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error reading full chunk

cbossu opened this issue · comments

Hello,

I'm running into an error while formatting the glf file to run ngsDist.

-> Printing at chr: scaffold1|size15271748 pos:3044951 chunknumber 60900 contains 50 sites
-> Printing at chr: scaffold1|size15271748 pos:3049951 chunknumber 61000 contains 50 sites
-> Printing at chr: scaffold1|size15271748 pos:3054951 chunknumber 61100 contains 50 sites
-> Printing at chr: scaffold1|size15271748 pos:3059951 chunknumber 61200 contains 50 sites

[glfReader.cpp:fetch():34] Error reading full chunk: bytesRead:3157 expected:13600 will exit

I don't think it is a memory issue as it stops at this point when I run in on a regular memory and large memory node. I originally thought it was a glf issue (so I ran glf in two ways --doGlf 2 and --doGlf 4), but I get the same error. There was a previous post about this that said the glf might be corrupt, however, I can use the glf in ngsLD without an error. Do you have any ideas why this error is being thrown? Thanks.

CH

Is the number of sites correct?
Which command line are you using?

I used the following code:

#Text glf with normal scale
$NGSTOOLS/angsd/angsd -P 16 -glf $outfile.glf.gz -fai BUOW.fasta.fai -nInd 170 -doMajorMinor 1 -doPost 1 -doMaf 1 -doGeno 8 -out all_incl_SC.rmrelFinal

So I haven't even run ngsDist yet, where the nsites actually comes into play. I'm just preparing the genotype file. I used the glf 4 option. Thanks for your help.

Ch

But then why are you using GL files as input? I'd suggest you try to generate it directly from the BAM/CRAM files.

However, it looks more like an angsd issue, and it might be better to raise the issue on the angsd github.