ANGSD / NgsRelate

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problem with guess in emStep

tristanpwdennis opened this issue · comments

Hello
I am trying to bootstrap my relatedness estimations.
I am using the command: ~/software/NgsRelate/ngsRelate -f freq -g angsdput.glf.gz -O test -L 521328 -n 81 -a ind1 -b ind2 -B 100

For most of the files (I am looping over ~200 individual pair combinations which are a subset of the whole callset contained in the .glf and .maf files) I get this error:

Problem with guess in emStep: -nan -nan -nan -nan -nan -nan -nan -nan -nan

I get the error when I loop over a list, and also when I specify individuals as a single command.

For 5 of the comparisons it seems to work fine (estimation has been made 101 times, as I guess the bootstrapping is zero indexed...). Can't work out what is going on.

Please could I trouble you for some assistance?

I have attached a Drive link to a compressed folder with the data and the analysis script (bash wrapper):
https://drive.google.com/open?id=1eraeCDVdOa-idliPg00TgMDkeRLM8zTz

Thank you!

Tristan

Hi Tristan,

Sorry for the late reply.

Thanks for providing an example upfront. it made it a lot easier to fix the bug. The bug was related to sample with replacement error.

When you provide -f freq, you dont have to also set the number of sites -L. Furthermore the example had 318194 sites and not 521328.

Let us know if the update also works for use.

Kristian

Hi Kristian
Thanks for getting back.
That seems to be working really well now, thank you.

I have one final question:

I notice in the output of the bootstrapping from the example data, the rab and KING values (that I have been using as the relatedness estimate) is now between 0.99 and 1 - even when the two individuals are, say, half siblings (so I expect ~0.25 or so...). However, a number of the other coefficients in the output file (J1, inbred_relatedness, fraternity) are more reflective of the true relatedness (some of the samples I know the relationship already). I've added the results file below.

This only happens when I use the bootstrapping - ought this to be happening?

Thank you for your assistance again!

Tristan

C1_S25-C4_S28.res.txt

I have just tested ngsrelate.

i get rab of 0.18.

When you only analyze a single pair using -a and -b, you need to provide the index (0-based) of the two individuals.

./ngsRelate -f freq -g angsdput.glf.gz -n 81 -a 24 -b 27 -O test2_boot  -p4 -z list -B10

If you provide sample names to -a and -b, the software will take the first individual twice :)

Let us know if it helped.

I'll add a check for this in ngsrelate

Kristian

I have now added a sanity check. if a string is provided as argument to -a or -b, the software will return an error.

Let me know if it works.

That works fine, thank you!

However, I pulled the most recent and passed 2 strings for names instead of indices and no error message
~/Software/NgsRelate/ngsRelate -f freq -g angsdput.glf.gz -n 81 -a C4_S28 -b C1_S25 -O test2_boot -p4 -z list -B10

Good.

I have just tried your command:

../NgsRelate/ngsRelate -f freq -g angsdput.glf.gz -n 81 -a C4_S28 -b C1_S25 -O test2 -p1 -z list -B 1
and get the following error:
pair 'C4_S28' is not a 0-based index

Ill close the issue. feel free to reopen if you encounter other issues with ngsrelate.

Glad that you're using the software.

Kristian