xyang619/AdmixInfer_old

AdmixInfer v1.0.1
=================================================
Short Description:
AdmixInfer is designed to optimize the parameters of admixture model
via maximum likelihood estimation and figure out the model best fit 
the data. The optimization is under assumption of HI, GA, CGFR and 
CGFD model.

1.Compile
It's very easy to compile from the source code by the following 
commands:

bash$ tar -zvxf AdmixInfer.tar.gz
bash$ cd AdmixInfer/src
bash$ make

After compiling, you will get the executable AdmixInfer, just
typing the command below to get help information:

bash$ ./AdmixInfer -h

2. Test with the toy data

bash$ ./AdmixInfer -f Uygur.seg -m 200 -c 0.01 -o Uygur.llk

Example explanation:
AdmixInfer will read the ancestral tracks from Uygur.seg, search for
the optimal generation from 1 to 200, the cutoff to discard short 
tracks is 0.01 Morgan, and finally, the likelihoods are save to the 
file Uygur.llk for further reference.

Results summary is also print to the screen, for example:

==================================================================
Results Summary
Parental-population-1: CEU
Admix-proportion: (0.428372 ± 0.00179507)
Optimal-model: GA(100%); Optimal-generation: (55.7721 ± 0.419461)
==================================================================

3. File formats

3.1 Input file format
AdmixInfer is easy to use, only need one file, in which each line 
represents a ancestral track with the start point, end points, from 
which ancestry the track originates, and from which chromosome. 
For example:

0.00000000      0.34602058      Yoruba 1
0.34602058      0.34614778      French 1
......
0.40759031      0.41517938      Yoruba 22

Here start and end points unit are in Morgan.

3.2 Output file format
AdmixInfer save the likelihoods of each generation searched under 
different models assumption for further reference, the format is 
straightforward, calculated chromosome by chromosome

For example:

chr1
Generation      HI      GA      CGFR    CGFD
1       -522.625        -522.625        -522.625        -522.625
2       -52.7604        -177.942        -135.031        -166.291
......
chr22
Generation      HI      GA      CGFR    CGFD
1       -108.11		-108.11 	-108.11 	-108.11
2       -6.72091    -33.6433 	-24.8894	-30.6868
......
99      328.038 	348.171 	350.288 	351.177
100     326.962 	347.958 	350.213 	351.21

4. Full argument list
-h/--help
if you forget the usage of any arguments, don't hesitate to use 
this one.

-f/--file <filename>
This argument is required, in which to specify the input file of 
ancestral tracks, format is described in previous section

-m/--maxT [generation]
This argument is optional, in which user can specify the maximum 
generation search from. Default is 500 generation, corresponding 
~15000 (30 year/generation assumed) years before present.

-c/--cutoff [value]
This argument is optional, in which user can specify the threshold 
to discard short tracks in case of uncertainty of inference about 
short tracks. Default is 0, which use all the tracks.

-o/--output [output_filename]
This argument is also optional, in which user can specify the 
filename of output, to save the likelihoods for each generation 
under HI, GA, CGFR and CGFD model assumptions

5. License
GNU GENERAL PUBLIC LICENSE Version 3 
http://www.gnu.org/licenses/gpl-3.0.html
=================================================
6. Questions and Suggestions
Questions and Suggestions are welcomed, feel free to contact 
Shawn xyang619@gmail.com
xyang619 / AdmixInfer_old

About

Languages