Guanliang MENG's repositories
taxonomy_ranks
To get taxonomy ranks information with ETE3 Python3 module (http://etetoolkit.org/)
gbseqextractor
Extract any CDS or rNRA or tRNA DNA sequences of genes from Genbank file.
msaconverter
msaconverter is a tool to convert a multiple sequence alignment into different format with Biopython (http://www.biopython.org/)
extract_codon_alignment
To extract some codon positions (1st, 2nd, 3rd) from a CDS alignment.
msa_cigars
A tool to get the CIGARs of a multiple sequence alignment.
bold_identification
To identify taxa of given sequences via BOLD system (http://www.boldsystems.org/index.php)
extract_fasta_seq
To extract specific fasta sequences from a fasta file. By Guanliang MENG, see https://github.com/linzhi2013
polish_genbank
Check for the internal stop codon, then substitute the internal stop codon with NNN.
specimen_bioseq_system
The Specimen Bioseq Information Managment System
atgcN_count
To stat the counts of each base in a fasta file.
bioconda-recipes
Conda recipes for the bioconda channel.
breakSeqInNs_then_translate
Filter the sequences by translating the protein coding genes (PCGs) with proper genetic code table, if one of the PCGs has interal stop codon, filter out this sequence.
cigar_coordinates
To get the coordinates of a given CIGAR string.
extract_specific_lines
To extract specific lines which maps the query ids (of the query file) from the subject file.
extract_specific_sites_from_msa
To extract some sites (or codon) from a multiple sequence alignment
find_longest_transcripts
To find out the longest transcripts/proteins
group_genetic_distance
To derive within- and between-groups genetic distance based on pairwise genetic distance matrix
Machine-learning-learning-notes
周志华《机器学习》又称西瓜书是一本较为全面的书籍,书中详细介绍了机器学习领域不同类型的算法(例如:监督学习、无监督学习、半监督学习、强化学习、集成降维、特征选择等),记录了本人在学习过程中的理解思路与扩展知识点,希望对新人阅读西瓜书有所帮助!
mglcmdtools
`mglcmdtools` is a collection of common cmd tools intended to be used in Python3 scripts. By Guanliang MENG, see https://github.com/linzhi2013/mglcmdtools.
ntJoin
🔗Genome assembly scaffolder using minimizer graphs
ParallelTask
A simple and lightweight parallel task engine
physalia-lcwgs
Files for the the Physalia course on Population genomic inference from low-coverage whole-genome sequencing data, Oct 11-14, 2021