hh1985 / sequence2vec

Vectorize the nucleotide sequences, so that the similarity of the sequences can be represented as similarity of vectors.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

kmer2vec

kmer2vec is an algorithm providing short seqeuence (k <= 10) embedding strategy, so that reads which share overlapping sequences tend to be "close" in N dimentional space.

read2vec

Read2Vec is an algorithm providing read-level (n = 150, 250, or 300) embedding strategy, so that reads which share overlapping sequences tend to be "close" in N dimentional space.

contig2vec

contig2vec is an algorithm which clusters and "assemble" reads based on graphical feature of reads and biological evidence (e.g. paired end reads). Segments from the same genome should be "close" to each other.

About

Vectorize the nucleotide sequences, so that the similarity of the sequences can be represented as similarity of vectors.