subwaystation / smoothxg

linearize and simplify variation graphs using blocked partial order alignment

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

smoothxg

local reconstruction of variation graphs using partial order alignment

Pangenome graphs built from raw sets of alignments may have complex local structures generated by common patterns of genome variation. These local nonlinearities can introduce difficulty in downstream analyses, visualization, and interpretation of variation graphs.

smoothxg finds blocks of paths that are collinear within a variation graph. It applies partial order alignment to each block, yielding an acyclic variation graph. Then, to yield a "smoothed" graph, it walks the original paths to lace these subgraphs together. The resulting graph only contains cyclic or inverting structures larger than the chosen block size, and is otherwise manifold linear. In addition to providing a linear structure to the graph, smoothxg can be used to extract the consensus pangenome graph by applying the heaviest bundle algorithm to each chain.

To find blocks, smoothxg applies a greedy algorithm that assumes that the graph nodes are sorted according to their occurence in the graph's embedded paths. The path-guided stochastic gradient descent based 1D sort implemented in odgi sort -Y is designed to provide this kind of sort.

building

smoothxg is built with cmake:

cmake -H. -Bbuild && cmake --build build -- -j4

About

linearize and simplify variation graphs using blocked partial order alignment

License:Other


Languages

Language:C++ 95.0%Language:CMake 5.0%