owensgroup / csgm

CUDA implementation of seeded graph matching

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Review request

bkj opened this issue · comments

This is nearly done, subject to some final testing on additional datasets to find edge cases.

@ctcyang has already looked at this and optimized some bits, but any feedback would be appreciated.

Depending on the dataset, the bottleneck should be SpMM or the auction algorithm. (I'm working on implementing/testing a CUB version of the auction algorithm that should make better use of parallelism, but that won't change a whole lot of code.)

I and other JHU collaborators have worked on this problem for some time. there are various datasets in https://github.com/adalisan/SeededGraphMatch/tree/master/data that would be useful for testing. We also found the auction algorithm to be the harder-to-scale part of the SGM algorithm.

@adalisan This implementation is based on work from Carey Priebe et al at Hopkins. Did you guys ever try using the auction algorithm for the assignment problem? The JHU implementations I'd seen used Hungarian/JV.

Carey was my PhD advisor. I vaguely remember people considering an alternative matching algorithm or implementation, I am not sure if it was YiCao or JV or the auction algorithm . but if the only advantage is parallelizability of the auction algorithm and not lower computationally complexity, then we didn't consider the auction algorithm. Maybe try paging @jovo for clarification? Is there an easy-to-use implementation for the auction algo?

Ah cool. I can't remember the complexity of auction vs JV off the top of my head -- but parallelizability + ability to be distributed is a nice property :)

Not sure what the best implementation of the auction algorithm is out there. I have a reasonably concise one here:
https://github.com/bkj/auction-lap
but it's not particularly high performance.

@adalisan Here's an easy-to-use implementation of the auction algorithm. Just wrote it now, so possibly some errors or inefficiencies, but seems to work reasonable well.

https://github.com/bkj/numba_auction

Comparing auction to JV is interesting -- relative runtimes seem to change a lot depending on the properties of the cost matrix.

hungarian.
https://github.com/neurodata/graspy/blob/master/graspy/match/gmp.py
scipy/scipy#12161

@asaadeldin11
@bdpedigo

Older versions of scipy (which is what we call for LAP) were using Hungarian, since 1.4 they are using a modified JV from https://ieeexplore.ieee.org/document/7738348

From that paper,

In [33], the Munkres,
JVC, deepest hole,7 and auction algorithms are compared
based on a measurement assignment problem for multiple
target tracking, where the JVC algorithm is found to be the
best overall, with the auction algorithm being faster on
sparse problems. In the assignment problem, “sparse”
means that many elements of C are not finite (certain rows
cannot be assigned to certain columns).

I wonder if this is related to what you mean by runtimes depending on cost matrix a lot

wikipedia suggests O(n^3) for hungarian and JV. complexity of the auction algorithm is more vague, but as you suggest it may be exploiting sparsity in the cost matrix.