A highly efficient cross-matching framework for large astronomical catalogues on heterogeneous computing environments
This is the main source code of HLC2, which can perform cross-matching between two astronomical catalogues. HLC2 is a high performance command-line cross-matching tool running on the Linux platform, which is implemented in C and C++.
The specific steps of cross-matching for two astronomical catalogues are shown in the following figure.
First the Extraction module extracts celestial position information (mainly RA, DEC) from the input data and filters out useless information. The First-level partition module divides the extracted location information into HEALPix data blocks, while our quad-direction strategy is implemented in the Boundary processing module to reduce the loss of accuracy without adding too much redundant data. Then through Second-level partition module, calculation blocks can be obtained for storing and subsequent parallel accessing catalogue records on GPU. Source reading module is designed to retrieve the current CPU and GPU computing status to dynamically adjust splitting strategy. On the GPU, the inter-catalogue parallelization is adopted to calculate the radius distances on Kernel module with I/O optimizations on the Compression module. Finally, the matching results transferred to CPU will be exported the final products and be visualized. This function is under development.
- Supports end-to-end efficient cross-matching of catalogues.
- Scales well in CPU-GPU heterogeneous platforms.
- CUDA Toolkit: https://developer.nvidia.com/cuda-toolkit-archive
-
Change the direction to 'HLC2' directory
-
Update the library file paths in the Makefile according to the paths of installed dependencies.
-
Generate an executable file named Crossmatch: make
-
To clear the files generated by the last make command, run this command: make clean
- After successful compilation, do the following thing for cross-matching performance testing:
$ ./Crossmatch ../data_sample/sample1_sdss.csv ../data_sample/sample2_sdss.csv
The cross-matching time of the two catalogues will be printed in the console.
- If you want to get the cross-matching result, use the following command:
$ ./Crossmatch ../data_sample/sample1_sdss.csv ../data_sample/sample2_sdss.csv result.txt
Successfully matched catalogue record pairs are printed in the specified output file.
HLC2 is being further improved, if you have any question or ideas, please don't skimp on your suggestions and welcome make a pull request. Moreover, you can contact us through the follow address.