arXiv Literature Clustering

Clustering of the 600,000 random subset of 1,747,307 literature hosted on arXiv. Papers are clustered using the technqiues described in COVID-19 Literature Clustering.

Dataset: arXiv Dataset | Kaggle

How to Cite This Work?

If you use arXiv Literature Clustering, please cite the original paper and the code:

@inproceedings{EREN2020,
	author = {Eren, E. Maksim. Solovyev, Nick. Nicholas, Charles. Raff, Edward. Johnson, Ben},
	title = {COVID-19 Kaggle Literature Organization},
	year = {2020},
	month = {April},
	location = {San Jose, CA, USA},
	note={Malware Research Group, University of Maryland Baltimore County. \url{https://github.com/MaksimEkin/COVID19-Literature-Clustering}},
	url = {TBA},
    doi = {TBA},
    howpublished = {DocEng'20: ACM Symposium on Document Engineering}
}

About

Clustering of the papers published on arXiv.

Languages

Language:Jupyter Notebook 100.0%