dan2097 / CovidTermVar

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

COVID-19 and SARS-CoV-2 term variation

The number of unique terms used in the scientific literature to refer to either SARS-CoV-2 or COVID-19 is remarkably large, and has continued to increase rapidly despite well-established standardized terms. This high degree of term variability makes it difficult for automated tools to accurately identify mentions of these important entities. We created an extensive dictionary of terms used in the literature to refer to SARS-CoV-2 and COVID-19. We used a rule-based approach to iteratively generate new term variants, then located these variants in the LitCovid corpus.

References

[1] Leaman, R., & Lu, Z. (2020). A Comprehensive Dictionary and Variability Analysis of Terms for COVID-19 and SARS-CoV-2. NLP COVID workshop at EMNLP 2020.
[2] Chen, Q., Allot, A., & Lu, Z. (2020). Keep up with the latest coronavirus research. Nature, 579(7798), 193. doi: 10.1038/d41586-020-00694-1

About


Languages

Language:Python 83.4%Language:Shell 16.6%