philippeitis / StackOverflowNER

Source Code and Data for Software Domain NER

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dataset and Model for Fine-grained Software Entity Extraction

This repository contains all the code and data proposed in the paper: Code and Named Entity Recognition in StackOverflow. (ACL 2020). [Paper PDF]

For the source code of our NER tagger, check the code/NER/ folder.

For our annotated data with software-domain named entities, check the resources/annotated_ner_data/ folder.

To cite the data or the code included in this repository, please use the following bibtex entry:

  @inproceedings{Tabassum20acl,
      title = {Code and Named Entity Recognition in StackOverflow},
      author = "Tabassum, Jeniya and Maddela, Mounica and  Xu, Wei  and Ritter, Alan",
      booktitle = {The Annual Meeting of the Association for Computational Linguistics (ACL)},
      year = {2020}
  }

About

Source Code and Data for Software Domain NER

License:MIT License


Languages

Language:Python 97.1%Language:Perl 2.9%Language:Shell 0.1%