rmanak / rumors

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Rumor has it!

In 2011, I worked on a project to identify fake news and misinformation on social media (See this paper for more details). This repository contains the dataset built and used in that work as well as some fun additional experiments.

To run

python main.py --task sentiment --method all

task should be one of [sentiment/detection] and method is either of [nb|sgd|dense|bilstm|all]

Rumor Sentiment Experiments

=========== NBModel ===========
Classification Report:
              precision    recall  f1-score   support

     endorse       0.75      0.88      0.81       416
        deny       0.79      0.93      0.86       375
    question       1.00      0.16      0.27       139
     neutral       0.00      0.00      0.00        21

    accuracy                           0.77       951
   macro avg       0.63      0.49      0.48       951
weighted avg       0.79      0.77      0.73       951


Confusion Matrix:
          endorse  deny  question  neutral
endorse       366    50         0        0
deny           27   348         0        0
question       87    30        22        0
neutral        11    10         0        0


=========== SGDModel ===========
Classification Report:
              precision    recall  f1-score   support

     endorse       0.81      0.84      0.82       416
        deny       0.87      0.90      0.88       375
    question       0.67      0.59      0.63       139
     neutral       0.86      0.29      0.43        21

    accuracy                           0.81       951
   macro avg       0.80      0.65      0.69       951
weighted avg       0.81      0.81      0.81       951


Confusion Matrix:
          endorse  deny  question  neutral
endorse       349    39        28        0
deny           29   338         7        1
question       47    10        82        0
neutral         7     3         5        6


=========== SimpleDense ===========
Classification Report:
              precision    recall  f1-score   support

     endorse       0.80      0.83      0.81       416
        deny       0.86      0.89      0.87       375
    question       0.68      0.60      0.64       139
     neutral       0.78      0.33      0.47        21

    accuracy                           0.81       951
   macro avg       0.78      0.66      0.70       951
weighted avg       0.80      0.81      0.80       951


Confusion Matrix:
          endorse  deny  question  neutral
endorse       345    37        34        0
deny           37   332         4        2
question       43    12        84        0
neutral         7     5         2        7


=========== BiLSTM/Glove ===========
Classification Report:
              precision    recall  f1-score   support

     endorse       0.81      0.83      0.82       416
        deny       0.85      0.91      0.88       375
    question       0.69      0.55      0.61       139
     neutral       0.64      0.33      0.44        21

    accuracy                           0.81       951
   macro avg       0.75      0.66      0.69       951
weighted avg       0.81      0.81      0.81       951


Confusion Matrix:
          endorse  deny  question  neutral
endorse       347    38        28        3
deny           26   342         6        1
question       48    14        77        0
neutral         6     7         1        7

Data

You can find the dataset and its readme under data/. If you use the data, please cite the following paper:

@InProceedings{qazvinian-EtAl:2011:EMNLP,
  author    = {Qazvinian, Vahed  and  Rosengren, Emily  and  Radev, Dragomir R.  and  Mei, Qiaozhu},
  title     = {Rumor has it: Identifying Misinformation in Microblogs},
  booktitle = {Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing},
  month     = {July},
  year      = {2011},
  address   = {Edinburgh, Scotland, UK.},
  publisher = {Association for Computational Linguistics},
  pages     = {1589--1599},
}

About

License:MIT License


Languages

Language:Python 100.0%