jyang32 / resonance_rep

Replication code and data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Replication code and data

There are five folders containing code used in different steps of the project.

  • The 1_scraping folder contains code used to scrape data from the forum. They are in the format of Jupyter Notebook.
  • The 2_segment folder contains code used to process the scraped data, including: saving each post as a individual .txt file with consistent naming rules, performing Jieba segmentation to these posts (normal segmentation and POS-tag based segmentation). Please read the comments closely before running these files. They are in the format of either R or Python scripts.
  • The 3_diachronic_w2v folder contains code (Python script format) used to run:
    • The "start model" (based on text from 2011 to 2017)
    • The w2v model using min_word_freq = 0 and window_size = 20 (based on text from 2016 to 2017)
    • The w2v model using min_word_freq = 10 and window_size = 20 (based on text from 2016 to 2017, used for sensitivity test)
  • The 4_sentiment folder contains code used to generate sentiment measure for each post. The file is in Python script format.
  • The 5_analysis folder contains code and data used for modeling and plotting, as well as code for sensitivity tests. The code is in R Markdown format.

About

Replication code and data

License:MIT License


Languages

Language:Jupyter Notebook 62.3%Language:Python 33.2%Language:R 4.5%