text-as-data

There are 7 repositories under text-as-data topic.

JasonKessler / scattertext
Beautiful visualizations of how language differs among document types.
nlp d3 word-embeddings machine-learning natural-language-processing visualization word2vec text-visualization text-mining japanese-language computational-social-science sentiment eda exploratory-data-analysis text-as-data scatter-plot topic-modeling stylometry stylometric semiotic-squares
Language:Python 2323
MilaNLProc / contextualized-topic-models
A python package to run contextualized topic modeling. CTMs combine contextualized embeddings (e.g., BERT) with topic models to get coherent topics. Published at EACL and ACL 2021 (Bianchi et al.).
topic-modeling bert transformer embeddings text-as-data topic-coherence multilingual-topic-models multilingual-models neural-topic-models nlp nlp-library nlp-machine-learning
Language:Python 1250
textnets
jboynyc / textnets
Text analysis with networks.
nlp network-analysis sociology visualization text-analysis text-as-data computational-social-science
Language:Python 290
ryanjgallagher / shifterator
Interpretable data visualizations for understanding how texts differ at the word level
natural-language-processing sentiment-analysis information-theory computational-social-science digital-humanities text-analysis text-as-data data-visualization
Language:Python 283
JasonKessler / Scattertext-PyData
Notebooks for the Seattle PyData 2017 talk on Scattertext
pydata nlp visualization text-visualization word2vec computational-social-science natural-language-processing gender text-as-data political-science political-parties
Language:HTML 142
chkla / CSS-Events
Summer/ winter schools, workshops and conferences in computational social science 🫂
computational-social-science conferences events summer-schools text-as-data winter-schools workshops
43
umanlp / SemScale
A tool for Semantic Scaling of Political Text (branch of Topfish, a suite of tools for Political Text Analysis)
text-scaling text-as-data computational-social-science
Language:Python 28
chkla / Populism-Text-Analysis
Literature 📄 and datasets 📚 on automatic populism detection
nlp political-science populism text-as-data literature-review
19
tweedmann / 3x8emotions
Code and models for 3 different tools to measure appeals to 8 discrete emotions in German political text
emotions political-science text-as-data political-communication
Language:Jupyter Notebook 16
fedenanni / Computational-Text-Analysis-2018-19
2018 Computational Text Analysis Notebooks, University of Mannheim
text-as-data computational-social-science natural-language-processing teaching-materials
Language:Jupyter Notebook 13
LinkOrgs-software
cjerzak / LinkOrgs-software
LinkOrgs: An R package for linking linking records on organizations using half a billion open-collaborated records from LinkedIn
machine-learning record-linkage text-as-data community-detection organizational-units equinox jax transformer-architecture
Language:HTML 12
wesslen / summer2017-socialmedia
Summer 2017 Social Media Analytics Workshop Series
twitter-api facebook-api text-as-data geospatial r
Language:HTML 11
davidycliao / bisCrawler
An Automation Webcrawler for Extracting Central Bankers' Speeches
bank-for-international-settlements central-banker central-bankers-speeches python scraper scraping speeches text-as-data
Language:Python 10
thieled / dictvectoR
'dictvectoR' measures the similarity between a concept dictionary and documents, using fastText word vectors. Implements the "Distributed-Dictionary-Representation" (Garten et al. 2018) method in R.
dictionary ideology natural-language-processing r scaling text-analysis text-as-data word-embeddings word-representations word-vectors
Language:R 9
KED2022
aflueckiger / KED2022
The ABC of Computational Text Analysis. BA Seminar, Spring 2022, University of Lucerne
computational-social-science social-science sociology teaching text-as-data
Language:HTML 6
WZBSocialScienceCenter / tm_corona
A small showcase for topic modeling with the tmtoolkit Python package. I use a corpus of articles from the German online news website Spiegel Online (SPON) to create a topic model for before and during the COVID-19 pandemic.
corona covid-19 news python scraping text-analysis text-as-data text-mining topic-modeling topicmodeling webscraping
Language:Jupyter Notebook 4
adamlauretig / gensim_in_R
Code for estimating word embeddings with gensim in R.
gensim r text-as-data
3
jfjelstul / regular-expressions-tutorial
A tutorial on using regular expressions in R
r regular-expressions stringr text-analysis text-as-data text-data tidyverse tutorial
2
thelautiff / UN_meeting_records
From using xpdf, rvest, and quanteda on United Nations Digital Library search results to applying dictionaries to speeches in United Nations meeting records
pdf pdfs quanteda r regular-expression rvest text-as-data united-nations xpdf
Language:R 2
varvarailyina / mds_thesis
all code and results for my MDS thesis at the hertie school
emotions llms political-communication text-as-data
Language:Jupyter Notebook 2
aflueckiger / KED2021
The ABC of Computational Text Analysis. BA Seminar, Spring 2021, University of Lucerne
computational-social-science social-science sociology teaching text-analysis text-as-data
Language:HTML 1
CT-P / portuguese_open_data
Empirical framework applied to parliament discourses and Twitter data, with a Discourse Polarization Index.
computational-social-science discourse political-polarization text-as-data gentzkow
Language:Jupyter Notebook 1
marek-chadim / Empirical-Economics
PhD Applied empirical economics at Stockholm University
gis machine-learning programming replication text-as-data version-control webscraping workflow
Language:HTML 1
BenjaminFReese / american_constitutional_praxis
This repository uses text-as-data methods alongside traditional primary source reading to analyze early American state constitutions. The R scripts create a function to scrape and clean the constitutional text, run sentiment analysis, calculate tf-idf, and perform LDA. This is a work-in-progress.
constitution latent-dirichlet-allocation political-theory sentiment-analysis text-as-data tf-idf
Language:HTML 0
bgonzalezbustamante / TextClass-Benchmark
TextClass Benchmark Leaderboards
deepseek elo-rating gpt-4 gpt-4o leaderboards llama llm llms-benchmarking misinformation mistral nous-hermes ollama openai perspective-api qwen2-5 text-as-data text-classification toxicity toxicity-classification zero-shot-classification
Language:Jupyter Notebook 0
Refugee-Text-as-Data
graceadcox / Refugee-Text-as-Data
Original corpus of articles relating to refugees scraped from Tennessee newspaper The Chattanoogan along with simple code for text-as-data word cloud.
text-as-data word-cloud r
Language:R 0
ichalkiad / datadescriptor_uselections2020
Code for collecting and cleaning speeches (text) of the US 2020 election campaign. Corresponding publication: "A text dataset of campaign speeches of the main tickets in the 2020 US presidential election", by Ioannis Chalkiadakis, Louise Anglès d’Auriac, Gareth W. Peters, and Divina Frau-Meigs
natural-language-processing political-science rhetoric text-analysis text-as-data us-election-2020 us-elections
Language:Python 0
Jszabo16 / EU-sentiments_NRSR
Replication script for mining sentiments towards the EU from Parliamentary Speeches in the National Council of the Slovak Republic (1994-2023)
european-union slovakia text-as-data aspect-based-sentiment-analysis bert structural-topic-modeling national-council-of-the-slovak-republic parliamentary-speeches
0
Sam-Gartenstein / Machine-Learning-for-the-Social-Sciences
Material from my Machine Learning for the Social Sciences course
neural-networks supervised-machine-learning unsupervised-machine-learning text-as-data
Language:Jupyter Notebook 0
smkerr / news-israel-gaza
🇮🇱🇵🇸 News coverage of Israel-Hamas War 🇵🇸🇮🇱
text-as-data r
Language:R

text-as-data

JasonKessler / scattertext

MilaNLProc / contextualized-topic-models

jboynyc / textnets

ryanjgallagher / shifterator

JasonKessler / Scattertext-PyData

chkla / CSS-Events

umanlp / SemScale

chkla / Populism-Text-Analysis

tweedmann / 3x8emotions

fedenanni / Computational-Text-Analysis-2018-19

cjerzak / LinkOrgs-software

wesslen / summer2017-socialmedia

davidycliao / bisCrawler

thieled / dictvectoR

aflueckiger / KED2022

WZBSocialScienceCenter / tm_corona

adamlauretig / gensim_in_R

jfjelstul / regular-expressions-tutorial

thelautiff / UN_meeting_records

varvarailyina / mds_thesis

aflueckiger / KED2021

CT-P / portuguese_open_data

marek-chadim / Empirical-Economics

BenjaminFReese / american_constitutional_praxis

bgonzalezbustamante / TextClass-Benchmark

graceadcox / Refugee-Text-as-Data

ichalkiad / datadescriptor_uselections2020

Jszabo16 / EU-sentiments_NRSR

Sam-Gartenstein / Machine-Learning-for-the-Social-Sciences

smkerr / news-israel-gaza