sagepublishing / text_cleaning

Corpora and scripts for cleaning political science texts. Scripts are translated into transformations that support SAGE Texti.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TEXT CLEANING FOR POLITICAL SCIENCE

This repository contains the python scripts powering the transformations on Texti in app_scripts as well as the documentation for it in source.

Take a look at the detailed documentation for Texti here

If you would like to contribute transformations or coprora examples, check this page

Contents of the file

  • Introduction
  • Initial setup
  • Tests

Introduction

This repository contains the python scripts powering the transformations on Texti in app_scripts as well as the documentation for it in source.

Setup

pip install -r requirements.txt

Tests

1. `cd to app_scripts/clean directory`
1. `python3 -m unittest"`

About

Corpora and scripts for cleaning political science texts. Scripts are translated into transformations that support SAGE Texti.

License:Mozilla Public License 2.0


Languages

Language:Python 95.6%Language:Batchfile 2.4%Language:Makefile 1.9%