dleng2242 / NHS-R_2020_TextAnalysis

NHS-R 2020 Text Analysis Workshop Materials

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NHS-R 2020 - TextAnalysis

NHS-R 2020 Text Analysis Workshop Materials

The github repo is found here: https://github.com/dleng2242/NHS-R_2020_TextAnalysis

Overview

Many companies have a large amount of data stored as text that is not being used effectively. In this introductory workshop we will show how you can get started with analysing text data, from simple manipulation through to sentiment analysis. By the end of the course attendees will have a good understanding of the techniques as well as how to implement them in R.

Details

Simple Text Manipulation

  • Regular Expressions
  • Tidy Text Format
  • Removing Stop Words and Stemming
  • Word Clouds
  • Tokenisation and n-grams

Sentiment Analysis

  • Sentiment Lexicons
  • Joining Sentiments to Documents

Word and Document Frequency

  • Term Frequency - Inverse Document Frequency (TF-IDF)

About

NHS-R 2020 Text Analysis Workshop Materials


Languages

Language:R 100.0%