yoni / csv_wordcount_analyzer

Word frequency analyzer for CSV data. Basically wraps a bunch of NLTK goodness with some simple CSV goodness.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CSV Wordcount Analyzer
======================

A simple tool for analyzing word counts across a given CSV file's columns. Based on
Natural Language Toolkit (NLTK) frequency distribution libraries.

Author
======
Yoni Ben-Meshulam

Results
=======
A word will count only once per cell of a column. Counds reflect the prevalence
of that word across all cells in that column.

The results are NLTK's standard FreqDist object.


Usage
=====
For example, to get the word frequencies for the "Innovation Ideas" column in the
file data/ideas.csv:

  import csv_wordcount_analyzer
  from csv_wordcount_analyzer import Analyzer
  a = CSVAnalyzer('data/ideas.csv')
  freq = a.word_freq('Innovation Ideas')
  freq.items()

Tests
=====
To run tests from the project root:
  bin/test

About

Word frequency analyzer for CSV data. Basically wraps a bunch of NLTK goodness with some simple CSV goodness.

License:MIT License


Languages

Language:Python 100.0%