Kavita Ganesan's repositories
word_cloud
Python word cloud library for use within Jupyter notebook and Python apps.
resources
Curated List of Blog Posts From Opinosis Analytics
nlp-in-practice
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word count with pyspark, simple text preprocessing, pre-trained embeddings and more.
data-science-blogs
A curated list of data science blogs
stop-words
Stop word lists
opinosis-summarization
This repo contains code and dataset for the Opinosis Summarization Framework
hashtags_test
Test hashtags
phrase-at-scale
Detect common phrases in large amounts of text using a data-driven approach. Size of discovered phrases can be arbitrary. Can be used in languages other than English
test-repo
Test repo
SIF_mini_demo
minimal example for sentence embedding by Smooth Inverse Frequency weighting scheme
text-mining-and-nlp-apis
APIs for clustering sentences, extracting topics, counting words & n-grams, extracting text from html or URL, computing similarity between texts and more.
clinical-concepts
Discovering Related Clinical Concepts using Large Amounts of Clinical Notes. An unsupervised graphical approach to mine related concepts by leveraging the volume within large amounts of clinical notes.
images
website images
python-examples
Working examples in python
ROUGE-Utility
Utility tools to prepare and evaluate ROUGE scores. Perl script to convert perl output of ROUGE to CSV.
spark-examples
Examples of code in spark
Micropinion-Generation-Dataset
Dataset for Micropinion Generation. Dataset is based on user reviews from CNET. The reviews are on products from various categories like tv, cell phones, gps etc.
electron
Build cross platform desktop apps with JavaScript, HTML, and CSS
GeoSpark
A Cluster Computing System for Processing Large-Scale Spatial Data
spark-lucenerdd
Spark RDD with Lucene's query capabilities
spectron
Test Electron apps using ChromeDriver
rails
Ruby on Rails
spark
Mirror of Apache Spark
stanza
Stanford NLP group's shared Python tools.
CoreNLP
Stanford CoreNLP: A Java suite of core NLP tools.