Pek Yun Ning's repositories

corex_topic

Hierarchical unsupervised and semi-supervised topic models for sparse count data with CorEx

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

enumerate_using_python

A simple implementation of 'enumeration' in Python. In this case, we number webchats from one whole chunk of text filled with tons of webchat entries.

Stargazers:0Issues:0Issues:0

geographical_hexbins-projections

Data science applications in geographical data. Involves hexbins and projections.

Stargazers:0Issues:0Issues:0

classify_nouns-verbs-adjectives

This is how nouns, verbs, and adjectives can be classified from a bunch of text.

Stargazers:0Issues:0Issues:0

text-preprocessing

Convert a text file to Python-readable, by firstly segregating each line of text and transferring them all to a Python list, then splitting each line into individual words. Good for analysis that requires by-line and/or by-word analysis. Removes all Stopwords as well, such as 'the', 'a', 'but'. Finally, consolidate them in a CSV file.

Stargazers:0Issues:0Issues:0

text-summarisation

To reduce essays / paragraphs to mere sentences. To obtain the gist of a large corpus of text.

Stargazers:0Issues:0Issues:0

wordcloud-using-python

Create a Word Cloud using Python.

Stargazers:0Issues:0Issues:0

csv-blank-removal

Removes blank cells in CSV files using Python. In Python list, it is seen as 'nan'.

Stargazers:0Issues:0Issues:0

topic-ranking

If you'd like to rank topics / sentences (based on relative importance between entries in a text corpus).

Stargazers:0Issues:0Issues:0

snownlp

Python library for processing Chinese text

License:MITStargazers:0Issues:0Issues:0

Mutual-Information

In probability theory and information theory, the mutual information of two random variables is a quantity that measures the mutual dependence of the two random variables. This script performs MI over Mutual Information over discrete random variables

Language:PythonStargazers:0Issues:0Issues:0