SHIJINGLI0206 / Brown-Clustering

Brown clustering implementation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Brown-Clustering

Description

  • Implementation of Brown Clustering Algorithm
  • The algorithm trains on data and does heirarchial clustering
  • Based on this clustering, it generates a unique vector of each word

Usage

  • Dependencies : numpy and scipy

  • On running the code, it trains on the small subset of data named - "subset_data.txt"

  • This dataset contains dummy pos tags

  • It clusters the similar words and prints and saves the clusters and the vectors of the words.

    python3 brown_clustering.py 
    

About

Brown clustering implementation


Languages

Language:Python 100.0%