Naif-Ganadily / WordEmbeddings-SimilaritySearch-NLP

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

EEP_596_Assignment3

Word Embeddings and Similarity Search

The objective of this assignment was to develop an understanding of word embeddings and how to use them to compute similarities between words. The repository contains Programming Assignment 3 for the course EE P 596: Advanced Introduction to Machine Learning, instructed by Prof. Karthik Mohan, Student: Naif A. Ganadily.

Assignment Overview

The project is divided into several parts, including computing cosine similarity, finding nearest neighbors, intersection of neighborhoods, correlation analysis, and solving word analogies. I completed the following tasks:

Computed cosine similarity between pairs of words in a text corpus using the GloVe word embedding model.

Implemented a function to find the top n similar words to a given word and computed the 20 nearest neighbors for words 'duck' and 'animal'.

Found the neighborhood intersection of 'duck' and 'animal' and identified words similar to both. This was achieved using the Jaccard similarity measure.

Analyzed the most correlated and uncorrelated words from the similarity search for the words "happy" and "sad".

Solved word analogies using vector arithmetic and cosine similarity to derive the answers.

Key Competencies

This assignment highlights my expertise in:

  • Word embeddings and their application in natural language processing.
  • Computing similarity measures between words using GloVe embeddings.
  • Implementing and evaluating similarity-based functions.
  • Analyzing correlations between pairs of words.
  • Solving word analogies using vector arithmetic and cosine similarity.

In Conclusion:

This assignment provided an opportunity to apply my knowledge of machine learning and natural language processing to real-world data and develop skills that can be applied to various domains, such as text classification and recommendation systems.

About


Languages

Language:Jupyter Notebook 100.0%