There are 3 repositories under text-vectorization topic.
A Python toolbox for gaining geometric insights into high-dimensional data
📖 Use Bi-normal Separation to find document vectors which is used to compute similarity for shorter sentences.
The project has text vectorization, handling big data with merging and cleaning the text and getting the required columns while boosting the performance by feature extraction and parameter tuning for NN, compares the Performances through applied different models treating the problem as classification and regression both.
This project is an unsupervised NLP-based recipe recommender system designed to provide personalized recipe suggestions. The system employs content-based filtering techniques, utilizing cosine similarity to measure the resemblance between user inputs and a database of recipes.
Given a document, identifying the closest documents within the list of documents using tf-idf matrix and cosine similarity
In this project, task involves analyzing the content of the articles to extract key concepts and themes that are discussed across the articles to identify major themes/topics across a collection of BBC news articles.
Syracuse University, Masters of Applied Data Science - IST 736 Text Mining
Comment Sentiment Analysis using Deep Learning
A DL project that helps in classifying Toxic Comment weather it is positive or not.
In this notebook we analyze and classify news articles using machine learning techniques, including Logistic Regression, Naive Bayes, Support Vector Machines, and Random Forests. Explore text vectorization and NLP for accurate news categorization.
♦️ Twitter US Airline Sentiment Analysis ♦️ Applied text preprocessing, NLP, and sentiment classification to analyze positive, neutral, and negative tweets. Created visualizations like sentiment distribution, airline comparisons, and word clouds for key insights. Delivered a cleaned dataset, insightful analysis, and automated PowerPoint report.
Using text-vectorization and similarity-based-matrix computation
🧠 Machine Learning & Natural Language Processing: Predict the author of literary text snippets. Built with TensorFlow and Keras, this project trains an LSTM model on classic literature to identify writing style and authorship.
demistifying nlp with a series of nlp implementation notebooks.
A simple Python script for transforming a corpus of documents into text vectors suitable for visualization
LLM-inspired BiLSTM pipeline for real-time, multi-label toxicity inference across adversarial discourse modalities.
Explore advanced neural networks for crafting captivating headlines! Compare LSTM 🔄 and Transformer 🔀 models through interactive notebooks 📓 and easy-to-use wrapper classes 🛠️. Ideal for content creators and data enthusiasts aiming to automate and enhance headline generation ✨.
Clustering text using text vectorization
Predictive Text Analysis project! This repository contains code for predicting answers to science exam questions using advanced natural language processing techniques. Check out the code and results!
A diploma project focused on vectorizing scientific texts using the Top2Vec algorithm, with the aim of analyzing thematic groups, identifying trends, and visualizing the dynamics of interest in various topics in the field of computer science.
🚀 Course Recommendation System is a machine learning-powered web application designed to recommend similar courses from Coursera's vast dataset of over 3,000 courses. Built using Python, Scikit-learn, and Streamlit, the app preprocesses course data, applies text vectorization, and leverages cosine similarity to offer personalized recommendations.
This program is a project carried out in the Natural Language Processing course, which is a Taylor Swift song recommender. It utilizes topics such as sentiment analysis in texts, text vectorization, and the removal of stopwords.
AI Resume & Job Matching Tool is a Streamlit-based web app that compares a candidate’s resume with a job description using NLP (TF-IDF & cosine similarity). It generates a match score and highlights missing skills, helping job seekers optimize their resumes for specific roles.
Movie Recommender based on Content based filtering.
Movie Recommender System leverages a content-based approach, suggesting films to users based on the attributes of movies they have previously enjoyed. By analyzing movie metadata such as genre, cast, director, keywords, etc., this project offers personalized recommendations aligned with users' cinematic tastes.
Identifies the emotions expressed in comments posted on social media
The repository contains notebooks created for collecting and preprocessing the corpus of diary entries and for experiments on creating models for predicting gender, age groups of authors and the time period of text creation.
A movie recommendation system using content-based filtering with cosine similarity to suggest movies based on plot and genre similarity. Built with Python, Pandas, and Scikit-learn.
Beginner-friendly notes and resources on Natural Language Processing (NLP), covering text preprocessing, vectorization, embeddings, models, and project pipeline.
Comprehensive guide to text preprocessing and vectorization techniques for NLP, covering tokenization, n-grams, Bag-of-Words, TF-IDF, and related feature-engineering methods.
Sentiment Analysis is a Natural Language Processing (NLP) technique used to identify the emotional tone behind a piece of text — typically classified as positive, negative, or neutral.
Natural language learning codes
A Machine Learning Project using Texts from TripleTen
Successfully established a machine learning model that can accurately classify an e-commerce product into one of four categories, namely "Books", "Clothing & Accessories", "Household" and "Electronics", based on the product's description.
Content Based Movie Recommendation System | Python