text-vectorization

There are 3 repositories under text-vectorization topic.

ContextLab / hypertools
A Python toolbox for gaining geometric insights into high-dimensional data
data-visualization high-dimensional-data python topic-modeling text-vectorization data-wrangling visualization time-series
Language:Python 1873
mkearney / wactor
Word Factor Vectors
r r-package rstats text text-classification text-processing text-vectorization word-embeddings word-vectors word2vec
Language:R 32
amansrivastava17 / bns-short-text-similarity
📖 Use Bi-normal Separation to find document vectors which is used to compute similarity for shorter sentences.
bns bns-vectorizer cosine-similarity nlp short-text-semantic-similarity term-frequency text-classification text-similarity text-vectorization tf-idf
Language:Python 27
mansipatel2508 / Yelp-Review-Stars-Prediction-with-Machine-Learning
The project has text vectorization, handling big data with merging and cleaning the text and getting the required columns while boosting the performance by feature extraction and parameter tuning for NN, compares the Performances through applied different models treating the problem as classification and regression both.
classification data-cleaning data-preprocessing feature-extraction knn-classification linear-regression logistic-regression machine-learning machine-learning-algorithms machinelearning-python multinomial-naive-bayes neural-network parameter-tuning regression sklearn svm tensorflow text-processing text-vectorization tf-idf
Language:Jupyter Notebook 8
kulwinderkk / recipe_recommender_nlp
This project is an unsupervised NLP-based recipe recommender system designed to provide personalized recipe suggestions. The system employs content-based filtering techniques, utilizing cosine similarity to measure the resemblance between user inputs and a database of recipes.
cosine-similarity gensim-topic-modeling gensim-word2vec lda nlp nltk text-vectorization tfidf-vectorizer word-embeddings
Language:Jupyter Notebook 4
Rishabbh-Sahu / information_retrieval
Given a document, identifying the closest documents within the list of documents using tf-idf matrix and cosine similarity
tfidf-vectorizer text-vectorization information-retrieval matrix-multiplication similarity-search similar-patterns root-cause-analysis lookalike-queries
Language:Python 4
SarangGami / Topic-modeling-on-News-Articles-Unsupervised-Learning
In this project, task involves analyzing the content of the articles to extract key concepts and themes that are discussed across the articles to identify major themes/topics across a collection of BBC news articles.
bag-of-words gensim latent-dirichlet-allocation latent-semantic-analysis nlp-machine-learning nltk spacy text-preprocessing text-vectorization tf-idf topic-modeling
Language:Jupyter Notebook 4
rtimbro185 / syr_mads_ist736_text_mining
Syracuse University, Masters of Applied Data Science - IST 736 Text Mining
sentiment-classification twitter-sentiment-analysis text-mining text-vectorization decision-trees naive-bayes support-vector-machines k-nearest-neighbor neural-networks linear-svm multinomial-naive-bayes benoulli-naive-bayes topic-modeling
Language:Jupyter Notebook 3
Minku-Koo / Comment-Sentiment-Analysis
Comment Sentiment Analysis using Deep Learning
keras python sentiment-analysis deep-learning covid-19 religion selenium text-vectorization
Language:Python 2
ni3choudhary / Toxic-Comment-Classifier
A DL project that helps in classifying Toxic Comment weather it is positive or not.
cnn deep-learning deep-neural-networks flask model python tensorflow text text-vectorization toxic-comment-classification toxic-comments mcauc
Language:Jupyter Notebook 2
NikosMav / FakeNews-Classification
In this notebook we analyze and classify news articles using machine learning techniques, including Logistic Regression, Naive Bayes, Support Vector Machines, and Random Forests. Explore text vectorization and NLP for accurate news categorization.
fake-news-dataset fake-news-detection model-training neural-networks python-notebook logistic-regression naive-bayes natural-language-processing random-forest svm text-vectorization tf-idf-vectorization word2vec-vectorization
Language:Jupyter Notebook 2
Abdullah321Umar / Brainwave_Matrix_Intern-TASK2
♦️ Twitter US Airline Sentiment Analysis ♦️ Applied text preprocessing, NLP, and sentiment classification to analyze positive, neutral, and negative tweets. Created visualizations like sentiment distribution, airline comparisons, and word clouds for key insights. Delivered a cleaned dataset, insightful analysis, and automated PowerPoint report.
feature-engineering natural-language-processing python tokenization programming-environment-skills stemming-lemmatization classification-algorithms communication-skills confusion-matrix machine-learning model-evaluation naive-bayes python-pptx text-vectorization tf-idf-vectorizer word-frequency-analysis wordcloud-generation eda-skills visualization-export
Language:Jupyter Notebook 1
Chronos-Asteri / movie-recommender-system-v1
Using text-vectorization and similarity-based-matrix computation
text-vectorization similarity-based-matrix-computation
Language:Jupyter Notebook 1
markiskorova / Machine-Learning-NLP-Predict-Author
🧠 Machine Learning & Natural Language Processing: Predict the author of literary text snippets. Built with TensorFlow and Keras, this project trains an LSTM model on classic literature to identify writing style and authorship.
keras machine-learning natural-language-processing python tensorflow text-tokenization text-vectorization
Language:Python 1
nainiayoub / demystifying-nlp
demistifying nlp with a series of nlp implementation notebooks.
nlp text-vectorization text-preprocessing
Language:Jupyter Notebook 1
rosette-api-community / visualize-embeddings
A simple Python script for transforming a corpus of documents into text vectors suitable for visualization
text-embedding visualization python text-vectorization tsv nlp natural-language-processing machine-learning
Language:Python 1
SD7Campeon / Comment-Toxicity-Detection-and-Classification
LLM-inspired BiLSTM pipeline for real-time, multi-label toxicity inference across adversarial discourse modalities.
affective-computing bilstm deep-sequential-model discourse-analysis keras-tensorflow llm multi-label-classification nlp nlp-pipeline real-time-inference sklearn subword-tokenization text-vectorization toxicity-analysis toxicity-classification toxicity-detection toxicity-prediction transformer contextual-nlp
Language:Jupyter Notebook 1
sergio11 / headline_generation_lstm_transformers
Explore advanced neural networks for crafting captivating headlines! Compare LSTM 🔄 and Transformer 🔀 models through interactive notebooks 📓 and easy-to-use wrapper classes 🛠️. Ideal for content creators and data enthusiasts aiming to automate and enhance headline generation ✨.
deep-learning lstm lstm-model lstm-neural-networks model-comparison model-training model-training-and-evaluation natural-language-processing text-generation text-vectorization transformers
Language:Jupyter Notebook 1
singh-l / Clustering_Repo
Clustering text using text vectorization
clustering-text text-vectorization categorize-data knowledge
Language:Jupyter Notebook 1
Vidhi1290 / ScienceQA-Insights-Exploring-with-LLMs
Predictive Text Analysis project! This repository contains code for predicting answers to science exam questions using advanced natural language processing techniques. Check out the code and results!
interactive-visualizations kaggle kaggle-competition machine-learning multi-class-classification nlp nlp-machine-learning random-forest-classifier text-analysis text-vectorization predictive-text-analysis
Language:Jupyter Notebook 1
vladimiralbrekhtccr / topic_modeling_top2vec_scientific-texts
A diploma project focused on vectorizing scientific texts using the Top2Vec algorithm, with the aim of analyzing thematic groups, identifying trends, and visualizing the dynamics of interest in various topics in the field of computer science.
computer-science science-article text-vectorization topic-modeling
Language:Jupyter Notebook 1
Ganesh2409 / Course-Recommendation-System
🚀 Course Recommendation System is a machine learning-powered web application designed to recommend similar courses from Coursera's vast dataset of over 3,000 courses. Built using Python, Scikit-learn, and Streamlit, the app preprocesses course data, applies text vectorization, and leverages cosine similarity to offer personalized recommendations.
cosine-similarity data-science docker machine-learning nlp python recommendation-system streamlit-webapp text-vectorization
Language:Jupyter Notebook 0
IanCarmona / Recommendation-Songs-Taylor-Swift
This program is a project carried out in the Natural Language Processing course, which is a Taylor Swift song recommender. It utilizes topics such as sentiment analysis in texts, text vectorization, and the removal of stopwords.
natural-language-processing sentiment-analysis sentiment-classification stopwords text-vectorization
Language:Python 0
narasi143 / AI-Resume-Analyzer-and-Job-Match-Tool
AI Resume & Job Matching Tool is a Streamlit-based web app that compares a candidate’s resume with a job description using NLP (TF-IDF & cosine similarity). It generates a match score and highlights missing skills, helping job seekers optimize their resumes for specific roles.
machine-learning-concepts natural-language-processing python similarity-measures text-vectorization web-application carrer-tech-applications
Language:Python 0
nikhil1209ui / movie_recommender
Movie Recommender based on Content based filtering.
model-building data-collection deployments exploratory-data-analysis feature-selection-and-engineering text-vectorization web-hosting python
Language:Jupyter Notebook 0
ns-nexus / Movie-Recommender-System
Movie Recommender System leverages a content-based approach, suggesting films to users based on the attributes of movies they have previously enjoyed. By analyzing movie metadata such as genre, cast, director, keywords, etc., this project offers personalized recommendations aligned with users' cinematic tastes.
bag-of-words content-based-recommendation cosine-similarity data-science machine-learning porter-stemmer similarity-matrix stop-word-removal streamlit-application text-vectorization
Language:Jupyter Notebook 0
vineyumarji / Sentiment-Analysis
Identifies the emotions expressed in comments posted on social media
emotion-detection emotion-recognition fine-tuning model sentiment-analysis text-vectorization
Language:Jupyter Notebook 0
vlada-pv / Prediction-Sociolinguistic-Data-Based-on-the-Diaries-Texts-of-the-Prozhito-Project
The repository contains notebooks created for collecting and preprocessing the corpus of diary entries and for experiments on creating models for predicting gender, age groups of authors and the time period of text creation.
author-profiling bilstm deep-learning diary-entries naive-bayes-classifier neural-networks sociolinguistics bag-of-words convol convolutional-neural-networks logistic-regression recurrent-neural-networks text-preprocessing text-vectorization tf-idf-vectorizer word-embeddings
Language:Jupyter Notebook 0
arnavsrao09 / MovieRecommender
A movie recommendation system using content-based filtering with cosine similarity to suggest movies based on plot and genre similarity. Built with Python, Pandas, and Scikit-learn.
jupyter-notebook python text-vectorization
Language:Jupyter Notebook
deypadma2020 / NaturalLanguageProcessing
Beginner-friendly notes and resources on Natural Language Processing (NLP), covering text preprocessing, vectorization, embeddings, models, and project pipeline.
bag-of-ngrams bag-of-words deeo-learning machine-learning nlp sentiment-analysis text-processing text-vectorization tf-idf word2vec-model
Language:Jupyter Notebook
Hasnat-Aarif-Aslam / NLP-Foundation-Tokens-Ngrams-BoW-TF-IDF-TFIDF
Comprehensive guide to text preprocessing and vectorization techniques for NLP, covering tokenization, n-grams, Bag-of-Words, TF-IDF, and related feature-engineering methods.
bag-of-words bow feature-engineering machine-learning natural-language-processing ngrams nlp text-processing text-vectorization tfidf tokenization
indu-explores-data / Sentiment-Analysis-on-IMDB-Reviews
Sentiment Analysis is a Natural Language Processing (NLP) technique used to identify the emotional tone behind a piece of text — typically classified as positive, negative, or neutral.
bag-of-words classification-model data-science end-to-end-ml feature-engineering gensim machine-learning model-tuning natural-language-processing nlp nltk python scikit-learn sentiment-analysis spacy text-classification text-vectorization tfidf word2vec
Language:Jupyter Notebook
kpsajesh / NaturalLanguageProcessing
Natural language learning codes
audio-processing audio-spectrogram embeddings jupyter-notebook language-translation mel-spectrogram nltk python3 sentiment-analysis text-vectorization transfer-learning transformer-architecture
Language:Jupyter Notebook
LKEthridge / Machine_Learning_for_Texts
A Machine Learning Project using Texts from TripleTen
bag-of-words bert embeddings language-representations lemmatization machine-learning-for-text-classification n-grams regular-expressions sentiment-analysis text-vectorization tf-idf word-embeddings word2vec
Language:Jupyter Notebook
SayamAlt / E-Commerce-Text-Classification
Successfully established a machine learning model that can accurately classify an e-commerce product into one of four categories, namely "Books", "Clothing & Accessories", "Household" and "Electronics", based on the product's description.
categorical-encoding cross-validation exploratory-data-analysis hyperparameter-optimization machine-learning model-deployment model-training-and-evaluation text-classification text-preprocessing text-vectorization
Language:Jupyter Notebook
thatdamncoder / whats-next-on-netflix
Content Based Movie Recommendation System | Python
machine-learning python streamlit-webapp cosine-similarity text-vectorization api tmdb-api
Language:Jupyter Notebook

text-vectorization

ContextLab / hypertools

mkearney / wactor

amansrivastava17 / bns-short-text-similarity

mansipatel2508 / Yelp-Review-Stars-Prediction-with-Machine-Learning

kulwinderkk / recipe_recommender_nlp

Rishabbh-Sahu / information_retrieval

SarangGami / Topic-modeling-on-News-Articles-Unsupervised-Learning

rtimbro185 / syr_mads_ist736_text_mining

Minku-Koo / Comment-Sentiment-Analysis

ni3choudhary / Toxic-Comment-Classifier

NikosMav / FakeNews-Classification

Abdullah321Umar / Brainwave_Matrix_Intern-TASK2

Chronos-Asteri / movie-recommender-system-v1

markiskorova / Machine-Learning-NLP-Predict-Author

nainiayoub / demystifying-nlp

rosette-api-community / visualize-embeddings

SD7Campeon / Comment-Toxicity-Detection-and-Classification

sergio11 / headline_generation_lstm_transformers

singh-l / Clustering_Repo

Vidhi1290 / ScienceQA-Insights-Exploring-with-LLMs

vladimiralbrekhtccr / topic_modeling_top2vec_scientific-texts

Ganesh2409 / Course-Recommendation-System

IanCarmona / Recommendation-Songs-Taylor-Swift

narasi143 / AI-Resume-Analyzer-and-Job-Match-Tool

nikhil1209ui / movie_recommender

ns-nexus / Movie-Recommender-System

vineyumarji / Sentiment-Analysis

vlada-pv / Prediction-Sociolinguistic-Data-Based-on-the-Diaries-Texts-of-the-Prozhito-Project

arnavsrao09 / MovieRecommender

deypadma2020 / NaturalLanguageProcessing

Hasnat-Aarif-Aslam / NLP-Foundation-Tokens-Ngrams-BoW-TF-IDF-TFIDF

indu-explores-data / Sentiment-Analysis-on-IMDB-Reviews

kpsajesh / NaturalLanguageProcessing

LKEthridge / Machine_Learning_for_Texts

SayamAlt / E-Commerce-Text-Classification

thatdamncoder / whats-next-on-netflix