There are 0 repository under tfidf-vectorizer topic.
Objective of the repository is to learn and build machine learning models using Pytorch. 30DaysofML Using Pytorch
Document Search Engine project with TF-IDF abd Google universal sentence encoder model
Using natural language processing to analyze the sentiments of people and detect suicidal ideation on online social content.
A Natural Language Processing with SMS Data to predict whether the SMS is Spam/Ham with various ML Algorithms like multinomial-naive-bayes,logistic regression,svm,decision trees to compare accuracy and using various data cleaning and processing techniques like PorterStemmer,CountVectorizer,TFIDF Vetorizer,WordnetLemmatizer. It is implemented using LSTM and Word Embeddings to gain accuracy of 97.84%.
Weighted Class TFIDF technique to deal with imbalanced datasets
The Bus-Mama is a bus tracking mobile application for the transportation of the students of BSMRSTU. It helps the students of our university by showing the available route, bus, and their exact location. This app includes real-time bus tracking which is going to solve a problem that university students have been facing for many years. Students are often seen missing their buses. Often they can't maintain the bus time. Since there are many buses in our university, students can easily catch a bus if they know where and when it will pass by. My goal is to track the buses and make hardware, mobile application, and machine learning solution to solve the issue. This way the students can get relief from missing the bus and use the buses efficiently. The main idea is to track the buses. GPS trackers will be attached to every bus that will give the current position of them and automatically sync on the server. The Bus-Mama mobile application will show every real-time position of those buses. This application will be installed on students' mobile phones and in this way the students can easily maintain their transportation. In this application, the current location of the bus can be seen through Google map. Every bus will have a specific marker on Google map and all the details about a specific bus will be shown by clicking on the marker. There will be seen about how far the bus is, from which direction it will come, how much time to reach the bus, how much time it will take if there is any traffic on road, etc. There is also a search option to know about any specific bus details. There is also a list of all buses with sufficient details that will help students to know about all the details. Every student will have an account through which they can access bus data. Another main objective is the Bus-Mama Chatbot in the Bengali language so that the students can communicate to know about the bus easily. For now, they can make conversation only about bus-related information. The Chatbot is not yet able to make conversation except bus-related questions. If anyone asks anything except bus-related questions, it cannot reply to the question rather it will give a tag to the question as a reply. As the Chatbot is created in the Bengali language, it has used the "trie" data structure in lemmatization. A library has been designed to lemmatize the Bengali words. Almost 63,205 Bengali words have been lemmatized by using the library to train the SVM machine learning model.
This notebook contains entire text preprocessing pipeline for NLP problems. The ready-to-use functions require NLTK and SKlearn package installations. It also contains some prominent text classification models.
Fake News Detection System for detecting whether news is fake or not. The model is trained using "Liar, Liar Pants on Fire": A New Benchmark Dataset for Fake News Detection. Link for dataset: https://arxiv.org/abs/1705.00648.
VIP Machine Learning Exercises and Practices
TFIDF being the most basic and simple topic in NLP, there's alot that can be done using TFIDF only! So, in this repo, I'll be adding the blog, TFIDF basics, wonders done using tfidf etc.
This repo contains a machine learning model made using advanced and enhanced algos like KNN,SVD and also concepts like vectorization ,cosine similarity which predicts the similar movies for a given fav movie of user. So no more time wasting on searching for a good of you're choice
Authorship Attribution with Machine Learning
TfidfVectorizer & PassiveAggressiveClassifier
Spam SMS Detection Project implemented using NLP & Transformers. DistilBERT - a hugging face Transformer model for text classification is used to fine-tune to best suit data to achieve the best results. Multinomial Naive Bayes achieved an F1 score of 0.94, the model was deployed on the Flask server. Application deployed in Google Cloud Platform
A text can be assigned more than one label
Sentiment analysis of IMDB dataset.
An NLP model to detect fake news and accurately classify a piece of news as REAL or FAKE trained on dataset provided by Kaggle.
Scikit-learn tutorial for beginniers. How to perform classification, regression. How to measure machine learning model performacne acuuracy, presiccion, recall, ROC.
📷🎥 Entity resolution system for SIGMOD 2020 programming contest
E-Commerce Recommendation System
This repo contains code files of all the important topics of NLP.
Assignment-11-Text-Mining-01-Elon-Musk, Perform sentimental analysis on the Elon-musk tweets (Exlon-musk.csv), Text Preprocessing: remove both the leading and the trailing characters, removes empty strings, because they are considered in Python as False, Joining the list into one string/text, Remove Twitter username handles from a given twitter text. (Removes @usernames), Again Joining the list into one string/text, Remove Punctuation, Remove https or url within text, Converting into Text Tokens, Tokenization, Remove Stopwords, Normalize the data, Stemming (Optional), Lemmatization, Feature Extraction, Using BoW CountVectorizer, CountVectorizer with N-grams (Bigrams & Trigrams), TF-IDF Vectorizer, Generate Word Cloud, Named Entity Recognition (NER), Emotion Mining - Sentiment Analysis.
🖼️ Text2Meme is a Meme Classification Experiment based on Caption Text (Implemented as a Discord Bot)
Recommendation system built using multiple ML models that aim to predict users' interests based on their past behavior and preferences.
resume parser using spacy and streamlit
Given a document, identifying the closest documents within the list of documents using tf-idf matrix and cosine similarity
Implementation of various Machine Learning and Deep Learning models for Sentiment Analysis on the 'Sentiment Labelled Sentences Data Set' by University of California, Irvine.
A comprehensive approach to implicit and explicit rating based recommendation engine
Twitter Sentiment Analysis
Movie recommender engine built on django
Personalized anime recommendations based on collaborative filtering. Discover your next favorite anime!
This is a recommendation engine that recommends 10 courses related to course you search.
This repository contains introductory notebooks for text mining and web scrapping.