There are 5 repositories under tfidfvectorizer topic.
Phony News Classifier is a repository which contains analysis of a natural language processing application i.e fake news classifier with the help of various text preprocessing strategies like bag of words,tfidf vectorizer,lemmatization,Stemming with Naive bayes and other deep learning RNN (LSTM) and maintaining the detailed accuracy below
TFIDF being the most basic and simple topic in NLP, there's alot that can be done using TFIDF only! So, in this repo, I'll be adding the blog, TFIDF basics, wonders done using tfidf etc.
Posts/Feeds recommendation engine based on content based and collaborative filtering methods
Movie Recommendation System based on machine learning concepts
Hire the Perfect candidate. HackerEarth Competitions solution.
An NLP model to detect fake news and accurately classify a piece of news as REAL or FAKE trained on dataset provided by Kaggle.
Python-based web application, Flask platform, utilizes a powerful Content-Based Filtering Algorithm to provide personalized recommendations excercises
Machine learning approach for fake news detection using Scikitlearn
In this project we are comparing two approaches for movie recommendation for a new user or existing user based on their age, gender, occupation.
For our final project, our group chose to use a dataset (from Kaggle) that contained medical transcriptions and the respective medical specialties (4998 datapoints). We chose to implement multiple supervised classification machine learning models - after heavily working on the corpora - to see if we were able to correctly classify the medical specialty based on the transcription text.
Learned to detect fake news with Python. We took a political dataset, implemented a TfidfVectorizer, initialized a PassiveAggressiveClassifier, and fit our model. We ended up obtaining an accuracy of 92.82% in magnitude.
Data consists of tweets scrapped using Twitter API. Objective is sentiment labelling using a lexicon approach, performing text pre-processing (such as language detection, tokenisation, normalisation, vectorisation), building pipelines for text classification models for sentiment analysis, followed by explainability of the final classifier
Penerapan TF-IDF Vectorizer dan Passive Aggressive Classifier dalam pendeteksian berita palsu dengan Python.
Use Key NLP techniques to classify news articles into categories: Bag_of_Words (tf-Idf), word embeddings and BERT language model
The aim - is to develop a model that will give accurate predictions for the customer's test sample, but the training sample for is not given. It should be collected by parsing
Fake News Prediction System using logistic regression, stopwords, nltk
Fake new detection using text classification as real or fake news segments. Required installations - Python 3.8, NLTK, Scikit-Learn, Jupyter. Text cleaning, tokenization, vectorization, classification model generation and evaluation.
Detecting 'FAKE' news using machine learning.
SMS Spam Classifier is a machine learning project that classifies SMS messages as either spam or not spam (ham).
Detect FAKE news using sklearn
Machine learning model to predict emotions throught text
A Lite Implementation of sklearn TfidfVectorizer
This projects aims to recommend movies to the user based on high similarity scores among them.
A Simple conversational chatbot built using NLU concepts. The project uses reddit comments taken from 2015, which has about 1.7 billiion interactions.
Intents-Based Chatbot with Streamlit
This webapp helps to find the inaccurate information around the world through news
Part of an internal project for my internship
This project is a Python-based data science end-to-end chatbot designed to autonomously handle conversations without the need for human intervention. Leveraging advanced natural language processing (NLP) techniques and machine learning algorithms, the chatbot is capable of understanding user queries, providing relevant responses.
Email Classifier: A machine learning project using Python that categorizes emails into spam and ham (non-spam). Utilizes the Scikit-Learn library, employing logistic regression and TF-IDF (Term Frequency-Inverse Document Frequency) vectorization for text analysis and classification.
A content-based recommender system that recommends movies similar to the movie plot and description
Program that can take in large amounts of .csv files in the same directory and trains a deep learning neural network model based on the best/worst students evaluated by the user. Second program then utilizes the model from the first program to produce a projected report for each individual student inputted.
Restrictions on sharing as advised by DataCamp
Compute the TF-IDF matrix from a collection of documents to measure the importance of words for text analysis and information retrieval tasks.