There are 5 repositories under dataprocessing topic.
Learning to create Machine Learning Algorithms
A day to day plan for this challenge (50 Days of Machine Learning) . Covers both theoretical and practical aspects
Classification of Breast Cancer diagnosis Using Support Vector Machines
Native Delta Lake Implementation in Go
Weather Forecasting report over the Jaipur Dataset for Rain Prediction
Stochastic Testing and Input Manipulation for Unbiased Learning Systems
Build a Flask web application to help users retrieve key restaurant information and feature-based reviews (generated by applying market-basket model – Apriori algorithm and NLP on user reviews).
Machine Learning project to predict popularity of Instagram posts
Process tardis.dev cryptocurrency data, reconstructing the market depth and computing imbalance.
The python module can be used to scrape data and process data from different sources. The python module can output data as either as a dataframe in the country year format or it will output data in excel files This module has primarily been created for processing data for the International Futures (IFs) Project however, it can be used to process data in general. The module can be used to process data from the following sources, 1) World Bank World Development Indicators (WDI) 2) UNESCO Education indicators(UIS) 3) FAO Food Balance Sheets (FAO) 4) IMF Global Finance Statistics (IMF GFS) 5) Health data from the Institute for Health and Metric Evaluation (IHME) 6) Water data from FAO AQUASTAT 7) Energy data from EIA Currently this module can be run as is on Windows. For usage on Macs, the user may have to make changes to the code lines which specify paths.
Creating an Inverted Index of words occurring in a large set of documents extracted from web pages using Hadoop MapReduce and Google Dataproc
This notebook presents a pipeline to process raw data files of battery cycling and the prediction of their useful life before the degradation starts.
A versatile pipelining library created with media organization in mind.
List of all my AI Projects
A graphical batch data processing tool for protein crystallography
Can we tell if a house is abandoned based on aerial imagery?
The SQL Graph with Tinkerpop3 and Clojure
Scraping searched jobs on Jobsite with Python and selenium on google colab
An end-to-end application that predicts stock price movements using sentiment analysis of financial news headlines. Powered by machine learning, NLP, and real-time data integration, this project offers investors a reliable tool for data-driven decision-making.
A Capstone Project that covers several aspects of Data Engineering (Data Exploration, Cleaning, Modeling, Pipelining, Processing)
For preprocessing empatica e4 data for analysis
Collection of scripts to gather training (meta) data for the ML model
Data Science materials
News Scraper App using Python and Beautiful Soup
Encoding: converting categorical data into a numerical data
Initial Release 7.0
WebScrapeSummarizer 🌐✍️: A web tool that fetches and summarizes content from any domain, offering insights in a compact CSV format.
The Credit Card Fraud Detection project is a machine learning-based system designed to identify fraudulent transactions in real-time. Using historical transaction data, the model classifies transactions as either fraudulent or legitimate, helping financial institutions reduce financial losses and improve security.
Search the web with SearxNG and summarize using DeepSeek-R1.