There are 62 repositories under big-data-analytics topic.
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
Easy Machine Learning is a general-purpose dataflow-based system for easing the process of applying machine learning algorithms to real world tasks.
A Data Analysis Board in Vue.
PySpark-Tutorial provides basic algorithms using PySpark
可视化大屏解决方案, 提供一套可视化编辑引擎, 助力个人或企业轻松定制自己的可视化大屏应用.
Use CH-UI to work with your data from Click House self-hosted with a user-friendly interface. CH-UI is a modern and feature-rich user interface for ClickHouse databases. It offers an intuitive platform for querying ClickHouse databases, executing queries, and visualizing metrics about your instance.
Powerful & Easy way for big data discovery
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
A data-driven method combining symbolic regression and compressed sensing for accurate & interpretable models.
Graph Sampling is a python package containing various approaches which samples the original graph according to different sample sizes.
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
This is about learning courses in Coursera. All the answers given written by myself
I have built the computer vision models in 3 different ways addressing different personas, because not all companies will have a resolute data science team. quality-control manufacturing big-data-analytics jupyter-notebook cognitive services industry solutions
Bucketize an image based on exhaust data and AI generated data. industry-solutions azure azure machine learning services computer-vision big data big data analytics machine learning image recognition manufacturing quality control cognitive services
Course covers big data fundamentals, processes, technologies, platform ecosystem, and management for practical application development.
The binary build of LEO CDP Free Edition for training purposes
Big data projects implemented by Maniram yadav
TeraHeap: Reducing Memory Pressure in Managed Big Data Frameworks
The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on writing pyspark code.
ARAKAT - Big Data Analysis and Business Intelligence Application Development Platform
Plugin offering views, operators, sensors, and more developed at Pandora Media.
Eskimo is a state of the art Big Data Infrastructure and Management Web Console to build, manage and operate Big Data 2.0 Analytics clusters on Kubernetes. This is the git repository of Eskimo Community Edition.
This project analyses and correlates student performance with different attributes. Then at last, it determines most suitable algorithm from bunch of them.
YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.
American Community Survey data on people and households
US Real Estate Rental Price Analysis