There are 59 repositories under big-data-analytics topic.
1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames.
Easy Machine Learning is a general-purpose dataflow-based system for easing the process of applying machine learning algorithms to real world tasks.
A Data Analysis Board in Vue.
PySpark-Tutorial provides basic algorithms using PySpark
可视化大屏解决方案, 提供一套可视化编辑引擎, 助力个人或企业轻松定制自己的可视化大屏应用.
Powerful & Easy way for big data discovery
A multi-cloud framework for big data analytics and embarrassingly parallel jobs, that provides an universal API for building parallel applications in the cloud ☁️🚀
A data-driven method combining symbolic regression and compressed sensing for accurate & interpretable models.
Graph Sampling is a python package containing various approaches which samples the original graph according to different sample sizes.
The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
Data cleaning, pre-processing, and Analytics on a million movies using Spark and Scala.
This is about learning courses in Coursera. All the answers given written by myself
Course covers big data fundamentals, processes, technologies, platform ecosystem, and management for practical application development.
Big data projects implemented by Maniram yadav
The binary build of LEO CDP Free Edition for training purposes
TeraHeap: Reducing Memory Pressure in Managed Big Data Frameworks
US Real Estate Rental Price Analysis
Plugin offering views, operators, sensors, and more developed at Pandora Media.
ARAKAT - Big Data Analysis and Business Intelligence Application Development Platform
Eskimo is a state of the art Big Data Infrastructure and Management Web Console to build, manage and operate Big Data 2.0 Analytics clusters on Kubernetes. This is the git repository of Eskimo Community Edition.
The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on writing pyspark code.
American Community Survey data on people and households
R interface to Azure Data Explorer, aka Kusto
Iot,Big Data Analytics using Apache-kafka,spark and other aws services
This project analyses and correlates student performance with different attributes. Then at last, it determines most suitable algorithm from bunch of them.
YiraBot: Simplifying Web Scraping for All. A user-friendly tool for developers and enthusiasts, offering command-line ease and Python integration. Ideal for research, SEO, and data collection.
Workflow management system for the automated and distributed analysis of large-scale experimental data.