There are 285 repositories under bigdata topic.
Distributed SQL transaction & query engine for data sharding, scaling, encryption, and more - on any database.
大数据入门指南 :star:
A curated list of awesome big data frameworks, ressources and other awesomeness.
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
This is a repo with links to everything you'd ever want to learn about data engineering
𝗗𝗮𝘁𝗮, 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀 & 𝗔𝗜. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
A Cloud Native Batch System (Project under CNCF)
🔨 用 JSON 来生成结构化的 SQL 语句,基于 Vue3 + TypeScript + Vite + Ant Design + MonacoEditor 实现,项目简单(重逻辑轻页面)、适合练手~
大数据学习,从零开始学习大数据,包含大数据学习各阶段学习视频、面试资料
基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
100+套大数据可视化炫酷大屏Html5模板;包含行业:社区、物业、政务、交通、金融银行等,全网最新、最多,最全、最酷、最炫大数据可视化模板。陆续更新中
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
:dart: :star2:[大数据面试题]分享自己在网络上收集的大数据相关的面试题以及自己的答案总结.目前包含Hadoop/Hive/Spark/Flink/Hbase/Kafka/Zookeeper框架的面试题知识总结
Google, Naver multiprocess image web crawler (Selenium)
TensorBase is a new big data warehousing with modern efforts.
大数据知识仓库涉及到数据仓库建模、实时计算、大数据、数据中台、系统设计、Java、算法等。
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
A batch scheduler of kubernetes for high performance workload, e.g. AI/ML, BigData, HPC
学习记录的一些笔记,以及所看得一些电子书eBooks、视频资源和平常收纳的一些自己认为比较好的博客、网站、工具。涉及大数据几大组件、Python机器学习和数据分析、Linux、操作系统、算法、网络等
Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.