孙波's repositories
flinkStreamSQL
基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法
airflow
Apache Airflow
atlas
Apache Atlas
beam
Apache Beam
ChatGLM-6B
ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型
DataLink
DataLink是一个满足各种异构数据源之间的实时增量同步、离线全量同步,分布式、可扩展的数据交换平台。
DataSphereStudio
DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.
delta-architecture
Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline
dr-elephant
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
FATE
An Industrial Grade Federated Learning Framework
fes.js
Fes.js 是一套优秀的中后台前端解决方案。提供初始项目、开发调试、Mock接口、编译打包的命令行工具。内置布局、权限、数据字典、状态管理、存储、Api等多个模块。以约定、配置化、组件化的设计**,让用户仅仅关心用组件搭建页面内容。基于Vue.js,上手简单。经过多个项目中打磨,趋于稳定。
flink-cdc-connectors
Change Data Capture (CDC) Connectors for Apache Flink
flinkx
基于flink的分布式数据同步工具
free-programming-books-zh_CN
:books: 免费的计算机编程类中文书籍,欢迎投稿
GitDataV
基于Vue框架构建的github数据可视化平台
God-Of-BigData
专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...
hudi
Upserts, Deletes And Incremental Processing on Big Data.
iceberg
Apache Iceberg
incubator-inlong
Apache InLong
incubator-superset
Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application
Linkis
Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.
NNAnalytics
NameNodeAnalytics is a self-help utility for scouting and maintaining the namespace of an HDFS instance.
Qualitis
Qualitis is a one-stop data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems caused by data processing. https://github.com/WeBankFinTech/Qualitis
Quicksql
A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources
scio
A Scala API for Apache Beam and Google Cloud Dataflow.
Scriptis
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
shuzeCloud
国内领先的数据中台开发平台
snowplow
Cloud-native web, mobile and event analytics, running on AWS and GCP
wormhole
Wormhole is a SPaaS (Stream Processing as a Service) Platform