孙波's repositories

autolabel

Label, clean and enrich text datasets with LLMs.

Language:PythonLicense:MITStargazers:1Issues:0Issues:0

flinkStreamSQL

基于开源的flink,对其实时sql进行扩展;主要实现了流与维表的join,支持原生flink SQL所有的语法

Language:JavaLicense:Apache-2.0Stargazers:1Issues:0Issues:0

airflow

Apache Airflow

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

atlas

Apache Atlas

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

beam

Apache Beam

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

License:Apache-2.0Stargazers:0Issues:0Issues:0

DataLink

DataLink是一个满足各种异构数据源之间的实时增量同步、离线全量同步,分布式、可扩展的数据交换平台。

License:Apache-2.0Stargazers:0Issues:0Issues:0

DataSphereStudio

DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.

License:Apache-2.0Stargazers:0Issues:0Issues:0

delta-architecture

Streaming data changes to a Data Lake with Debezium and Delta Lake pipeline

Stargazers:0Issues:0Issues:0

dr-elephant

Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark

License:Apache-2.0Stargazers:0Issues:0Issues:0

FATE

An Industrial Grade Federated Learning Framework

License:Apache-2.0Stargazers:0Issues:0Issues:0

fes.js

Fes.js 是一套优秀的中后台前端解决方案。提供初始项目、开发调试、Mock接口、编译打包的命令行工具。内置布局、权限、数据字典、状态管理、存储、Api等多个模块。以约定、配置化、组件化的设计**,让用户仅仅关心用组件搭建页面内容。基于Vue.js,上手简单。经过多个项目中打磨,趋于稳定。

License:MITStargazers:0Issues:0Issues:0

flink-cdc-connectors

Change Data Capture (CDC) Connectors for Apache Flink

License:Apache-2.0Stargazers:0Issues:0Issues:0

flinkx

基于flink的分布式数据同步工具

License:Apache-2.0Stargazers:0Issues:0Issues:0

free-programming-books-zh_CN

:books: 免费的计算机编程类中文书籍,欢迎投稿

License:GPL-3.0Stargazers:0Issues:0Issues:0

GitDataV

基于Vue框架构建的github数据可视化平台

Stargazers:0Issues:0Issues:0

God-Of-BigData

专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

Stargazers:0Issues:0Issues:0

hudi

Upserts, Deletes And Incremental Processing on Big Data.

License:Apache-2.0Stargazers:0Issues:0Issues:0

iceberg

Apache Iceberg

License:Apache-2.0Stargazers:0Issues:0Issues:0

incubator-inlong

Apache InLong

License:Apache-2.0Stargazers:0Issues:0Issues:0

incubator-superset

Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application

Language:JavaScriptLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Linkis

Linkis helps easily connect to various back-end computation/storage engines(Spark, Python, TiDB...), exposes various interfaces(REST, JDBC, Java ...), with multi-tenancy, high performance, and resource control.

License:Apache-2.0Stargazers:0Issues:0Issues:0

NNAnalytics

NameNodeAnalytics is a self-help utility for scouting and maintaining the namespace of an HDFS instance.

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

Qualitis

Qualitis is a one-stop data quality management platform that supports quality verification, notification, and management for various datasource. It is used to solve various data quality problems caused by data processing. https://github.com/WeBankFinTech/Qualitis

License:Apache-2.0Stargazers:0Issues:0Issues:0

Quicksql

A Flexible, Fast, Federated(3F) SQL Analysis Middleware for Multiple Data Sources

License:MITStargazers:0Issues:0Issues:0

scio

A Scala API for Apache Beam and Google Cloud Dataflow.

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Scriptis

Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.

License:Apache-2.0Stargazers:0Issues:0Issues:0

shuzeCloud

国内领先的数据中台开发平台

Stargazers:0Issues:0Issues:0

snowplow

Cloud-native web, mobile and event analytics, running on AWS and GCP

Language:ScalaStargazers:0Issues:0Issues:0

wormhole

Wormhole is a SPaaS (Stream Processing as a Service) Platform

License:Apache-2.0Stargazers:0Issues:0Issues:0