LLS's repositories
cratos-remoting
cratos-remoting rpc application
incubator-seatunnel
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
AIAS
AIAS (AI Acceleration Suite),AI算法落地加速器套件
big-data-projects
This project has customization likes custom data sources, plugins written for the distributed systems like Apache Spark, Apache Ignite etc
ColumnLevelLineageListener
Column Level Lineage based on Apache Spark Listener.
databend
A modern Elasticity and Performance cloud data warehouse, activate your object storage for real-time analytics. Databend Serverless at https://app.databend.com/
DataX
DataX是阿里云DataWorks数据集成的开源版本。
determined
Determined: Deep Learning Training Platform
dlink
Dinky is an out of the box one-stop real-time computing platform dedicated to the construction and practice of Unified Batch & Streaming and Unified Data Lake & Data Warehouse. Based on Apache Flink, Dinky provides the ability to connect many big data frameworks including OLAP and Data Lake.
dolphinscheduler
Apache DolphinScheduler is the modern data workflow orchestration platform with powerful user interface, dedicated to solving complex task dependencies in the data pipeline and providing various types of jobs available `out of the box`
flink-spark-submiter
从本地IDEA提交Flink/Spark任务到Yarn/k8s集群
flink-sql-lineage
FlinkSQL字段血缘解决方案及源码。FlinkSQL field lineage solution and source code, The core idea is to parse SQL through Calcite to generate a RelNode tree of relational expressions. Then get the optimized logical paln through optimization stage, and finally call Calcite RelMetadataQuery to get the lineage relationship at the field level.
flink-sql-security
FlinkSQL的行级权限解决方案及源码,支持面向用户级别的行级数据访问控制,即特定用户只能访问授权过的行,隐藏未授权的行数据。此方案是实时领域Flink的解决方案,类似离线数仓Hive中Ranger Row-level Filter方案。
k8s-device-plugin
OpenAIOS vGPU device plugin for Kubernetes is originated from the OpenAIOS project to virtualize GPU device memory, in order to allow applications to access larger memory space than its physical capacity. It is designed for ease of use of extended device memory for AI workloads.
KnowStreaming
一站式Apache Kafka管控平台
kplcloud
基于Kubernetes的PaaS平台
kyuubi
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
label-studio
Label Studio is a multi-type data labeling and annotation tool with standardized output format
labelme
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
modeldb
Open Source ML Model Versioning, Metadata, and Experiment Management
rust_cms
使用Rust编写一个CMS(内容管理系统)可以做为个人博客,公司网站
sealos
kubernetes-kernel-based cloud os! Let's sealos run kubernetes and applications.
simple-taskflow
taskflow是一款轻量、简单易用、可灵活扩展的通用任务编排框架,基于有向无环图(DAG)的方式实现,框架提供了组件复用、同步/异步编排、条件判断、分支选择等能力,可以根据不同的业务场景对任意的业务流程进行编排
spark-sql-flow-plugin
Visualize column-level data lineage in Spark SQL
spug
开源运维平台:面向中小型企业设计的轻量级无Agent的自动化运维平台,整合了主机管理、主机批量执行、主机在线终端、文件在线上传下载、应用发布部署、在线任务计划、配置中心、监控、报警等一系列功能。
synch
Sync data from the other DB to ClickHouse(cluster)
trino-hive-superset-docker
Cloud-native Trino (prestosql) + Hive + Minio + Superset