MAX.H's repositories
awesome-etl
A curated list of awesome ETL frameworks, libraries, and software.
awesome-python
A curated list of awesome Python frameworks, libraries, software and resources
big-data-landscape
Big Data Landscape (by www.qaware.de)
big-data-plugin
Kettle plugin that provides support for interacting within many "big data" projects including Hadoop, Hive, HBase, Cassandra, MongoDB, and others.
datacollector
StreamSets Data Collector - Continuous big data and cloud platform ingest infrastructure
DataQuality
DataQuality for BigData
debezium
Change data capture for a variety of databases. https://debezium.io Please log issues in our JIRA at https://issues.jboss.org/projects/DBZ/issues
flink-learning
flink learning blog. http://www.54tianzhisheng.cn 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》
gradict-charts-doc
图之典 - 图表文档
hadoopecosystemtable.github.io
This page is a summary to keep the track of Hadoop related projects, and relevant projects around Big Data scene focused on the open source, free software environment.
ignite-learning-paths-training
Ignite Learning Path Training
incubator-hop
Hop Orchestration Platform
incubator-hudi
Upserts And Incremental Processing on Big Data
incubator-superset
Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application
Java
All Algorithms implemented in Java
kettle-scheduler
一款简单易用的Kettle调度监控平台,专门用来调度和监控由kettle客户端创建的job和transformation。整体的框架是由spring+sprin gmvc +beetlsql整合而成,通过调用kettle的API来执行转换和作业,并且使用quartz框架完成调度工作。
kudu
Mirror of Apache Kudu
Machine-Learning-Study-Path-March-2019
A complete ML study path, focused on TensorFlow and Scikit-Learn
openGauss-server
openGauss kernel
pentaho-kettle
Pentaho Data Integration ( ETL ) a.k.a Kettle
presto-1
Official home of Presto, the distributed SQL query engine for big data
spark-clickhouse
spark to yandex clickhouse connector
spring-cloud-dataflow
Spring Cloud Data Flow is a toolkit for building data integration and real-time data processing pipelines.
waterdrop
生产环境的海量数据计算产品,文档地址: