brucemen711's repositories
BigData-Notes
大数据入门指南 :star:
awesome-datascience
:memo: An awesome Data Science repository to learn and apply for real world problems.
awesome-sysadmin
A curated list of amazingly awesome open source sysadmin resources inspired by Awesome PHP.
book-notes
Notes from books and other interesting things that I've read. Table of contents at the end 👇
dag-factory
Dynamically generate Apache Airflow DAGs from YAML configuration files
dbt-core
dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
dbt-spark
dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks
debezium-examples
Examples for running Debezium (Configuration, Docker Compose files etc.)
hive-metastore-docker
Example for article Running Spark 3 with standalone Hive Metastore 3.0
iceberg
Apache Iceberg
incubator-dolphinscheduler
Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available `out of the box`.
integrations-extras
Community developed integrations and plugins for the Datadog Agent.
k8s-example
example
kamu-cli
New generation decentralized data warehouse and streaming data pipeline
ML-From-Scratch
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
mml-book.github.io
Companion webpage to the book "Mathematics For Machine Learning"
nlp_course
YSDA course in Natural Language Processing
puppet-clickhouse-1
Install and manage ClickHouse DBMS Requires for xml-simple ruby gem to be installed
react-flow
Highly customizable library for building interactive node-based UIs, editors, flow charts and diagrams
the_silver_searcher
A code-searching tool similar to ack, but faster.
verdict
Interactive-Speed Analytics: 200x Faster, 200x Fewer Cluster Resources, Approximate Query Processing
wormhole
Wormhole is a SPaaS (Stream Processing as a Service) Platform