Greg Chu's starred repositories

coding-interview-university

A complete computer science study plan to become a software engineer.

tech-interview-handbook

💯 Curated coding interview preparation materials for busy software engineers

Language:TypeScriptLicense:MITStargazers:113011Issues:2099Issues:92

spark

Apache Spark - A unified analytics engine for large-scale data processing

Language:ScalaLicense:Apache-2.0Stargazers:38667Issues:2030Issues:0

streamlit

Streamlit — A faster way to build and share data apps.

Language:PythonLicense:Apache-2.0Stargazers:32691Issues:317Issues:4208

system-design

Learn how to design systems at scale and prepare for system design interviews

applied-ml

📚 Papers & tech blogs by companies sharing their work on data science & machine learning in production.

k9s

🐶 Kubernetes CLI To Manage Your Clusters In Style!

Language:GoLicense:Apache-2.0Stargazers:25386Issues:146Issues:1751

kaniko

Build Container Images In Kubernetes

Language:GoLicense:Apache-2.0Stargazers:14129Issues:140Issues:1508

karpenter

Karpenter is a Kubernetes Node Autoscaler built for flexibility, performance, and simplicity.

Language:GoLicense:Apache-2.0Stargazers:5256Issues:73Issues:1752

Tech-Interview-Cheat-Sheet

Studying for a tech interview sucks. Here's an open source cheat sheet to help

Language:TypeScriptLicense:MITStargazers:4052Issues:85Issues:7

lightdash

Self-serve BI to 10x your data team ⚡️

Language:TypeScriptLicense:MITStargazers:3530Issues:26Issues:5075

quarto-cli

Open-source scientific and technical publishing system built on Pandoc.

Language:JavaScriptLicense:NOASSERTIONStargazers:3467Issues:27Issues:4474

deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

Language:ScalaLicense:Apache-2.0Stargazers:3157Issues:80Issues:333

dtreeviz

A python library for decision tree visualization and model interpretation.

Language:Jupyter NotebookLicense:MITStargazers:2869Issues:46Issues:202

forecasting

Time Series Forecasting Best Practices & Examples

Language:PythonLicense:MITStargazers:2646Issues:104Issues:79

nixtla

TimeGPT-1: production ready pre-trained Time Series Foundation Model for forecasting and anomaly detection. Generative pretrained transformer for time series trained on over 100B data points. It's capable of accurately predicting various domains such as retail, electricity, finance, and IoT with just a few lines of code 🚀.

Language:Jupyter NotebookLicense:NOASSERTIONStargazers:1826Issues:31Issues:119

re-data

re_data - fix data issues before your users & CEO would discover them 😊

Language:HTMLLicense:NOASSERTIONStargazers:1530Issues:24Issues:195

Reddit-wiki-programming

Resources to Learn Data Structures and Algorithms, ace competitive programming, Get a Job in Tech/CS

plotly-resampler

Visualize large time series data with plotly.py

Language:PythonLicense:MITStargazers:970Issues:13Issues:159

sematic

An open-source ML pipeline development platform

Language:PythonLicense:NOASSERTIONStargazers:953Issues:10Issues:331

spotty

Training deep learning models on AWS and GCP instances

Language:PythonLicense:MITStargazers:493Issues:9Issues:83

dbx

🧱 Databricks CLI eXtensions - aka dbx is a CLI tool for development and advanced Databricks workflows management.

Language:PythonLicense:NOASSERTIONStargazers:435Issues:22Issues:413

corp

Assets related to the operation of Fishtown Analytics.

dbt-project-evaluator

This package contains macros and models to find DAG issues automatically

Language:ShellLicense:Apache-2.0Stargazers:399Issues:6Issues:208

dbt-spark

dbt-spark contains all of the code enabling dbt to work with Apache Spark and Databricks

Language:PythonLicense:Apache-2.0Stargazers:370Issues:21Issues:355

lleaves

Compiler for LightGBM gradient-boosted trees, based on LLVM. Speeds up prediction by ≥10x.

Language:PythonLicense:MITStargazers:318Issues:10Issues:40

fal

⚡ Fastest way to serve open source ML models to millions

Language:PythonLicense:Apache-2.0Stargazers:195Issues:12Issues:21

spark-utils

Utility functions for dbt projects running on Spark

Language:PythonLicense:Apache-2.0Stargazers:30Issues:5Issues:11

coalesce-2022-python-databricks

Coalesce 2022 Python models demo with Databricks. Not actively maintained.

Language:Jupyter NotebookStargazers:12Issues:1Issues:0

dbt-package-workshop

The companion repo to the 2022 Coalesce New Orleans Workshop - dbt Packages You Didn't Know You Needed