Eric Xiao (ericxiao251)

ericxiao251

Geek Repo

Location:Toronto, Canada

Github PK Tool:Github PK Tool

Eric Xiao's repositories

spark-syntax

This is a repo documenting the best practices in PySpark.

Language:Jupyter NotebookStargazers:457Issues:15Issues:10

deep-dive-into-spark

Workshop on optimizing PySpark pipelines.

Stargazers:4Issues:0Issues:0
Language:Jupyter NotebookStargazers:1Issues:1Issues:0

Miscellaneous

Scripts and code examples. Includes Spark notes, Jupyter notebook examples for Spark, Impala and Oracle.

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:1Issues:0Issues:0

N-Body

N-Body Simulation using OpenMP and OpenMPI in C/C++.

Language:CLicense:MITStargazers:1Issues:3Issues:1

TwitterAPI

My own Twitter Class

Language:PythonStargazers:1Issues:2Issues:0

1brc

1️⃣🐝🏎️ The One Billion Row Challenge -- A fun exploration of how quickly 1B rows from a text file can be aggregated with Java

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

dbt-presto

The presto adpter plugin for dbt (https://getdbt.com)

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0
Language:ShellStargazers:0Issues:1Issues:0

druid

Apache Druid: a high performance real-time analytics database.

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

flink

Apache Flink

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0
Stargazers:0Issues:0Issues:0
Language:JavaStargazers:0Issues:0Issues:0

grok_sdi_educative

Grokking the System Design Interview Course

Stargazers:0Issues:1Issues:0

incubator-paimon

Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.

License:Apache-2.0Stargazers:0Issues:0Issues:0

lists

The definitive list of lists (of lists) curated on GitHub and elsewhere

License:CC0-1.0Stargazers:0Issues:1Issues:0
Language:JavaStargazers:0Issues:1Issues:0

matterport-dl

A downloader for matterport virtual tours

License:UnlicenseStargazers:0Issues:0Issues:0

Notes

A collection of notes used for personal and technical development.

Stargazers:0Issues:2Issues:0
Language:ScalaStargazers:0Issues:0Issues:0

sqlfluff

A SQL Linter for Humans

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

TakeHomeDataChallenges

My solution to the book <A collection of Data Science Take-home Challenges>

Stargazers:0Issues:0Issues:0

trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0