jitkasem pintaya's repositories
pyspark_read_write_to_hive
Correct way to read the json file on AWS S3 with Pyspark
poc_streaming_twitter_to_kafka_to_spark_to_hdfs
I try to build the data pipeline that read the twitter stream and store tweet data into HDFS
airflow-maintenance-dags
A series of DAGs/Workflows to help maintain the operation of Airflow
automl-gs
Provide an input CSV and a target field to predict, generate a model + code to run it.
avro-fastserde
Fast Apache Avro serialization/deserialization library
avro-util
Collection of utilities to allow writing java code that operates across a wide range of avro versions.
bigquery-ml-templates
BigQuery ML SQL templates for common marketing use cases
BigQueryML-Examples
Practical BigQuery ML Examples
cloud-opensource-python
Dependency Management Toolkit for Google Cloud Python Projects
code-snippets
Small Google Cloud Platform examples and code snippets.
Data-Wrangling-with-Python
Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices
DataflowTemplates
Google-provided Cloud Dataflow template pipelines for solving simple in-Cloud data tasks
datalake
Data Lake template
dbeam
DBeam extracts SQL tables using JDBC and Apache Beam
getting_started_with_pyspark
Materials for class Getting Started with Pyspark
hadoopoffice
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
kaniko
Build Container Images In Kubernetes
lambda-arch
Applying the Lambda Architecture with Spark, Kafka, and Cassandra.
mlflow
Open source platform for the machine learning lifecycle
modin
Modin: Speed up your Pandas workflows by changing a single line of code
nuclio
High-Performance Serverless event and data processing platform
oreilly_advanced_sql_for_data
Resources for the O'Reilly Online Training "Advanced SQL For Data Analysis"
pro-devops-with-google-cloud-platform
Source Code for 'Pro DevOps with Google Cloud Platform' by Pierluigi Riti
PyMySQL
Pure Python MySQL Client
python-mysql-replication
Pure Python Implementation of MySQL replication protocol build on top of PyMYSQL
sope
Sope - Apache Spark ETL Utilities
tableschema-py
A Python library for working with Table Schema.