jitkasem pintaya (jitkasempin)

jitkasempin

Geek Repo

Company:Predictive

Location:Bangkok

Github PK Tool:Github PK Tool

jitkasem pintaya's repositories

pyspark_read_write_to_hive

Correct way to read the json file on AWS S3 with Pyspark

Language:PythonStargazers:1Issues:0Issues:0

poc_streaming_twitter_to_kafka_to_spark_to_hdfs

I try to build the data pipeline that read the twitter stream and store tweet data into HDFS

Language:PythonStargazers:0Issues:0Issues:0

airflow-maintenance-dags

A series of DAGs/Workflows to help maintain the operation of Airflow

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

automl-gs

Provide an input CSV and a target field to predict, generate a model + code to run it.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

avro-fastserde

Fast Apache Avro serialization/deserialization library

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

avro-util

Collection of utilities to allow writing java code that operates across a wide range of avro versions.

Language:JavaLicense:BSD-2-ClauseStargazers:0Issues:0Issues:0
Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

bigquery-ml-templates

BigQuery ML SQL templates for common marketing use cases

Language:TSQLLicense:Apache-2.0Stargazers:0Issues:0Issues:0

BigQueryML-Examples

Practical BigQuery ML Examples

Stargazers:0Issues:0Issues:0

cloud-opensource-python

Dependency Management Toolkit for Google Cloud Python Projects

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

code-snippets

Small Google Cloud Platform examples and code snippets.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

Data-Wrangling-with-Python

Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

DataflowTemplates

Google-provided Cloud Dataflow template pipelines for solving simple in-Cloud data tasks

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

datalake

Data Lake template

Language:ShellStargazers:0Issues:0Issues:0

dbeam

DBeam extracts SQL tables using JDBC and Apache Beam

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

getting_started_with_pyspark

Materials for class Getting Started with Pyspark

Language:Jupyter NotebookStargazers:0Issues:0Issues:0

hadoopoffice

HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

kaniko

Build Container Images In Kubernetes

Language:GoLicense:Apache-2.0Stargazers:0Issues:0Issues:0

lambda-arch

Applying the Lambda Architecture with Spark, Kafka, and Cassandra.

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

mlflow

Open source platform for the machine learning lifecycle

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

modin

Modin: Speed up your Pandas workflows by changing a single line of code

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

nuclio

High-Performance Serverless event and data processing platform

Language:GoLicense:Apache-2.0Stargazers:0Issues:0Issues:0

oreilly_advanced_sql_for_data

Resources for the O'Reilly Online Training "Advanced SQL For Data Analysis"

Stargazers:0Issues:0Issues:0

pro-devops-with-google-cloud-platform

Source Code for 'Pro DevOps with Google Cloud Platform' by Pierluigi Riti

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

PyMySQL

Pure Python MySQL Client

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

python-mysql-replication

Pure Python Implementation of MySQL replication protocol build on top of PyMYSQL

Language:PythonStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

sope

Sope - Apache Spark ETL Utilities

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

tableschema-py

A Python library for working with Table Schema.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0