Dogukan Ulu's repositories

kafka_spark_structured_streaming

Get data from API, run a scheduled script with Airflow, send data to Kafka and consume with Spark, then write to Cassandra

streaming_data_processing

Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO

Language:PythonStargazers:49Issues:0Issues:0

airflow_kafka_cassandra_mongodb

Produce Kafka messages, consume them and upload into Cassandra, MongoDB.

Language:PythonStargazers:33Issues:0Issues:0

csv_extract_airflow_docker

Writes the CSV file to Postgres, read table and modify it. Write more tables to Postgres with Airflow.

Language:PythonStargazers:31Issues:2Issues:0

docker-airflow

Docker Apache Airflow

Language:ShellLicense:Apache-2.0Stargazers:12Issues:0Issues:0

crypto_api_kafka_airflow_streaming

Get Crypto data from API, stream it to Kafka with Airflow. Write data to MySQL and visualize with Metabase

Language:PythonStargazers:11Issues:0Issues:0

aws_end_to_end_streaming_pipeline

An AWS Data Engineering End-to-End Project (Glue, Lambda, Kinesis, Redshift, QuickSight, Athena, EC2, S3)

Language:PythonStargazers:9Issues:1Issues:0

glue_etl_job_data_catalog_s3

Glue ETL job or EMR Spark that gets from data catalog, modifies and uploads to S3 and Data Catalog

Language:Jupyter NotebookStargazers:9Issues:0Issues:0

parquet_gcs_bucket_to_bigquery_table

Parquet files will be obtained regularly from a public GCS bucket. They will be written to BQ table

Language:PythonStargazers:9Issues:2Issues:0

kaggle_projects

In this repository, I created ML algorithms for various Kaggle Competitions

Language:Jupyter NotebookStargazers:5Issues:0Issues:0

s3_trigger_lambda_to_rds

Send a dataframe to S3 automatically, trigger Lambda and modify dataframe, upload to RDS

Language:PythonStargazers:5Issues:1Issues:0

send_data_to_aws_services

This repo automates the processes when we want to send remote data to AWS services such as Kinesis, S3, etc.

Language:PythonStargazers:5Issues:1Issues:0

dogukannulu

My personal repo

Stargazers:4Issues:0Issues:0

csv_to_kinesis_streams

This repo will write a CSV file to the Amazon Kinesis Data Streams

Language:PythonStargazers:3Issues:0Issues:0

twitter_etl_s3

Get data via Twitter API, orchestrate with Airflow and store in S3 bucket

Language:PythonStargazers:3Issues:2Issues:0

amazon_msk_kafka_streaming

Create Kafka topic, stream the data to producer and consume on the console using Amazon MSK

Language:PythonStargazers:2Issues:1Issues:0

data-generator

This repo is for generating data from existing dataset to a file or producing dataset rows as message to kafka in a streaming manner.

Stargazers:2Issues:0Issues:0

datasets

This repo contains datasets used in trainings.

License:GPL-3.0Stargazers:2Issues:0Issues:0

IBM-Data-Science-Capstone-Project

This repository is created for IBM Data Science Professional Certificate Capstone Project

Language:Jupyter NotebookStargazers:2Issues:0Issues:0

read_from_s3_upload_to_rds

Upload the remote data into Amazon S3, read the data and upload to Amazon RDS MySQL

Language:Jupyter NotebookStargazers:2Issues:1Issues:0
License:Apache-2.0Stargazers:1Issues:0Issues:0
Language:PythonStargazers:1Issues:0Issues:0

prefect-example-flows

Create sample Prefect flows, deploy them as Docker containers and store within GitHub

Language:PythonStargazers:0Issues:0Issues:0

snowpipe-aws-stream-processing

Get the streaming data from the S3 bucket with SQS queue. Load into Snowflake with Snowpipe and modify the data with Snowflake task

Language:PythonStargazers:0Issues:1Issues:0