Hussein Awala's repositories
spark-on-k8s
A Python package to submit and manage Apache Spark applications on Kubernetes.
async-batcher
A service to batch the http requests.
airflow-duckdb
A package to run DuckDB queries from Apache Airflow.
airflow-server
Docker configuration for airflow server with Localexecutor
airflow
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
airflow-ci-infra
Automation around CI infrastructure for Apache Airflow
airflow-client-java
Apache Airflow - OpenApi Client for Java
airflow-site
Apache Airflow Website
spark-on-k8s-demo
In this project, I will explain how we can create a multi environment project using terraform and helm, then how we create PySpark jobs and run them in a K8S cluster using spark-on-k8s operator.
arrow
Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
FastAPI-app
A simple application developed by python FastAPI
flask
The Python micro framework for building web applications.
hive
Apache Hive
hudi
Upserts, Deletes And Incremental Processing on Big Data.
iceberg
Apache Iceberg
iceberg-python
Apache PyIceberg
MeteoSpark
meteo-spark is an open source project that aims to simplify the Climate Data Analysis using PySpark, which allow the processing of very big files saved in the cloud (S3, GCS, ...) on a large pyspark cluster managed by YARN or Kubernetes.
micro-cluster-lab
A micro cluster lab to experiment Dask and Spark (Python and Scala) based on Docker
Micro-express
Proof of concept of a specs driven development microservices architecture using code generation.
spark
Apache Spark - A unified analytics engine for large-scale data processing
spark-xarray
This is an experimental project that seeks to integrate PySpark and xarray for Climate Data Analysis.
stream-applications
A repository contains some examples for stream processing applications using spark structured streaming, Kafka Streams, and some other tools like Apache Hudi...
terraform-aws-ec2-autoscale-group
Terraform module to provision Auto Scaling Group and Launch Template on AWS
terraform-aws-eks-workers
Terraform module to provision an AWS AutoScaling Group, IAM Role, and Security Group for EKS Workers
the-algorithm
Source code for Twitter's Recommendation Algorithm
the-algorithm-ml
Source code for Twitter's Recommendation Algorithm