Thiru-Perumal

Thiru-Perumal

Geek Repo

Github PK Tool:Github PK Tool

Thiru-Perumal's starred repositories

streaming_data_processing

Create a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO

Language:PythonStargazers:55Issues:0Issues:0

ease-with-apache-spark

Series follows learning from Apache Spark (PySpark) with quick tips and workaround for daily problems in hand

Language:Jupyter NotebookStargazers:38Issues:0Issues:0

Data-Pipelines-with-Airflow

This project helps me to understand the core concepts of Apache Airflow. I have created custom operators to perform tasks such as staging the data, filling the data warehouse, and running checks on the data quality as the final step. Automate the ETL pipeline and creation of data warehouse using Apache Airflow. Skills include: Using Airflow to automate ETL pipelines using Airflow, Python, Amazon Redshift. Writing custom operators to perform tasks such as staging data, filling the data warehouse, and validation through data quality checks. Transforming data from various sources into a star schema optimized for the analytics team’s use cases. Technologies used: Apache Airflow, S3, Amazon Redshift, Python.

Language:PythonStargazers:68Issues:0Issues:0
Language:Jupyter NotebookStargazers:1Issues:0Issues:0

griffin

Mirror of Apache griffin

Language:ScalaLicense:Apache-2.0Stargazers:1122Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:91Issues:0Issues:0

dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Language:PythonLicense:Apache-2.0Stargazers:9495Issues:0Issues:0
Language:PythonStargazers:4Issues:0Issues:0

pandas_dq

Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.

Language:PythonLicense:Apache-2.0Stargazers:125Issues:0Issues:0
Language:PythonLicense:MITStargazers:6Issues:0Issues:0

Cloudera_Material

Cloudera_Material: Study Material to help people preparing for Cloudera CCA Spark and Hadoop Developer Exam (CCA175). Feel free to collaborate.

License:MITStargazers:32Issues:0Issues:0

Stock-Prediction-Models

Gathers machine learning and deep learning models for Stock forecasting including trading bots and simulations

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:7804Issues:0Issues:0

data-engineering-interview-questions

More than 2000+ Data engineer interview questions.

Stargazers:1010Issues:0Issues:0

codewars-handbook

A code warrior's handbook 🐱‍💻

Language:JavaLicense:MITStargazers:110Issues:0Issues:0

python-poetry-docker-example

Example of integrating Poetry with Docker leveraging multi-stage builds.

Language:DockerfileLicense:MITStargazers:355Issues:0Issues:0

Hackerrank_Python_Domain_Solutions

Solutions of challenges of Hackerrank Python domain

Language:PythonStargazers:416Issues:0Issues:0

Hackerrank-Problem-Solving-Python-Solutions

Hackerrank Problem solving solutions in Python

Language:PythonStargazers:460Issues:0Issues:0

HackerRank-Solutions

HackerRank concepts & solutions

Language:C++Stargazers:535Issues:0Issues:0

Hackerrank

Solutions to the practice exercises, coding challenges, and other problems on Hackerrank! www.Hackerrank.com

Language:PythonStargazers:319Issues:0Issues:0

HackerRank

HackerRank solutions in Java/JS/Python/C++/C#

Language:JavaLicense:MITStargazers:1272Issues:0Issues:0

HackerrankPractice

170+ solutions to Hackerrank.com practice problems using Python 3, С++ and Oracle SQL

Language:PythonLicense:MITStargazers:1019Issues:0Issues:0

Hackerrank_Python_Solutions

HackerRank Python solutions and challenges.

Stargazers:83Issues:0Issues:0
Language:PythonStargazers:42Issues:0Issues:0

spark

Apache Spark - A unified analytics engine for large-scale data processing

Language:ScalaLicense:Apache-2.0Stargazers:39170Issues:0Issues:0

cheat-sheet

collection of cheat sheets

Stargazers:323Issues:0Issues:0

scenic

Scenic: A Jax Library for Computer Vision Research and Beyond

Language:PythonLicense:Apache-2.0Stargazers:3209Issues:0Issues:0

PythonDataIES

The course site for the Data Processing in Python from IES

Language:Jupyter NotebookStargazers:44Issues:0Issues:0

dask

Parallel computing with task scheduling

Language:PythonLicense:BSD-3-ClauseStargazers:12339Issues:0Issues:0

python-machine-learning-book

The "Python Machine Learning (1st edition)" book code repository and info resource

Language:Jupyter NotebookLicense:MITStargazers:12206Issues:0Issues:0

best-of-ml-python

🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.

License:CC-BY-SA-4.0Stargazers:16186Issues:0Issues:0