Jairo Souza (r4phael)

r4phael

Geek Repo

Company:@intelligentagents

Location:Maceió, Brazil

Github PK Tool:Github PK Tool


Organizations
easy-software-ufal

Jairo Souza's starred repositories

SWE-bench

[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?

Language:PythonLicense:MITStargazers:1437Issues:0Issues:0

findpapers

Findpapers: A tool for helping researchers who are looking for related works

Language:PythonLicense:MITStargazers:190Issues:0Issues:0

safaribooks

Download and generate EPUB of your favorite books from O'Reilly Learning (aka Safari Books Online) library.

Language:PythonLicense:WTFPLStargazers:4517Issues:0Issues:0

learning-notes

Notes on books I read, talks I watch, articles I study, and papers I love

Language:SCSSStargazers:5023Issues:0Issues:0

system-design-primer

Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

Language:PythonLicense:NOASSERTIONStargazers:262478Issues:0Issues:0

astronomer-cosmos

Run your dbt Core projects as Apache Airflow DAGs and Task Groups with a few lines of code

Language:PythonLicense:Apache-2.0Stargazers:511Issues:0Issues:0

deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.

Language:ScalaLicense:Apache-2.0Stargazers:3171Issues:0Issues:0

awesome-readme

A curated list of awesome READMEs

Stargazers:17391Issues:0Issues:0

dbt-core

dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.

Language:PythonLicense:Apache-2.0Stargazers:9253Issues:0Issues:0

dotfiles

Settings for various tools I use.

Language:ShellLicense:MITStargazers:936Issues:0Issues:0
Language:PythonLicense:MITStargazers:2Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:43Issues:0Issues:0

data-engineering-zoomcamp

Free Data Engineering course!

Language:Jupyter NotebookStargazers:23633Issues:0Issues:0

ligar-cobranca

Ligue automaticamente para empresas de cobrança e deixe uma voz falando "Alô?" sem parar.

Language:JavaScriptLicense:MITStargazers:1789Issues:0Issues:0

repodriller

a tool to support researchers on mining software repositories studies

Language:JavaStargazers:173Issues:0Issues:0

awesome-apache-airflow

Curated list of resources about Apache Airflow

Language:ShellStargazers:3608Issues:0Issues:0

Tokenizer

Fast and customizable text tokenization library with BPE and SentencePiece support

Language:C++License:MITStargazers:269Issues:0Issues:0

al-folio

A beautiful, simple, clean, and responsive Jekyll theme for academics

Language:HTMLLicense:MITStargazers:9824Issues:0Issues:0

airflow-testing-ci-workflow

(project & tutorial) dag pipeline tests + ci/cd setup

Language:PythonStargazers:83Issues:0Issues:0

Data-Pipelines-with-Airflow

This project helps me to understand the core concepts of Apache Airflow. I have created custom operators to perform tasks such as staging the data, filling the data warehouse, and running checks on the data quality as the final step. Automate the ETL pipeline and creation of data warehouse using Apache Airflow. Skills include: Using Airflow to automate ETL pipelines using Airflow, Python, Amazon Redshift. Writing custom operators to perform tasks such as staging data, filling the data warehouse, and validation through data quality checks. Transforming data from various sources into a star schema optimized for the analytics team’s use cases. Technologies used: Apache Airflow, S3, Amazon Redshift, Python.

Language:PythonStargazers:62Issues:0Issues:0

airflow-pentaho-plugin

Pentaho plugin for Apache Airflow - Orquestate pentaho transformations and jobs from Airflow

Language:PythonLicense:Apache-2.0Stargazers:37Issues:0Issues:0

kedro

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.

Language:PythonLicense:Apache-2.0Stargazers:9464Issues:0Issues:0

Cookbook

The Data Engineering Cookbook

License:Apache-2.0Stargazers:13282Issues:0Issues:0

get5

CS:GO Sourcemod plugin for competitive matches/scrims

Language:SourcePawnLicense:GPL-3.0Stargazers:559Issues:0Issues:0

awesome-production-machine-learning

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

License:MITStargazers:16662Issues:0Issues:0

awesome-mlops

A curated list of references for MLOps

Stargazers:12131Issues:0Issues:0

form-to-google-sheets

Store HTML form submissions in Google Sheets.

Language:JavaScriptLicense:Apache-2.0Stargazers:4371Issues:0Issues:0

awesome-seml

A curated list of articles that cover the software engineering best practices for building machine learning applications.

License:CC0-1.0Stargazers:1204Issues:0Issues:0

seaibib

Software Engineering for AI/ML -- An Annotated Bibliography

License:CC0-1.0Stargazers:301Issues:0Issues:0

pentaho-pdi-dataset

Set of PDI plugins to more easily work with data sets. We also want to provide unit testing capabilities through input data sets and golden data sets.

Language:JavaLicense:Apache-2.0Stargazers:30Issues:0Issues:0