Francis Joseph's starred repositories

cpython

The Python programming language

Language:PythonLicense:NOASSERTIONStargazers:61370Issues:0Issues:0

the-algorithm-ml

Source code for Twitter's Recommendation Algorithm

Language:PythonLicense:AGPL-3.0Stargazers:9983Issues:0Issues:0

comprehensive-rust

This is the Rust course used by the Android team at Google. It provides you the material to quickly teach Rust.

Language:RustLicense:Apache-2.0Stargazers:26874Issues:0Issues:0

spark

Apache Spark - A unified analytics engine for large-scale data processing

Language:ScalaLicense:Apache-2.0Stargazers:39008Issues:0Issues:0

solana

Web-Scale Blockchain for fast, secure, scalable, decentralized apps and marketplaces.

Language:RustLicense:Apache-2.0Stargazers:12764Issues:0Issues:0

polars

Dataframes powered by a multithreaded, vectorized query engine, written in Rust

Language:RustLicense:NOASSERTIONStargazers:28280Issues:0Issues:0

gevent

Coroutine-based concurrency library for Python

Language:PythonLicense:NOASSERTIONStargazers:6208Issues:0Issues:0

hudi

Upserts, Deletes And Incremental Processing on Big Data.

Language:JavaLicense:Apache-2.0Stargazers:5248Issues:0Issues:0

learn-regex

Learn regex the easy way

License:MITStargazers:45387Issues:0Issues:0

scala-algorithms

Algorithms and Data Structures in Scala

Language:ScalaLicense:GPL-3.0Stargazers:28Issues:0Issues:0

Scala

All Algorithms implemented in Scala

Language:ScalaLicense:MITStargazers:1059Issues:0Issues:0

GoogleSummerOfCode

Ideas list for GSoC 2024 mentored by Scala Center

Stargazers:37Issues:0Issues:0

zio

ZIO — A type-safe, composable library for async and concurrent programming in Scala

Language:ScalaLicense:Apache-2.0Stargazers:4036Issues:0Issues:0

fe

Emerging smart contract language for the Ethereum blockchain.

Language:RustLicense:NOASSERTIONStargazers:1594Issues:0Issues:0

blockchain-documentation-project

Blockchain written in Python - A Documentation Project

Language:PythonLicense:BSD-2-ClauseStargazers:98Issues:0Issues:0

full-blockchain-solidity-course-py

Ultimate Solidity, Blockchain, and Smart Contract - Beginner to Expert Full Course | Python Edition

License:MITStargazers:10700Issues:0Issues:0

pyspark-cheatsheet

PySpark Cheat Sheet - example code to help you learn PySpark and develop apps faster

Language:PythonLicense:CC0-1.0Stargazers:382Issues:0Issues:0

ethereum-etl-airflow

Airflow DAGs for exporting, loading, and parsing the Ethereum blockchain data. How to get any Ethereum smart contract into BigQuery https://towardsdatascience.com/how-to-get-any-ethereum-smart-contract-into-bigquery-in-8-mins-bab5db1fdeee

Language:PythonLicense:MITStargazers:396Issues:0Issues:0

kafka-crypto-questdb

Using Kafka to track cryptocurrency price trends

Language:PythonLicense:Apache-2.0Stargazers:62Issues:0Issues:0

Dopefolio

Dopefolio 🔥 - Portfolio Template for Developers 🚀

Language:HTMLLicense:GPL-3.0Stargazers:3314Issues:0Issues:0

Production-of-Cryptocurrency-Data-Lake-Using-Spark-

This project is a ETL pipeline processing structured financial data and unstructured social media data related to cryptocurrencies(dataset with millions of record). Which prepare for exploring the relationship between the price trend of cryptocurrency assets and the sentiment of its social media platform. Use python, spark, BianceAPI, etc. to extract tradedata from the cryptocurrency exchange platform, transform it to marketdata on AWS EMR, and store it in AWS S3 Bucket. Use python, spark, TwitterAPI, etc. to extract tweets from the Twitter platform, transform and store them in AWS S3 Bucket. Perform data quality checks on tweets and marketdata and persist them on AWS S3 Bucket. Utilized:Python,Pyspark,Spark,SQL,AWS,Amazon S3,AWS EMR,BianceAPI,TwitterAPI,Data Quality,Structured data,Unstructured Data,Data Lake,ETL,Big Data,Hadoop.

Language:Jupyter NotebookStargazers:2Issues:0Issues:0

PySpark-Confluent-Kafka-Apache-Drill-

A code-based tutorial for production level data streaming with PySpark plus Optimus for data cleaning, Confluent Kafka, & Apache Drill using Docker and Cassandra (NoSQL DB) for storage; This allows for for fast feature engineering and data cleaning.

Language:Jupyter NotebookStargazers:26Issues:0Issues:0

pyspark-boilerplate-mehdio

Pyspark boilerplate for running prod ready data pipeline

Language:PythonLicense:MITStargazers:29Issues:0Issues:0

Spark-Programming-In-Python

Apache Spark 3 - Spark Programming in Python for Beginners

Language:PythonLicense:MITStargazers:343Issues:0Issues:0

data-science-ipython-notebooks

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

Language:PythonLicense:NOASSERTIONStargazers:26941Issues:0Issues:0

pyspark-cheatsheet

🐍 Quick reference guide to common patterns & functions in PySpark.

License:MITStargazers:388Issues:0Issues:0

pyspark-pictures

Learn the pyspark API through pictures and simple examples

Language:Jupyter NotebookLicense:MITStargazers:168Issues:0Issues:0

pyspark-examples

Pyspark RDD, DataFrame and Dataset Examples in Python language

Language:PythonStargazers:1128Issues:0Issues:0

pyspark-example-project

Implementing best practices for PySpark ETL jobs and applications.

Language:PythonStargazers:1583Issues:0Issues:0

flintrock

A command-line tool for launching Apache Spark clusters.

Language:PythonLicense:Apache-2.0Stargazers:636Issues:0Issues:0