Ramses Alexander Coraspe Valdez's repositories

apache-spark-docker

Dockerizing an Apache Spark Standalone Cluster

Language:VBALicense:Apache-2.0Stargazers:41Issues:6Issues:3

data-engineer-challenge

Challenge Data Engineer

Language:PythonLicense:Apache-2.0Stargazers:25Issues:2Issues:0

pyspark-on-aws-emr

The goal of this project is to offer an AWS EMR template using Spot Fleet and On-Demand Instances that you can use quickly. Just focus on writing pyspark code.

Language:PythonLicense:Apache-2.0Stargazers:24Issues:4Issues:4

Dropout-Students-Prediction

The goal of this project is to identify students at risk of dropping out the school

data-engineering-challenge-th

Dockerizing a Python Script for Web Scraping and consume the scraped data using FastApi (www.metroscubicos.com)

Language:PythonLicense:Apache-2.0Stargazers:13Issues:3Issues:0

recommendation-system

Build a Content-Based Movie Recommender System (TF-IDF, BM25, BERT)

Language:PythonLicense:Apache-2.0Stargazers:12Issues:3Issues:0

text-analysis-speeches-amlo

Text analysis of the speeches, conferences and interviews of the current president of Mexico

Language:Jupyter NotebookStargazers:8Issues:3Issues:0

tf-idf

Term Frequency-Inverse Document Frequency from Scratch

Language:PythonStargazers:7Issues:3Issues:0

dataengineering-assignment

Prescreening Tasks for Data Engineer

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:6Issues:4Issues:0

Huffman-decoding

A New Approach for Efficient Sequential Decoding of Static Huffman Codes

Language:HTMLStargazers:5Issues:3Issues:0

Moving-Average-Spark

How to Compute Moving Average with Spark

distance-metrics

Distance metrics are one of the most important parts of some machine learning algorithms, supervised and unsupervised learning, it will help us to calculate and measure similarities between numerical values expressed as data points

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:4Issues:3Issues:0

Contextual-Data-Transforms

This repository contain the most important contextual data transformation algorithms which help to improve the rate compression reached by statistical encoders. Ramses Alexander Coraspe Valdez

Language:HTMLLicense:MITStargazers:3Issues:3Issues:0

MachineLearning

The repository contains basic experiments using machine learning algorithms with python

Language:HTMLStargazers:3Issues:3Issues:0

Computer-Vision-and-Deep-Learning

This repository contains information on the basic techniques and algorithms used in computer image processing, in addition to some projects related to pattern recognition using deep learning.

Language:PythonStargazers:2Issues:3Issues:0

Data-Analytics-with-R

Repository for data analytics course using R

Language:HTMLStargazers:2Issues:3Issues:0

GPU-Programming-with-Python

GPU programming with Python, you can take advantage of the incredible computing power of your graphics processing unit GPU. we will work with NVIDIA’s CUDA library.

optimizing-public-transportation

Streaming event pipeline around Apache Kafka and its ecosystem. Using public data from the Chicago Transit Authority we will construct an event pipeline around Kafka that allows us to simulate and display the status of train lines in real time.

Language:PythonStargazers:2Issues:4Issues:0

SparkSQL-with-Python

This repository has some examples of using Spark and SparkSQL with Python through PySpark

Language:HTMLStargazers:2Issues:3Issues:0

burrows-wheeler-transform

Implementation of the algorithm "Burrows Wheeler Transform" in python for data compression

Language:PythonLicense:Apache-2.0Stargazers:1Issues:3Issues:0

Multiprocessing

Improving the Performance in the Statistical Redistribution of Message Symbols using Architectural patterns for Parallel Programming

Language:HTMLStargazers:1Issues:3Issues:0

Python

Software Analysis, Design and Construction with Python

Language:HTMLStargazers:1Issues:3Issues:0

Python-recursion

This repository shows the implementation of the most common recursive algorithms

Language:HTMLStargazers:1Issues:3Issues:0

wittline.github.io

My github profile

Language:SCSSLicense:MITStargazers:1Issues:3Issues:0

dag-example

Directed acyclic graph

Language:HTMLStargazers:0Issues:3Issues:0

document-clustering

Agglomerative Hierarchical Document Clustering

Language:PythonLicense:Apache-2.0Stargazers:0Issues:3Issues:0

move-to-front

Implementation of the algorithm "Move to front" in python for data compression

Language:PythonLicense:Apache-2.0Stargazers:0Issues:3Issues:0

python-driver

Teradata SQL Driver for Python

License:NOASSERTIONStargazers:0Issues:2Issues:0

spark

Apache Spark - A unified analytics engine for large-scale data processing

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:2Issues:0

SparkInternals

Notes talking about the design and implementation of Apache Spark

Stargazers:0Issues:1Issues:0