Joao Pedro Afonso Cerqueira (jpacerqueira-zz)

jpacerqueira-zz

Geek Repo

Company:FuelBigData.com

Location:London

Home Page:FuelBigData.com

Twitter:@jpacerqueira83

Github PK Tool:Github PK Tool

Joao Pedro Afonso Cerqueira's repositories

Akamai-log-Analysis-SparkML-H2o

Transformation of Akamai Logs with Spark ETL and discover of Values and similarities in logs used SparkML and H2O ML

Language:HTMLStargazers:5Issues:1Issues:0

Jupyter_Spark_H2O_Kafka_Client_Setup

This is the core of project lost_saturn . The project lost_saturn project is a modern approach to datascience, focus on enabling DataScience on containerised environments everywhere. Built first with a local setup and transformed into a container solution. It has tools centralized in Jupyter , with Spark and AutoML H2O.ai . Ideal to run Notebooks in Jupyter in WSL (Windows Subsystem Linux), or Docker containers with Ubunto 18.4 LTS

Language:Jupyter NotebookStargazers:2Issues:0Issues:0

project_lost_saturn

This is the core of project lost_saturn . The project lost_saturn project is a modern approach to datascience, focus on enabling DataScience on containerised environments everywhere. Built first with a local setup and transformed into a container solution. It has tools centralized in Jupyter , with Spark and AutoML H2O.ai . Ideal to run Notebooks in Jupyter in WSL (Windows Subsystem Linux), or Docker containers with Ubunto 18.4 LTS

Language:Jupyter NotebookStargazers:2Issues:2Issues:0

technical-test-Jupyter-Spark-Delta-Pandas

Technical Test Github Repo for Container of Test

Language:Jupyter NotebookStargazers:1Issues:1Issues:0

Terraform_start6Nodes_cdh5.xCluster

AWScli Terraform for 6 Node Cloudera CDH with Hadoop Spark Hive

Language:HCLStargazers:1Issues:1Issues:0

airflow-executions

Apache Airflow for K8s Clusters with Docker-compose orchestration. Example includes used in Workflows for Jobs like WebHooks and WebScrapers

Language:PythonStargazers:0Issues:1Issues:0
Language:Jupyter NotebookStargazers:0Issues:1Issues:0

jpac-sparklyr

H2O and sparklyr setup in Rstudio with demo/trials for Hadoop Spark

Language:JavaLicense:GPL-3.0Stargazers:0Issues:1Issues:0

spark-on-kubernetes

An Deployment and Setup of Apache Spark for multi-tenant usage in Kubernetes Clusters. This deploys 1 Executor per K8S POD , scales linearly.

Language:PythonStargazers:0Issues:2Issues:0

SparkElasticSearchPublisher

Elasticsearch publisher using Hadoop as source and Spark 1.6 as ETL engine :: Running package for Cloudera CDH 5.9.0 Cluster

Language:ScalaLicense:NOASSERTIONStargazers:0Issues:1Issues:0

als-benchmark-scripts

Scripts to benchmark distributed Alternative Least Squares (ALS)

Language:ScalaStargazers:0Issues:1Issues:0

cluster-management-python-pyspark-ngrams-samples

cluster-management-python-pyspark-ngrams-samples

Language:JavaLicense:GPL-3.0Stargazers:0Issues:1Issues:0

confluent-kafka-xperiments

Experimentation of confluent Kafka Tools and Client solutions

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0
Language:DockerfileLicense:GPL-3.0Stargazers:0Issues:1Issues:0
Language:HCLStargazers:0Issues:1Issues:0

Docker-Container-Jupyter

Docker-Container for Jupyter Notebooks using as a baseline hook other repo

Stargazers:0Issues:1Issues:0

FiveCoolTest

Techical assignment

Language:ScalaStargazers:0Issues:1Issues:0

Hadoop

Hadoop Cloudera investigations

Language:ShellStargazers:0Issues:2Issues:0

jpac-flume-logs

My adaptation of the flume-logs ingestion process

Language:HTMLLicense:Apache-2.0Stargazers:0Issues:1Issues:0
Language:ShellStargazers:0Issues:1Issues:0

TensorFlowJava

TensorFlow in Java. If Google Can do it! I can Do it!

Language:JavaLicense:NOASSERTIONStargazers:0Issues:2Issues:0