Manoj Mallela's repositories

aas

Code to accompany Advanced Analytics with Spark from O'Reilly Media

Language:ScalaLicense:NOASSERTIONStargazers:0Issues:0Issues:0

amazon-kinesis-scaling-utils

The Kinesis Scaling Utility is designed to give you the ability to scale Amazon Kinesis Streams in the same way that you scale EC2 Auto Scaling groups – up or down by a count or as a percentage of the total fleet. You can also simply scale to an exact number of Shards. There is no requirement for you to manage the allocation of the keyspace to Shards when using this API, as it is done automatically.

Language:JavaLicense:NOASSERTIONStargazers:0Issues:2Issues:0

awesome-data-engineering

A curated list of data engineering tools for software developers

Stargazers:0Issues:2Issues:0

awesome-public-datasets

An awesome list of high-quality open datasets in public domains (on-going).

License:MITStargazers:0Issues:2Issues:0

awesome-spark

A curated list of awesome Apache Spark packages and resources.

License:CC0-1.0Stargazers:0Issues:1Issues:0

basic-spark

Use Apache Spark like a Swiss Knife.

Stargazers:0Issues:2Issues:0

nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Language:PythonLicense:MITStargazers:0Issues:0Issues:0

s3-utils

Utilities and tools based around Amazon S3 to provide convenience APIs in a CLI

Language:RustLicense:MITStargazers:0Issues:1Issues:0

tpcds-kit

TPC-DS benchmark kit with some modifications/fixes

Language:CStargazers:0Issues:1Issues:0

datafusion-comet

Apache DataFusion Comet Spark Accelerator

License:Apache-2.0Stargazers:0Issues:0Issues:0
Language:ScalaStargazers:0Issues:0Issues:0

db-migration

Databricks Migration Tools

Language:PythonLicense:Apache-2.0Stargazers:0Issues:1Issues:0

director-scripts

Cloudera Director sample code

Language:ShellLicense:Apache-2.0Stargazers:0Issues:2Issues:0

docker-elk

The ELK stack powered by Docker and Compose.

License:MITStargazers:0Issues:0Issues:0

flink

Mirror of Apache Flink

Language:JavaLicense:Apache-2.0Stargazers:0Issues:2Issues:0

fluentd-benchmark

Benchmark collection of fluentd use cases

Language:RubyStargazers:0Issues:2Issues:0
Language:HTMLStargazers:0Issues:2Issues:0

gimme-aws-whitepapers

Download AWS White-papers with minimum effort.

License:GPL-3.0Stargazers:0Issues:2Issues:0
Stargazers:0Issues:2Issues:0

iperf

iperf3: A TCP, UDP, and SCTP network bandwidth measurement tool

Language:CLicense:NOASSERTIONStargazers:0Issues:1Issues:0

kubernetes-iperf3

Simple wrapper around iperf3 to measure network bandwidth from all nodes of a Kubernetes cluster

Language:ShellLicense:MITStargazers:0Issues:1Issues:0

LastFM-LogAnalyzer

An Apache Spark application to analyze LastFM's userActivity logs

Language:ScalaStargazers:0Issues:2Issues:0
Stargazers:0Issues:0Issues:0
Language:Jupyter NotebookStargazers:0Issues:0Issues:0

matplotlib-cheatsheet

Matplotlib 3.1 cheat sheet

Language:PythonLicense:BSD-2-ClauseStargazers:0Issues:1Issues:0

scala-style-guide

Databricks Scala Coding Style Guide

Stargazers:0Issues:2Issues:0
Language:ScalaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

tensorframes

Tensorflow wrapper for DataFrames on Apache Spark

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:2Issues:0

terraform-aws-eks

A Terraform module to create an Elastic Kubernetes (EKS) cluster and associated worker instances on AWS.

License:MITStargazers:0Issues:0Issues:0

zookeeper

Mirror of Apache Hadoop ZooKeeper

Language:JavaLicense:Apache-2.0Stargazers:0Issues:2Issues:0