Roger M (rogersmarin)

rogersmarin

Geek Repo

Location:Melbourne, Australia

Home Page:rogersmarin.com

Github PK Tool:Github PK Tool

Roger M's repositories

spark

Mirror of Apache Spark

Language:ScalaLicense:Apache-2.0Stargazers:1Issues:1Issues:0

wikipedia-extractor

Extracts and cleans text from Wikipedia database dump and stores output in a number of files of similar size in a given directory. This is a mirror of the script by Giuseppe Attardi.

Language:PythonStargazers:1Issues:2Issues:0

Avro-Schema-Generator

Tool which generates Avro schemas and Java bindings from XML schemas.

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

azkaban

Azkaban workflow manager.

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0
Language:JavaStargazers:0Issues:1Issues:0

druid

Real²time Exploratory Analytics on Large Datasets

Language:JavaLicense:GPL-2.0Stargazers:0Issues:1Issues:0

elasticsearch-hadoop

Elasticsearch real-time search and analytics natively integrated with Hadoop

Language:JavaLicense:Apache-2.0Stargazers:0Issues:2Issues:0

go

The Open Source Data Science Masters

License:UnlicenseStargazers:0Issues:1Issues:0

hive

Mirror of Apache Hive

Language:JavaLicense:Apache-2.0Stargazers:0Issues:2Issues:0

kafka

Mirror of Apache Kafka

Language:ScalaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

kafka-1

A high-throughput, distributed, publish-subscribe messaging system

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0

machine-learning

Content for Udacity's Machine Learning curriculum

Language:Jupyter NotebookStargazers:0Issues:1Issues:0

pipeline

End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline using Spark, Spark SQL, Spark ML, GraphX, Spark Streaming, Kafka, Cassandra, ElasticSearch, Redis, Tachyon, HDFS, Zeppelin, Spark-Notebook, iPython/Jupyter Notebook, Tableau. See https://github.com/fluxcapacitor/pipeline/wiki for Setup Instructions.

Language:ShellLicense:NOASSERTIONStargazers:0Issues:1Issues:0

presto

Distributed SQL query engine for running interactive analytic queries against big data sources.

Language:JavaLicense:Apache-2.0Stargazers:0Issues:1Issues:0
Language:ScalaStargazers:0Issues:1Issues:0

toy-robot-simulator

A Scala + Akka implementation of the Toy Robot application

Language:ScalaStargazers:0Issues:2Issues:0

vowpal_wabbit

John Langford's original release of Vowpal Wabbit -- a fast online learning algorithm

Language:C++License:NOASSERTIONStargazers:0Issues:1Issues:0

word2vec-wikipedia-spark

Utility to perform feature extraction via spark-word2vec on the wikipedia (en) dataset

Language:JavaStargazers:0Issues:1Issues:1