Scale Unlimited (ScaleUnlimited)

Scale Unlimited

ScaleUnlimited

Geek Repo

Location:Nevada City, CA

Home Page:http://www.scaleunlimited.com

Github PK Tool:Github PK Tool

Scale Unlimited's repositories

flink-crawler

Continuous scalable web crawler built on top of Flink and crawler-commons

Language:JavaLicense:Apache-2.0Stargazers:52Issues:11Issues:111

cascading.solr

Cascading scheme for Solr

cascading.utils

Utilities for Cascading

Language:JavaStargazers:22Issues:0Issues:0

cascading.avro

Cascading Scheme for the Apache Avro data serialization format

Language:JavaLicense:NOASSERTIONStargazers:19Issues:0Issues:0

cascading.simpledb

Cascading Tap & Scheme for Amazon's SimpleDB

Language:JavaStargazers:12Issues:0Issues:0

wikipedia-ngrams

Code to split/parse Wikipedia XML dump

Language:JavaStargazers:12Issues:0Issues:0

text-similarity

Source code for blog post series on text features for similarity calculation

Language:JavaStargazers:11Issues:0Issues:0

flink-streaming-kmeans

Simple implementation of KMeans clustering on Flink, using iterations

Language:JavaLicense:Apache-2.0Stargazers:10Issues:0Issues:0

liblinear-java

Java version of LIBLINEAR

Language:JavaLicense:BSD-3-ClauseStargazers:5Issues:0Issues:0

cascading.snippets

Snippets of useful Cascading code.

Language:JavaStargazers:1Issues:0Issues:0

ec2instances.info

Amazon EC2 instance comparison site

Language:JavaScriptStargazers:1Issues:0Issues:0

scaleunlimited.github.com

Maven repo for Java components that aren't in a public Maven repo.

Stargazers:0Issues:0Issues:0

atomizer

Cascading-based workflow to process noisy record-based data

Language:JavaStargazers:0Issues:0Issues:0

cascading

Cascading is a feature rich API for defining and executing complex and fault tolerant data processing workflows on various cluster computing platforms. Please see https://github.com/cwensel/cascading for access to all WIP branches.

License:NOASSERTIONStargazers:0Issues:0Issues:0

cascading.classify

Linear SVM for Cascading-based workflows

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

cascading.cuke

Integration of Cucumber with Cascading

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

cascading.lucene

Cascading 2.0 Scheme for writing out Lucene indexes using Tuple field values.

Language:JavaStargazers:0Issues:0Issues:0

cucumber-jvm

Cucumber for the JVM

License:MITStargazers:0Issues:0Issues:0

fastText

Library for fast text representation and classification.

Language:HTMLLicense:NOASSERTIONStargazers:0Issues:0Issues:0

flink-crawler-ccdemo

Demo of using flink-crawler to extract pages from Common Crawl for a target language

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

flink-multisource

Classes that wrap multiple source functions in useful ways

License:Apache-2.0Stargazers:0Issues:0Issues:0

flink-utils

Utilities for use with Flink

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

fse4j

Java port of FiniteStateEntropy project in GitHub (https://github.com/Cyan4973/FiniteStateEntropy)

License:Apache-2.0Stargazers:0Issues:0Issues:0

http-fetcher

Wrapper code for Apache HttpClient that provides common page fetching functionality

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

JFastText

Java interface for fastText

Language:JavaLicense:NOASSERTIONStargazers:0Issues:0Issues:0

lucene-solr

Mirror of Apache Lucene + Solr

Language:JavaStargazers:0Issues:0Issues:0

pinot

Apache Pinot (Incubating) - A realtime distributed OLAP datastore

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

tenaya

Tenaya is code that processes FASTQ files from the Sequence Read Archive (SRA), and identifies reads with bad metadata (e.g. wrong species) and/or bad read data.

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

wikiwords

Code to create mapping from words to Wikipedia article titles (topics) and categories

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0

yahoo-streaming-benchmark

An extension of Yahoo's Benchmarks

Language:JavaLicense:Apache-2.0Stargazers:0Issues:0Issues:0