mattyb149 / pig

Mirror of Apache Pig

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Apache Pig
===========
Pig is a dataflow programming environment for processing very large files. Pig's
language is called Pig Latin. A Pig Latin program consists of a directed
acyclic graph where each node represents an operation that transforms data.
Operations are of two flavors: (1) relational-algebra style operations such as
join, filter, project; (2) functional-programming style operators such as map,
reduce. 

Pig compiles these dataflow programs into (sequences of) map-reduce or Apache Tez
jobs and executes them using Hadoop. It is also possible to execute Pig Latin
programs in a "local" mode (without Hadoop cluster), in which case all 
processing takes place in a single local JVM. 

General Info
===============

For the latest information about Pig, please visit our website at:

   http://pig.apache.org/

and our wiki, at:

   http://wiki.apache.org/pig/

Getting Started
===============
1. To learn about Pig, try http://wiki.apache.org/pig/PigTutorial
2. To build and run Pig, try http://wiki.apache.org/pig/BuildPig and
http://wiki.apache.org/pig/RunPig
3. To check out the function library, try http://wiki.apache.org/pig/PiggyBank


Contributing to the Project
===========================

We welcome all contributions. For the details, please, visit
https://cwiki.apache.org/confluence/display/PIG/HowToContribute

About

Mirror of Apache Pig

License:Apache License 2.0


Languages

Language:Java 92.7%Language:Perl 4.4%Language:GAP 1.0%Language:PigLatin 0.7%Language:Shell 0.5%Language:Python 0.3%Language:Ruby 0.1%Language:HTML 0.1%Language:Batchfile 0.0%Language:XSLT 0.0%Language:JavaScript 0.0%Language:Groovy 0.0%