hanborq / pigml

An Apache Pig based machine learning pack for bigdata

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

  1. What's PigML?

    PigML is a machine learning pack leveraging Apache Pig.

    Machine learning algorithms in PigML will typically consists a) Pig UDFs - the atomic and core part algorithm b) PigLatin scripts - connect and make data flow between UDFs, including data loading, storing, filtering, joining, grouping etc c) shell scripts - connect and make data flow between PigLatin scripts.

  2. Why PigML?

    When talking about big data and machine learning, there exists alternatives like Apache Mahout. Yet we believe Pig is a great utility doing this job in both development efficiency and runtime flexibility.

  3. How to use it?

    Description of algorithms and easy to go sample Pig scripts (PigLatin) will come along with the codes.

  4. How to contribute?

    [TODO]

About

An Apache Pig based machine learning pack for bigdata


Languages

Language:Java 100.0%