AnkushKhanna / hadoop-apriori

Finding frequent pair using Apriori on Hadoop cascading.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

hadoop-apriori

Finding frequent pair using Apriori on Hadoop cascading.

Original data of 1.5 GB was processed and broken down into important information such as:

Sessions Products-visited("|" separated)

Apriori algorithm is runned on information above and create L1 and L2 files.

Runned sequentially producing L1 and using L1 via DistributedCache to produce L2. All process run on hadoop.

About

Finding frequent pair using Apriori on Hadoop cascading.


Languages

Language:Java 100.0%