Datamining by krithivasanchandran

Apriori Algorithm - Mining Frequent k+ Itemsets

Our transaction database is a set of reviewers from Amazon.com. Specifically, reviewer ids are our items. The transaction is a set of reviewer ids. Specifically, all reviewer ids which were used to post a review on that product. The format of the transaction database follows the standard format seen in class. Each line represents a transaction. For a given transaction, the items (reviewer ids) are separated by a space character.

$ cd krithivasanchandran/DataMining
$ git fetch origin
$ git checkout DataMining

If you're using the GitHub for Mac, simply sync your repository and you'll see the new branch.

Implementation

A frequent k+ itemset refer to an itemset whose size is k (i.e., it has k elements) and the support of that itemset (no. of times that itemset appears in the transaction database)exceeds minimum_support. Frequent k+ itemsets refers to all frequent itemsets (i.e., itemsets appearing more than minimum_support times in the data) which have sizes greater than k.

Environment

JRE 7 and above
Heap Memory Allocation : 1024 MB minimum required
Concurrent File Read and Writes is implemented

Author

2015, Krithivasan Chandran (@KrithivasanChandran).

Support or Contact

In case of any issues in the code contact c.keerthivasan@gmail.com and I’ll help you to sort it out.