28-02-2011, 10:51 AM
[attachment=9213]
PREDICTING MISSING ITEMS IN SHOPPING CARTS
Abstract:
Existing research in association mining has focused mainly on how to expedite the search for frequently co-occurring groups of items in “shopping cart” type of transactions; less attention has been paid to methods that exploit these “frequent itemsets” for prediction purposes. This paper contributes to the latter task by proposing a technique that uses partial information about the contents of a shopping cart for the prediction of what else the customer is likely to buy. Using the recently proposed data structure of itemset trees (IT-trees), we obtain, in a computationally efficient manner, all rules whose antecedents contain at least one item from the incomplete shopping cart. Then, we combine these rules by uncertainty processing techniques, including the classical Bayesian decision theory and a new algorithm based on the Dempster-Shafer (DS) theory of evidence combination.
Existing System:
If j is the item whose absence or presence is to be predicted, for a given itemset s, the technique identifies among the rules with antecedents subsumed by s those that have the highest precedence according to the reliability of the rules—this reliability is assessed based on the rules’ confidence and support values. The rule is then used for the prediction of j. The method suffers from three shortcomings. First, it is clearly not suitable in domains with many distinct items j. Second, the consequent is predicted based on the “testimony” of a single rule, ignoring the simple fact that rules with the same antecedent can imply different consequents—a method to combine these rules is needed. Third, the system may be sensitive to the subjective user-specified support and confidence thresholds.
An early attempt by Bayardo and Agrawal reports a method to convert frequent itemsets to rules. Some papers then suggest that a selected item can be treated as a binary class (absence! 0; presence! 1) whose value can be predicted by such rules. A user asks: does the current status of the shopping cart suggest that the customer will buy bread? If yes, how reliable is this prediction? Early attempts achieved promising results and some authors even observed that the classification performance of association mining systems may compare favorably with that of machine-learning techniques.
Some of these weaknesses are alleviated in, where a missing item is predicted in four steps. First, they use a so-called partitioned-ARM to generate a set of association rules (a ruleset). The next step prunes the ruleset (e.g., by removing redundant rules). From these, rules with the smallest distance from the observed incomplete shopping cart are selected. Finally, the items predicted by these rules are weighed by the rules’ antecedents’ similarity to the shopping cart.
Proposed System:
The mechanism reported in this paper focuses on one of the oldest tasks in association mining, based on incomplete information about the contents of a shopping cart, can we predict which other items the shopping cart contains? Our literature survey indicates that, while some of the recently published systems can be used to this end, their practical utility is constrained, for instance, by being limited to domains with very few distinct items. Bayesian classifier can be used too, but we are not aware of any systematic study of how it might operate under the diverse circumstances encountered in association mining.
We refer to our technique by the acronym DS-ARM. The underlying idea is simple: when presented with an incomplete list s of items in a shopping cart, our program first identifies all high-support, high-confidence rules that have as antecedent a subset of s. Then, it combines the consequents of all these (sometimes conflicting) rules and creates a set of items most likely to complete the shopping cart. Two major problems complicate the task: first, how to identify the relevant rules in a computationally efficient manner; second, how to combine (and quantify) the evidence of conflicting rules. We addressed the former issue by the recently proposed technique of IT-trees and the latter by a few simple ideas from the DS theory.
Modules:
• Login Authentication
• Item selection
• Predicting Missing Items
• Decision making
• Adding to Shopping Cart
Hardware Requirements:
• Processor: Pentium IV or Above
• Hard Disk: 40GB or Above
• RAM: 512 or Above
Software Requirements:
• Operating System: Windows XP
• Front End: Asp.net
• Back End: SQL Server