16-03-2011, 03:36 PM
Presented by:
Jean Ouafo
[attachment=10327]
Distributed Data Mining in Credit Card Fraud Detection
Defintion of Data Mining
➢ Method of searching Data with mathematical Algorithms
● Typical applications
● New Applications where we can observed Data Mining:
- Business and E-Commerce Data
- Scientific , Engeneering
- Web Data
Defintion of Credit Card
Visa
MasterCard
American express
Discover
Definition of Credit Card
- Any Card , plate , or coupon book that may be used repeatedly to borrow money or buy products and ser vice on credit.
- Many forms of credit card:letter of credit,earnings credit rate, etc
- Informations about Credit Card
Introduction
Credit Card transactions continue to grow in number, taking a large share of the US payment system and follow thus to a higher rate of stolen account numbers and the Banks losse much money. Goals of fraud detection
The 3 Steps:
✔ High efficient technique,
✔ Data are highly skewed
✔ choose Cost-based techniques
Black box fraud detection
JAM(Java agents for Meta Learning)
Meta-Learning
● Apply to the area of Data miming
● Combine the prediction from multiple models
✔ Reduce the cost of fraud through timely
✔ Minimize the losses by catching fraud more rapidly
✔ Minimize Costs assoziated with false alarms
JAM
2 techniques with JAM:
● Local fraud Detection agents to learn how to detect fraud
● Secure, integrated meta-detection system ot view the network transaction
Credit Card Data and Cost Models
The transaction data are characterized by some very special proportions:
• The probability of a fraud transaction is very low(0.2%)
• Most of the 38 data fields (about 26 fields) per transaction contain symbolic data as merchant code, account number,client,name, etc.....
Cost Models
. A symbolic field can contain as low as two values(e.g. the kind of credit card) up to several hundred thousand values (as the merchant code).
.Transactions with a confidence for fraud of higher than 10% are accepted to be revised or aborted.
Experiment and Results
4 learning algorithms:
C4,5
CART
RIPPER
BAYES
Results
Mining the analog Data
Results
The neuronal network experts for analog Data.
Diagnosis sequence data
2 Ideas:
first, there can be the typical fraud sequences, for instance the behavior of a thief after copying or picking the credit card.
Second, there can be a typical behavior of the user which it does not correspond to the actual.transaction sequence may indicate a credit card misuse.
Combining analog and symbolic Information
time
Fraud y/n
Combining with Sequential
AdaCost Algo
Used internal heuristics based upon training acuracy
Learning Algorithm to predict fraud.
Employs internal metrics of misclassification cost.
Conclusion
In summary, We can observe that the combined power of rule and analog expert does not increase the amount of detected fraud, but detect it more securely with 100% confidence just as we expected. Nevertheless, the probability of fraud detection is too low compared with the rule based system only. Therefore, we tested the strategy of adding additional rules even with lower confidence.
Links : twocrowsglossary.html
Distributed data mining in credit card fraud detection
P. Chan, W. Fan, A. Prodromidis, and S. Stolfo
IEEE Intelligent Systems, 14(6):67-74, 1999.
JAM: Java Agents for Meta-learning over Distributed Databases
S.J. Stolfo, D. Fan, W. Lee, A. Prodromidis, P. Chan