Comparison and analysis of logistic regression, Na�ve Bayes and KNN machine learning algorithms for credit card fraud detection

No Thumbnail Available

Date

2020-02-15T00:00:00

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Science and Business Media B.V.

Abstract

Financial fraud is a threat which is increasing on a greater pace and has a very bad impact over the economy, collaborative institutions and administration. Credit card transactions are increasing faster because of the advancement in internet technology which leads to high dependence over internet. With the up-gradation of technology and increase in usage of credit cards, fraud rates become challenge for economy. With inclusion of new security features in credit card transactions the fraudsters are also developing new patterns or loopholes to chase the transactions. As a result of which behavior of frauds and normal transactions change constantly. Also the problem with the credit card data is that it is highly skewed which leads to inefficient prediction of fraudulent transactions. In order to achieve the better result, imbalanced or skewed data is pre-processed with the re-sampling (over-sampling or under sampling) technique for better results. The three different proportions of datasets were used in this study and random under-sampling technique was used for skewed dataset. This work uses the three machine learning algorithms namely: logistic regression, Na�ve Bayes and K-nearest neighbour. The performance of these algorithms is recorded with their comparative analysis. The work is implemented in python and the performance of the algorithms is measured based on accuracy, sensitivity, specificity, precision, F-measure and area under curve. On the basis these measurements logistic regression based model for prediction of fraudulent was found to be a better in comparison to other prediction models developed from Na�ve Bayes and K-nearest neighbour. Better results are also seen by applying under sampling techniques over the data before developing the prediction model. � 2020, Bharati Vidyapeeth's Institute of Computer Applications and Management.

Description

Keywords

Credit card fraud, Fraud detection, KNN, Logistic regression, Na�ve Bayes, Random under-sampling

Citation

Endorsement

Review

Supplemented By

Referenced By