Computer Science And Technology - Research Publications
Permanent URI for this collectionhttps://kr.cup.edu.in/handle/32116/82
Browse
Item Detection of malicious URLs in big data using RIPPER algorithm(Institute of Electrical and Electronics Engineers Inc., 2018) Thakur, S.; Meenakshi, E.; Priya, A.'Big Data' is the term that describes a large amount of datasets. Datasets like web logs, call records, medical records, military surveillance, photography archives, etc. are often so large and complex, and as the data is stored in Big Data in the form of both structured and unstructured therefore, big data cannot be processed using database queries like SQL queries. In big data, malicious URLs have become a station for internet criminal activities such as drive-by-download, information warfare, spamming and phishing. Malicious URLs detection techniques can be classified into Non-Machine Learning (e.g. blacklisting) and Machine learning approach (e.g. data mining techniques). Data mining helps in the analysis of large and complex datasets in order to detect common patterns or learn new things. Big data is the collection of large and complex datasets and the processing of these datasets can be done either by using tool like Hadoop or data mining algorithms. Data mining techniques can generate classification models which is used to manage data, modelling of data that helps to make prediction about whether it is malicious or legitimate. In this paper analysis of RIPPER i.e. JRip data mining algorithm has been done using WEKA tool. A training dataset of 6000 URLs has been made to train the JRip algorithm which is an implementation of RIPPER algorithm in WEKA. Training dataset will generate a model which is used to predict the testing dataset of 1050 URLs. Accuracy are calculated after testing process. Result shows JRip has an accuracy of 82%. ? 2017 IEEE.Item Detection of phishing websites using C4.5 data mining algorithm(Institute of Electrical and Electronics Engineers Inc., 2018) Priya, A.; Meenakshi, E.Phishing sites are fake sites that are made by deceptive persons which are copy of genuine sites. These websites look like an official website of any company such as bank, institute, etc. The main aim of phishing is that to steal sensitive information of user such as password, username, pin number, etc. Victims of phishing attacks may uncover their money related delicate data to the attackers who may utilize this data for budgetary and criminal exercises. Different technical and non-technical approaches have been proposed to identify phishing sites. Non-Technical approach has no solution against the fast disappearance feature of phishing websites. Data mining technique, one of the classifications of technical approach, has shown promising results in detection of phishing websites. As compared to non-technical approaches, data mining techniques can generate classification models which can make prediction on phishing websites in real-time. In this paper analysis of C4.5 (J48) data mining algorithm has been done using WEKA tool. C4.5 is a benchmark data mining technique which can accurately identify phishing websites. A training dataset of 750 URLs has been made to train the algorithm J48, which is an implementation of C4.5 algorithm in WEKA. Testing dataset of 300 URLs is used to make prediction using the classifier generated after the training of J48. True positive rate, True negative rate, False positive rate, False negative rate, Success rate, Error rate and Accuracy are calculated after testing process. Result shows C4.5 has an accuracy of 82.6%. ? 2017 IEEE.Item Identification of Counterfeit Indian Currency Note using Image Processing and Machine Learning Classifiers(Institute of Electrical and Electronics Engineers Inc., 2023-03-27T00:00:00) Sharan, Vivek; Kaur, Amandeep; Singh, ParvinderTechnology is continuously changing our life. Day by day, it makes our life easy, but some challenges and issues exist. Counterfeit currency is one of them. It happens because of the production and circulation of currency without the permission of an authorized system. Some people use scanning and printing technology to produce such notes and circulate them around us, which is a kind of forgery.It leads to personal loss and degrades the Country's economy. Such notes are very similar to the original, which becomes a problem for ordinary people to identify the authenticity of the currency, especially for visually impaired people. However, most researchers proposed different methods to differentiate real notes from fake ones based on the currency's shape, colors, and size. The notes are easily detected when the condition of the note is good but difficult when it deteriorates over time. Extracting features from such notes is another challenging task. Some systems are available to only specific sectors and not easily available to common people. Therefore, a system that can distinguish between real and fake notes is required. This article classifies Indian Currency notes as real or fake with four supervised Machine Learning algorithms i.e., Support Vector Classifier, K-Nearest Neighbor, Decision Tree, and Logistic Regression followed by Image Processing techniques. This study has implemented these algorithms on the dataset of 1372 currency image samples, out of which 762 are real, and the remaining are fake, available on the UCI machine learning repository. Further, the performance of all the algorithms is measured in terms of accuracy, recall, Precision, and F1-score and after analyzing, it is observed that the K-Nearest Neighbor leverage outstanding results compared to other algorithms. � 2023 IEEE.