Identifikasi Website Phishing Menggunakan Algoritma Classification And Regression Trees (CART)
Abstract
With the increase in internet users and the development of technology, the threats to its security are increasingly diverse. One of them is phishing which is the most important issue in cyberspace. Phishing is a threatening and trapping activity someone by luring the target to indirectly provide information to the trapper. The number of phishing crimes, this has the potential to cause several losses, one of which is namely about the loss of privacy of a person or company. This study aims to identify phishing websites. The Classification And Regression Trees (CART) algorithm is one of the classification algorithms, and the dataset in this research taken from the UCI Repository Learning obtained from the University of Huddersfield. The method used in this research is problem identification, data collection, pre-processing stage, use of the CART algorithm, validation and evaluation and withdrawal conclusion. Based on the test results obtained the value of accuracy of 95.28%. Thus the value of the accuracy obtained using the CART algorithm of 95.28% categorized very good classification.
Downloads
References
A. S. Gulo, S. Lasmadi, and K. Nawawi, “Cyber Crime dalam Bentuk Phising Berdasarkan Undang-Undang Informasi dan Transaksi Elektronik,” PAMPAS J. Crim. Law, vol. 1, no. 2, pp. 68–81, 2021.
G. P. Riyanto, “Jumlah Pengguna Internet Indonesia 2021 Tembus 202 Juta.” [Online]. Available: https://tekno.kompas.com/read/2021/02/23/16100057/jumlah-pengguna-internet-indonesia-2021-tembus-202-juta. [Accessed: 02-Jul-2021].
M. H. Wibowo and N. Fatimah, “Ancaman phishing terhadap pengguna sosial media dalam dunia cyber crime,” JOEICT(jurnal Educ. Inf. Commun. Technol., vol. 1, pp. 1–5, 2017.
K. L. Chiew, K. S. C. Yong, and C. L. Tan, “A survey of phishing attacks: Their types, vectors and technical approaches,” Expert Syst. Appl., vol. 106, pp. 1–20, 2018.
S. S. M. Motiur Rahman, T. Islam, and M. I. Jabiullah, “PhishStack: Evaluation of Stacked Generalization in Phishing URLs Detection,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 2410–2418, 2020.
F. Eka Purwiantono and A. Tjahyanto, “Model Klasifikasi Untuk Deteksi Situs Phising Di Indonesia,” p. 156, 2017.
M. Al-diabat, “Detection and Prediction of Phishing Websites using Classification Mining Techniques,” vol. 147, no. 5, pp. 5–11, 2016.
Z. Halim, “Prediksi Website Pemancing Informasi Penting Phising Menggunakan Support Vector Machine (SVM),” Inf. Syst. Educ. Prof., vol. 2, no. 1, pp. 71–82, 2017.
A. S. Sunge, “Optimasi Algoritma C4.5 Dalam Prediksi Web Phishing Menggunakan Seleksi Fitur Genetic Algoritma,” Paradigma, vol. 10, no. 2, pp. 27–32, 2018.
P. Subarkah, E. P. Pambudi, S. Oktaviani, and N. Hidayah, “Perbandingan Metode Klasifikasi Data Mining untuk Nasabah Bank Telemarketing,” vol. 20, no. 1, 2020.
R. M. A. Mohammad, L. M. Cluskey, and F. Thabtah, “Dataset Website Phishing,” 2015. [Online]. Available: https://archive.ics.uci.edu/ml/datasets/Phishing+Websites. [Accessed: 02-Jun-2021].
A. Moro et al., “Prognostic factors differ according to KRAS mutational status: A classification and regression tree model to define prognostic groups after hepatectomy for colorectal liver metastasis,” Surg. (United States), vol. 168, no. 3, pp. 497–503, 2020.
R. Timofeev, Classification and Regression Trees (CART) Theory and Aplications. Berlin: Humboldt University, 2004.
J. Han, M. Kamber, and J. Pei, Data mining: concepts and techniques, Third Edit., vol. 5. USA: Elsevier, 2012.
M. Han, J., & Kamber, Data Mining Concepts, Model and Techniques 2nd Edition. San Fransisco: Elsevier, 2006.
F. Gorunescu, Data mining Concepts, Models and Techniques. Verlen Berlin: Springer, 2011.
P. Subarkah, “Penerapan Algoritma Klasifikasi Classification And Regression Trees ( CART ) Untuk Diagnosis Penyakit Diabetes Retinopathy,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 19, no. 2, pp. 294–301, 2020.
Copyright (c) 2021 Jurnal Ilmiah Informatika
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.