Prediksi Popularitas Novel Berbasis Fitur-Fitur Teks Menggunakan Metode Random Forest

Authors

  • Nadya Elfareta Azarin Universitas Halu Oleo
  • Rizal Adi Saputra Universitas Halu Oleo
  • Subardin Subardin Universitas Halu Oleo

DOI:

https://doi.org/10.35316/jimi.v9i1.57-62

Keywords:

Selected:Prediksi popularitas, Novel, Fitur teks, Random Forest, Pembelajaran Mesin

Abstract

In today's digital era, a novel's popularity is often measured by reader response and sales. This research aims to develop a novel popularity prediction model based on text features to provide insights to authors and publishers about the factors that influence reader acceptance. The method used in this research is Random Forest, a machine learning algorithm that can handle classification and regression well. The main goal of this research is to develop a predictive model that can identify key factors that contribute to the popularity of novels. The proposed method integrates text features, such as keyword extraction and sentiment analysis, in a Random Forest framework to predict popularity with high accuracy. The dataset used consists of various novel information, including title, genre, number of pages, and text features such as summary or description. Data is preprocessed to address issues such as missing values ​​and duplicates. Feature extraction is carried out by applying tokenization, stemming, and converting text into TF-IDF vectors. A Random Forest model was built incorporating these features, and the model parameters were optimized through a cross-validation process. The dataset used consists of various novel information, including title, genre, number of pages, and text features such as summary or description. Data is preprocessed to address issues such as missing values ​​and duplicates. Feature extraction is carried out by applying tokenization, stemming, and converting text into TF-IDF vectors. A Random Forest model was built incorporating these features, and the model parameters were optimized through a cross-validation process. The experimental results show that the Random Forest model is able to predict the popularity of novels with a satisfactory level of accuracy. Text features, such as keyword frequency and sentiment analysis, proved significant in their contribution to the predictive ability of the model. These findings provide valuable insight to authors and publishers in understanding reader preferences and the potential success of a novel.

Downloads

Download data is not yet available.

References

[1] N. Suciati and U. Yulianto, “Analisis Sentimen terhadap Calon Presiden Indonesia Tahun 2019 Menggunakan Metode Naive Bayes Classifier,” IKRA-ITH TEKNOLOGI: Jurnal Teknik Informatika, vol. 3, no. 2, pp. 71-82, 2019

.

[2] A. Novitasari, T. A. Setiawan, and A. M. Arymurthy, "Klasifikasi Opini Film Berbahasa Indonesia pada Review Menggunakan Convolutional Neural Network (CNN)," Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 2, no. 12, pp. 11535-11543, Dec. 2018.

[3] Z. Maharani, S. Rikhama, and A. Filza, "Analisis Sentimen Opini Publik pada Media Sosial terkait Isu Kenaikan Tarif Listrik 2019 Menggunakan Metode Naïve Bayes Classifier," Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 3, no. 3, pp. 2605-2614, Mar. 2019.

[4] Y. Arum Sari, "Prediksi Rating Novel Baru Berdasarkan Sinopsis Menggunakan Genre Based Collaborative Filtering dan Text Similarity," Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 3, no. 3, pp. 2768-2773, Mar. 2019.

[5] S. Ibad and Homaidi, "Analisis Pemodelan Sistem Pengaduan Kasus Menggunakan Object Oriented Method (Unified Modelling Language)," Jurnal Ilmiah Informatika, vol. 4, no. 1, pp. 47-52, Jun. 2019.

[6] E. A. Nasrullah, A. Prabuwono, and R. A. Saputra, “Penerapan Regresi Linier Berganda untuk Prediksi Penjualan Produk,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 6, no. 4, pp. 369-376, 2019.

[7] N. Aini, A. Mahendra, and R. Sarno, “Analisis Sentimen Berbasis Lexicon untuk Review Produk Menggunakan Algoritma K-Nearest Neighbor,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 3, no. 6, pp. 1863–1872, 2019.

[8] R. A. Saputra and S. A. Alamsyah, “Penerapan Decision Tree untuk Prediksi Kelulusan Mahasiswa,” Jurnal Informatika Universitas Pamulang, vol. 3, no. 1, pp. 9-15, 2020.

[9] F. P. Dinata and R. A. Saputra, “Analisis Perbandingan Algoritma K-Nearest Neighbor dan Naive Bayes untuk Prediksi Penyakit Jantung,” Jurnal Teknologi dan Sistem Informasi, vol. 6, no. 2, pp. 69-76, 2020.

[10] M. Alamsyah and N. H. Wibowo, “Penerapan Algoritma C4.5 untuk Prediksi Kelulusan Mahasiswa,” Jurnal Informatika: Jurnal Pengembangan IT, vol. 5, no. 1, pp. 1-8, 2021.

Published

15-06-2024 — Updated on 17-05-2025

Versions

How to Cite

Azarin, N. E., Saputra, R. A., & Subardin, S. (2025). Prediksi Popularitas Novel Berbasis Fitur-Fitur Teks Menggunakan Metode Random Forest. Jurnal Ilmiah Informatika, 9(1), 57–62. https://doi.org/10.35316/jimi.v9i1.57-62 (Original work published June 15, 2024)

Similar Articles

1 2 > >> 

You may also start an advanced similarity search for this article.