Comparison of Models for Classification of Learning Achievement of Middle School Students in Indonesia in 2019 using the Support Vector Machine Algorithm, Conditional Inference Trees, and Random Forest


  • Alfina Nurpiana Politeknik Statistika STIS
  • Arie Wahyu Wijayanto Politeknik Statistika STIS



learning achievements, classification, SVM, conditional inference tree, random forest


Indonesian JHS students' learning achievement is still low. During 2015-2019, the average national exam score for Indonesian JHS has always decreased. In the last national examination, the average national exam score was 52.82 and was included in the bad category. This certainly needs to be a concern for local governments and the education office. Therefore, it is necessary to form a classification model that can be used to identify cities/districts in Indonesia which are categorized as bad or enough. This study discusses the comparison of models for the classification of learning achievement categories as seen from the average 2019 JHS results in 514 districts/cities in Indonesia using the Support Vector Machine (SVM), Conditional Inference Trees (Ctree), and Random Forest (RF) algorithms. The three algorithms were chosen because of their respective advantages, namely the SVM algorithm is known to be very powerful, Ctree as an improvement from the usual decision tree, and RF to represent ensemble learning. The independent variables used are education budget, classroom conditions, school accreditation, and teacher qualifications. From the results of this study, it has been found that the SVM algorithm produces the highest accuracy (0,80), recall (0,97), kappa statistics (0,38), and F1-score (0,87) compared to the Ctree and RF algorithms, while only precision (0,80) has the same value as the Ctree algorithm. So, the SVM algorithm produces the best model for the classification of district/city learning achievement categories in Indonesia based on education budget, classroom conditions, school accreditation, and teacher qualifications.


Download data is not yet available.


Al Amrani, Y., Lazaar, M. & El Kadiri, K. E. 2018. Random Forest and Support Vector Machine based Hybrid Approach to Sentiment Analysis. Procedia Computer Science, Vol. 127, 511-520.

Aryani, Y. & Wijayanto, A.W. 2021. Klasifikasi Pengembalian Radar dari Ionosfer Menggunakan SVM, Naive Bayes dan Random Forest. Komputika: Jurnal Sistem Komputer, Vol. 10, No. 2.

Breiman, L. & Cutler, A. 2004. RFtools for Predicting and Understanding Data. Interface Work, 1-62.

Cortes, C. & Vapnik, V. 1995. Support-vector networks. Machine learning, Vol. 20, No. 3, 273-297.

Earthman, G. I. 2017. The Relationship Between School Building Condition and Student Achievement: A Critical Examination of the Literature. Journal of Ethical Educational Leadership, Vol. 4, No. 3, 1-6.

GeeksforGeeks. 2020. Conditional Inference Trees in R Programming. GeeksforGeeks. [20 Desember 2021]

Ghosh, A. & Dey, P. 2021. Flood Severity assessment of the coastal tract situated between Muriganga and Saptamukhi estuaries of Sundarban delta of India using Frequency Ratio (FR), Fuzzy Logic (FL), Logistic Regression (LR) and Random Forest (RF) models. Regional Studies in Marine Science¸Vol. 42, 1-15.

Han, J., Kamber, M. & Pei, J. 2012. Data Mining Concepts and Techniques, Third Edition. Elsevier Inc., San Fransisco.

Iman, Q. & Wijayanto, A. W. 2021. Klasifikasi Rumah Tangga Penerima Beras Miskin (Raskin)/Beras Sejahtera (Rastra) di Provinsi Jawa Barat Tahun 2017 dengan Metode Random Forest dan Support Vector Machine. JUSTIN (Jurnal Sistem dan Teknologi Informasi), Vol. 9, No. 2, 178-184.

Kemala, I. & Wijayanto, A. W. 2021. Perbandingan Kinerja Metode Bagging dan Non-Ensemble Machine Learning pada Klasifikasi Wilayah di Indonesia menurut Indeks Pembangunan Manusia. JUSTIN (Jurnal Sistem dan Teknologi Informasi), Vol. 9, No. 2, 269-275.

Kemdikbud. 2019. Hasil UN dan IIUN: 2019. Kemdikbud. [4 Desember 2021]

Lee, S. W., & Lee, E. A. 2020. Teacher qualification matters: The association between cumulative teacher qualification and students’ educational attainment. International Journal of Educational Development, Vol. 77.

Li, C., Liao, C., Meng, X., Chen, H., Chen, W., Wei, B. & Zhu, P. 2021. Effective Analysis of Inpatient Satisfaction: The Random Forest Algorithm. Patient Preference and Adherence, Vol. 15, 691-703.

Malik, R. H., & Rizvi, A. A. 2018. Effect of Classroom Learning Environment on Students' Academic Achievement in Mathematics at Secondary Level. Bulletin of Education and Research, Vol. 40, No. 2, 207-218.

Maulidah, M., Gata, W., Aulianita, R. & Agustyaningrum, C.I. 2020. Decision Tree Classification Algorithm for Recommended Books by Book Category. E-Bisnis: Jurnal Ilmiah Ekonomi dah Bisnis, Vol. 13, No. 2, 89-96.

Nguyen, C. H. 2016. Accreditation of Schools in Vietnam: Achievements, Challenges, and Lessons Learnt. Journal of Science of Hnue, Vol. 61, No. 11, 91-98.

Nurfaizah, Hariguna, T., & Romadon, Y. I. 2019. The accuracy comparison of vector support machine and decision tree methods in sentiment analysis. Journal of Physics: Conference Series, No. 1367.

Republik Indonesia. Peraturan Presiden Nomor 18 Tahun 2020 tentang Rencana Pembangunan Jangka Menengah Nasional Tahun 2020-2024.

Republik Indonesia. Undang-Undang Dasar 1945.

Vegas, E., & Coffin, C. 2015. When Education Expenditure Matters: An Empirical Analysis of Recent International Data. Comparative Education Review, Vol. 59, No. 2, 289–304.

Wang, S., Liu, Y., Cairano-Gilfedder, C. D., Titmus, S., Naim, M. N. & Syntetos, A. A. 2018. Reliability Analysis for Automobile Engines: Conditional Inference Trees. Procedia CIRP, Vol. 72, 1392-1397.






Research Articles