Clustering and Forecasting of Covid-19 Data in Indonesia


  • Diyah Astuti Universitas Bengkulu
  • Dyah Yunita hartanti
  • Susi Tri Nurhayanti
  • Herlin Fransiska



Covid-19, Clustering, Forecasting, Provinsi, Indonesia


Indonesia reported its first case of Covid-19 in March 2020, which was suspected to have been infected by a foreigner who visited Indonesia. The distribution of cases that occurred in Indonesia has an uneven frequency considering that Indonesia is an archipelagic country, in the analysis of Covid-19 cases in Indonesia, there are many provinces and some have the same pattern of case characteristics. time series so that forecasting analysis can be used. So that clustering analysis and forecasting of Covid-19 data can be used in Indonesia. The analysis was carried out with 2 stages of analysis, namely clusters using the clustering hierarchy method and forecasting using the ARIMA method. By using 288 data from January 1, 2021 – October 15, 2021, the results show that the daily Covid-19 cases by province in Indonesia can be grouped into 2 clusters, in the forecasting analysis only one province is taken from each cluster used in determining the model, cluster 1 used data from the province of Banten and cluster 2 used data from the province of West Java. By using R software, a model for each cluster is obtained, namely ARIMA(0,1,1) for cluster 1 and ARIMA(2,1,2) for cluster 2. From the forecasting results obtained data until October 30, 2021 shows the number of cases tends to be constant.


Download data is not yet available.


Dani, Andrea TR., Sri W. & Nanda AR., 2020. Pengelompokan Data Runtun Waktu Mengunakan Analisis Cluster. Jurnal Eksponensial, Vol. 11, No. 1.

Giusti, R & Batista, G.E.A.P.A. 2013, ‘An empirical comparison of dissimilarity measures for time series classification’ In Proceedings – 2013, Brazilian Conference on Intelligent Systems, BRACIS 2013, IEEE Computer Society, IEEE, pp. 82 – 88.

Hariadi, W., & Sulantari., 2021. Application of ARIMA Model for Forecasting Additional Positive Cases of Covid-19 in Jember Regency. International Journal Of Statistics And Data Science. Vol 1, Issue 1, pp. 20-27.

Hartati, 2017. Penggunaan Metode Arima Dalam Meramal Pergerakan Inflasi. [12 Agustus 2021]

Kartikasari, MD., 2021. Forecasting COVID-19 Cases in Indonesia using Hybrid Double Exponential Smoothing. International Journal Of Statistics And Data Science. Vol 1, Issue 2, pp. 53-57.

Kementrian Kesehatan RI, 2020. Pedoman Pencegahan Dan Pengendalian Coronavirus Disease (Covid-19). Kementrian Kesehatan RI. Jakarta.

Maulana, HA., 2018. Pemodelan deret waktu dan peramalan curah hujan pada dua belas stasiun di bogor. Jurnal matematika statistika dan komputasi. Vol. 15, No. 1.

Muhidin, A., 2017. Analisa Metode Hierarchical Clustering Dan K-Mean Dengan Model Lrfmp Pada Segmentasi Pelanggan. Jurnal Teknologi Pelita Bangsa, 7(1):81-88.

Niennattrakul, V & Ratanamahatana., 2007.On Clustering Multimedia Time and Dynamic Time Warping. Computer Society, hh. 733-738.

Novidianto, Raditya., & Andrea T., 2020. Analisis Klaster Kasus Aktif Covid-19 Menurut Provinsi Di Indonesia Berdasarkan Data Deret Waktu. [15 Agustus 2021]

Rencher & Alvin., 2002, Methods of Multivariate Analysis Second Edition, Wiley-Interscience, United States of America.

Satuan Tugas Penanganan Covid-19., 2021. Analisis Data Covid-19 Di Indonesia.–data–covid-19-indonesia-update-27-jun i-2021. [15 Agustus 2021]

Yunita, Tasna., 2019. Peramalan Jumlah Penggunaan Kuota Internet Menggunakan Metode Autoregressive Integrated Moving Average (ARIMA). Journal of Mathematics: Theory and Applications, Vol. 1, No. 2.

Zhang, Z., Diya L., Zhe Z., & Nicholas D, 2021. A Time-Series Clustering Algorithm for Analyzing the Changes of Mobility Pattern Caused by COVID-19. Proceedings of the 1st ACM SIGSPATIAL International Workshop On Animal Movement Ecology And Human Mobility. 13-17, 2021, ational+journal+clustering+time+series+covid+19&btnG=#d=gsqabs&u=%23p%3D V u W975wcxsJ. [02 Desember 2021]






Research Articles