Comparison of Elliptic Envelope Method and Isolation Forest Method on Imbalance Dataset

Authors

  • Supri Bin Hj Amir Universitas Hasanuddin
  • Bagas Prasetyo

DOI:

https://doi.org/10.20956/jmsk.v17i1.10899

Keywords:

Data mining, Imbalance Datasets, Classification, Elliptic Envelope, Isolation Forest

Abstract

The problem of unbalanced data is important in the field of Data Mining. Dataset with unbalanced classes is a dataset whose frequency of occurrence of certain classes is very much different from other classes. This imbalance problem will bias the classifier's performance. Many researchers have examined both the development of algorithms and modifications to the preprocessing stage to overcome this problem. This study discusses the comparison of One Class Classification algorithms, namely Elliptic Envelope and Isolation Forest on unbalanced data. From this study, the Elliptic Envelope Method showed better results compared to the Isolation Forest method with 80.28% recall testing and 80.28% precision while Isolation Forest showed 46.95% recall results and 46.95% precision.

Downloads

Download data is not yet available.

References

Burnaev, E., Erofeev, P. and Papanov, A., 2015. Influence of resampling on accuracy of imbalanced classification. Eighth International Conference on Machine Vision (ICMV 2015), 9875(December), p.987521.

Burnaev, E. and Smolyakov, D., 2016. One-Class SVM with Privileged Information and Its Application to Malware Detection. IEEE International Conference on Data Mining Workshops, ICDMW, 0, pp.273–280.

Kang, S. and Ramamohanarao, K., 2014. A robust classifier for imbalanced datasets. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8443 LNAI(PART 1), pp.212–223.

Moya, M.M. and Hush, D.R., 1996. Network constraints and multi-objective optimization for one-class classification. Neural Networks, 9(3), pp.463–474.

Noumir, Z., Honeine, P. and Richard, C., 2012. On simple one-class classification methods. IEEE International Symposium on Information Theory - Proceedings, (October), pp.2022–2026.

Rousseeuw, P. and Driessen, K., 1999. A Fast Algorithm for the Minimum Covariance. Technometrics, 41(3), pp.212–223.

Tony Liu, F., Ming Ting, K. and Zhou, Z.-H., 2008. Isolation Forest ICDM08. Icdm.

Downloads

Published

2020-08-24

Issue

Section

Research Articles

Most read articles by the same author(s)