Effect of Imputation Methods in the Classifier Performance

Cihan, Pınar; Kalıpsız, Oya; GÖKÇE, ERHAN

doi:10.16984/saufenbilder.515716

Effect of Imputation Methods in the Classifier Performance

Yazarlar (3)

Doç. Dr. Pınar Cihan Tekirdağ Namık Kemal Üniversitesi, Türkiye

Oya Kalıpsız Yıldız Teknik Üniversitesi, Türkiye

Prof. Dr. Erhan GÖKÇE Kafkas Üniversitesi, Türkiye

Makale Türü	Özgün Makale (Diğer hakemli uluslarası dergilerde yayınlanan tam makale)
Dergi Adı	Sakarya University Journal of Science
Dergi ISSN	2147-835X
Dergi Tarandığı Indeksler	TR DİZİN
Makale Dili	İngilizce	Basım Tarihi	12-2019
Cilt / Sayı / Sayfa	23 / 6 / 1225–1236	DOI	10.16984/saufenbilder.515716
Makale Linki	http://www.saujs.sakarya.edu.tr/issue/44246/515716
UAK Araştırma Alanları	Veri Madenciliği

Özet

Missing values in a dataset present an important problem for almost any traditional and modernstatistical method since most of these methods were developed under the assumption that thedataset was complete. However, in the real world, no complete datasets are available and theissue of missing data is frequently encountered in veterinary field studies as in other fields.While the imputation of missing data is important in veterinary field studies where data miningis newly starting to be implemented, another important issue is how it should be imputed. Thisis because in many studies observations with any variables having missing values are beingremoved or they are completed by traditional methods. In recent years, while alternativeapproaches are widely available to prevent the removal of observations with missing values,they are being used rarely. The aim of this study is to examine mean, median, nearest neighbors,MICE and missForest methods to impute the simulated missing data which is the randomlyremoved with varying frequencies (5 to 25% by 5%) from the original veterinary dataset. Thenhighly accurate methods selected to impute the original dataset for observation of influence inclassifier performance and to determine the optimal imputation method for the original dataset.

Anahtar Kelimeler

BM Sürdürülebilir Kalkınma Amaçları

Atıf Sayıları

Effect of Imputation Methods in the Classifier Performance

Effect of Imputation Methods in the Classifier Performance

Paylaş