University of Twente Student Theses

Login

Impact of ensemble machine learning methods on handling missing data

Perkowski, Ernest (2020) Impact of ensemble machine learning methods on handling missing data.

This is the latest version of this item.

[img] PDF
727kB
Abstract:Missing values are a common problem present in data from various sources. When building machine learning classifiers, incomplete data creates a risk of drawing invalid conclusions and producing biased models. This can have a tremendous impact on many business sectors or even human lives. Ensemble methods are meta-algorithms that can combine weak base estimators into stronger classifiers. Ensemble learning can make use of both ML and non-ML techniques. Using this approach proved to yield better predictions in many use cases. This research examines various usages of ensemble methods for handling missing data. Moreover, the impact of using ensemble learning is explored, given various levels of test data artificially generated based on missing at random (MAR) mechanism.
Item Type:Essay (Bachelor)
Faculty:EEMCS: Electrical Engineering, Mathematics and Computer Science
Subject:54 computer science
Programme:Business & IT BSc (56066)
Link to this item:https://purl.utwente.nl/essays/82210
Export this item as:BibTeX
EndNote
HTML Citation
Reference Manager

 

Repository Staff Only: item control page