Missing data analysis using machine learning methods to predict the performance of technical students

Authors

DOI:

https://doi.org/10.5335/rbca.v12i2.10565

Keywords:

Missing Data Treatment Methods, Machine Learning, Evaluation of algorithms

Abstract

Machine learning (ML) has become an emerging technology able to solve problems in many areas, including education, medicine, robotic and aerospace. ML is a specific field of artificial intelligence which designs computational models able to learn from data. However, to develop a ML model, it is necessary to ensure data quality, since real-world data is incomplete, noisy and inconsistent. This paper evaluates state-of-the-art
missing data treatment methods using ML algorithms to classify the performance of technical high school students at the Federal Institute of Goiás in Brazil. The aim is to provide an efficient computational tool to aid educational performance that allows the educators to verify the student’s tendency to fail. The results indicate that ignoring and discarding method outperforms other missing data treatment methods. Moreover, the tests reveal that Sequential Minimal Optimization, Neural Networks and Bagging outperform the other ML algorithms, such as Naive Bayes and Decision tree, in terms of classification accuracy.

Downloads

Download data is not yet available.

Downloads

Published

2020-07-06

Issue

Section

Original Paper

How to Cite

[1]
2020. Missing data analysis using machine learning methods to predict the performance of technical students. Brazilian Journal of Applied Computing. 12, 2 (Jul. 2020), 134–143. DOI:https://doi.org/10.5335/rbca.v12i2.10565.