Data analysis and visualization on titanic and student’s performance datasets-an exploratory study
Abstract
Exploratory data analysis (EDA) is all about exploring the data in order to identify any underlying pattern before you try to use it to make a predictive model. It also plays a major role in the data discovery process as it is used to analyze data and to recapitulate their different characteristics, which is displayed efficiently with the help of data visualization methods. This paper aims to identify errors in the dataset, to understand the existing hidden structure and to identify new ones, to detect points in a dataset that deviate to a greater extent from the collected data (outliers), and also to find any relationship or intersection between the variables and constants. Two datasets are used namely ‘Titanic’ and ‘student’s performance’ to perform data analysis and ‘data visualization’ to depict ‘exploratory data analysis’ which acts as an important set of tools for recognizing a qualitative understanding. The datasets were explored and hence it assisted with identifying patterns, outliers, corrupt data, and discovering the relationship between the fields in the dataset.
Keywords
Full Text:
PDFDOI: http://doi.org/10.11591/ijict.v14i1.pp68-76
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
The International Journal of Informatics and Communication Technology (IJ-ICT)
p-ISSN 2252-8776, e-ISSN 2722-2616
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).