A Case Study on Diagnostics of Medical Data Set to Detect the Outliers in Systolic Blood Pressure using Machine Learning Approach.
Abstract
An outlier in regression analysis is an observation with a large residual compared to the other observations in the data set. Influential points and outliers must be identified as part of the machine learning based method like regression analysis. To find and delete aberrant values from data, outlier detection methods were applied. In this research, we use basic linear regression models to find outliers in a medical data set. First, using current residuals and standardized residuals methodologies, we analyses the presence of outliers. Then, instead of using anticipated values, we employed the novel approach of standardized scores to discover outliers. Real-world data was used to validate the new approach's performance. The usual residuals, according to Chatterjee and Hadi, are not suitable for diagnostic purposes; an altered version is preferred.