Skip to content

Latest commit

 

History

History

cap10

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Chapter 10 Some Robust Data Analyses


Atkinson,A.C., Riani,M., Corbellini,A., Perrotta D., and Todorov,V. (2024), "Applied Robust Statistics through the Monitoring Approach: Applications in Regression", Heidelberg: Springer Nature.

Abstract

In this last chapter, we exhibit the power of the methods described earlier, by analysing five datasets. We start in section 10.2 with the two sets of income data from section 1.4. Without explanatory variables we found the log transformation for the former, while the analysis of the latter remained inconclusive. When explanatory variables are included, and outliers deleted, the square-root transformation is indicated for both. In section 10.4 we analyse 1711 responses to a survey on customer loyalty, in which there are six explanatory variables. Parametric methods lead to $\sqrt{y}$ as the response, the identification of 41 outliers and a skewed distribution of residuals. RAVAS followed by the FS provides a good approximation to normally distributed errors, when only nine observations are deleted. Despite transformation and outlier detection, the $t$-statistics for the significance of the variables hardly change. Accordingly, in section 10.5 we modify 25 observations: monitoring plots reveal the outliers and the results of the RAVAS analysis are close to those for the uncontaminated data. Finally, we analyse the NCI-60 cancer cell data (Chapter 9). With only seven explanatory variables, we monitor LS diagnostics, detect outliers and apply RAVAS, which gives the best fitting model. The generalized candlestick plot provides further model selection.

Code to reproduce Figures and Tables in this Chapter

FileName Description Open in MATLAB on line Jupiter notebook
CustomerLoyalty.m Customer Loyalty data
This file creates Figures 10.22-10.36 and Tables 10.5-10.6. Figures 10.22 and 10.24 are created in a non interactive way. In order to create them interactively, please use file CustomerLoyaltyInteractive.m
Open in MATLAB Online [ipynb]
CustomerLoyaltyInteractive.m This file is referred to dataset Customer Loyalty
This is just the interactive part It creates Figures 10.22 and 10.24 The file which creates all the Figures and for this dataset is called CustomerLoyalty.m
Open in MATLAB Online [ipynb]
CustomerLoyaltyModified.m Modified Customer Loyalty data.
This file creates Figures 10.37-10.46 and Table 10.7.
Open in MATLAB Online [ipynb]
CustomerLoyaltyModifiedInteractive.m Modified Customer Loyalty dataset (interactive part).
It creates Figures 10.37 ---- 10.46 and Table 10.7
Open in MATLAB Online [ipynb]
Income1Regression.m Regression analysis of dataset Income1.
This file creates Figures 10.1-10.11 and Table 10.1
Open in MATLAB Online [ipynb]
Income2regression.m Regression analysis of dataset Income2.
This file creates Figures 10.12-10.21 and Tables 10.2-10.4.
Open in MATLAB Online [ipynb]
NCI60.m NCI 60 Cancer Cell Panel Data.
This file creates Figures 10.47-10.62 and Tables 10.8-10.10.
Open in MATLAB Online [ipynb]