Angiographic heart disease medical patient prediction using Machine Learning techniques: Feature Extraction, Supervised and Unsupervised Learning
Determining presence of any kind of diseases is a skill which has always been needed by society and, up until recently, could only be performed meticulously by doctors with extensive training and experience.
Our problem of interest is to be able to take advantage of the high computational power available nowadays by using various Machine Learning techniques upon patients’ data, in order to detect accurately and rapidly whether such patients are suffering from diseases.
For this project, we have decided to focus on detecting the presence of angiographic heart disease using a dataset provided by UCL and Kaggle. The dataset’s creators were Andras Janosi, M.D., William Steinbrunn, M.D., Matthias Pfisterer, M.D. and Robert Detrano, M.D.
Firstly, we have analysed our dataset using various data visualization and feature extraction methods, among which the most beneficial for our project was PCA.
Afterwards, we have performed and evaluated the performance and characteristics of various types of Supervised Learning models upon the Heart Disease data, using Neural Networks, Decision Trees, Logistic Regressions and baselines for model comparison.
Lastly, we have investigated patient readings grouping and anomaly detection using Unsupervised Learning methods of density estimation and clustering, together with finding frequently-occurring disease-confident patterns from patients' data using association mining.