CHIC601 Project: Survival Modelling & Analysis
NHS England COVID-19 Hospital Activity & ISARIC Study Excerpt:
It is probably impossible to compare the study's patient population to the NHS England patient population of the same time period. There are a few reasons why
- The COVID-19 Hospital Activity Data:
- Does not include a daily record of new COVID-19 patients admissions. Instead, it continuously updates a number that denotes the number of admitted COVID-19 patients that were still in hospital during the last 24 hours
- Does not include in-hospital COVID-19 deaths.
- Has a different age groups demarcation:
[0 5]
,[6 17]
,[18 64]
,[65 84]
,85+
Therefore, for example, the discharges numbers of the Hospital Activity Data cannot be compared with those of the study because the baseline numbers that the discharges are relative to are unknown.
Important Modelling & Analysis Points:
- Internal validation
- External validation
- C Indices
- Calibration plots
Excluding fields:
# Either
data %>% dplyr::select(!outcome)
data[, !(names(data) %in% 'outcome')]
Hmisc::naclus()
:
# the diagonal is.na(i)/(number of instances)
# the off-diagonal is.na(i, j)/(number of instances)
# colSums(is.na(data)) / nrow(data)
# sum(is.na(data$asthma) & is.na(data$pulmonary)) / nrow(data)
estimates <- data.frame(lower = seq(30, 90, 10), upper = seq(39, 99, 10))
estimates <- estimates %>%
mutate(m = rowMedians(as.matrix(estimates[, c("lower", "upper")])))
ggplot2
& subplots
Use library(patchwork)
; cf. facet_wrap()
.
variable | elements | frequencies |
---|---|---|
sex |
male | 28116 |
female | 21789 | |
not specified | 91 | |
NA | 4 | |
asthma |
no | 40458 |
yes | 6393 | |
unknown | 2739 | |
NA | 410 | |
liver_mild |
no | 45550 |
unknown | 3327 | |
yes | 712 | |
NA | 411 | |
renal |
no | 38210 |
yes | 8623 | |
unknown | 2767 | |
NA | 400 |
variable | elements | frequencies |
---|---|---|
pulmonary |
no | 38255 |
yes | 8710 | |
unknown | 2628 | |
NA | 407 | |
neurological |
no | 40381 |
yes | 6227 | |
unknown | 2994 | |
NA | 398 | |
liver_mod_severe |
no | 45430 |
unknown | 3195 | |
yes | 961 | |
NA | 414 | |
malignant_neoplasm |
no | 45430 |
unknown | 3195 | |
yes | 961 | |
NA | 414 |
variable | elements | frequencies |
---|---|---|
outcome |
Discharged alive | 29097 |
Death | 15233 | |
Transferred | 2941 | |
Remains in hospital | 1434 | |
Palliative discharge | 941 | |
NA | 297 | |
Unknown | 57 | |
outcome_date |
... | 49039 |
NA | 961 |
- Edit the help file skeletons in 'man', possibly combining help files for multiple functions.
- Edit the exports in 'NAMESPACE', and add necessary imports.
- Put any C/C++/Fortran code in 'src'.
- If you have compiled code, add a useDynLib() directive to 'NAMESPACE'.
- Run R CMD build to build the package tarball.
- Run R CMD check to check the package tarball.
Read "Writing R Extensions" for more information.
# https://cran.r-project.org/web/packages/devtools/index.html
# https://cran.r-project.org/bin/windows/Rtools/rtools40.html
# https://github.com/binderh/CoxBoost
library(devtools)
install_github(repo = 'binderh/CoxBoost')
- warehouse/training/models/boosted