Load Libraries:
- caret
- data.table
- dplyr
- sets
- scales
- tidyr
- stringr
- Load patient data from a CSV file. (select your desired geographic level & years from AHRQ SDOHD).
- Load years of AHRQ SDOH data from CSV files.
- Load a CSV file containing feature names for AHRQ SDOH variables.
- Merge AHRQ data from multiple years into a single data frame.
- Pad ZIP codes and STATEFIPS codes with leading zeros for consistency.
- Optionally perform imputation for missing values.
Merge the preprocessed AHRQ data with patient data using crosswalk variables: STATEFIPS ZIPCODE YEAR