-
Notifications
You must be signed in to change notification settings - Fork 0
/
actuarial.qmd
46 lines (33 loc) · 2.98 KB
/
actuarial.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# Human Judgment Versus Actuarial Approaches to Prediction {#sec-judgmentVsActuarial}
## Getting Started {#sec-judgmentVsActuarialGettingStarted}
### Load Packages {#sec-judgmentVsActuarialLoadPackages}
```{r}
```
::: {.content-visible when-format="html"}
## Best Actuarial Approaches to Prediction
The best actuarial models tend relatively simple (parsimonious), that can account for one or several of the most important predictors and their optimal weightings, and that account for the base rate of the phenomenon.
Even unit-weighted formulas (formulas whose [predictor variables](#sec-correlationalStudy) are equally weighted with a weight of one) can sometimes generalize better to other samples than complex weightings [@Garb2019].
Differential weightings sometimes capture random variance and [over-fit](#sec-overfitting) the model, thus leading to predictive accuracy shrinkage in cross-validation samples [@Garb2019], as described below.
The choice of [predictor variables](#sec-correlationalStudy) often matters more than their weighting.
In general, there is often shrinkage of estimates from training data set to a test data set.
*Shrinkage* is when variables with stronger predictive power in the original data set tend to show somewhat smaller predictive power (smaller regression coefficients) when applied to new groups.
Shrinkage reflects a model [over-fitting](#sec-overfitting) (i.e., fitting to error by capitalizing on chance).
Shrinkage is especially likely when the original sample is small and/or unrepresentative and the number of variables considered for inclusion is large.
Cross-validation with large, representative samples can help evaluate the amount of shrinkage of estimates, particularly for more complex models such as machine learning models [@Ursenbach2019].
Ideally, cross-validation would be conducted with a separate sample (external cross-validation) to see the generalizability of estimates.
However, you can also do internal cross-validation.
For example, you can perform *k*-fold cross-validation, where you:
- split the data set into *k* groups
- for each unique group:
- take the group as a hold-out data set (also called a test data set)
- take the remaining groups as a training data set
- fit a model on the training data set and evaluate it on the test data set
- after all *k*-folds have been used as the test data set, and all models have been fit, you average the estimates across the models, which presumably yields more robust, generalizable estimates
An emerging technique that holds promise for increasing predictive accuracy of actuarial methods is machine learning [@Garb2019].
However, one challenge of some machine learning techniques is that they are like a "black box" and are not transparent, which raises ethical concerns [@Garb2019].
Machine learning may be most valuable when the data available are complex and there are many [predictor variables](#sec-correlationalStudy) [@Garb2019].
## Session Info {#sec-judgmentVsActuarialSessionInfo}
```{r}
sessionInfo()
```
:::