Forecasting Runoff Triangles
Forecasting Runoff Triangles
Forecasting Runoff Triangles
Piet de Jong*
ABSTRACT
This paper deals with the methodology of liability forecasting using the runoff triangle data. Techniques are based on time series models and methods that facilitate the calculation of forecast distributions and the assessment of model t. The models deal with correlation within triangles. Correlations are critical to proper reserving. The output of the methodology is the complete shape of the liability distribution. Methods are applied to a well-known runoff triangle and results compared to those from previous studies.
1. INTRODUCTION
Claims or loss reserving in casualty insurance is often based on runoff triangles. An example runoff triangle is displayed in Table 1. Outstanding liabilities correspond to the lower unlled portion of the rectangle that have to be forecast. This article deals with models and methods for performing the prediction. The models proposed in this article are extensions of Hertig (1985), and the methods are based on modern time series forecasting. There is an extensive literature on runoff triangle analysis: see, for example, the bibliography in England and Verrall (2002) or the book by Taylor (2000). Articles of particular interest include Zehnwirth (1985), Wright (1990), Mack (1993, 1994b), Verrall (1990), Goovaerts and Redant (1999), and Barnett and Zehnwirth (2000). References that take a time series approach include De Jong and Zehnwirth (1983), Kremer (1984), and Verrall (1989a, b). The approach in this article departs from the previous literature in that it implements difference equation time series models as discussed in, for example, Harvey (1989). These models are stated in terms of levels, trends, and correlations. This compares with the regression methods (such as generalized linear modeling; Taylor 2000) that forecast by extrapolating tted surfaces in terms of explanatory variables. Extrapolations from regression ts (as opposed to difference equation ts) often deliver inferior time series forecasts and forecast intervals. There are many advantages to a difference equation approach. First, there is a rich and widely studied range of practical models. Second, there is no need to rethink or reinvent optimal forecasting formulas and algorithms. Third, diagnostics are readily available. Fourth, the actuary is free to tackle actuarially relevant tasks including the selection of appropriate models and uncovering and dealing with features such as correlations between accident and calendar years or between different triangles. Thus the actuary need not spend time on estimation, forecasting, and diagnostic issues that have already been resolved. The layout of this article is as follows. The next section introduces notation and summarizes the proposed approach. Section 3 sets out Hertigs model and generalizations that are the modeling basis of the present approach. Section 4 discusses the tting and assessment of the proposed models and forecasting implementation. Section 5 deals with a case study involving the development correlation model. An appendix discusses state space forms.
* Piet de Jong, BEc, PhD, is a Professor in the Department of Actuarial Studies, Macquarie University, New South Wales 2109, Australia, [email protected].
28
29
2. LOG-LINK RATIOS
A runoff triangle contains cumulative liabilities with respect to accident years i and development years j. Entries are denoted cij. An example triangle displayed in Table 1 relates to Automatic Facultative General Liability (excluding asbestos and environmental) from the Historical Loss Development study. This triangle, called the AFG data, was considered by both Mack (1994a) and England and Verrall (2002) and is used here to illustrate methods and compare the present papers methods with those in previous studies. Entries in each row generally increase with j, indicating that as time progresses, incurred liabilities with respect to an accident year increase. Each calendar year leads to an oblique diagonal of observations. In Table 1 there are n 10 calendar years. Following Hertig (1985), the models of this article are stated in terms of the log-link ratios
ij
ln
cij ci, j
1
1, . . . , n, j
1, . . . , n
1,
(2.1)
with i0 ln(ci0). Thus ij, j 0, is the continuously compounded growth in accident years i cumulative in development year j. In terms of the log-link ratios the future rate of growth in cumulatives along any row of the runoff triangle through to development year n 1 is gi ln ci,n ci,n
1 i,n i 1 i i,n 1
, i
2, . . . , n.
(2.2)
Further, ci,n 1 ci,n i ci,n i (e gi 1) is the further increase in the liability with respect to development year i. The total future liability with respect to all accident years and through to development year n 1 is
n i 2 n
(ci,n
ci,n i)
i 2
ci,n
(e gi
1).
(2.3)
Forecasting the liability (2.3) requires forecasts of g (g2, . . . , gn) or, from expression (2.2), the future log-link ratios ij, i j n. The approach to forecasting the liability (2.3) advocated in this article consists of the following steps: 1. Modeling and tting. The log-link ratios ij are modeled, and the chosen model is tted to the runoff triangle data. Models described in this article are time series models, while tting is on the basis of maximum likelihood with the Kalman lter used to evaluate the likelihood. 2. Model assessment. The model is assessed using diagnostics. Assessment and diagnostics may suggest areas of model inadequacy and appropriate extensions. Assessment is important for minimizing the chance of model error. Diagnostics are generated using the Kalman lter (De Jong and Penzer 1998). Barnett and Zehnwirth (2000) have stressed the importance of model assessment generally.
Table 1 AFG DataCumulative Incurred Claim Amounts
Accident Year i 1 2 3 4 5 6 7 8 9 10 Development Year j 0 5,012 106 3,410 5,655 1,092 1,513 557 1,351 3,133 2,063 1 8,269 4,285 8,992 11,555 9,565 6,445 4,020 6,947 5,395 2 10,907 5,396 13,873 15,766 15,836 11,702 10,946 13,112 3 11,805 10,666 16,141 21,266 22,169 12,935 12,314 4 13,539 13,782 18,735 23,425 25,955 15,852 5 16,181 15,599 22,214 26,083 26,180 6 18,009 15,496 22,863 27,067 7 18,608 16,169 23,466 8 18,662 16,704 9 18,834
30
3. Forecast future log-link ratios. Future log-link ratios ij, i j n are forecast using the Kalman lter based on the tted model. Forecast error variances and covariances also are derived. i,n i 1 i,n 1 are derived, i 4. Derivation of future accident year growth rates. The i g 2, . . . , n as well as the associated covariance matrix . 5. Simulation of the liability distribution. Repeated draws are made from the multivariate distribution with means i and covariance matrix . Each draw is combined as in equation (2.3) to derive the g simulated forecast liability distribution. The simulated liability distribution incorporates both process and estimation error and provides estimated percentiles of the liability distribution. Details of each step are described in subsequent sections. All the steps have been implemented in an Excel environment, superimposed on the detailed computer algorithms. Figure 1 displays the estimated liability distribution derived from the AFG data using the above steps. Subsequent sections discuss these steps in detail. The estimated liability distribution forms the basis for inferences about the mean, standard deviation, and percentiles. For example, recent Australian legislation has mandated the 75th percentile as an appropriate level of reserves.
3. HERTIGS MODEL
AND
EXTENSIONS
Hertig (1985) introduced a simple yet useful model for the growth rates ij. The model states (see also Murphy 1993 and Taylor 2000) that the growth rates are uncorrelated with means and variances depending only on the development year:
ij j
hjij,
1, . . . n, j
0, . . . , n
1,
(3.1)
where ij (0, 2) and h0 1. Thus the mean and standard deviation of ij are j and hj , respectively, where is the standard deviation of i0. A seemingly minor addition to Hertig (1985) is the inclusion in the model of i0 ln(ci0). This addition turns out to be practically important. Despite this addition we will call equation (4) Hertigs model. Trends in claims over accident years are allowed for with the Hertigs model since each accident years development starts off from from the relevant ci0. Hence a high or low value in ci0 automatically shifts up or down the subsequent development prole for that accident year. Thus the assumption that the i0 all have the same mean 0 is of no import from the forecasting point of view since the forecast liability for each accident year takes off from ci,n i, the latest observed cumulative for that accident year.
Figure 1 Estimated Histogram of Forecast Incurred Liabilities for AFG Data
$0
$50
$100
$150
$200
Thousands
31
A key feature of Hertigs model is that the growth rates ij are uncorrelated across both accident years i and development years j. The next three subsections relax this assumption.
hj(ij
j i, j 1 ij
), j
1, . . . , n
i, j 1
1, i
1, . . . , n.
i0
(3.2) and
i1
Thus is
and
2 2 1
h1 h (1
1 2 1
2 1
Similar expressions apply for higher-order correlations. Case studies, such as in Section 5, suggest that development correlation between i0 and i1 is often important, but higher-order development year correlations, such as between i1 and i2, often can be ignored.
hjij,
i 1, j
ij
j ij
,i
1, . . . , n, j
0, . . . , n
1.
(3.3)
This extends Hertigs model (3.1) by allowing the mean ij within any development year to evolve slowly over accident years. Hertigs model (3.1) is the special case where j 0 for all j, implying each development years mean is constant over the accident years: i 1, j ij. The accident year correlation model implies that, when forecasting future development ratios, more weight is given to more recent observed ratios as opposed to those in the more remote past. For example, the estimate of the log-link ratios in the upcoming calendar year is a geometrically declining average of the previous ratios falling in the same development year. The rate of decline in each development year is controlled by the signal-to-noise ratio j/hj. A ratio near zero implies a very low rate of decline. Weighting loss ratios has two other consequences, both impacting on the variability associated with the forecasts. First, basing the ratio estimates on the more recent evidence implies there is less certainty about them because, in effect, fewer observations are used to make the forecasts. Second, since the ratios evolve over time, evolution is likely to continue into the future, also implying increased uncertainty in the estimates.
hj(
i j
ij),
i j 1
i j
i j
, j
1, . . . , n
1, i
1, . . . , n,
(3.4)
where ij and i j are uncorrelated mean zero, variance 2 noise terms. The calendar year effects k thus are assumed to evolve as a random walk in calendar time, and each k serves to increase or decrease all the log-link ratios falling in calendar year k. The effect of k on a particular log-link ratio is scaled by hj, and hence the effect is assumed proportional to the standard deviation associated with the loglink ratio. Hertigs model (3.1) results when 0.
32
Gtt,
t 1
Wt
Tt
Htt, t
1, . . . , n,
(3.5)
where yt is the vector of observations at time t. In this article yt is the vector of log-link ratios observed at t, yt ( 1,t 1, 2,t 2, . . . , 0t) , and is diagonal t of the runoff triangle. The Appendix shows how each of the above models can be cast into the state space form and the interpretation accorded to the parameters, appearing in the right-hand side of equation (3.5).
4. FITTING, ASSESSMENT,
AND
FORECASTING
This section discusses the detailed steps in the tting, assessment, and forecasting associated with the models described in the previous section.
1 n j
(
i 1
ij
j)2, j
0, . . . , n
1.
(4.1)
In turn 0 and hj j/ 0, j 0, . . . , n 1. These estimates are discussed in Hertig (1985) and Taylor (2000, Section 7.3) and illustrated with respect to the AFG data in Section 5. The correlation models of Sections 3.1, 3.2, and 3.3 can be tted using likelihood maximization with the Kalman lter employed to evaluate the likelihood. There is no need to tailor estimation formulas or software to the specic extension if the model is cast in the general state space form amenable to the general form of the Kalman lter. The appendix shows how the correlation models can be written in the state space form. The Kalman lter equations are displayed in Anderson and Moore (1979) or Harvey (1989), while associated smoothing and diagnostic algorithms are discussed in De Jong (1989) and De Jong and Penzer (1998). These smoothing and diagnostic algorithms facilitate signal extraction and model assessment.
j ,
1, . . . , n, j
0, . . . , n
i.
(4.2)
If Hertigs model (3.1) holds, then the z-scores are approximately normally distributed with mean zero and unit standard deviation. Large zij suggests the model does not t well at the given accident year i and development year j. The z-scores can be plotted against development j, accident i, or calendar i j to reveal any structure. If Hertigs model is appropriate, there should be no structure. The standardized residuals (4.2) form a basis for assessing correlation in runoff triangles. For example, development correlation between two development years is computed from the zij corresponding to the two years, with the accident years i forming the cases. Similarly, correlation between accident years is computed by taking the zij for the two accident years and letting the development years j be the cases. Calendar year correlation can be dealt with similarly. In the development correlation case, the correlation between the zij is the same as the correlation between the corresponding ij. However, this property does not hold for the accident or calendar year correlation computation.
33
Residuals also are available for the correlation models. These are calculated with the Kalman lter as discussed in De Jong and Penzer (1998).
2 i
2 n i 1
2 n 1
, i
2, . . . , n.
In practice the j and j in these expressions are replaced by the estimates j and j given in equations (4.1). Using the j ij rather than the j in equations (4.3) introduces estimation correlation between the i g i,n
i 1
i,n
i 1
n 1, i
2, . . . , n,
(4.4)
since the same estimates j are used in different i. Expressions for the elements of the covariance g matrix of g (2, . . . , n) are given in Taylor (2000, Section 7.3). g g For the more general correlation models, the mean vector and covariance matrix of g, and indeed all the ij, can be computed readily with the Kalman lter, again provided the model is in the state space form. In this setting is interpreted as the minimum mean square error linear predictor of g, g and is the associated error covariance matrix. Given and the covariance matrix , and assuming normality, repeated draws can be made from the g (, ) distribution, with each draw combined as in equation (2.3) to yield an estimate of the outstandg ing liability. The convenient distribution to work with is the multivariate normal, although its applicability would have to be assessed. If the normal assumption is appropriate, then it follows (Aitchison and Brown 1957) that the conditional expected value of ci,n 1 and associated coefcient of variation are
g ci,n i e i
2 i/2
2 i
1,
2, . . . , n,
respectively. These expressions in turn lead to the conditional expected value of each accident years liability ci,n 1 ci,n i and associated coefcient of variation. In these expressions the 2 dened in i equations (4.3) are assumed known. In practice they are replaced with estimates.
34
Table 2 z-Scores and Hertigs Model Estimates for the AFG Data
Accident Year i 1 2 3 4 5 6 7 8 9 10 j j Development Year j 0 1.04 2.39 0.70 1.15 0.31 0.02 0.91 0.12 0.62 0.25 7.35 1.12 1 1.06 2.27 0.57 0.83 0.68 0.07 0.48 0.12 1.01 1.52 0.96 2 0.94 1.14 0.28 0.80 0.02 0.42 2.14 0.58 3 0.87 2.17 0.51 0.24 0.42 0.77 0.68 4 0.58 1.76 0.35 1.38 0.18 0.72 5 1.00 0.10 0.87 0.17 1.79 6 1.59 1.17 0.31 0.11 7 0.15 1.29 1.14 8 1.00 1.00 9 0.00
0.50 0.24
0.25 0.20
0.17 0.05
0.12 0.06
0.04 0.04
0.03 0.01
0.02 0.01
0.01
priate. In particular, the top right panel of Figure 2 displays a scatter plot of the z-scores corresponding to development years 0 and 1. The z-scores are almost perfectly negatively correlated: a high value in development year 0 is almost invariably followed by a low value the following development year. Ignoring the correlation between i0 and i1 can lead to sizable forecast errors. Since the cumulative ci0 for accident year i 10 is above the average, it is highly likely that the increment in development year 1 will be below the average. This has major implications for the forecast since a large part of the total liability is with respect to the nal accident year n 10. It may be argued that negative correlation always will be present since i0 ln(ci0) and i1 ln(ci1/ ci1). However, in our experience not all runoff triangles display the correlation. Further, the existing
mean
2
log(mu)
z-scores
0 -2 -4 0 1 2 3 4 5 6 7 8 9
year 1 value
1 0 -1 -2 -2 -1 0 1 2
std dev
0
year 1 value log(sigma)
cumulatives
12 9 6 3 0
-2 -4 0 1 2 3 4 5 6 7 8 9
development year
35
literature and methods ignore the correlation. For example, the often used chain-ladder approach forecasts cn1 by cn0( i ci0/ i ci1). This formula does not adjust the forecast according to the relative size of cn0. Hertig (1985) does not explicitly model ci0, and hence the possibility of correlation is ignored. Taylor (2000), illustrating the use of Hertigs model, also ignores the correlation. A notable exception is Barnett and Zehnwirth (2000), who argue that the chain-ladder method is misleading because it does not allow for an intercept term in the simple regression corresponding to the bottom right panel in Figure 2, which plots ci0 versus ci1. However, in this plot the correlation is obscured and may be rationalized on the misleading basis that a large value of ci0 will lead, other things equal, to a large value for ci1.
j 1
b( j 2) 2
0, 1, 2, . . . , n 1.
, j j
Thus the parameters to be tted are , h1 (equivalently 1), , a, b, and 0, . . . , n 1. When n 10, as in the case of the AFG data, this gives 15 unknown parameters compared to 55 data points. This is possibly an excessive number of unknown parameters compared to data points. The n 10 mean parameters 0, . . . , 1 can possibly also be structured as linear in the logs, as suggested in the top left panel of Figure 2. Table 3 reports moment- and normal-based maximum likelihood estimates of development correlation model parameters using the AGF data in Table 1. The moment estimates are where the estimate of is derived from the empirical correlation between i0 and i1, and a and b are derived from a least squares regression j on j 2 for j 2, . . . , n 2. Maximum likelihood values are derived using the Kalman lter to evaluate the likelihood. In the table is the value of the negative of the log-likelihood. Table 1 indicates there are no material differences in the two sets of estimates. Given the parameter estimates displayed in Table 3, the Kalman lter was used to derive generalized least squares estimates of the j and to compute the forecasts i as well as the associated error covarg iance matrix. Results are displayed in Table 4 with estimated standard errors in parentheses. The standard errors factor in both process and estimation uncertainty of the j, but not the uncertainty of the estimates displayed in Table 3. Simulations from the distribution, including covariances, yield the forecast liability distribution displayed in Figure 1. The development correlation model was assessed by computing the linear predictors of the disturbances ij in equation (3.2) given the runoff triangle data, and where unknown parameters are replaced by the Table 3 estimates. Computations are performed with the smoothing lter companion to the Kalman lter (De Jong 1989). These residuals indicate that the outcome in accident year 2, develop-
36
ment year 1, previously agged as extreme, is in fact not extreme in the context of the development correlation model. Also the correlation between the development year 0 and 1 residuals is now an insignicant 0.19. The model has thus both explained the correlation and exploited it to arrive at a defensible loss forecast distribution.
37
38
MACK, T. 1993. Distribution-Free Calculation of the Standard Error of Chain Ladder Reserve Estimates. ASTIN Bulletin 23(2): 213 25. . 1994a. Measuring the Variability of Chain Ladder Reserve Estimates. In Proceedings of the Casualty Actuarial Society Spring Forum, 10182. . 1994b. Which Stochastic Model Is Underlying the Chain Ladder Method? Insurance, Mathematics and Economics 15(2/3): 13338. MURPHY, D. 1993. Unbiased Loss Development Factors. In Proceedings of the Casualty Actuarial Society Meeting, May 1993, pp. 183 246. TAYLOR, G. 2000. Loss Reserving: An Actuarial Perspective. Boston: Kluwer. VERRALL, R. 1989a. Modelling Claims Run-Off Triangles with Two-Dimensional Time Series. Scandanavian Actuarial Journal: 129 38. . 1989b. A State Space Representation of the Chain Ladder Linear Model. Journal of the Institute of Actuaries 116: 589610. . 1990. Bayes and Empirical Bayes Estimation for the Chain Ladder Model. ASTIN Bulletin 20: 21743. WRIGHT, T. S. 1990. A Stochastic Method for Claims Reserving in General Insurance. Journal of the Institute of Actuaries 117: 677 731. ZEHNWIRTH, B. 1985. Interactive Claims Reserving Forecasting System. St. Kilda, Australia: Insureware P/L.
Discussions on this paper can be submitted until October 1, 2006. The author reserves the right to reply to any discussion. Please see the Submission Guidelines for Authors on the inside back cover for instructions on the submission of discussions.