Geomorphology 81 (2006) 166 – 184
www.elsevier.com/locate/geomorph
Estimating the quality of landslide susceptibility models
Fausto Guzzetti ⁎, Paola Reichenbach, Francesca Ardizzone, Mauro Cardinali, Mirco Galli
IRPI CNR, via Madonna Alta 126, 06128 Perugia, Italy
Received 18 July 2005; received in revised form 13 November 2005; accepted 19 April 2006
Available online 22 May 2006
Abstract
We present a landslide susceptibility model for the Collazzone area, central Italy, and we propose a framework for evaluating
the model reliability and prediction skill. The landslide susceptibility model was obtained through discriminant analysis of 46
thematic environmental variables and using the presence of shallow landslides obtained from a multi-temporal inventory map as the
dependent variable for statistical analysis. By comparing the number of correctly and incorrectly classified mapping units, it is
established that the model classifies 77.0% of 894 mapping units correctly. Model fitting performance is investigated by comparing
the proportion of the study area in each probability class with the corresponding proportion of landslide area. We then prepare an
ensemble of 350 landslide susceptibility models using the same landslide and thematic information but different numbers of
mapping units. This ensemble is exploited to investigate the model reliability, including the role of the thematic variables used to
construct the model, and the model sensitivity to changes in the input data. By studying the variation of the model's susceptibility
estimate, the error associated with the susceptibility assessment for each mapping unit is determined. This result is shown on a map
that complements the landslide susceptibility map. Prediction skill of the susceptibility model is then estimated by comparing the
forecast with two recent event inventory maps. The susceptibility model is found capable of predicting the newly triggered
landslides. A general framework for testing a susceptibility model is proposed, including a scheme for ranking the quality of the
susceptibility assessment.
© 2006 Elsevier B.V. All rights reserved.
Keywords: Landslide susceptibility; Statistically based model; Discriminant analysis; Quality; Uncertainty; Validation; Landslide prediction; Map
1. Introduction
Susceptibility is the propensity of an area to generate
landslides. In mathematical form, landslide susceptibility is the probability of spatial occurrence of known
slope failures, given a set of geoenvironmental conditions (Guzzetti et al., 2005). Assuming landslides will
occur in the future because of the same conditions that
produced them in the past (Guzzetti et al., 1999),
⁎ Corresponding author. Tel.: +39 075 5014 413; fax: +39 075 5014
420.
E-mail address:
[email protected] (F. Guzzetti).
0169-555X/$ - see front matter © 2006 Elsevier B.V. All rights reserved.
doi:10.1016/j.geomorph.2006.04.007
susceptibility assessments can be used to predict the
geographical location of future landslides (Chung and
Fabbri, 1999; Guzzetti et al., 2005). Many methods have
been proposed to evaluate landslide susceptibility at the
basin scale, including direct geomorphological mapping, heuristic approaches, statistical classification
methods and physically based models (Carrara et al.,
1995; Soeters and van Westen, 1996; Chung and Fabbri,
1999; Guzzetti et al., 1999, and references therein).
Statistical classification methods are particularly suited
to determining landslide susceptibility over large and
complex areas (e.g., Cardinali et al., 2002). Such
methods provide quantitative estimates of “where”
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
landslides are expected, based on detailed information
on the distribution of past landslides and a set of thematic
environmental information. The former becomes the
dependent variable and the latter the independent
variables for the statistical modelling. Landslide susceptibility does not forecast “when” or “how frequently” a
landslide will occur or how large and destructive the
slope failure will be (Guzzetti et al., 2005).
In recent years, many landslide susceptibility assessments (often referred to as hazard assessments) have
been published. We counted at least 40 papers in major
international journals in the 6-year period from 2000 to
2005 discussing landslide susceptibility. The majority of
the published papers present a statistically based
susceptibility model and discuss the data and the
method used to prepare the model, but provide little or
no information on the quality of the proposed model.
This is a limitation of much previous research (Chung
and Fabbri, 2003), including some of our own work
(e.g., Carrara et al., 1995; Guzzetti et al., 1999; Cardinali
et al., 2002).
Any attempt to ascertain landslide susceptibility in a
region needs proper validation. Validation should
establish the quality (i.e., reliability, robustness, degree
of fitting and prediction skill) of the proposed
susceptibility estimate. The quality of a landslide
susceptibility model can be ascertained using the same
landslide data used to obtain the susceptibility estimate,
or by using independent landslide information not
available to construct the model. The former allows
for (i) evaluating the degree of match between the
predicted susceptibility levels in a given region, and the
distribution and abundance of known landslides in the
same region; (ii) evaluating the role of the thematic
information in constructing the model; (iii) assessing the
ability of the model to cope with variations in the input
data; and (iv) determining the error associated with the
obtained susceptibility estimate. The latter allows for
determining the prediction skill of the model to forecast
the location of new or reactivated landslides (Chung and
Fabbri, 2003; Guzzetti et al., 2005).
In this paper, we provide a comprehensive validation
of a landslide susceptibility model prepared through
discriminant analysis of thematic information for the
Collazzone area in central Umbria (Fig. 1). The
landslide susceptibility model is first presented. A set
of tests is then performed, aimed at evaluating the
quality and robustness of the model. We further test the
ability of the model to predict new landslides by
comparing the susceptibility estimate against the
distribution of slope failures that occurred after the
model was prepared. Results obtained are discussed, and
167
Fig. 1. Location of the Collazzone study area in Umbria, central Italy.
Shaded relief image shows the hilly morphology of the area.
a general framework is proposed for evaluating and
ranking the quality of a statistically based landslide
susceptibility model.
2. The study area
The Collazzone area extends for 78.9 km2 in central
Umbria, Italy (Fig. 1). Elevation in the area ranges from
145 m to 634 m above sea level, with an average value
of 273 m (standard deviation = 96.1 m). Terrain gradient
computed from a 10 m × 10 m DTM ranges from 0° to
63.7° degree, with a mean value of 9.9° and a standard
deviation of 6.4°. In the area the terrain is hilly, valleys
are asymmetrical, and the lithology and attitude of
bedding control the morphology of the slopes. Gravel,
sand, clay, travertine, layered sandstone and marl, and
thinly layered limestone, Lias to Holocene in age, crop
out in the area. Soils range in thickness from a few
decimetres to more than 1 m; they have a fine or medium
texture and exhibit a xenic moisture regime, typical of
the Mediterranean climate. Precipitation is most abundant in October and November; with a mean annual
rainfall in the period from 1921 to 2001 of 884 mm.
Snow falls on the area on average every 2–3 years.
Landslides are abundant in the area, and range in age,
type, morphology and volume from very old, partly
eroded, large and deep-seated slides to young, shallow
slides and flows. Slope failures are triggered chiefly by
168
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
meteorological events, including intense and prolonged
rainfall and rapid snow melt.
3. Landslide susceptibility model
For the Collazzone area, we prepared a landslide
susceptibility model using discriminant analysis of 46
environmental thematic variables, including morphology, hydrology, lithology, structure, bedding attitude and
land use. To determine landslide susceptibility, we
subdivided the study area into mapping units, i.e.,
portions of the ground that contain a set of conditions
that differ from the adjacent units across distinct
boundaries (Guzzetti et al., 1999). To establish the
terrain subdivisions, we adopted the approach proposed
by Carrara et al. (1991), which has proven to be reliable
in predicting landslide susceptibility in Umbria (Carrara
et al., 1991, 1995, 1999; Guzzetti et al., 1999; Cardinali
et al., 2002). We obtained the terrain subdivision using
specialised software that, starting from a 10 m × 10 m
DTM, generated drainage and divide lines. The DTM
was prepared by automatic interpolation of 10 and 5-m
interval contour lines obtained from 1:10,000 scale
topographic maps. By combining the drainage and
divide lines, the software automatically identified 894
elementary slopes units (i.e., sub-basins) which represent the mapping units of reference for the determination
of landslide susceptibility in the Collazzone area. The
procedure also calculated a number of morphological
and hydrological parameters: some correspond to those
acquired by traditional methods (e.g., channel length,
stream order, link length, etc.); others were specifically
designed to model the spatial distribution of landslides
(e.g., slope unit area, slope unit terrain gradient, slope
unit aspect, slope unit terrain roughness, etc.) (Carrara et
al., 1991, 1995).
As the dependent variable for the statistical analysis,
we used the presence or absence of shallow landslides
that occurred before 1997 (Table 1, Fig. 2) in the 894
mapping units into which the study area was partitioned.
The distribution of landslides was obtained from a
detailed, multi-temporal landslide inventory map prepared at 1:10,000 scale. The landslide inventory was
prepared through the systematic interpretation of five
sets of aerial photographs (Table 2), supplemented by
field surveys. Two geomorphologists carried out the
interpretation of the aerial photographs in the 5-month
period from July to November 2002. The investigators
looked at each pair of aerial photographs using a mirror
stereoscope (4× magnification) and a continue-zoom
stereoscope (3× to 20× magnification). Field surveys
carried out in the period from 1998 to 2004 were
conducted to map new landslides triggered by intense or
prolonged rainfall, and to test the inventory obtained
through photo-interpretation. The field surveys allowed
mapping 230 new or reactivated landslides, most of
which occurred in the period from October to December
2004. The field surveys also revealed that shallow
landslides are uncommon in forested terrain that covers
23.9% of the study area.
In the multi-temporal inventory, landslides were
classified according to the type of movement, and the
Table 1
Descriptive statistics of landslide data sets for the Collazzone study area
Inventory
A
B
C
D
Multi-temporal landslide
inventory prepared through the
interpretation of five sets of
aerial photographs (Table 2).
Landslides are older than 1941
to December 2004.
Subset of the multi-temporal
inventory showing shallow
landslides and used to prepare
the susceptibility model shown
in Fig. 3. Landslides are older
than 1941 to 1996.
Snowmelt induced landslides
occurred in January 1997
(Fig. 10A).
Rainfall-induced landslides
occurred in autumn 2004
(Fig. 10B).
Type
Number
Area
Total (km2)
Percent (%)
Minimum (m2)
Maximum (m2)
All landslides
Deep-seated landslides
Shallow landslides
2760
363
2397
12.51
7.70
6.53
15.8
9.76
8.28
51
3815
51
173,518
173,518
64,691
Shallow landslides
1759
5.77
7.31
103
43,204
413
7
406
153
1
152
0.78
0.14
0.64
0.38
0.05
0.33
0.98
0.17
0.81
0.48
0.06
0.42
78
10,199
78
51
47,884
51
44,335
44,335
9882
47,884
47,884
12,098
All landslides
Deep-seated landslides
Shallow landslides
All landslides
Deep-seated landslides
Shallow landslides
169
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
Fig. 2. Multi-temporal landslide inventory map showing shallow
landslides (see Table 1). Original map scale 1:10,000.
estimated age, activity, depth, and velocity. Landslide
type was defined according to Cruden and Varnes
(1996). Landslide age, activity, depth, and velocity were
determined based on the type of movement, the
morphological characteristics and appearance of the
landslides on the aerial photographs, the local lithological and structural setting, and the date of the aerial
photographs. Landslide age was defined as recent, old or
very old, despite ambiguity in the definition of the age of
a mass movement based on its appearance (McCalpin,
1984). Overall, the multi-temporal inventory map shows
2760 landslides (Table 1A). The subset of shallow
landslides used to prepare the susceptibility model
includes 1759 landslides, covering 5.77 km2 of the
study area (Table 2B).
To account for possible cartographic and drafting
errors in the production of the multi-temporal inventory
map (e.g., landslides erroneously mapped as crossing a
divide or a stream line), we established an empirical
threshold to decide if a mapping unit contained or was
free of landslides. Slope units having less than 2% of the
area covered by shallow slope failures were considered
free of landslides, whilst slope units having 2% or more
of their area covered were considered as containing
landslides.
Independent variables used in the statistical analysis
of the shallow landslides included morphological,
hydrological, lithological, structural and land-use information. We obtained 26 variables describing hydrology
and morphology from the same DTM used to perform
the subdivision of the study area into slope units.
Hydrological variables included slope unit drainage
channel length, gradient, order and magnitude, and slope
unit area and upstream contributing area. Morphological
variables included slope unit mean elevation, standard
deviation of elevation, mean length, mean terrain
gradient and standard deviation of terrain gradient,
slope unit aspect (in six classes), slope unit terrain
roughness, and mean terrain gradient for the upper,
intermediate and lower portions of the slope unit. From
the latter three statistics, derivative variables describing
the shape of the slope unit profile (concave, convex,
irregular, etc.) were obtained. Since most of the
morphological variables describe average terrain conditions in a slope unit, local testing of the variables in the
field was problematic and was not performed. We
compiled lithological and structural data, including the
attitude of bedding, through detailed lithological and
structural mapping at 1:10,000 scale. The lithological
map did not show the distribution and thickness of the
soils or the colluvial deposits. We obtained information
on land use from a map compiled in 1977 by the Umbria
Regional Government, largely revised and updated by
interpreting the most recent aerial photographs (Table 2).
To determine landslide susceptibility we adopted
discriminant analysis, a multivariate technique introduced by Fisher (1936) to classify samples into
alternative groups on the basis of a set of measurements
(Michie et al., 1994; Brown, 1998; SPSS, 2004). More
precisely, the goal of discriminant analysis is to classify
cases into one of several mutually exclusive groups
based on their values for a set of predictor variables. The
grouping variable must be categorical and the predictor
variables can be interval or dichotomous. For landslide
susceptibility assessment most commonly two groups
are established, namely, (i) mapping units free of
landslides (G0, stable slopes); and (ii) mapping units
having landslides (G1, unstable slopes). The assumption
Table 2
Aerial photographs used to prepare the multi-temporal landslide
inventory map (1 to 4) (Fig. 2) and the inventory of snowmelt induced
landslides (5) (Fig. 10A) for the Collazzone area
ID
Year
Period
Type
Nominal scale
1
2
3
4
5
1941
1954
1977
1985
1997
Summer
Spring–Summer
June
July
April
Panchromatic
Panchromatic
Colour
Panchromatic
Panchromatic
1:18,000
1:33,000
1:13,000
1:15,000
1:20,000
170
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
is made that the two groups are distinct, and that a
mapping unit r pertains only to one group. In the context
of landslide susceptibility, the scope of discriminant
analysis is to determine the group membership of a
mapping unit by finding a linear combination of the
environmental variables which maximizes the differences between the populations of stable and unstable
slopes with minimal error. To obtain this, consider a set
of m environmental variables ν1, ν2, …, νm for each
mapping unit, r, by means of which it is desired to
discriminate the region between the groups of stable
(G0) and unstable (G1) slopes, and let Z be a linear
combination of the input variables, such as Z = β1ν1(r)
+ β2ν2(r) + … + βmνm(r). For discriminant analysis, the
task is to determine the β coefficients by means of some
criterion that will enable Z to serve as an index for
differentiating between members of the two groups. The
linear discriminant function Z transforms the original
sets of measurements into a discriminant score, which
represents the sample position along a line defined by
the same discriminant function. To measure how far
apart the two groups are along this line, different
“distances” can be used (e.g., Euclidean, diagonal or
Mahalonobis distances; Michie et al., 1994; Gorsevski
et al., 2003). A larger “distance” indicates that it is easy
to discriminate between the two groups. Posterior
probabilities are then used to express the likelihood of
a mapping unit belonging to one group or the other
(Brown, 1998). Thus, when probabilities are derived
from a discriminant analysis, they represent the
likelihood of a mapping unit pertaining to one of the
two groups established a priori. The relative contribution of each independent environmental variable to the
discriminating function can be evaluated by studying the
standardized discriminant function coefficients (SDFC).
The SDFC show the relative importance (i.e., the
“weight”) of each variable as a predictor of slope
instability. Variables with large coefficients (in absolute
value) are more strongly associated with the presence or
the absence of landslides. The sign of the coefficient
indicates whether the variable is positively or negatively
correlated to instability within a mapping unit.
Using the landslide and environmental data available
for the Collazzone area, a discriminant function automatically selected 16 (out of 46) variables as the best
predictors of the presence (or absence) of landslides in the
894 slope units in which the study area was partitioned. A
step-wise procedure was used. The procedure entered and
removed variables in a stepwise fashion, based on a
minimum tolerance value of 0.001; the tolerance of a
variable candidate for inclusion in the analysis was the
proportion of the within-groups variance not accounted
for by other variables in the analysis (SPSS, 2004, p. 522).
Table 3 lists the 16 selected environmental variables and
the associated standardized discriminant function coefficients. For convenience, a threshold of |0.250| for the
SDFC was selected to outline the variables more strongly
associated with the presence (negative SDFC) or the
absence (positive SDFC) of landslides. This threshold is
heuristic and its significance for landslide susceptibility
must be verified.
Fig. 3 portrays the results of the landslide susceptibility model for shallow landslides. The map expresses
the probability that each slope unit contains shallow
landslides in the multi-temporal inventory map shown in
Fig. 2. If a slope unit has a high probability of containing
a known shallow landslide, the same mapping unit is
classified as landslide prone. On the contrary, if a slope
unit has a low probability of having known shallow
landslides, the mapping unit is considered stable.
Intermediate values of probability indicate the inability
of the model to classify the mapping unit with the
available thematic information (80 mapping units,
8.95%), and not necessarily conditions of marginal or
intermediate stability.
The first question to ask when a landslide susceptibility model is prepared through a statistical classification technique is “how well has the model performed in
classifying the mapping units?” This involves determining the degree of model fit. A straightforward way of
testing model fit consists of counting the number of
cases (i.e., the number of mapping units) correctly
Table 3
Variables selected by a stepwise discriminant function as the best
predictors of landslide occurrence (Fig. 3)
Variable description
Variable
SDFC
Slope unit mean terrain gradient
Slope unit elevation standard deviation
Slope unit length
Slope unit terrain gradient (upper portion)
Cultivated area
Bedding dipping out of the slope
Slope unit with convex slope
(downslope profile)
Travertine
Slope unit facing S-SE
Slope unit drainage channel order
Recent alluvial deposit
Gravel and coarse continental sediments
Slope unit terrain gradient standard deviation
Marl
Downslope concave profile
Limestone
SLO_ANG
ELV_STD
SLO_LEN
ANGLE3
SS
FRA
CONV
−0.398
−0.370
−0.287
−0.282
−0.276
−0.241
−0.135
TRAVERTI
TR2
ORDER
ALLUVIO
GHIAIA
ANG_STD
MARNE
CC
CARBO
0.105
0.133
0.140
0.144
0.179
0.219
0.285
0.303
0.833
Values in boldface are variables more strongly associated with the
presence (negative SDFC) or the absence (positive SDFC) of landslides.
171
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
Fig. 3. Map showing spatial probability of shallow landslide
occurrence (landslide susceptibility). Study area subdivided into 894
slope units. Square bracket, class limit included; round bracket,
probability class limit not included. See Tables 3 and 4 for model
classification results.
classified by the model. Table 4 shows the results for the
model shown in Fig. 3. The susceptibility model
correctly classifies 688 of the 894 mapping units in
which the study area was partitioned. The figure
represents a measure of the “overall goodness of fit”
of the model. Of the 206 misclassified cases, 121 are
mapping units free of landslides that were classified as
“unstable” by the model, and 85 are mapping units that
showed landslides in the inventory map and were
attributed to the “stable” group by the model. The
former may be the result of errors in the inventory map
(e.g., unrecognized landslides, or landslides removed or
concealed by erosion, land use changes, ploughing or
other human actions). The latter are mapping units that
have environmental conditions typical of stable slopes,
and where landslides took place owing to local
conditions not accounted for by the model (e.g., local
structural conditions, particularly thick soil, local land
use or surface drainage modifications). Further inspection of Table 4 reveals that the susceptibility model is
more efficient in correctly classifying mapping units that
have landslides (84.1%), and less efficient in classifying
mapping units free of slope failures (66.4%). We
attribute the difference to the larger number of mapping
units with shallow landslides, when compared to the
mapping units free of slope failures. Indeed, in the study
area 534 mapping units (59.7%) have shallow landslides
and 360 mapping units (40.3%) are free of shallow
landslides (Fig. 2).
An alternative way of measuring the reliability of a
susceptibility model–in terms of its ability to classify
known landslides–involves the use of Cohen's Kappa
index (Cohen, 1960; Hoehler, 2000). To compute this
index we prepared Table 5 that shows the proportion
(i.e., the observed probability) of mapping units in each
of the four classification cases listed in Table 4. Table 5
also shows the marginal totals obtained by summing the
proportions along the table rows and columns. In Table
5 values in parentheses represent the expected proportion on the basis of chance associations, i.e., the joint
probability of the marginal proportions. The Kappa
index is obtained as:
j¼
PC PE
1 PC
lVxV1
ð1Þ
where, PC is the proportion of mapping units correctly
classified as stable or unstable, and PE is the proportion
of mapping units for which the agreement is expected by
chance. In this case, κ = 0.513. Landis and Kock (1977)
have suggested that for 0.41 ≤ κ ≤ 0.60, the strength of
Table 4
Comparison between slope units classified as stable or unstable by the statistical model (Fig. 3) and slope units free of and containing shallow
landslides in the multi-temporal inventory map shown in Fig. 2
Predicted groups (model)
Group 0
Actual Groups (inventory)
Group 0 slope units free of shallow
landslides in inventory map
Group 1 slope units containing
shallow landslides in inventory map
Total number of slope units = 894.
Overall percentage of slope units correctly classified = 77.0%, 688 slope units.
Group 1
Stable slope units
Unstable slope units
239 (66.4%)
121 (33.6%)
360 (100%)
85 (15.9%)
449 (84.1%)
534 (100%)
172
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
Table 5
Comparison between the proportion of slope units classified as stable or unstable by the susceptibility model (Fig. 3) and the proportion of slope units
free of and containing shallow landslides in the multi-temporal inventory map (Fig. 2)
Predicted groups (model)
Group 0
Actual Groups (inventory)
Group 0 slope units free of shallow
landslides in inventory map
Group 1 slope units containing
shallow landslides in inventory map
Marginal totals
Group 1
Marginal
totals
Stable slope units
Unstable slope units
0.267 (0.146)
0.135 (0.257)
0.403
0.095 (0.216)
0.502 (0.381)
0.597
0.362
0.638
1.000
κ = 0.513, moderate agreement.
the agreement between the observed and the predicted
values is moderate.
In addition to Cohen's Kappa index, other indexes
can be computed to measure the performance of a
statistical classification. Most commonly, the indexes
are obtained from figures listed in a contingency table
similar to Table 5. Table 6 shows 13 statistical indexes
obtained from Table 4. For a discussion of the
significance and properties of the individual performance indexes listed in Table 6, see Mason (2003) and
references therein.
Tables 4 and 5 provide a combined estimate of model
fit, but do not provide a detailed description of the model
performance of the different susceptibility classes
(Chung and Fabbri, 1999, 2003). To determine this,
we compare the total area of known landslides in each
susceptibility class with the percentage area of the
susceptibility class. Fig. 4 shows the percentage of the
study area ranked from most to least susceptible (x-axis)
against the cumulative percentage of landslide area in
each susceptibility class (y-axis). Most of the landslides
shown in the multi-temporal inventory (Fig. 2) are in
areas classified as susceptible by the model, and only
6.4% of the slope failures are in areas classified as not or
weakly susceptible (probability ≤ 0.45) by the model.
The latter is in agreement with the reduced number of
mapping units (85, 9.51%) having landslides and
erroneously attributed to the “stable” group by the
model (Table 4). Fig. 4 provides a quantitative
indication of the ability of the susceptibility model to
match (“fit”) the known distribution of shallow landslides in the Collazzone area (Fig. 2).
4. Uncertainty in the landslide susceptibility model
Any landslide susceptibility prediction has a level of
uncertainty. Sources of uncertainty include (i) errors
and incompleteness in the landslide and thematic
Table 6
Statistical indexes measuring the performance of the susceptibility
model shown in Fig. 3
Index
Value
Range
Hit rate, or Sensitivity
False alarm rate
Specificity
False alarm ratio
Positive predictive value
Negative predictive value
Proportion correct
Proportion correct by chance
Cohen's Kappa coefficient
Heidke skill score
Peirce's skill score
Critical success index
Yule's Q
0.664
0.159
0.841
0.262
0.738
0.788
0.770
0.527
0.513
0.107
0.505
0.537
0.825
[0,1]
[0,1]
[0,1]
[0,1]
[0,1]
[0,1]
[0,1]
[0,1]
[− ∞,1]
[− 1,1]
[− 1,1]
[0,1]
[− 1,1]
Values computed using figures listed in Table 4.
Fig. 4. Analysis of fitting performance of landslide susceptibility
model shown in Fig. 3.
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
information available to complete the analysis; (ii) an
imperfect understanding of landslide processes and their
geographical and temporal evolution; (iii) limitations in
the techniques used to determine the susceptibility; and
(iv) the inherent natural variability of the landslide
phenomena (Carrara et al., 1992, 1999; Ardizzone et al.,
2002).
Determining the errors associated with the geomorphological, geological and other thematic information is
no trivial task. Improving the understanding of the
landslide processes is feasible, but requires time and
resources often not available to landslide investigators.
The characteristics of the methods used to ascertain
landslide susceptibility are known (Carrara et al., 1995;
Soeters and van Westen, 1996; Chung and Fabbri, 1999;
Guzzetti et al., 1999), but their limitations and drawbacks when applied to specific areas, data sets, and
landslide types remain poorly investigated. Despite
these problems, we argue that determining the errors
associated with a landslide susceptibility assessment is
of primary importance. Different types of uncertainty
contribute to the model error, including: (i) uncertainty
in the model classification due to the type, abundance
and reliability of the available thematic information; (ii)
uncertainty in the classification of individual mapping
units; and (iii) uncertainty in the ability of the model to
predict future landslides (prediction skill).
In the following, we propose a framework to test our
susceptibility model. Tests aim at (i) investigating the
role of the thematic information in the production of the
susceptibility model; (ii) determining the model sensitivity and robustness to variations in the input data; (iii)
determining the error associated with the susceptibility
prediction obtained for each mapping unit; and (iv)
testing the model prediction against independent
landslide information.
5. Analysis of model reliability
5.1. Construction of an ensemble of landslide
susceptibility models
To determine the reliability of the landslide susceptibility assessment shown in Fig. 3, we prepared an
ensemble of landslide susceptibility models. The ensemble contains 350 different susceptibility models obtained
from the same set of 46 independent thematic variables
and the same multi-temporal landslide map (Fig. 2) but
using a different number of terrain units, from 268 (30%)
to 849 (95% of the 894) units. To obtain the ensemble we
adopted the following strategy. First, a subset containing
30% of the mapping units (268 units) was obtained by
173
random selection from the entire set of 894 mapping units.
The random selection was repeated 50 times, obtaining a
group of 50 different subsets, each containing 268
mapping units. This collection of 50 subsets of mapping
units became group G30 for the analysis (30% selected
mapping units). The selection process was repeated,
changing the number of the selected units heuristically.
We obtained collections with 45%, 55%, 65%, 75%, 85%,
and 95% mapping units, respectively. These collections,
each listing 50 subsets of mapping units, became groups
G45, G55, G65, G75, G85 and G95. Overall, the ensemble
contains 350 subsets of mapping units, i.e., 7 groups (from
G30 to G95) each containing 50 subsets.
Landslide susceptibility models were prepared for
each subset of the ensemble, obtaining 350 different
susceptibility models, i.e., 350 different forecasts of
landslide susceptibility for the Collazzone area. The
large number of susceptibility forecasts was exploited to
study the errors associated with the landslide susceptibility model shown in Fig. 3.
5.2. Role of the independent thematic variables
The role of the 46 independent thematic variables
used to construct the landslide susceptibility model is
first considered. For this purpose, group G85 is used. For
this group, Table 7 lists the number and the percentage
of the 50 models that selected (or did not select) the 46
variables, and whether the variables were selected as
predictors of slope stability (S) or of slope instability (I).
Inspection of Table 7 reveals that of the 46 considered
variables, 38 (82.6%) were selected in at least one of the
50 models encompassing G85, and 8 (17.4%) variables
were never selected as predictors of landslide occurrence. Of the 38 selected variables, 15 (39.5%) were
selected by 25 or more models, and 7 (18.4%) were
selected by 45 or more models.
The 50 stepwise discriminant functions constructed
from G85 selected from as few as 11 to as many as 18
variables (mode 14 variables). All the selected variables,
with the exception of drainage magnitude (MAGN),
were either always selected as negatively (I, in Table 7)
or always selected as positively (S, in Table 7) in
association with the presence of landslides. We take this
as an indication of the consistency of the role of the
thematic variables in explaining the known distribution
of landslides, which contributes to the reliability of the
susceptibility model.
Inspection of Table 7 further indicates that more than
75% of the prepared models used the same set of 10
variables. These variables included: four variables
describing morphology (ELV_STD, ANG_STD, SLO_
174
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
Table 7
Independent thematic variables selected, or not selected, by 50 discriminant functions as the best predictors of landslide occurrence
Variables
SDFC
Susceptibility
models
#
Slope unit elevation standard deviation
Limestone
Bedding dipping out of the slope
Gravel and coarse continental sediments
Marl
Slope unit terrain gradient standard deviation
Slope unit length
Slope unit mean terrain gradient
Cultivated area
Slope unit facing S-SE
Downslope concave slope
Slope unit drainage channel order
Recent alluvial deposit
Slope unit with convex slope (downslope profile)
Sandstone
Travertine
Slope unit terrain gradient (upper portion)
Forested area
Slope unit area
Slope unit drainage channel length
Slope unit surface roughness index
Slope unit slope (lower portion)
Slope unit mean elevation
Concave profile downslope
Drainage channel mean slope
Continental deposit
Sand
Slope unit drainage channel magnitude
Urban area
Bedding dipping into the slope
Bedding dipping across the slope
Slope unit facing N-NE
Standard deviation of terrain unit length
Slope unit with convex–concave slope (downslope profile)
Slope unit with irregular slope (downslope profile)
Clay
Cultivated area with trees
Vineyards
Drainage basins total area upstream the slope unit
Slope unit terrain gradient (intermediate portion)
Concave–convex profile downslope
Slope unit rectilinear profile
Fruits trees
Pasture
Slope unit facing S-SW
Deposit of ancient landslide
ELV_STD
CARBO
FRA
GHIAIA
MARNE
ANG_STD
SLO_LEN
SLO_ANG
SS
TR2
CC
ORDER
ALLUVIO
CONV
AREN
TRAVERTI
ANGLE3
BOSCO
SLO_ARE
LINK_LEN
R
ANGLE1
ELV_M
CONC
LNK_ANG
CONTI
SABBIA
MAGN
URB
REG
TRA
TR1
LEN_STD
COC_COV
IRR
ARGILLA
SA
VIG
AREAT_K
ANGLE2
COV_COC
RET
FRUTT
PASCOLO
TR3
FRA_OLD
Predictor
%
50
100
I
50
100
S
49
98
I
47
94
S
47
94
S
45
90
S
45
90
I
41
82
I
40
80
I
38
76
S
33
66
S
30
60
S
30
60
S
27
54
I
25
50
S
0.105
23
46
S
−0.282
21
42
I
21
42
S
13
26
I
10
20
I
10
20
I
5
10
I
4
8
I
4
8
I
3
6
S
3
6
I
3
6
I
2
4
I/S
2
4
S
2
4
S
2
4
I
2
4
I
1
2
S
1
2
S
1
2
S
1
2
I
1
2
I
1
2
S
Variables were never selected as predictors of landslide
occurrence
−0.370
0.833
−0.241
0.179
0.285
0.219
−0.287
−0.398
−0.276
0.133
0.303
0.134
0.144
−0.135
Group G85 used for the analysis.
LEN, SLO_ANG), three variables describing lithology
(CARBO, GHIAIA, MARNE), one variable for the
attitude of bedding (FRA), one variable describing
slope aspect (TR2), and one variable describing a land
use type (SS). The 10 variables are also present in
Table 3. Comparison of Tables 3 and 7 reveals that,
with the exception of AREN, all the 16 variables
selected to construct the susceptibility model shown in
Fig. 3 are listed in Table 7 as the most selected
variables. We take this as further indication of the
ability of the selected variables to explain the known
distribution of landslides.
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
5.3. Model sensitivity
The sensitivity of the susceptibility model to changes
in the input data is then investigated. In general, results
of a robust (least sensitive) statistical model should not
change significantly if the input data are changed within
a reasonable range. To investigate the sensitivity of the
susceptibility model to changes in the input data, we use
the entire ensemble of models, and we study the
variation in the overall percentage of mapping units
correctly classified by the 350 models. Three cases are
considered: (i) mapping units selected by the adopted
random selection procedure and classified by the
discriminant functions (selected units, i.e., “training”
or “modelling” set, Fig. 5A); (ii) mapping units not
selected by the random selection procedure and
175
classified by the discriminant functions constructed on
the corresponding subset of selected units (non-selected
units, i.e., “classification” or “validation” set, Fig. 5B);
and (iii) all mapping units, irrespective of the fact that
they pertained to the selected (training) or the nonselected (classification) sets (Fig. 5C).
In Fig. 5A, the box plots show that an increase in the
number of the selected mapping units results in a slight
decrease of the median value (50th percentile) of the
model fit, and in a significant decrease of the variability
(10th to 90th percentile range) of the model fit. This was
expected. Given the large number of the available
thematic variables (46), a reduced number of cases (268
mapping units for G30) allows for a (apparently) better
model classification (mean = 78.36% for G30) at the
expense of model variability, which is large (std. dev.
= 2.59% for G30). Further inspection of Fig. 5A indicates
that a reduction in the percentage of mapping units
correctly classified, and in the corresponding scatter in
the susceptibility estimates, become negligible for
percentages of the considered mapping units exceeding
75%. Thus, susceptibility models obtained using 75% or
more of mapping units do not differ significantly–in
terms of the number of correctly classified units–from
the model obtained using the entire set of 894 terrain
units. We take this as indication of the model ability to
cope with significant (up to 25%) random variation in
the input data.
Fig. 5B provides similar results for the non-selected
subsets. The overall model fit and its scatter increase
with a decreasing number of non-selected units.
Comparison of Fig. 5A and B reveals that models
prepared using the selected units result in a better
classification (i.e., in a larger model classification)
when compared to the models obtained using the nonselected units. This was also expected. Any statistical
classification provides better results on the training set,
and performs less efficiently when applied to the
validation set (Michie et al., 1994). Fig. 5C shows the
result for the collection of the selected (training) and
the non-selected (validation) subsets. The box plots
show the cumulative effect of the mapping units
correctly classified in the training and in the validation
sets. For this reason, the scatter around the median is
reduced, particularly for percentages of mapping units
exceeding 75%.
5.4. Uncertainty in the susceptibility estimate of
individual mapping units
Fig. 5. Sensitivity analysis for landslide susceptibility model shown in
Fig. 3. (A) Training set. (B) Validation set. (C) All mapping units.
Numbers of elements in each group are shown.
The adopted approach to ascertain landslide susceptibility provides a unique value for the probability of
176
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
spatial occurrence of known landslides (i.e., of landslide
susceptibility) for each mapping unit (Fig. 3). The
approach does not provide a measure of the error
associated with the probability estimate. This is a
limitation. To obtain an assessment of the error
associated with the susceptibility assessment we use
group G85. This group is selected as a compromise
between model sensitivity and a sufficiently reduced
number of selected mapping units to account for model
variability.
For each mapping unit, Fig. 6 shows the comparison
between the mean value of the 50 probability estimates
obtained using group G85 (x-axis) and the single
probability estimate obtained for the model shown in
Fig. 3 (y-axes), prepared using the entire set of 894 slope
units. The correlation between the two estimates of
landslide susceptibility is very high. We take this as
indication that the two classifications are virtually
identical.
Based on this result, Fig. 7 relates, for the 894
mapping units, the probability estimate of landslide
spatial occurrence (x-axis), ranked from low (left) to
high (right) values, to the variation of the model estimate
(y-axis), measured by 2 standard deviations (2σ) of the
obtained probability estimate. The measure of 2σ is low
(< 0.05) for mapping units classified as highly susceptible (probability > 0.80) and as largely stable (probability < 0.20). The scatter in the model estimate is larger for
intermediate values of the probability (i.e., for probability values between 0.45 and 0.55), indicating that for
Fig. 6. For 894 mapping units in the Collazzone area, comparison of
the mean value for 50 probability estimates obtained from group G85
(x-axis), and the single probability value obtained for the susceptibility
model shown in Fig. 3 (y-axis).
Fig. 7. Landslide susceptibility model error. For 894 mapping units in
the Collazzone area, the graph shows the mean value of 50 probability
estimates (x-axis) against two standard deviations (2σ) of the
probability estimate (y-axis). Statistics obtained from group G85.
Black line shows estimated model error obtained by linear regression
fit (least square method).
these mapping units not only is the model incapable of
satisfactorily classifying the terrain as stable or unstable,
but also that the obtained estimate is highly variable, and
hence, unreliable. Fig. 7 indicates that the variation in
the model estimate can be approximated by the
Fig. 8. Map showing estimated model error (2σ) for the landslide
susceptibility model shown in Fig. 3. Model error computed using Eq.
(2). Square bracket, class limit included; round bracket, class limit not
included. Larger values indicate increased uncertainty in the
probabilistic estimate of landslide susceptibility.
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
following quadratic equation, obtained by linear regression fit (least square method):
y¼
0:309x2 þ 0:308x
0VxV1 ðr2 ¼ 0:605Þ
ð2Þ
where, x is the estimated value of the probability of
pertaining to an unstable mapping unit (i.e., the
landslide susceptibility estimate), and y is 2σ of the
model estimate.
We consider the value of 2 standard deviations of the
model estimate a proxy for the model error. We use Eq.
(2) to estimate quantitatively the model error for each
mapping unit, based on the computed probability
estimate. For each mapping unit, Fig. 8 shows the
error associated with the probability estimate (i.e., to
landslide susceptibility), computed using the quadratic
Eq. (2). Fig. 8 provides a quantitative measure of the
error associated with the quantitative landslide susceptibility assessment shown in Fig. 3.
To investigate further the relationship between the
predicted probability of spatial landslide occurrence
and its variation (error), the 894 mapping units were
ranked according to the mean value of the computed
probability estimates obtained from group G85. In Fig.
9, the rank of the mapping unit (x-axis) was plotted
against statistics of the probability estimates (y-axis).
The thick line shows the average value of the landslide
susceptibility estimates, and the thin lines show ± 2σ of
the estimate. The measure of 2 standard deviations, a
proxy for model error, varies with the predicted
probability of spatial landslide occurrence. The variation is small for mapping units predicted as highly
unstable, increases to a maximum value towards the
centre of the graph where unclassified mapping units
are shown, and decreases again for mapping units
predicted as highly stable.
Fig. 9. Variation in the estimate of landslide susceptibility. For 894
slope units, ranked from low (left) to high (right) susceptibility values
(x-axis), the graph shows the probability of the spatial occurrence of
landslides (y-axis). Probability estimates obtained from group G85.
177
6. Analysis of model prediction skill
The tests described in the previous section were
aimed at determining the (statistical) reliability and
robustness of the susceptibility model, and at estimating
the error associated with the quantitative forecast. We
performed all tests using the same landslide information
used to construct the susceptibility model, i.e., the multitemporal shallow landslide inventory map (Fig. 2, Table
1B). The performed tests do not provide insight on the
ability of the susceptibility model to predict the
occurrence of new or reactivated (i.e., “future”) landslides, which is the goal of any susceptibility assessment
(Chung and Fabbri, 1999; Guzzetti et al., 1999; Chung
and Fabbri, 2003; Guzzetti et al., 2005).
To evaluate the ability of the susceptibility model to
predict future landslides, we exploit the spatial distribution of shallow slope failures obtained from two
recent landslide event inventories. The first inventory
shows 413 landslides triggered by rapid snowmelt in
January 1997 (Fig. 10A, Table 1C). Landslides shown in
this inventory were mapped at 1:10,000 scale through
field surveys and the interpretation of 1:20,000 scale
aerial photographs flown 4 months after the event
(Cardinali et al., 2000; Guzzetti et al., 2003). The second
event inventory shows 153 landslides triggered by
heavy rainfall in the period from October to December
2004 (Fig. 10B, Table 1D). The rainfall-induced landslides were mapped directly in the field at 1:10,000
scale.
Using the two recent event inventories, three tests are
performed to determine the ability of the susceptibility
model to predict future landslides. The first test consists
of computing the proportion of the event's landslide
area in each susceptibility class, and showing the results
using cumulative statistics. Fig. 11 shows the percentage
of the study area, ranked from most to least susceptible
(x-axis), against the cumulative percentage of the area of
the triggered landslides in each susceptibility class (yaxis), for the snowmelt-induced landslides in January
1997 (dashed line), and for the rainfall-induced landslides in autumn 2004 (dotted line). Statistics given in
Fig. 11 provide a quantitative estimate of the model
prediction skill.
The forecasting performance of the susceptibility
model is better for the 1997 snowmelt-induced landslides, and slightly poorer for the 2004 rainfall-induced
landslides. We attribute the difference to the larger
number of snowmelt-induced landslides, a function of
the different severity of the triggering events. In the
study area, rapid snowmelt in January 1997 was a
more severe trigger of landslides than the autumn 2004
178
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
Fig. 10. Recent landslide event inventory maps. (A) 406 shallow landslides triggered by rapid snowmelt in January 1997. (B) 152 shallow landslides
triggered by heavy rainfall in the period from October to December 2004. Original maps at 1:10,000 scale.
Fig. 11. Analysis of the prediction skill of landslide susceptibility
model shown in Fig. 3. Dashed line, shallow landslides triggered by
rapid snowmelt in January 1997 (Fig. 10A). Dotted line, shallow
landslides triggered by heavy rainfall in autumn 2004 (Fig. 10B).
Continuous line, model fitting performance (Fig. 4).
rainfall period. Fig. 11 shows that the prediction
performance is similar to (for rainfall-induced landslides) or even higher (for snow melt-induced landslides) than the model fitting performance (Fig. 4, and
thin line in Fig. 11). This is surprising, because a model
fitting performance is usually higher than its prediction
skill (Chung and Fabbri, 2003; Guzzetti et al., 2005).
The remaining two tests explore further the relationship between the predicted susceptibility classes and the
distribution and abundance of the triggered landslides.
Fig. 12A shows that 60.7% of the snowmelt-induced
landslide areas in January 1997, and 48.3% of the
rainfall-induced landslide areas in autumn 2004 occurred in mapping units classified as highly unstable
(probability > 0.80). Further, 88.7% of the snowmeltinduced landslide areas, and 75.1% of the rainfallinduced landslide areas occurred in unstable or highly
unstable slope units (probability > 0.55). Conversely,
only 2.5% of the snowmelt-induced landslide areas and
only 4.2% of the rainfall-induced landslide areas were
found in mapping units classified as highly stable
(probability ≤ 0.20). Fig. 12B shows similar results, but
considers the number of triggered landslides. To obtain
these statistics in the GIS, we established the central
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
179
a landslide susceptibility model (Chung and Fabbri,
2003); and (iii) a scheme designed to evaluate (Fig. 7)
and map (Fig. 8) the error associated with the
susceptibility estimate obtained for individual mapping
units. The latter is an improvement over existing
modelling approaches, which is especially relevant
when landslide susceptibility assessments are used for
planning purposes (Guzzetti et al., 2000) or to determine
landslide hazard (Guzzetti et al., 2005, 2006).
Based on the results obtained in the Collazzone area,
and aided by the scarce literature on the validation of
landslide susceptibility models (Carrara et al., 1992;
Chung and Fabbri, 1999, 2003, 2005; Ardizzone et al.,
2002; Fabbri et al., 2003; Remondo et al., 2003), a
general framework for establishing the quality of a
landslide susceptibility assessment is proposed, including an objective scheme for ranking the quality of the
assessment.
It is proposed that any landslide susceptibility model
be tested to
Fig. 12. Analysis of the relationships between the predicted
susceptibility classes and the distribution and abundance of the
triggered landslides. (A) Cumulative percentage of landslide area in
the susceptibility classes. (B) Cumulative percentage of the number of
triggered landslides in the susceptibility classes (x-axis).
point of each landslide polygon and counted the number
of landslide central points in each mapping unit. About
55.8% of the snowmelt-induced landslides and 53.3% of
the rainfall-induced landslides occurred in mapping
units classified as highly unstable (probability > 0.80).
Conversely, only 2.2% of the snowmelt-induced landslides and only 3.3% of the rainfall-induced landslides
occurred in mapping units classified as highly stable
(probability ≤ 0.20). Fig. 12 confirms the aptitude of the
susceptibility model to predict where (i.e., in which
mapping unit) the snowmelt-induced landslides occurred in January 1997, and where the rainfall-induced
landslides occurred in autumn 2004.
7. Discussion
In the previous sections, we have shown how the
quality (i.e., reliability, robustness, degree of fitting and
prediction skill) of a statistically based, landslide
susceptibility model can be assessed quantitatively.
The adopted evaluation procedure includes: (i) standard
methods used to evaluate the “goodness of fit” of a
statistical classification (e.g., Tables 4 and 5); (ii) tests
proposed in the literature to determine the degree of
model fitting (Fig. 4) and the prediction skill (Fig. 11) of
(i) determine the degree of model fit;
(ii) establish the aptitude of the thematic information
to construct the model, including an assessment of
the sensitivity of the model to changes in the
landslide and the thematic information used to
construct the model;
(iii) determine the error associated with the probabilistic estimate obtained for each mapping unit; and
(iv) test the skill of the model prediction to forecast
“future” landslides.
Determining the degree of model fit consists of
establishing how well the model describes the known
distribution of landslides (Tables 4 and 5). The task is
easily performed in a GIS by using the same landslide
information used to construct the susceptibility model.
For the purpose, contingency tables (Tables 4 and 5) and
cumulative statistics of the abundance of landslides in
the susceptibility classes (Fig. 4) can be used. For the
test to be significant, the landslide information must be
representative, accurate, and complete.
To evaluate the role of the thematic information in the
construction of the landslide susceptibility model (Table
7) and to evaluate the model sensitivity (Fig. 5), we
studied the thematic variables entered (and not entered)
in a large set of discriminant classification functions
constructed on a sub-set of randomly selected mapping
units (group G85). In this scheme, the random selection
procedure accounted for the variability in the input data.
The expected error (i.e., the level of uncertainty)
associated with the probabilistic susceptibility estimate
180
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
obtained for each mapping unit was determined by
investigating the variability of the estimate in the
mapping units. For this purpose, we established two
standard deviations (2σ) of the model estimate as a
quantitative measure of the model uncertainty, and we
modelled the expected error using equation 2 (Fig. 7).
To show the geographical distribution of the expected
error we prepared the map shown in Fig. 8.
Testing the skill of the susceptibility model to
forecast “future” landslides can best be accomplished
by using landslide information not available to construct
the susceptibility model. This study used independent
shallow landslide information obtained from two eventinventory maps showing new slope failures triggered by
rapid snow melting (Fig. 10A) and by intense rainfall
(Fig. 10B). Chung and Fabbri (2003, 2005) obtained a
similar result by splitting a multi-temporal inventory
into two temporal subsets, i.e., a training set containing
landslides that occurred before an established date, and a
classification set showing landslides that occurred after
that date. The landslide classification set was used to test
the forecasting performance of the model constructed
from the training set. We argue that our scheme for
testing the model prediction performance is superior,
given the availability of new and independent landslide
information (i.e., the event inventories shown in Fig.
10). In this scheme, to construct the susceptibility model
the entire information about past landslides is exploited
and not a limited subset. As a potential drawback, this
scheme is more limited because a reduced number of
landslides (406 snowmelt-induced and 152 rainfallinduced shallow landslides) is used to ascertain the
model prediction skill.
Chung and Fabbri (2003, 2005) also proposed
splitting the study area geographically into two subareas of equal size: a training (modelling) area and a
validation (classification) area. In this scheme, the
model is constructed in the training area and its
prediction performance is evaluated in the validation
area. Splitting the study area into two adjacent sub-areas
can be problematic. The approach assumes that the
independent thematic variables remain the same in the
training and the classification areas. A rock type or land
use class may be present in an area (e.g., the training
area) but may not be represented in another area (e.g.,
the verification area), making it difficult (or even
impossible) to apply the classification function obtained
in the training area. Further, the approach assumes that
the role of the ensemble of thematic variables in
explaining the known distribution of landslides does
not change geographically. In places, validity of this
assumption may be difficult to establish.
Table 8 lists a set of criteria for ranking and
comparing the quality of landslide susceptibility assessments. Based on the listed criteria, when no information
is available on the quality of a landslide susceptibility
model, the resulting zoning map has the lowest possible
level of quality (level 0). We consider this level of
quality unacceptable for modern, statistical or physically
based susceptibility models. When estimates of model
fit are available, the susceptibility assessment has the
least acceptable quality level (level 1). When the error
associated with the predicted susceptibility estimate for
each mapping unit is established, the susceptibility
assessment has a higher level of quality (level 2). Lastly,
when the prediction skill of the model is known, the
susceptibility assessment has a still higher quality rank
(level 4). The proposed scheme allows for summing the
individual quality levels. As an example, a susceptibility
assessment for which the fitting performance (level 1)
and prediction skill (level 4) were determined is quality
level 5. When, for the same susceptibility assessment,
the error associated with the predicted susceptibility for
each mapping unit is established (level 2), the quality
level becomes 7. Adopting the proposed scheme, the
landslide susceptibility model prepared for the Collazzone area has the highest quality level (level 7).
The established set of tests does not guarantee, as
such, the quality of the susceptibility estimate. To obtain
this, the results of the tests must be matched against
established acceptance thresholds. Defining acceptance
thresholds is not an easy task. In the following, based on
the experience gained in landslide susceptibility assessments completed in southern (Carrara, 1983), central
(Carrara et al., 1991, 1995; Guzzetti et al., 1999, 2006;
Cardinali et al., 2002; Carrara et al., 2003) and northern
(Ardizzone et al., 2002; Carrara et al., 2003; Guzzetti et
Table 8
Proposed criteria and levels of quality for landslide susceptibility
models and associated maps
Description
Level
No information available, or no test performed to determine
the quality and prediction skill of the landslide
susceptibility assessment.
Estimates of degree of model fit are available
(tests performed using the same landslide information
used to obtain the susceptibility estimate).
Estimates of the error associated with the predicted
susceptibility value in each terrain unit are available
(tests performed using the same landslide information
used to obtain the susceptibility estimate).
Estimates of the model prediction performance are available
(tests performed using independent landslide
information, not used to obtain the susceptibility model).
0
1
2
4
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
al., 2005) Italy, we propose acceptance thresholds, and
then compare the results of our tests to the proposed
thresholds.
We consider acceptable a susceptibility model with
an overall degree of model fit greater than about 75%. If
the overall model fit is greater that 80%, we regard the
classification as very satisfactory. An extremely high
value of the overall model fit (e.g., ≥ 90%) is an
indication that the model matches too closely the
original landslide inventory map. When such a case
arises, the model prediction is virtually indistinct from a
prediction made using solely the landslide inventory,
making the model suspicious. The case may arise, where
the spatial distribution of landslides is “trivial” to
forecast or where the number of mapping units is very
small compared to the number of the explanatory
variables (e.g., Campus et al., 1999). An additional
indication of the higher quality of the model consists of
a reduced number (e.g., ≤ 15%) of mapping units with
landslides erroneously classified as “stable” areas by the
model. The overall fit obtained for the susceptibility
model prepared for the Collazzone area was 77.0%
(Table 4), and the proportion of mapping units with
landslides erroneously classified as stable areas was
9.5% (85 units).
A statistical model obtained using a reduced number
of geomorphologically meaningful explanatory variables is less expensive and, thus, superior to a model
which uses a very large number of variables. Further,
use of a stable combination of variables provides for a
robust model that can cope with uncertainty in the input
data. The discriminant function used to construct the
susceptibility model shown in Fig. 3 selected 16 of the
46 available thematic variables (34.8%). Our analysis
(Table 7) revealed that the selected variables were
highly consistent in classifying the mapping units as
stable or unstable in a large number of models. We
consider this an indication of the robustness of the
selected model.
Apart from the example discussed in this work (Figs.
7 and 8), we are not aware of any other susceptibility
assessment for which the error associated with the
probabilistic estimate of landslide occurrence was
determined for individual mapping units. Establishing
an acceptance threshold is therefore difficult. Inspection
of Fig. 7 reveals that most mapping units have an
expected error (2σ) lower than 10% of the probability
estimate. This figure is taken as a quality acceptance
threshold for the model error. In the model presented
here (Fig. 3), there are only 21 mapping units (2.35%)
that do not match this criterion. Most of the latter units
are in the “unclassified” probability range (Fig. 7).
181
To appraise the fitting performance and the prediction skill of a landslide susceptibility model, Chung and
Fabbri (2003) proposed comparing the proportion of
landslide area in each susceptibility class (AL) with the
proportion of the susceptibility class (AS) in the study
area. For a successful classification, the “effectiveness
ratio” AL/AS should be greater than one in the areas
predicted as landslide prone by the model, and less than
one in the areas predicted as stable by the model. A very
effective prediction class has a ratio close to zero or very
large, depending whether the class predicts stability or
instability. Where the effectiveness ratio is near one, the
proportion of landslides in the susceptibility class is not
different from the average landslide density in the study
area, and the performance of the susceptibility class in
determining the known (“fitting” performance) or the
future (“prediction” skill) location of landslides is weak.
Chung and Fabbri (2003) considered “effective” a
susceptibility class with a ratio larger than at least 3
(unstable areas) or less than at most 0.2 (stable areas),
and “significantly effective” a susceptibility class with a
ratio larger than at least 6 or less than at most 0.1. We
regard these criteria as very hard to match, particularly
in complex areas where landslides are large and
numerous, and where the landscape exhibits considerable geomorphological variability. We consider “effective” a susceptibility class with an effectiveness ratio
larger than 1.5 or smaller than 0.5, corresponding to a
50% increase or a 50% decrease from the expected
proportion of landslides in the susceptibility class.
Fig. 13 shows the efficacy of the susceptibility model
shown in Fig. 3 in describing the known distribution of
landslides (“fitting” performance, Fig. 13A), and the
location of “future” landslides (“prediction” skill, Fig.
13B and C). Based on the established criteria, 12 of the
20 landslides susceptibility classes are “effective” in
explaining the distribution of the known (past) landslides used to construct the model. In Fig. 13A, the black
and the white bars exceeding the 1.5 and the 0.5
thresholds, respectively, show these effective classes. In
the figure, the three cross-hachured bars represent
terrain units classified as unstable (spatial probability
in the range from 0.55 to 0.70) where landslides were
not abundant in the multi-temporal inventory map.
Comparison of Fig. 13B and C indicates that the
individual susceptibility classes were better predictors of
the presence (black bars) or the absence (white bars) of
the snowmelt-induced landslides than of the rainfallinduced landslides. For the latter, the number of
“ineffective” classes is also larger.
It should be clear that the proposed acceptance
thresholds are not absolute or fixed. The proposed limits
182
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
in errors that affect a susceptibility assessment. These
factors include (i) imprecision and incompleteness in the
landslide information used to construct and test the
susceptibility model (Carrara et al., 1992; Ardizzone et al.,
2002); (ii) quality, abundance, precision and completeness
of the thematic data used to obtain the susceptibility
assessment (Carrara et al., 1992; Soeters and van Westen,
1996; Carrara et al., 1999); (iii) characteristics and
limitations of the statistical technique used to obtain the
classification, including the experience of the investigator
in applying the selected statistical tools (Carrara et al.,
1992; Michie et al., 1994); and (iv) selection of the
appropriate mapping unit (e.g., slope units, unique
condition units, grid cells, etc.) (Carrara et al., 1995;
Guzzetti et al., 1999). The listed factors require choices
from the investigator, which inevitably introduce uncertainty in the susceptibility assessment. For this work we
assumed the landslide information used to construct the
model (Fig. 2, Table 1B) and to test the model prediction
skill (Fig. 10A and B) was accurate and complete. We also
assumed the thematic data were correct and complete and
relevant to the distribution of landslides in the Collazzone
area. We further assumed that the terrain subdivision
adopted to ascertain landslide susceptibility was precise in
describing the morphology of the area, and apt to explain
the size and abundance of landslides in the study area.
Finally, landslide susceptibility is just one of the three
components of landslide hazard (Guzzetti et al., 1999,
2005, 2006). In addition to landslide susceptibility, to
determine landslide hazard one has to ascertain the
temporal frequency (i.e., recurrence) of the landslides, and
the statistics of landslide size. These probabilistic
assessments are affected by errors which should be
identified and determined.
8. Conclusions
Fig. 13. Analysis of the effectiveness of the landslide susceptibility
classification.
were selected heuristically, based on the results of
landslide susceptibility assessments carried out in Italy
in the last 25 years. The acceptance criteria need to be
tested in other areas and by different investigators.
Depending on the geomorphological setting and the
complexity of a study area, other investigators may
select different thresholds.
The framework discussed for the evaluation of the
quality of a landslide susceptibility model considers the
most relevant sources of errors in a statistically based
susceptibility assessment, but other factors exist resulting
Where accurate landslide inventory maps have been
prepared, the availability of user-friendly GIS software
and of digital cartographic databases containing morphological, geological, land use and other environmental information has made it easy for geomorphologists to
obtain digital landslide susceptibility maps. These maps
attempt to zone an area based on the propensity of a
territory to produce new or reactivated landslides, and
they have proven to be very valuable for land use
planning, policymaking, and civil defence (Brabb,
1984; Varnes and IAEG, 1984; Guzzetti et al., 2000;
Glade et al., 2005). As in any other prediction, a
landslide susceptibility assessment requires proper
validation to ascertain its quality and prediction skills.
Unfortunately, inspection of the literature reveals that
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
this is rarely performed. Forecast without validation is of
as little use in science as in common life (Jollifee and
Stephenson, 2003). Landslide susceptibility assessments are no exception, and lack of proper verification
(or falsification) jeopardizes the use of a susceptibility
map. Indeed, if a geomorphologist cannot define the
reliability, robustness, degree of fitting and prediction
skill (i.e., the quality) of a susceptibility assessment,
why should a planner, policy-maker or civil defence
manager use the prediction?
In this paper, we have proposed a framework to
address this problem. The framework is based on a set of
tests and related acceptance criteria aimed at establishing and ranking the quality of a landslide susceptibility
assessment, including the degree of model fit, the
robustness of the model to changes in the input data, the
error associated with the probabilistic estimate, and the
model prediction skill. The proposed framework has
been successfully tested in the Collazzone area. The
experiment has demonstrated that adopting a simple,
statistical classification method applied to good-quality
data allows geomorphologists to prepare and validate a
landslide susceptibility model, at least for areas of the
size of the Collazzone study area. It remains to be
demonstrated if the same set of tests and related
acceptance criteria are applicable to larger areas,
extending for hundreds or even thousands of square
kilometres (e.g., Cardinali et al., 2002).
Two noteworthy improvements over existing landslide susceptibility modelling efforts have been
obtained. The first improvement consists of providing
an estimate of the error associated with the probability of
landslide spatial occurrence (i.e., susceptibility) obtained for each mapping unit in which a study area is
partitioned. This is particularly significant where
landslide susceptibility assessments are prepared to be
used for planning purposes or for establishing land use
regulations (Guzzetti et al., 2000). The second improvement consists of having established clearly defined
criteria and associated acceptance thresholds for determining and ranking the quality of a landslide susceptibility assessment. If adopted, the proposed framework
will provide for quantitative comparisons of the results
obtained by different investigators working in different
areas, and using different methods, to predict landslide
susceptibility. Ultimately, this will add to the credibility
of our products and the quality of our science.
Acknowledgements
Work supported by CNR-GNDCI (Publication Number 2893) and CNR-IRPI grants. We thank Earl E.
183
Brabb, the editor, and two anonymous referees for their
comments.
References
Ardizzone, F., Cardinali, M., Carrara, A., Guzzetti, F., Reichenbach, P.,
2002. Uncertainty and errors in landslide mapping and landslide
hazard assessment. Natural Hazards and Earth System Sciences 2
(1–2), 3–14.
Brabb, E.E., 1984. Innovative approach to landslide hazard and risk
mapping. Proceedings of the 4th International Symposium on
Landslides, Toronto, vol. 1, pp. 307–324.
Brown, C.E., 1998. Applied Multivariate Statistics in Geohydrology
and Related Sciences. Springer-Verlag. 248 pp.
Campus, S., Forlati, F., Pegoraro, C., 1999. Studio propedeutico alla
valutazione della pericolosità geologica inerente l'instabilità dei
versanti mediante tecniche GIS ed approccio statistico multivariato. In: Piemonte, Regione (Ed.), Eventi Alluvionali in
Piemonte, Regione Piemonte, Torino, pp. 288–295.
Cardinali, M., Ardizzone, F., Galli, M., Guzzetti, F., Reichenbach, P.,
2000. Landslides triggered by rapid snow melting: the December
1996–January 1997 event in Central Italy. In: Claps, P., Siccardi, F.
(Eds.), Proceedings 1st Plinius Conference on Mediterranean
Storms. Bios Publisher, Cosenza, pp. 439–448.
Cardinali, M., Carrara, A., Guzzetti, F., Reichenbach, P., 2002.
Landslide hazard map for the Upper Tiber River basin. CNR,
Gruppo Nazionale per la Difesa dalle Catastrofi Idrogeologiche,
Publication n. 2116, scale 1:100,000.
Carrara, A., 1983. A multivariate model for landslide hazard
evaluation. Mathematical Geology 15, 403–426.
Carrara, A., Cardinali, M., Detti, R., Guzzetti, F., Pasqui, V.,
Reichenbach, P., 1991. GIS Techniques and statistical models in
evaluating landslide hazard. Earth Surface Processes and Landforms 16 (5), 427–445.
Carrara, A., Cardinali, M., Guzzetti, F., 1992. Uncertainty in assessing
landslide hazard and risk. ITC Journal 2, 172–183.
Carrara, A., Cardinali, M., Guzzetti, F., Reichenbach, P., 1995. GIS
technology in mapping landslide hazard. In: Carrara, A., Guzzetti,
F. (Eds.), Geographical Information Systems in Assessing Natural
Hazards. Kluwer Academic Publisher, Dordrecht, The Netherlands, pp. 135–175.
Carrara, A., Guzzetti, F., Cardinali, M., Reichenbach, P., 1999. Use of
GIS technology in the prediction and monitoring of landslide
hazard. Natural Hazards 20 (2–3), 117–135.
Carrara, A., Crosta, G.B., Frattini, P., 2003. Geomorphological and
historical data in assessing landslide hazard. Earth Surface
Processes and Landforms 28 (10), 1125–1142.
Chung, C.-J.F., Fabbri, A.G., 1999. Probabilistic prediction models for
landslide hazard mapping. Photogrammetric Engineering &
Remote Sensing 65 (12), 1389–1399.
Chung, C.-J.F., Fabbri, A.G., 2003. Validation of spatial prediction
models for landslide hazard mapping. Natural Hazards 30 (3),
451–472.
Chung, C.-J.F., Fabbri, A.G., 2005. Systematic procedures of landslide
hazard mapping for risk assessment using spatial prediction
models. In: Glade, T., Anderson, M.G., Crozier, M.J. (Eds.),
Landslide Risk Assessment. John Wiley, pp. 139–174.
Cohen, J., 1960. A coefficient of agreement for nominal scales.
Educational and Psychological Measurement 20, 37–46.
Cruden, D., Varnes, D., 1996. Landslide types and processes. In:
Turner, A., Schuster, R. (Eds.), Landslides: Investigation and
184
F. Guzzetti et al. / Geomorphology 81 (2006) 166–184
Mitigation. Transportation Research Board, National Research
Council, vol. 247, pp. 36–75.
Fabbri, A.G., Chung, C.-J.F., Cendrero, C., Remondo, J., 2003. Is
prediction of future landslides possible with a GIS? Natural
Hazards 30 (3), 487–503.
Fisher, R.A., 1936. The use of multiple measurements in taxonomic
problems. Annales Eugenics 7, 179–188.
Glade, T., Anderson, M.G., Crozier, M.J. (Eds.), 2005. Landslide Risk
Assessment. John Wiley and Sons. 832 pp.
Gorsevski, P.V., Gessler, P.E., Jankowski, P., 2003. Integrating a fuzzy
k-means classification and a Bayesian approach for spatial
prediction of landslide hazard. Journal of Geographical Systems
5 (3), 223–251.
Guzzetti, F., Carrara, A., Cardinali, M., Reichenbach, P., 1999.
Landslide hazard evaluation: an aid to a sustainable development.
Geomorphology 31, 181–216.
Guzzetti, F., Cardinali, M., Reichenbach, P., Carrara, A., 2000.
Comparing landslide maps: a case study in the upper Tiber River
Basin, central Italy. Environmental Management 25 (3), 247–363.
Guzzetti, F., Reichenbach, P., Cardinali, M., Ardizzone, F., Galli, M.,
2003. Impact of landslides in the Umbria Region, Central Italy.
Natural Hazards and Earth System Sciences 3 (5), 469–486.
Guzzetti, F., Reichenbach, P., Cardinali, M., Galli, M., Ardizzone, F.,
2005. Landslide hazard assessment in the Staffora basin, northern
Italian Apennines. Geomorphology 72, 272–299.
Guzzetti, F., Galli, M., Reichenbach, P., Ardizzone, F., Cardinali, M.,
2006. Landslide hazard assessment in the Collazzone area,
Umbria, central Italy. Natural Hazards and Earth System Sciences
6, 115–131.
Hoehler, F.K., 2000. Bias and prevalence effects on kappa viewed in
terms of sensitivity and specificity. Journal of Clinical Epidemiology 53 (5), 499–503.
Jollifee, I.T., Stephenson, D.B., 2003. Forecast Verification. A
Practitioner's Guide in Atmospheric Science. John Wiley. 240 pp.
Landis, J.R., Kock, G.G., 1977. The measurement of observer
agreement for categorial data. Biometrics 33, 159–174.
Mason, I.B., 2003. Binary events. Chapter 3. In: Jollifee, I.T.,
Stephenson, D.B. (Eds.), Forecast Verification. A Practitioner's
Guide in Atmospheric Science. John Wiley, pp. 37–76.
McCalpin, J., 1984. Preliminary age classification of landslides for
inventory mapping. 21st Annual Symposium on Engineering
Geology and Soils Engineering, April 5–6, 1984, Pocatello, Idaho.
13 pp.
Michie, D., Spiegelhalter, D.J., Taylor, C.C. (Eds.) 1994. Machine
Learning, Neural and Statistical Classification. Internet version
available at https://www.amsta.leeds.ac.uk/~charles/statlog/.
Remondo, J., Gonzalez, A., Diaz De Teran, J.R., Cendrero, A., Fabbri,
A., Chung, C.-J.F., 2003. Validation of Landslide Susceptibility
Maps; Examples and Applications from a Case Study in Northern
Spain. Natural Hazards 30, 437–449.
Soeters, R., van Westen, C.J., 1996. Slope instability recognition
analysis and zonation. In: Turner, A.K., Schuster, R.L. (Eds.),
Landslide Investigation and Mitigation, National Research
Council. Transportation Research Board Special Report, vol.
247. National Academy Press, Washington, D.C., pp. 129–177.
SPSS, 2004. SPSS 13.0 Command Syntax Reference. SPSS Inc.
Chicago. 1994 p.
Varnes, D.J., and IAEG Commission on Landslides and other MassMovements, 1984. Landslide Hazard Zonation: A Review of
Principles and Practice. UNESCO Press, Paris. 63 pp.