Fast Area-Weighted Peeling of Convex Hulls for Outlier Detection111This work is supported in part by Independent Research Fund Denmark grant 9131-00113B and a fellowship from the Department of Computer Science at UC Irvine.

Vinesh Sridhar University of California, Irvine, [email protected]    Rolf Svenning The Department of Computer Science, Aarhus University, [email protected]
Abstract

We present a novel 2D convex hull peeling algorithm for outlier detection, which repeatedly removes the point on the hull that decreases the hull’s area the most. To find k𝑘kitalic_k outliers among n𝑛nitalic_n points, one simply peels k𝑘kitalic_k points. The algorithm is an efficient heuristic for exact methods, which find the k𝑘kitalic_k points whose removal together results in the smallest convex hull. Our algorithm runs in 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ) time using 𝒪(n)𝒪𝑛\mathcal{O}\!\left(n\right)caligraphic_O ( italic_n ) space for any choice of k𝑘kitalic_k. This is a significant speedup compared to the fastest exact algorithms, which run in 𝒪(n2logn+(nk)3)𝒪superscript𝑛2𝑛superscript𝑛𝑘3\mathcal{O}\!\left(n^{2}\log{n}+(n-k)^{3}\right)caligraphic_O ( italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_log italic_n + ( italic_n - italic_k ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) time using 𝒪(nlogn+(nk)3)𝒪𝑛𝑛superscript𝑛𝑘3\mathcal{O}\!\left(n\log{n}+(n-k)^{3}\right)caligraphic_O ( italic_n roman_log italic_n + ( italic_n - italic_k ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) space by Eppstein et al. [12, 14], and 𝒪(nlogn+(4k2k)(3k)kn)𝒪𝑛𝑛binomial4𝑘2𝑘superscript3𝑘𝑘𝑛\mathcal{O}\!\left(n\log{n}+\binom{4k}{2k}(3k)^{k}n\right)caligraphic_O ( italic_n roman_log italic_n + ( FRACOP start_ARG 4 italic_k end_ARG start_ARG 2 italic_k end_ARG ) ( 3 italic_k ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_n ) time by Atanassov et al. [4]. Existing heuristic peeling approaches are not area-based. Instead, an approach by Harsh et al. [17] repeatedly removes the point furthest from the mean using various distance metrics and runs in 𝒪(nlogn+kn)𝒪𝑛𝑛𝑘𝑛\mathcal{O}\!\left(n\log{n}+kn\right)caligraphic_O ( italic_n roman_log italic_n + italic_k italic_n ) time. Other approaches greedily peel one convex layer at a time [20, 2, 19, 30], which is efficient when using an 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ) time algorithm by Chazelle [7] to compute the convex layers. However, in many cases this fails to recover outliers. For most values of n𝑛nitalic_n and k𝑘kitalic_k, our approach is the fastest and first practical choice for finding outliers based on minimizing the area of the convex hull. Our algorithm also generalizes to other objectives such as perimeter.

1 Introduction

When performing data analysis, a critical first step is to identify outliers in the data. This has applications in data exploration, clustering, and statistical analysis [31, 9, 23]. Typical methods of outlier detection such as Grubbs’ test [15] are based in statistics and require strong assumptions about the distribution from which the sample is taken. These are known as parametric outlier detection tests. If the sample size is too small or the distribution assumptions are incorrect, parametric tests can produce misleading results. For these reasons, non-parametric complementary approaches based in computation geometry have emerged. Our work follows this line of research and is based on the fundamental notion of a convex hull. For a set of points P𝑃Pitalic_P, the convex hull is the smallest convex set containing P𝑃Pitalic_P [10].

There are numerous definitions of outliers [22, 28, 3], but a general theme is that points without many close neighbors are likely to be outliers. As such, these outlying points tend to have a large effect on the shape of the convex hull. Prior work has applied this insight in different ways to identify possible outliers, such as removing points from the convex hull to minimize its diameter [1, 13], its perimeter [11], or its area [14, 12]. Motivated by the last category, we will consider likely outliers to be points whose removal causes the area of the convex hull to shrink the most. We propose a greedy algorithm that repeatedly removes the point pP𝑝𝑃p\in Pitalic_p ∈ italic_P such that the area of P𝑃Pitalic_P’s convex hull decreases the most. We call the amount the area would decrease if point p𝑝pitalic_p is removed its sensitivity σ(p)𝜎𝑝\sigma\!\left(p\right)italic_σ ( italic_p ). The removed point is guaranteed to be on the convex hull, and such an algorithm is known as a convex hull peeling algorithm [19, 30]. To find k𝑘kitalic_k outliers, we peel k𝑘kitalic_k points. Our algorithm is conceptually simple, though it relies on the black-box use of a dynamic (or deletion-only) convex hull data structure [18, 6]. We assume that points are in general position. This assumption may be lifted using perturbation methods [25].

Refer to caption
Figure 1: Here point v𝑣vitalic_v was peeled from the convex hull and replaced by vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. The previous triangle tuv𝑡𝑢𝑣\triangle tuv△ italic_t italic_u italic_v for u𝑢uitalic_u contained no points. However, when u𝑢uitalic_u’s triangle becomes tuv𝑡𝑢superscript𝑣\triangle tuv^{\prime}△ italic_t italic_u italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, the set of points ΔAΔ𝐴\Delta Aroman_Δ italic_A affect the sensitivity σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) of u𝑢uitalic_u. The size of ΔAΔ𝐴\Delta Aroman_Δ italic_A may be Ω(n)Ω𝑛\Omega\!\left(n\right)roman_Ω ( italic_n ).

The main challenge is maintaining the sensitivities as points are peeled. When peeling a single point v𝑣vitalic_v, there may be Ω(n)Ω𝑛\Omega\!\left(n\right)roman_Ω ( italic_n ) new points affecting the sensitivity σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) for a different point uv𝑢𝑣u\neq vitalic_u ≠ italic_v, as in Figure 1. In that case, naively computing the new sensitivity σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) would take Ω(n)Ω𝑛\Omega\!\left(n\right)roman_Ω ( italic_n ) time. Nevertheless, we show that our algorithm runs in 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ) time for any 1kn1𝑘𝑛1\leq k\leq n1 ≤ italic_k ≤ italic_n.

2 Related work

The two existing approaches for finding outliers based on the area of the convex hull took a more ideal approach. They considered finding the k𝑘kitalic_k points (outliers) whose removal together causes the area of the convex hull to decrease the most. We call this a k𝑘kitalic_k-peel and note that it always yields an area smaller or equal to that of performing k𝑘kitalic_k individual 1111-peels. It is not hard to come up with examples where the difference in area between the two approaches is arbitrarily large such as in Figure 2. Still, these examples are quite artificial and require that outliers have at least one other point close by. More importantly, these methods are combinatorial in nature, and much less efficient than our algorithm. The state-of-the-art algorithms for performing a k𝑘kitalic_k-peel run in 𝒪(n2logn+(nk)3)𝒪superscript𝑛2𝑛superscript𝑛𝑘3\mathcal{O}\!\left(n^{2}\log{n}+(n-k)^{3}\right)caligraphic_O ( italic_n start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_log italic_n + ( italic_n - italic_k ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) time and 𝒪(nlogn+(nk)3)𝒪𝑛𝑛superscript𝑛𝑘3\mathcal{O}\!\left(n\log{n}+(n-k)^{3}\right)caligraphic_O ( italic_n roman_log italic_n + ( italic_n - italic_k ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) space by Eppstein [12, 14] and 𝒪(nlogn+(4k2k)(3k)kn)𝒪𝑛𝑛binomial4𝑘2𝑘superscript3𝑘𝑘𝑛\mathcal{O}\!\left(n\log{n}+\binom{4k}{2k}(3k)^{k}n\right)caligraphic_O ( italic_n roman_log italic_n + ( FRACOP start_ARG 4 italic_k end_ARG start_ARG 2 italic_k end_ARG ) ( 3 italic_k ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_n ) time by Atanassov et al. [4]. While excellent theoretical results, for most values of 1kn1𝑘𝑛1\leq k\leq n1 ≤ italic_k ≤ italic_n and n𝑛nitalic_n, the running time of both of these algorithms is prohibitive for practical purposes. Our contribution is a fast and practical heuristic for these ideal approaches. There are also several results for finding the k𝑘kitalic_k points minimizing other objectives such as the minimum diameter, perimeter, or area-enclosing rectangle [13, 29].

Refer to caption
Figure 2: This figure demonstrates the limitations of our heuristic weighted-peeling approach. Clearly, the red squares are outliers, but because there are two squares close-by, the sensitivity of the red squares is minimal. Thus, our algorithm may peel all the valid points before peeling the outlier squares. Note that two k𝑘kitalic_k-peels for k=2𝑘2k=2italic_k = 2 would be sufficient to remove all outliers.

Another convex hull peeling algorithm is presented in [17]. Unlike in area-based peeling, they repeatedly remove the point furthest from the mean under various distance metrics. Letting d𝑑ditalic_d be the time to compute the distance between two points, their algorithm runs in 𝒪(nlogn+knd)𝒪𝑛𝑛𝑘𝑛𝑑\mathcal{O}\!\left(n\log{n}+knd\right)caligraphic_O ( italic_n roman_log italic_n + italic_k italic_n italic_d ) time, which is also significantly slower than our algorithm for most values of k𝑘kitalic_k. Since they maintain the mean of the remaining points during the peeling process, each peel takes Θ(n)Θ𝑛\Theta\!\left(n\right)roman_Θ ( italic_n ) time.

Some depth-based outlier detection methods also use convex hulls. They compute a point set’s convex layers, which can be defined by iteratively computing PCH(P)𝑃𝐶𝐻𝑃P\setminus CH(P)italic_P ∖ italic_C italic_H ( italic_P ) and are computable in 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ) time [7]. Here, points are deleted from the outermost-layer-in [20, 2, 19, 30]. While efficient, the natural example in Figure 3 is a bad instance for this approach.

Refer to caption
Figure 3: This example shows points drawn uniformly from a target disk P𝑃Pitalic_P. Clearly, the outliers are the points marked as red squares. It shows the downside of peeling based on depth since many points have to be peeled before reaching the outliers on the second and third layers. In particular, if there are n𝑛nitalic_n points drawn uniformly from P𝑃Pitalic_P, then its convex hull has expected size 𝒪(n1/3)𝒪superscript𝑛13\mathcal{O}\!\left(n^{1/3}\right)caligraphic_O ( italic_n start_POSTSUPERSCRIPT 1 / 3 end_POSTSUPERSCRIPT ) [16].

3 Results

The main result of our paper is Theorem 3.1, that there exists an algorithm for efficiently performing area-weighted-peeling.

Theorem 3.1.

Given n𝑛nitalic_n points in 2D, Algorithm 1 performs area-weighted-peeling, repeatedly removing the point from the convex hull which causes its area to decrease the most, in 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ) time.

To prove Theorem 3.1, we derive Theorem 6.7, which bounds the total number of times points become active in any 2D convex hull peeling process to 𝒪(n)𝒪𝑛\mathcal{O}\!\left(n\right)caligraphic_O ( italic_n ).

Definition 3.2 (Active Points).

Let (t,u,v)𝑡𝑢𝑣(t,u,v)( italic_t , italic_u , italic_v ) be consecutive points on the first layer in clockwise order. A point p𝑝pitalic_p is active for u𝑢uitalic_u if, upon deleting u𝑢uitalic_u and restoring the first and second layers, p𝑝pitalic_p moves to the first layer.

Intuitively, the active points are the points not on the convex hull that affect the sensitivities. Note that the active points form a subset of the points on the second convex layer. We define A(u)𝐴𝑢A(u)italic_A ( italic_u ) to be the set of active points for point u𝑢uitalic_u in a given configuration. Furthermore, all points in A(u)𝐴𝑢A(u)italic_A ( italic_u ) can be found by performing gift-wrapping starting from u𝑢uitalic_u’s counterclockwise neighbor t𝑡titalic_t while ignoring u𝑢uitalic_u. We use this ordering for the points in A(u)𝐴𝑢A(u)italic_A ( italic_u ). In Theorem 7.1, we show that our algorithm generalizes to other objectives such as perimeter where the sensitivity only depends on the points on the first layer and the active points.

4 Machinery

In this section, we describe some of the existing techniques we use. To efficiently calculate how much the hull shrinks when a point is peeled, we perform tangent queries from the neighbours of the peeled point to the second convex layer. The tangents from a point q𝑞qitalic_q to a convex polygon L𝐿Litalic_L can be found in 𝒪(logn)𝒪𝑛\mathcal{O}\!\left(\log{n}\right)caligraphic_O ( roman_log italic_n ) time both with [27] and without [21] a line separating q𝑞qitalic_q and L𝐿Litalic_L. In our application, such a separating line is always available, and either approach can be used. Tangent queries require that L𝐿Litalic_L is represented as an array or a balanced binary search tree of its vertices ordered (cyclically) as they appear on the perimeter of L𝐿Litalic_L. To allow efficient updates to L𝐿Litalic_L we use a binary tree representation that is leaf-linked such that given a pointer to a vertex its successor/predecessor can be found in 𝒪(1)𝒪1\mathcal{O}\!\left(1\right)caligraphic_O ( 1 ) time.

The convex layers of n𝑛nitalic_n points can be computed in 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ) time using an algorithm by Chazelle [7]. Given l𝑙litalic_l convex layers, after a single peel they can be restored in 𝒪(llogn)𝒪𝑙𝑛\mathcal{O}\!\left(l\log{n}\right)caligraphic_O ( italic_l roman_log italic_n ) time (Lemma 3.3 [24]). However, for our purposes we only need the 2 outermost layers for area calculations. As such, we explicitly maintain the two outermost layers L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT and L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, and we store all remaining points P{L1L2}𝑃superscript𝐿1superscript𝐿2P\setminus\left\{L^{1}\cup L^{2}\right\}italic_P ∖ { italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∪ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT } in a center convex hull. To restore L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT we use tangent queries on L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT as in [24]. To restore L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT we use extreme point queries on the center convex hull which we maintain using a semi-dynamic [18] or fully-dynamic [6] convex hull data structures supporting extreme point queries in worst case 𝒪(logn)𝒪𝑛\mathcal{O}\!\left(\log{n}\right)caligraphic_O ( roman_log italic_n ) time and updates in amortized 𝒪(logn)𝒪𝑛\mathcal{O}\!\left(\log{n}\right)caligraphic_O ( roman_log italic_n ) time.

5 Area-Weighted-Peeling Algorithm

In this section, we describe Algorithm 1 in detail and show that its running time is 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ).

Refer to caption
Figure 4: Using u𝑢uitalic_u’s neighbors, we can perform two tangent queries on L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT to recover the first and last active point of u𝑢uitalic_u, labeled ussubscript𝑢𝑠u_{s}italic_u start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT and uesubscript𝑢𝑒u_{e}italic_u start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT respectively, in 𝒪(logn)𝒪𝑛\mathcal{O}\!\left(\log n\right)caligraphic_O ( roman_log italic_n ) time. Because we represent L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT as a leaf-linked tree, we can walk along L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT to recover all points of A(u)𝐴𝑢A(u)italic_A ( italic_u ). The shaded part of the figure represents σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ).

At a high level, we want to repeatedly identify and remove the point which causes the area of the convex hull to decrease the most. Such an iteration is a peel, and we call the amount the area would decrease if point u𝑢uitalic_u was peeled the sensitivity σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) of u𝑢uitalic_u. To efficiently find the point to peel, we maintain a priority queue Q𝑄Qitalic_Q on the sensitivities of hull points. Only points on the convex hull may have positive sensitivity, and in lines 1-1 we compute the initial sensitives of the points on the convex hull and store them in Q𝑄Qitalic_Q. For a hull point u𝑢uitalic_u, to compute its sensitivity σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) we find its active points A(u)𝐴𝑢A(u)italic_A ( italic_u ). Note they must be on the second convex layer, and if u𝑢uitalic_u’s neighbors are t𝑡titalic_t and v𝑣vitalic_v, then the points A(u)𝐴𝑢A(u)italic_A ( italic_u ) are in the triangle tuv𝑡𝑢𝑣\triangle tuv△ italic_t italic_u italic_v. In line 1 we compute the two outer convex hull layers represented as balanced binary trees. That allows us to compute A(u)𝐴𝑢A(u)italic_A ( italic_u ) using tangent queries on the inner layer from t𝑡titalic_t and v𝑣vitalic_v. Then σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) can be found by computing the area of the polygon (tvA(u))𝑡𝑣𝐴𝑢\pentagon\!\left(t\circ v\circ A(u)\right)⬠ ( italic_t ∘ italic_v ∘ italic_A ( italic_u ) ).

As points are peeled (lines 1-1) layers L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT and L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT must be restored. To restore L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT when point u𝑢uitalic_u is peeled (line 1) we perform tangent queries on L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT as in [24] to find u𝑢uitalic_u’s active points A(u)𝐴𝑢A(u)italic_A ( italic_u ) (line 1) and move A(u)𝐴𝑢A(u)italic_A ( italic_u ) from L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT to L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT. See Figure 4 for an example of tangent queries from L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT to L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT.

To restore the broken part of L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, we perform extreme point queries on the remaining points efficiently using a dynamic convex hull data structure DCHsubscript𝐷𝐶𝐻D_{CH}italic_D start_POSTSUBSCRIPT italic_C italic_H end_POSTSUBSCRIPT (line 1) as in [18] or [6]. As described in Lemma 6.9, A(u)𝐴𝑢A(u)italic_A ( italic_u ) is always contiguous on L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. Therefore, removing A(u)𝐴𝑢A(u)italic_A ( italic_u ) from L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT requires us to restore it between two ”endpoints” a𝑎aitalic_a and b𝑏bitalic_b. The first extreme point query uses line ab¯¯𝑎𝑏\overline{ab}over¯ start_ARG italic_a italic_b end_ARG in the direction of u𝑢uitalic_u. If a point z𝑧zitalic_z from DCHsubscript𝐷𝐶𝐻D_{CH}italic_D start_POSTSUBSCRIPT italic_C italic_H end_POSTSUBSCRIPT is found then at least two more queries are performed with lines za¯¯𝑧𝑎\overline{za}over¯ start_ARG italic_z italic_a end_ARG and zb¯¯𝑧𝑏\overline{zb}over¯ start_ARG italic_z italic_b end_ARG. In general, if k𝑘kitalic_k points are found then the number of queries is 2k+12𝑘12k+12 italic_k + 1. The k𝑘kitalic_k points are deleted from DCHsubscript𝐷𝐶𝐻D_{CH}italic_D start_POSTSUBSCRIPT italic_C italic_H end_POSTSUBSCRIPT. This all happens on line 1.

Next, we compute the sensitivities of the new points on the hull (line 1) and insert them into the priority queue. Finally, we update the sensitivities of u𝑢uitalic_u’s neighbors t𝑡titalic_t and v𝑣vitalic_v (line 1), which, by Lemma 6.1(4), are the only two points already in Q𝑄Qitalic_Q whose sensitivity changes.

Input: A set of n𝑛nitalic_n points P𝑃Pitalic_P in 2D
1
2L1,L2superscript𝐿1superscript𝐿2absentL^{1},L^{2}\longleftarrowitalic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT , italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⟵ the first two convex layers of P𝑃Pitalic_P
3 Q𝑄absentQ\longleftarrowitalic_Q ⟵ empty max priority queue
4 for i=1𝑖1i=1italic_i = 1 to |L1|superscript𝐿1\left|L^{1}\right|| italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT | do
5       uLi1𝑢subscriptsuperscript𝐿1𝑖u\longleftarrow L^{1}_{i}italic_u ⟵ italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
6       Compute sensitivity σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) for u𝑢uitalic_u
7       Q.insert(u,σ(u))formulae-sequence𝑄𝑖𝑛𝑠𝑒𝑟𝑡𝑢𝜎𝑢Q.insert\!\left(u,\sigma\!\left(u\right)\right)italic_Q . italic_i italic_n italic_s italic_e italic_r italic_t ( italic_u , italic_σ ( italic_u ) )
8      
9
10DCHsubscript𝐷𝐶𝐻absentD_{CH}\longleftarrowitalic_D start_POSTSUBSCRIPT italic_C italic_H end_POSTSUBSCRIPT ⟵ a dynamic convex hull data structure on P{L1L2}𝑃superscript𝐿1superscript𝐿2P\setminus\left\{L^{1}\cup L^{2}\right\}italic_P ∖ { italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ∪ italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT }
11
12for i=1𝑖1i=1italic_i = 1 to n𝑛nitalic_n do
13       uQ.extractMaxformulae-sequence𝑢𝑄𝑒𝑥𝑡𝑟𝑎𝑐𝑡𝑀𝑎𝑥u\longleftarrow Q.extractMaxitalic_u ⟵ italic_Q . italic_e italic_x italic_t italic_r italic_a italic_c italic_t italic_M italic_a italic_x
14       A(u)𝐴𝑢absentA(u)\longleftarrowitalic_A ( italic_u ) ⟵ u𝑢uitalic_u’s active points
15       Delete u𝑢uitalic_u from L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT and update L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT, L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and DCHsubscript𝐷𝐶𝐻D_{CH}italic_D start_POSTSUBSCRIPT italic_C italic_H end_POSTSUBSCRIPT
16       for i=1𝑖1i=1italic_i = 1 to |A(u)|𝐴𝑢\left|A(u)\right|| italic_A ( italic_u ) | do
17             u¯A(u)i¯𝑢𝐴subscript𝑢𝑖\bar{u}\longleftarrow A(u)_{i}over¯ start_ARG italic_u end_ARG ⟵ italic_A ( italic_u ) start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT
18             Compute sensitivity σ(u¯)𝜎¯𝑢\sigma\!\left(\bar{u}\right)italic_σ ( over¯ start_ARG italic_u end_ARG ) for u¯¯𝑢\bar{u}over¯ start_ARG italic_u end_ARG
19             Q.insert(u¯,σ(u¯))formulae-sequence𝑄𝑖𝑛𝑠𝑒𝑟𝑡¯𝑢𝜎¯𝑢Q.insert\!\left(\bar{u},\sigma\!\left(\bar{u}\right)\right)italic_Q . italic_i italic_n italic_s italic_e italic_r italic_t ( over¯ start_ARG italic_u end_ARG , italic_σ ( over¯ start_ARG italic_u end_ARG ) )
20            
21      t,v𝑡𝑣absentt,v\longleftarrowitalic_t , italic_v ⟵ neighbors of u𝑢uitalic_u in L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT
22       Update Q[t]𝑄delimited-[]𝑡Q\!\left[t\right]italic_Q [ italic_t ] and Q[v]𝑄delimited-[]𝑣Q\!\left[v\right]italic_Q [ italic_v ]
23      
24
Algorithm 1 Weighted peeling

5.1 Analysis

The hardest part of the analysis is showing that the overall time spent on lines 1 and 1 is 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log n\right)caligraphic_O ( italic_n roman_log italic_n ). We first show that, excluding the time spent on these lines, the running time of Algorithm 1 is 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ). In line 1 we compute the first and second convex layers in 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ) time by running any optimal convex hull algorithm twice. In lines 1 to 1, we compute the initial sensitivities by finding the points active for each uL1𝑢superscript𝐿1u\in L^{1}italic_u ∈ italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT. As described above, we can do this by applying two tangent queries, allowing us to recover the first and last extreme point for u𝑢uitalic_u. We can walk along L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT between them to recover A(u)𝐴𝑢A(u)italic_A ( italic_u ). Once A(u)𝐴𝑢A(u)italic_A ( italic_u ) is found for each u𝑢uitalic_u, we find σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) by computing the area of the polygon (tuvA(u))𝑡𝑢𝑣𝐴𝑢\pentagon\!\left(t\circ u\circ v\circ A(u)\right)⬠ ( italic_t ∘ italic_u ∘ italic_v ∘ italic_A ( italic_u ) ), where t𝑡titalic_t and v𝑣vitalic_v are u𝑢uitalic_u’s neighbors. By Lemma 6.1(1), in this initial configuration each point on L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT is active in at most three triangles. Thus, we make in total 𝒪(|L2|)=𝒪(n)𝒪superscript𝐿2𝒪𝑛\mathcal{O}\!\left(|L^{2}|\right)=\mathcal{O}\!\left(n\right)caligraphic_O ( | italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT | ) = caligraphic_O ( italic_n ) tangent queries, each of which costs 𝒪(logn)𝒪𝑛\mathcal{O}\!\left(\log n\right)caligraphic_O ( roman_log italic_n ) time. Since the area of a simple polygon can be computed in linear time [26], all the area computations take uL1Θ(1+|A(u)|)=𝒪(|L1|+|L2|)=𝒪(n)subscript𝑢superscript𝐿1Θ1𝐴𝑢𝒪superscript𝐿1superscript𝐿2𝒪𝑛\sum_{u\in L^{1}}\Theta\!\left(1+\left|A(u)\right|\right)=\mathcal{O}\!\left(% \left|L^{1}\right|+\left|L^{2}\right|\right)=\mathcal{O}\!\left(n\right)∑ start_POSTSUBSCRIPT italic_u ∈ italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT roman_Θ ( 1 + | italic_A ( italic_u ) | ) = caligraphic_O ( | italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT | + | italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT | ) = caligraphic_O ( italic_n ) time. Therefore, the overall time to initialize the priority queue is 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log n\right)caligraphic_O ( italic_n roman_log italic_n ).

Initializing DCHsubscript𝐷𝐶𝐻D_{CH}italic_D start_POSTSUBSCRIPT italic_C italic_H end_POSTSUBSCRIPT in line 1 takes 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log n\right)caligraphic_O ( italic_n roman_log italic_n ) time [18]. In line 1, we can perform tangent queries on L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT from t𝑡titalic_t and v𝑣vitalic_v to find the first and last active points of u𝑢uitalic_u. In line 1, it will take no more than 𝒪(n)𝒪𝑛\mathcal{O}\!\left(n\right)caligraphic_O ( italic_n ) tangent queries to restore L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT and L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT throughout the algorithm by charging the queries to the points moved from the center convex hull to L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT or from L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT to L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT. Using an efficient dynamic convex hull data structure, it takes 𝒪(logn)𝒪𝑛\mathcal{O}\!\left(\log n\right)caligraphic_O ( roman_log italic_n ) amortized time to delete a point and thus 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ) time overall [18, 6]. We add points to the priority queue n𝑛nitalic_n times, delete points from the priority queue n𝑛nitalic_n times, and perform 𝒪(1)𝒪1\mathcal{O}\!\left(1\right)caligraphic_O ( 1 ) priority queue update operations for each iteration of the outer loop on line 1. Excluding lines 1 and 1 this establishes the overall 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ) running time.

To bound the total time spent on line 1 to 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log n\right)caligraphic_O ( italic_n roman_log italic_n ), we prove Theorem 6.7, bounding the total number of times points becomes active to 𝒪(n)𝒪𝑛\mathcal{O}\!\left(n\right)caligraphic_O ( italic_n ). Computing σ(u¯)𝜎¯𝑢\sigma\!\left(\bar{u}\right)italic_σ ( over¯ start_ARG italic_u end_ARG ) in line 1 requires us to find A(u¯)𝐴¯𝑢A(\bar{u})italic_A ( over¯ start_ARG italic_u end_ARG ), where u¯¯𝑢\bar{u}over¯ start_ARG italic_u end_ARG is a new point added to the first layer. From the theorem, it takes 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ) time to compute A(u¯)𝐴¯𝑢A(\bar{u})italic_A ( over¯ start_ARG italic_u end_ARG ) for every u¯¯𝑢\bar{u}over¯ start_ARG italic_u end_ARG. In addition, because it takes Θ(1+|A(u¯)|)Θ1𝐴¯𝑢\Theta\!\left(1+|A(\bar{u})|\right)roman_Θ ( 1 + | italic_A ( over¯ start_ARG italic_u end_ARG ) | ) to compute σ(u¯)𝜎¯𝑢\sigma\!\left(\bar{u}\right)italic_σ ( over¯ start_ARG italic_u end_ARG ) from A(u¯)𝐴¯𝑢A(\bar{u})italic_A ( over¯ start_ARG italic_u end_ARG ), overall it takes 𝒪(n)𝒪𝑛\mathcal{O}\!\left(n\right)caligraphic_O ( italic_n ) time to compute σ(u¯)𝜎¯𝑢\sigma\!\left(\bar{u}\right)italic_σ ( over¯ start_ARG italic_u end_ARG ) for every u¯¯𝑢\bar{u}over¯ start_ARG italic_u end_ARG.

To bound the total time spent on line 1 on updating the sensitivities of u𝑢uitalic_u’s neighbors to 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log n\right)caligraphic_O ( italic_n roman_log italic_n ), we prove Lemma 6.9. Together with Theorem 6.7, it implies the desired result.

6 Geometric properties of peeling

In this section, we develop an amortized analysis of peeling to show that lines 1 and 1 can be computed efficiently. We ultimately aim to show that the number of times that any point becomes active for any triangle is 𝒪(n)𝒪𝑛\mathcal{O}\!\left(n\right)caligraphic_O ( italic_n ), bounding the amount of work done to initialize new triangles to 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log n\right)caligraphic_O ( italic_n roman_log italic_n ). Then we show that the amount of work done to update the sensitivities of neighbor points is proportional to the number of new active points for them and an additive 𝒪(logn)𝒪𝑛\mathcal{O}\!\left(\log n\right)caligraphic_O ( roman_log italic_n ) term. Thus, updating the sensitivities over all n𝑛nitalic_n iterations takes 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log n\right)caligraphic_O ( italic_n roman_log italic_n ).

6.1 Preliminaries

When considering outer hull points, we use the notation tuv𝑡𝑢𝑣\triangle tuv△ italic_t italic_u italic_v for the triangle formed by u𝑢uitalic_u, its counterclockwise neighbor t𝑡titalic_t, and its clockwise neighbor v𝑣vitalic_v. For a set of ordered vertices V𝑉Vitalic_V we let (V)𝑉\pentagon\!\left(V\right)⬠ ( italic_V ) be the polygon formed by the points in the (cyclical) order. We say p(V)𝑝𝑉p\in\pentagon\!\left(V\right)italic_p ∈ ⬠ ( italic_V ) if p𝑝pitalic_p is strictly inside the polygon.

The following Lemma 6.1 combines a number of simple but useful propositions.

Lemma 6.1.

For a set of points P𝑃Pitalic_P, the following propositions are true:

  1. 1.

    Any point pP𝑝𝑃p\in Pitalic_p ∈ italic_P is active for at most three points on the first layer.

  2. 2.

    Let tuv𝑡𝑢𝑣\triangle tuv△ italic_t italic_u italic_v be a triangle for consecutive vertices (t,u,v)𝑡𝑢𝑣(t,u,v)( italic_t , italic_u , italic_v ) on the first layer and let pq𝑝𝑞p\neq qitalic_p ≠ italic_q be points ptuv𝑝𝑡𝑢𝑣p\in\triangle tuvitalic_p ∈ △ italic_t italic_u italic_v and qtpv𝑞𝑡𝑝𝑣q\in\triangle tpvitalic_q ∈ △ italic_t italic_p italic_v. Then qA(u)𝑞𝐴𝑢q\notin A(u)italic_q ∉ italic_A ( italic_u ).

  3. 3.

    Let p𝑝pitalic_p be a point on any layer k𝑘kitalic_k. After deleting any point qp𝑞𝑝q\neq pitalic_q ≠ italic_p and reconstructing the convex layers, p𝑝pitalic_p is on layer k1𝑘1k-1italic_k - 1 or k𝑘kitalic_k.

  4. 4.

    Let (t,u,v)𝑡𝑢𝑣(t,u,v)( italic_t , italic_u , italic_v ) be consecutive vertices on the first layer L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT. Then if u𝑢uitalic_u is deleted, among the vertices in L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT, only the sensitivities of vertices t𝑡titalic_t and v𝑣vitalic_v change.

  5. 5.

    For adjacent points (u,v)𝑢𝑣(u,v)( italic_u , italic_v ) on the hull, |A(u)A(v)|1𝐴𝑢𝐴𝑣1|A(u)\cap A(v)|\leq 1| italic_A ( italic_u ) ∩ italic_A ( italic_v ) | ≤ 1.

Proof 6.2.

See Section 7.2 in the appendix.

6.2 Bounding the active points

We will show that once a point is active for a hull point, it remains active for that hull point until the point is moved to the first layer. This implies a much stronger result by Lemma 6.1(1): over the entire course of the algorithm, a point becomes active for at most three other points. To do so, we first show that for each peel the active points A(u)𝐴𝑢A(u)italic_A ( italic_u ) remain in u𝑢uitalic_u’s triangle (Lemma 6.3) and second that the points in A(u)𝐴𝑢A(u)italic_A ( italic_u ) remain active (Lemma 6.5).

Lemma 6.3.

Given a set of points P𝑃Pitalic_P, for all adjacent hull points (u,v)𝑢𝑣(u,v)( italic_u , italic_v ) and for all points pA(u)A(v)𝑝𝐴𝑢𝐴𝑣p\in A(u)\setminus A(v)italic_p ∈ italic_A ( italic_u ) ∖ italic_A ( italic_v ), if v𝑣vitalic_v is deleted then p𝑝pitalic_p still remains within u𝑢uitalic_u’s triangle.

Proof 6.4.

Let t𝑡titalic_t be u𝑢uitalic_u’s other neighbor, and w.l.o.g. let the clockwise order on the hull be (t,u,v)𝑡𝑢𝑣(t,u,v)( italic_t , italic_u , italic_v ). Then if vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is u𝑢uitalic_u’s new neighbor after deleting v𝑣vitalic_v, the clockwise order on the new hull will be (t,u,v)𝑡𝑢superscript𝑣(t,u,v^{\prime})( italic_t , italic_u , italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). Because p𝑝pitalic_p is active for u𝑢uitalic_u before v𝑣vitalic_v is deleted, ptuv𝑝𝑡𝑢𝑣p\in\triangle tuvitalic_p ∈ △ italic_t italic_u italic_v.

First, we consider the case where vtuvsuperscript𝑣𝑡𝑢𝑣v^{\prime}\notin\triangle tuvitalic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∉ △ italic_t italic_u italic_v. We want to show that ptuv𝑝𝑡𝑢superscript𝑣p\in\triangle tuv^{\prime}italic_p ∈ △ italic_t italic_u italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Equivalently, that p𝑝pitalic_p is in the intersection of the three half-planes tu𝑡𝑢\overrightarrow{tu}over→ start_ARG italic_t italic_u end_ARG, uv𝑢superscript𝑣\overrightarrow{uv^{\prime}}over→ start_ARG italic_u italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG, and tv𝑡superscript𝑣\overrightarrow{tv^{\prime}}over→ start_ARG italic_t italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG. Clearly, p𝑝pitalic_p must satisfy the half-planes tu𝑡𝑢\overrightarrow{tu}over→ start_ARG italic_t italic_u end_ARG and uv𝑢superscript𝑣\overrightarrow{uv^{\prime}}over→ start_ARG italic_u italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG as these coincide with hull edges. In addition, since vtuvsuperscript𝑣𝑡𝑢𝑣v^{\prime}\notin\triangle tuvitalic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∉ △ italic_t italic_u italic_v, the half-plane for tv𝑡𝑣\overrightarrow{tv}over→ start_ARG italic_t italic_v end_ARG is a subset of the half-plane for tv𝑡superscript𝑣\overrightarrow{tv^{\prime}}over→ start_ARG italic_t italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG. Because ptuv𝑝𝑡𝑢𝑣p\in\triangle tuvitalic_p ∈ △ italic_t italic_u italic_v, p𝑝pitalic_p satisfies tv𝑡𝑣\overrightarrow{tv}over→ start_ARG italic_t italic_v end_ARG. Therefore, p𝑝pitalic_p must satisfy tv𝑡superscript𝑣\overrightarrow{tv^{\prime}}over→ start_ARG italic_t italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG.

Now we consider the case where vtuvsuperscript𝑣𝑡𝑢𝑣v^{\prime}\in\triangle tuvitalic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ △ italic_t italic_u italic_v. Assume that ptuv𝑝𝑡𝑢superscript𝑣p\notin\triangle tuv^{\prime}italic_p ∉ △ italic_t italic_u italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Then because we know that ptuv𝑝𝑡𝑢𝑣p\in\triangle tuvitalic_p ∈ △ italic_t italic_u italic_v, either ptvv𝑝𝑡superscript𝑣𝑣p\in\triangle tv^{\prime}vitalic_p ∈ △ italic_t italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_v or puvv𝑝𝑢superscript𝑣𝑣p\in\triangle uv^{\prime}vitalic_p ∈ △ italic_u italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_v. If ptvv𝑝𝑡superscript𝑣𝑣p\in\triangle tv^{\prime}vitalic_p ∈ △ italic_t italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_v, by Lemma 6.1(2), p𝑝pitalic_p could not have been active for u𝑢uitalic_u prior to deleting v𝑣vitalic_v. If puvv𝑝𝑢superscript𝑣𝑣p\in\triangle uv^{\prime}vitalic_p ∈ △ italic_u italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_v, p𝑝pitalic_p is now outside of the convex hull. Either way, this is a contradiction.

The following Lemma 6.5 shows that if p𝑝pitalic_p is in A(u)𝐴𝑢A(u)italic_A ( italic_u ), it remains in A(u)𝐴𝑢A(u)italic_A ( italic_u ) until moved to the first layer, after which it never becomes active again. It also shows that the active points A(u)𝐴𝑢A(u)italic_A ( italic_u ) only change by adding or deleting points from either end, and thus can easily be found.

Lemma 6.5.

Given a set of points P𝑃Pitalic_P, for all hull points u𝑢uitalic_u and v𝑣vitalic_v and for all points pA(u)A(v)𝑝𝐴𝑢𝐴𝑣p\in A(u)\setminus A(v)italic_p ∈ italic_A ( italic_u ) ∖ italic_A ( italic_v ), upon deleting v𝑣vitalic_v, p𝑝pitalic_p is in A(u)𝐴superscript𝑢A(u)^{\prime}italic_A ( italic_u ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, u𝑢uitalic_u’s new set of active points.

Proof 6.6.


Case 1 (u𝑢uitalic_u is not adjacent to v𝑣vitalic_v)

If u𝑢uitalic_u is not adjacent to v𝑣vitalic_v, there are no changes to u𝑢\triangle u△ italic_u upon deleting v𝑣vitalic_v, and thus, A(u)=A(u)𝐴𝑢𝐴superscript𝑢A(u)=A(u)^{\prime}italic_A ( italic_u ) = italic_A ( italic_u ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

For the following cases, assume that u𝑢uitalic_u was adjacent to v𝑣vitalic_v. Then by Lemma 6.3, p𝑝pitalic_p is still in the triangle defined by u𝑢uitalic_u even after deleting v𝑣vitalic_v. Also, w.l.o.g. let (u,v)𝑢𝑣(u,v)( italic_u , italic_v ) be the clockwise ordering of the points, and let vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT be u𝑢uitalic_u’s new neighbor.

Case 2 (vA(u)superscript𝑣𝐴𝑢v^{\prime}\in A(u)italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_A ( italic_u ))

By Lemma 6.1(5), A(u)A(v)=v𝐴𝑢𝐴𝑣superscript𝑣A(u)\cap A(v)=v^{\prime}italic_A ( italic_u ) ∩ italic_A ( italic_v ) = italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. By Lemma 6.3, all points A(u){v}𝐴𝑢superscript𝑣A(u)\setminus\left\{v^{\prime}\right\}italic_A ( italic_u ) ∖ { italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } are in tuv𝑡𝑢superscript𝑣\triangle tuv^{\prime}△ italic_t italic_u italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Because the second layer is a convex hull, each consecutive pair of points (a,b)𝑎𝑏(a,b)( italic_a , italic_b ) in tA(u)𝑡𝐴𝑢t\circ A(u)italic_t ∘ italic_A ( italic_u ) define a half-plane ab𝑎𝑏\overrightarrow{ab}over→ start_ARG italic_a italic_b end_ARG with only points from the first layer to the left of each half-plane. This is still the case after deleting v𝑣vitalic_v by Lemma 3. Since the only new points on the first layer are A(v)𝐴𝑣A(v)italic_A ( italic_v ) then all points in A(u){v}𝐴𝑢superscript𝑣A(u)\setminus\left\{v^{\prime}\right\}italic_A ( italic_u ) ∖ { italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } remain on the second layer. Thus, the gift-wrapping starting from t𝑡titalic_t wraps around all points in A(u){v}𝐴𝑢superscript𝑣A(u)\setminus\left\{v^{\prime}\right\}italic_A ( italic_u ) ∖ { italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT }. Gift wrapping can hit no new points because, if that were true, there must be some point on the second layer to the left of one of the half-planes in described above. Thus, A(u)=A(u){v}𝐴superscript𝑢𝐴𝑢superscript𝑣A(u)^{\prime}=A(u)\setminus\left\{v^{\prime}\right\}italic_A ( italic_u ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_A ( italic_u ) ∖ { italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT }.

Case 3 (vA(u)superscript𝑣𝐴𝑢v^{\prime}\notin A(u)italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∉ italic_A ( italic_u ))

Let uesubscript𝑢𝑒u_{e}italic_u start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT be the last point A(u)𝐴𝑢A(u)italic_A ( italic_u ). Similar to the previous case, the gift-wrapping certifies all points in A(u)𝐴𝑢A(u)italic_A ( italic_u ). Again, wrapping will not hit new active points before wrapping around uesubscript𝑢𝑒u_{e}italic_u start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT because that would imply the points hit were to the left of the half-planes described previously. When wrapping continues around uesubscript𝑢𝑒u_{e}italic_u start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT, several new active points may appear, until the wrapping terminates at vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Thus, A(u)A(u)𝐴𝑢𝐴superscript𝑢A(u)\subseteq A(u)^{\prime}italic_A ( italic_u ) ⊆ italic_A ( italic_u ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT.

Theorem 6.7.

For any 2D convex hull peeling process on n𝑛nitalic_n points the total number of times any point becomes active in any triangle is at most 3n3𝑛3n3 italic_n.

Proof 6.8.

This follows directly from the results of Lemma 6.1(1) and Lemma 6.5.

6.3 Updating sensitivities

Next, we show that the total time to update the sensitivities in line 1 when peeling all n𝑛nitalic_n points takes 𝒪(Δ+nlogn)𝒪Δ𝑛𝑛\mathcal{O}\!\left(\Delta+n\log{n}\right)caligraphic_O ( roman_Δ + italic_n roman_log italic_n ) time. Here ΔΔ\Deltaroman_Δ is the the number of times any point becomes active for any triangle. Theorem 6.7 proves that Δ=𝒪(n)Δ𝒪𝑛\Delta=\mathcal{O}\!\left(n\right)roman_Δ = caligraphic_O ( italic_n ). The following lemma shows that the sensitivity of a point u𝑢uitalic_u can be updated in time proportional to the increase to |A(u)|𝐴𝑢\left|A(u)\right|| italic_A ( italic_u ) | and an additive 𝒪(logn)𝒪𝑛\mathcal{O}\!\left(\log{n}\right)caligraphic_O ( roman_log italic_n ) term. Figure 5 shows an example of how the sensitivity of a point changes when its neighbor is peeled.

Lemma 6.9.

Let (u,v)𝑢𝑣(u,v)( italic_u , italic_v ) be points on the first layer. Consider a peel of v𝑣vitalic_v where δusubscript𝛿𝑢\delta_{u}italic_δ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT new points become active points for u𝑢uitalic_u. Then the updated sensitivity σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) can be computed it Θ(δu+logn)Θsubscript𝛿𝑢𝑛\Theta\!\left(\delta_{u}+\log n\right)roman_Θ ( italic_δ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT + roman_log italic_n ) time, excluding the time to restore the second and first layer.

Proof 6.10.

The sensitivity σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) is equal to the area of the polygon U=(tuvA(u))𝑈𝑡𝑢𝑣𝐴𝑢U=\pentagon\!\left(t\circ u\circ v\circ A(u)\right)italic_U = ⬠ ( italic_t ∘ italic_u ∘ italic_v ∘ italic_A ( italic_u ) ). By the shoelace formula, the area of U𝑈Uitalic_U can be computed as the sum S(U)𝑆𝑈S(U)italic_S ( italic_U ) of certain simple terms for each of its edges [5, 8]. We consider how U𝑈Uitalic_U, and thus S(U)𝑆𝑈S(U)italic_S ( italic_U ), changes when v𝑣vitalic_v is peeled. Inspecting the proof of Lemma 6.5, we see that at most two vertices are removed from U𝑈Uitalic_U and at most 1+δu1subscript𝛿𝑢1+\delta_{u}1 + italic_δ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT vertices are added to U𝑈Uitalic_U. Furthermore, all the new vertices are located contiguously on the restored second layer and can be found in 𝒪(δu+logn)𝒪subscript𝛿𝑢𝑛\mathcal{O}\!\left(\delta_{u}+\log{n}\right)caligraphic_O ( italic_δ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT + roman_log italic_n ) time using a tangent query from u𝑢uitalic_u’s new neighbor which replaces v𝑣vitalic_v. To update σ(u)=S(U)𝜎𝑢𝑆𝑈\sigma\!\left(u\right)=S(U)italic_σ ( italic_u ) = italic_S ( italic_U ), we simply add and subtract the appropriate 𝒪(δu)𝒪subscript𝛿𝑢\mathcal{O}\!\left(\delta_{u}\right)caligraphic_O ( italic_δ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) terms depending on the removed and added edges.

Refer to caption
Figure 5: This figure shows how the sensitivity σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) changes when point v𝑣vitalic_v is peeled. The point q𝑞qitalic_q is the intersection of the tangent from v𝑣vitalic_v to uesubscript𝑢𝑒u_{e}italic_u start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT and the tangent from u𝑢uitalic_u to vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, where uesubscript𝑢𝑒u_{e}italic_u start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT is the last active point in A(u)𝐴𝑢A(u)italic_A ( italic_u ) and vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is the first active point in A(v)𝐴𝑣A(v)italic_A ( italic_v ). After the peel, vsuperscript𝑣v^{\prime}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT replaces v𝑣vitalic_v as u𝑢uitalic_u’s neighbor, and the points ΔAΔ𝐴\Delta Aroman_Δ italic_A are newly active for u𝑢uitalic_u. The sensitivity σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) before peeling v𝑣vitalic_v was equal to the area of (tuvA(u))𝑡𝑢𝑣𝐴𝑢\pentagon\!\left(t\circ u\circ v\circ A(u)\right)⬠ ( italic_t ∘ italic_u ∘ italic_v ∘ italic_A ( italic_u ) ). After peeling v𝑣vitalic_v, the sensitivity σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) equals the area of (tuvΔAA(u))𝑡𝑢superscript𝑣Δ𝐴𝐴𝑢\pentagon\!\left(t\circ u\circ v^{\prime}\circ\Delta A\circ A(u)\right)⬠ ( italic_t ∘ italic_u ∘ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∘ roman_Δ italic_A ∘ italic_A ( italic_u ) ). Note how this can be computed in 𝒪(|ΔA|)𝒪Δ𝐴\mathcal{O}\!\left(\left|\Delta A\right|\right)caligraphic_O ( | roman_Δ italic_A | ) time from σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) before the peel of v𝑣vitalic_v by subtracting the red area of uvq𝑢𝑣𝑞\triangle uvq△ italic_u italic_v italic_q and adding the green area of (ueqvΔA)subscript𝑢𝑒𝑞superscript𝑣Δ𝐴\pentagon\!\left(u_{e}\circ q\circ v^{\prime}\circ\Delta A\right)⬠ ( italic_u start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ∘ italic_q ∘ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∘ roman_Δ italic_A ).

7 Generalization and open problems

Theorem 7.1 shows that Algorithm 1 generalizes straightforwardly to other objectives such as peeling the point that causes the perimeter of the convex hull to decrease the most each iteration.

Theorem 7.1.

Let u𝑢uitalic_u be a point and O𝑂Oitalic_O an objective where σO(u)subscript𝜎𝑂𝑢\sigma_{O}\!\left(u\right)italic_σ start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT ( italic_u ) is the sensitivity of u𝑢uitalic_u under O𝑂Oitalic_O. Consider the following three conditions:

  1. C1:

    If uL1𝑢superscript𝐿1u\notin L^{1}italic_u ∉ italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT, then σO(u)=0subscript𝜎𝑂𝑢0\sigma_{O}\!\left(u\right)=0italic_σ start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT ( italic_u ) = 0.

  2. C2:

    If uL1𝑢superscript𝐿1u\in L^{1}italic_u ∈ italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT, then σO(u)>0subscript𝜎𝑂𝑢0\sigma_{O}\!\left(u\right)>0italic_σ start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT ( italic_u ) > 0, and σO(u)subscript𝜎𝑂𝑢\sigma_{O}\!\left(u\right)italic_σ start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT ( italic_u ) depends only on u𝑢uitalic_u, u𝑢uitalic_u’s neighbors and its active points A(u)𝐴𝑢A(u)italic_A ( italic_u ).

  3. C3:

    If a single point p𝑝pitalic_p is added or removed from A(u)𝐴𝑢A(u)italic_A ( italic_u ), then provided σO(u)subscript𝜎𝑂𝑢\sigma_{O}\!\left(u\right)italic_σ start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT ( italic_u ) and the neighbors aisubscript𝑎𝑖a_{i}italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and ajsubscript𝑎𝑗a_{j}italic_a start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT of p𝑝pitalic_p in A(u)𝐴𝑢A(u)italic_A ( italic_u ), the new sensitivity σO(u)subscript𝜎𝑂superscript𝑢\sigma_{O}\!\left(u\right)^{\prime}italic_σ start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT ( italic_u ) start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT can be computed in 𝒪(logn)𝒪𝑛\mathcal{O}\!\left(\log{n}\right)caligraphic_O ( roman_log italic_n ) time.

If O𝑂Oitalic_O satisfies the above conditions, then Algorithm 1 runs in 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ) time for objective O𝑂Oitalic_O.

Proof 7.2.

By conditions C1 and C2, it is always a point u𝑢uitalic_u on the first layer that is peeled. Furthermore, when u𝑢uitalic_u is peeled only the sensitivities of the new points on the first layer and the neighbors of u𝑢uitalic_u must be updated since they are the only points for which their active points or neighbors change. Thus, Algorithm 1 can be used for objective O𝑂Oitalic_O. Now we will show that the runtime of Algorithm 1 remains 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ).

First, observe that all parts unrelated to computing sensitivities behave the same and still take 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ) time. By condition C3, for a point u𝑢uitalic_u on the first layer, its sensitivity σO(u)subscript𝜎𝑂𝑢\sigma_{O}\!\left(u\right)italic_σ start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT ( italic_u ) only depends on its neighbors and active points A(u)𝐴𝑢A(u)italic_A ( italic_u ). As described in the proof of Lemma 6.9, when the set of points that affect σO(u)subscript𝜎𝑂𝑢\sigma_{O}\!\left(u\right)italic_σ start_POSTSUBSCRIPT italic_O end_POSTSUBSCRIPT ( italic_u ) changes, these points are readily available. The total number of neighbor changes is 𝒪(n)𝒪𝑛\mathcal{O}\!\left(n\right)caligraphic_O ( italic_n ) since, in each iteration, only the neighbors of the points adjacent to the peeled point change. The total number of changes to active points is 𝒪(n)𝒪𝑛\mathcal{O}\!\left(n\right)caligraphic_O ( italic_n ) by Theorem 6.7. If there are multiple changes to the active points in one iteration, such as when deleting one of u𝑢uitalic_u’s neighbors, we perform one change at a time and, by condition C3, the total time to update sensitivities is 𝒪(nlogn)𝒪𝑛𝑛\mathcal{O}\!\left(n\log{n}\right)caligraphic_O ( italic_n roman_log italic_n ).

For concrete examples, we show how the three objectives area (OAsubscript𝑂𝐴O_{A}italic_O start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT), perimeter (OPsubscript𝑂𝑃O_{P}italic_O start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT), and number of active points (ON)O_{N})italic_O start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) fit into this framework.

Let f(σ(u),ai,p,aj)=σ(u)d(ai,aj)+d(ai,p)+d(p,aj)𝑓𝜎𝑢subscript𝑎𝑖𝑝subscript𝑎𝑗𝜎𝑢𝑑subscript𝑎𝑖subscript𝑎𝑗𝑑subscript𝑎𝑖𝑝𝑑𝑝subscript𝑎𝑗f\!\left(\sigma\!\left(u\right),a_{i},p,a_{j}\right)=\sigma\!\left(u\right)-d% \!\left(a_{i},a_{j}\right)+d\!\left(a_{i},p\right)+d\!\left(p,a_{j}\right)italic_f ( italic_σ ( italic_u ) , italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_p , italic_a start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) = italic_σ ( italic_u ) - italic_d ( italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) + italic_d ( italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_p ) + italic_d ( italic_p , italic_a start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) be a function for computing the sensitivity σ(u)𝜎𝑢\sigma\!\left(u\right)italic_σ ( italic_u ) when p𝑝pitalic_p is added to A(u)𝐴𝑢A(u)italic_A ( italic_u ) between aisubscript𝑎𝑖a_{i}italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and ajsubscript𝑎𝑗a_{j}italic_a start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT (the functions where a point is removed from A(u)𝐴𝑢A(u)italic_A ( italic_u ) or a neighbor of u𝑢uitalic_u changes are similar). For f𝑓fitalic_f to match each of the objectives it is sufficient to implement d(,)𝑑d\!\left(\cdot,\cdot\right)italic_d ( ⋅ , ⋅ ) as follows for points a,b𝐑2𝑎𝑏superscript𝐑2a,b\in\mathbf{R}^{2}italic_a , italic_b ∈ bold_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT:

  1. OAsubscript𝑂𝐴O_{A}italic_O start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT:

    d(a,b)=12(a2b1a1b2)𝑑𝑎𝑏12subscript𝑎2subscript𝑏1subscript𝑎1subscript𝑏2d\!\left(a,b\right)=\frac{1}{2}\left(a_{2}b_{1}-a_{1}b_{2}\right)italic_d ( italic_a , italic_b ) = divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT )

  2. OPsubscript𝑂𝑃O_{P}italic_O start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT:

    d(a,b)=(b2a2)2+(b1a1)2𝑑𝑎𝑏superscriptsubscript𝑏2subscript𝑎22superscriptsubscript𝑏1subscript𝑎12d\!\left(a,b\right)=\sqrt{\left(b_{2}-a_{2}\right)^{2}+\left(b_{1}-a_{1}\right% )^{2}}italic_d ( italic_a , italic_b ) = square-root start_ARG ( italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG

  3. ONsubscript𝑂𝑁O_{N}italic_O start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT:

    d(a,b)=1𝑑𝑎𝑏1d\!\left(a,b\right)=1italic_d ( italic_a , italic_b ) = 1

The case with OAsubscript𝑂𝐴O_{A}italic_O start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT is based on the shoelace formula. Additionally, for ONsubscript𝑂𝑁O_{N}italic_O start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT to satisfy condition C2, we add 1111 when computing the sensitivity of uL1𝑢superscript𝐿1u\in L^{1}italic_u ∈ italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT to ensure that σ(u)>0𝜎𝑢0\sigma\!\left(u\right)>0italic_σ ( italic_u ) > 0 even if |A(u)|=0𝐴𝑢0|A(u)|=0| italic_A ( italic_u ) | = 0. For the three objectives, f𝑓fitalic_f takes 𝒪(1)𝒪1\mathcal{O}\!\left(1\right)caligraphic_O ( 1 ) time to compute satisfying the 𝒪(logn)𝒪𝑛\mathcal{O}\!\left(\log{n}\right)caligraphic_O ( roman_log italic_n ) time requirement from condition C3.

7.1 Open problems

The first open problem is extending the result to 𝐑3superscript𝐑3\mathbf{R}^{3}bold_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT or higher. Directly applying our approach requires a dynamic 3D convex hull data structure, and Theorem 6.7 has to be extended to 3D. Second, is it possible to improve the quality of peeling by performing z𝑧zitalic_z-peels, even for z=2𝑧2z=2italic_z = 2 in 𝒪(n)𝒪𝑛\mathchoice{{\scriptstyle\mathcal{O}}}{{\scriptstyle\mathcal{O}}}{{% \scriptscriptstyle\mathcal{O}}}{\scalebox{0.6}{$\scriptscriptstyle\mathcal{O}$% }}\!\left(n\right)caligraphic_O ( italic_n ) time? Third, is there an efficient approximation algorithm for k𝑘kitalic_k-peeling?

Acknowledgement

We thank Asger Svenning for the initial discussions that inspired us to consider this problem.

References

  • [1] A. Aggarwal, H. Imai, N. Katoh, and S. Suri. Fining k points with minimum spanning trees and related problems. In Proceedings of the fifth annual symposium on Computational geometry, pages 283–291, 1989.
  • [2] G. Aloupis. Geometric measures of data depth. DIMACS series in discrete mathematics and theoretical computer science, 72:147, 2006.
  • [3] F. Angiulli and C. Pizzuti. Fast outlier detection in high dimensional spaces. In European conference on principles of data mining and knowledge discovery, pages 15–27. Springer, 2002.
  • [4] R. Atanassov, P. Bose, M. Couture, A. Maheshwari, P. Morin, M. Paquette, M. Smid, and S. Wuhrer. Algorithms for optimal outlier removal. Journal of Discrete Algorithms, 7(2):239–248, 2009. Selected papers from the 2nd Algorithms and Complexity in Durham Workshop ACiD 2006.
  • [5] R. Boland and J. Urrutia. Polygon area problems. In Proc. of the 12th Canadian Conf. on Computational Geometry, Fredericton, NB, Canada, 2000.
  • [6] G. Brodal and R. Jacob. Dynamic planar convex hull. In The 43rd Annual IEEE Symposium on Foundations of Computer Science, 2002. Proceedings., pages 617–626, 2002.
  • [7] B. Chazelle. On the convex layers of a planar set. IEEE Transactions on Information Theory, 31(4):509–517, 1985.
  • [8] F. Contreras. Cutting polygons and a problem on illumination of stages. University of Ottawa (Canada), 1998.
  • [9] R. N. Dave. Characterization and detection of noise in clustering. Pattern Recognition Letters, 12(11):657–664, 1991.
  • [10] M. De Berg. Computational geometry: algorithms and applications. Springer Science & Business Media, 2000.
  • [11] D. P. Dobkin, R. Drysdale, and L. J. Guibas. Finding smallest polygons. Computational Geometry, 1:181–214, 1983.
  • [12] D. Eppstein. New algorithms for minimum area k-gons. In Proceedings of the Third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’92, page 83–88, USA, 1992. Society for Industrial and Applied Mathematics.
  • [13] D. Eppstein and J. Erickson. Iterated nearest neighbors and finding minimal polytopes. Discrete & Computational Geometry, 11:321–350, 1994.
  • [14] D. Eppstein, M. Overmars, G. Rote, and G. Woeginger. Finding minimum area k-gons. Discrete & Computational Geometry, 7:45–58, 1992.
  • [15] F. E. Grubbs. Sample criteria for testing outlying observations. University of Michigan, 1949.
  • [16] S. Har-Peled. On the expected complexity of random convex hulls. arXiv preprint arXiv:1111.5340, 2011.
  • [17] A. Harsh, J. E., and P. Wei. Onion-peeling outlier detection in 2-d data sets. International Journal of Computer Applications, 139:26–31, 04 2016.
  • [18] J. Hershberger and S. Suri. Applications of a semi-dynamic convex hull algorithm. BIT Numerical Mathematics, 32:249–267, 1992.
  • [19] P. J. Huber. The 1972 wald lecture robust statistics: A review. The Annals of Mathematical Statistics, 43(4):1041–1067, 1972.
  • [20] J. Hugg, E. Rafalin, K. Seyboth, and D. Souvaine. An experimental study of old and new depth measures. In 2006 Proceedings of the Eighth Workshop on Algorithm Engineering and Experiments (ALENEX), pages 51–64. SIAM, 2006.
  • [21] D. Kirkpatrick and J. Snoeyink. Computing common tangents without a separating line. In S. G. Akl, F. Dehne, J.-R. Sack, and N. Santoro, editors, Algorithms and Data Structures, pages 183–193, Berlin, Heidelberg, 1995. Springer Berlin Heidelberg.
  • [22] E. M. Knorr and R. T. Ng. Finding intensional knowledge of distance-based outliers. In Vldb, volume 99, pages 211–222, 1999.
  • [23] S. K. Kwak and J. H. Kim. Statistical data preparation: management of missing values and outliers. Korean journal of anesthesiology, 70(4):407, 2017.
  • [24] M. Löffler and W. Mulzer. Unions of onions: preprocessing imprecise points for fast onion decomposition. Journal of Computational Geometry, 5(1), 2014.
  • [25] K. Mehlhorn, R. Osbild, and M. Sagraloff. Reliable and efficient computational geometry via controlled perturbation. In International Colloquium on Automata, Languages, and Programming, pages 299–310. Springer, 2006.
  • [26] A. Meister. Generalia de genesi figurarum planarum et inde pendentibus earum affectionibus. 1769.
  • [27] M. H. Overmars and J. Van Leeuwen. Maintenance of configurations in the plane. Journal of computer and System Sciences, 23(2):166–204, 1981.
  • [28] S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pages 427–438, 2000.
  • [29] M. Segal and K. Kedem. Enclosing k points in the smallest axis parallel rectangle. Information Processing Letters, 65(2):95–99, 1998.
  • [30] M. I. Shamos. Problems in computational geometry. 1975.
  • [31] A. F. Zuur, E. N. Ieno, and C. S. Elphick. A protocol for data exploration to avoid common statistical problems. Methods in ecology and evolution, 1(1):3–14, 2010.

Appendix

7.2 Proof of Lemma 6.1

Lemma 6.1(1)

Fix a point set P𝑃Pitalic_P. Any point pP𝑝𝑃p\in Pitalic_p ∈ italic_P is active in at most three triangles.

Proof 7.3.

First, note that a point can only be active for a hull point u𝑢uitalic_u if it is located inside u𝑢\triangle u△ italic_u, so it is sufficient to show that any p𝑝pitalic_p is strictly inside at most three triangles. In addition, one can prove this by showing that u𝑢\triangle u△ italic_u only intersects with its neighbors’ triangles t𝑡\triangle t△ italic_t and v𝑣\triangle v△ italic_v.

Consider some z𝑧\triangle z△ italic_z, such that z𝑧zitalic_z is not a neighbor of u𝑢uitalic_u. That is, u𝑢uitalic_u is not one of the vertices of z𝑧\triangle z△ italic_z. If z𝑧\triangle z△ italic_z intersects with u𝑢\triangle u△ italic_u, then either a vertex of z𝑧\triangle z△ italic_z is inside u𝑢\triangle u△ italic_u or the convex hull is a self-intersecting polygon, both violating convexity.

Lemma 6.1(2)

Let tuv𝑡𝑢𝑣\triangle tuv△ italic_t italic_u italic_v be a triangle for consecutive vertices (t,u,v)𝑡𝑢𝑣(t,u,v)( italic_t , italic_u , italic_v ) on the first layer and let pq𝑝𝑞p\neq qitalic_p ≠ italic_q be points ptuv𝑝𝑡𝑢𝑣p\in\triangle tuvitalic_p ∈ △ italic_t italic_u italic_v and qtpv𝑞𝑡𝑝𝑣q\in\triangle tpvitalic_q ∈ △ italic_t italic_p italic_v. Then qA(u)𝑞𝐴𝑢q\notin A(u)italic_q ∉ italic_A ( italic_u ).

Proof 7.4.

By definition, p(tvA(u))𝑝𝑡𝑣𝐴𝑢p\in\pentagon\!\left(t\circ v\circ A(u)\right)italic_p ∈ ⬠ ( italic_t ∘ italic_v ∘ italic_A ( italic_u ) ) or pA(u)𝑝𝐴𝑢p\in A(u)italic_p ∈ italic_A ( italic_u ). Either way, qtpv𝑞𝑡𝑝𝑣q\in\triangle tpvitalic_q ∈ △ italic_t italic_p italic_v implies that q(tvA(u))𝑞𝑡𝑣𝐴𝑢q\in\pentagon\!\left(t\circ v\circ A(u)\right)italic_q ∈ ⬠ ( italic_t ∘ italic_v ∘ italic_A ( italic_u ) ), so qA(u)𝑞𝐴𝑢q\not\in A(u)italic_q ∉ italic_A ( italic_u ).

Lemma 6.1(3)

Let p𝑝pitalic_p be a point on any convex layer k𝑘kitalic_k. After deleting any point qp𝑞𝑝q\neq pitalic_q ≠ italic_p and reconstructing the convex layers, p𝑝pitalic_p is on layer k1𝑘1k-1italic_k - 1 or k𝑘kitalic_k.

Proof 7.5.

First we show that p𝑝pitalic_p never moves inward to layer k>ksuperscript𝑘𝑘k^{\prime}>kitalic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT > italic_k. Consider the outermost layer L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT. By a property of convex hulls, every point v𝑣vitalic_v inside the convex hull is a convex combination of the hull points whereas any point uL1𝑢superscript𝐿1u\in L^{1}italic_u ∈ italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT is not a convex combination of L1{u}superscript𝐿1𝑢L^{1}-\{u\}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - { italic_u }. If deleting q𝑞qitalic_q causes pL1𝑝superscript𝐿1p\in L^{1}italic_p ∈ italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT to descend to a layer inside L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT, that implies that p𝑝pitalic_p is a convex combination of some subset of P{q,p}𝑃𝑞𝑝P-\{q,p\}italic_P - { italic_q , italic_p }. This contradicts the fact that p𝑝pitalic_p is not a convex combination of L1{p}superscript𝐿1𝑝L^{1}-\{p\}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT - { italic_p } and by extension is not a convex combination of P{p}𝑃𝑝P-\{p\}italic_P - { italic_p }. Because of the recursive definition of convex layers, the proof for subsequent layers is symmetric.

Now we will show that p𝑝pitalic_p never moves up more than one layer at a time. This is clearly true for L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT and L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT because only one point is completely removed from the structure at at time (i.e. shifts to layer 0). For layers k3𝑘3k\geq 3italic_k ≥ 3, consider a point p𝑝pitalic_p on layer k𝑘kitalic_k that moves to layer kk2superscript𝑘𝑘2k^{\prime}\leq k-2italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ≤ italic_k - 2. Let Lksuperscript𝐿absentsuperscript𝑘L^{*k^{\prime}}italic_L start_POSTSUPERSCRIPT ∗ italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT be the set of points on layer ksuperscript𝑘k^{\prime}italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT after deleting q𝑞qitalic_q. Let Lk1superscript𝐿𝑘1L^{k-1}italic_L start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT be the set of points on layer k1𝑘1k-1italic_k - 1 before deleting q𝑞qitalic_q.

Because pLk𝑝superscript𝐿absentsuperscript𝑘p\in L^{*k^{\prime}}italic_p ∈ italic_L start_POSTSUPERSCRIPT ∗ italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT, no convex combination of the points in Lk{p}superscript𝐿absentsuperscript𝑘𝑝L^{*k^{\prime}}-\{p\}italic_L start_POSTSUPERSCRIPT ∗ italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT - { italic_p } equals p𝑝pitalic_p by convexity. By the inductive hypothesis, all points on Lk1superscript𝐿𝑘1L^{k-1}italic_L start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT are convex combinations of Lksuperscript𝐿absentsuperscript𝑘L^{*k^{\prime}}italic_L start_POSTSUPERSCRIPT ∗ italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT because upon deleting q𝑞qitalic_q no point on Lk1superscript𝐿𝑘1L^{k-1}italic_L start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT advances above layer ksuperscript𝑘k^{\prime}italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. Furthermore, they are all convex combinations of Lk{p}superscript𝐿absentsuperscript𝑘𝑝L^{*k^{\prime}}-\{p\}italic_L start_POSTSUPERSCRIPT ∗ italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT - { italic_p } as p𝑝pitalic_p itself is a convex combination of Lk1superscript𝐿𝑘1L^{k-1}italic_L start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT. But if p𝑝pitalic_p is not a convex combination of Lk{p}superscript𝐿absentsuperscript𝑘𝑝L^{*k^{\prime}}-\{p\}italic_L start_POSTSUPERSCRIPT ∗ italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT - { italic_p }, and all the points on layer k1𝑘1k-1italic_k - 1 are convex combinations of Lk{p}superscript𝐿absentsuperscript𝑘𝑝L^{*k^{\prime}}-\{p\}italic_L start_POSTSUPERSCRIPT ∗ italic_k start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT - { italic_p }, then prior to deleting q𝑞qitalic_q, p𝑝pitalic_p was above layer k1𝑘1k-1italic_k - 1, which is a contradiction.

Lemma 6.1(4)

Let (t,u,v)𝑡𝑢𝑣(t,u,v)( italic_t , italic_u , italic_v ) be consecutive vertices on the first layer L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT. Then if u𝑢uitalic_u is deleted, among the vertices in L1superscript𝐿1L^{1}italic_L start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT, only the sensitivities of vertices t𝑡titalic_t and v𝑣vitalic_v change.

Proof 7.6.

Consider a vertex z𝑧zitalic_z not adjacent to u𝑢uitalic_u. By the same arguments as in the proof of Lemma 6.1(1), the vertices defining z𝑧\triangle z△ italic_z do not change upon deleting u𝑢uitalic_u because it does not intersect u𝑢\triangle u△ italic_u. In addition, because their triangles do not intersect, |A(u)A(z)|=0𝐴𝑢𝐴𝑧0|A(u)\cap A(z)|=0| italic_A ( italic_u ) ∩ italic_A ( italic_z ) | = 0. Therefore, no points are removed from A(z)𝐴𝑧A(z)italic_A ( italic_z ) upon deleting u𝑢uitalic_u.

Lastly, we will show that no points are added to A(z)𝐴𝑧A(z)italic_A ( italic_z ) upon deleting u𝑢uitalic_u. Assume that there is some point p𝑝pitalic_p added to A(z)𝐴𝑧A(z)italic_A ( italic_z ) when we delete u𝑢uitalic_u. But if p𝑝pitalic_p satisfies the conditions of being active for z𝑧zitalic_z and z𝑧\triangle z△ italic_z did not change upon deleting u𝑢uitalic_u, it should have been active for z𝑧zitalic_z before u𝑢uitalic_u was deleted, which is a contradiction.

Because z𝑧\triangle z△ italic_z and A(z)𝐴𝑧A(z)italic_A ( italic_z ) do not change upon deleting u𝑢uitalic_u, it must be that σ(z)𝜎𝑧\sigma\!\left(z\right)italic_σ ( italic_z ) remains the same.

Lemma 6.1(5)

For adjacent points (u,v)𝑢𝑣(u,v)( italic_u , italic_v ) on the hull, |A(u)A(v)|1𝐴𝑢𝐴𝑣1|A(u)\cap A(v)|\leq 1| italic_A ( italic_u ) ∩ italic_A ( italic_v ) | ≤ 1.

Proof 7.7.

We assume the contrary. Let pp𝑝superscript𝑝p\neq p^{\prime}italic_p ≠ italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT be two points such that p,pA(u)A(v)𝑝superscript𝑝𝐴𝑢𝐴𝑣p,p^{\prime}\in A(u)\cap A(v)italic_p , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_A ( italic_u ) ∩ italic_A ( italic_v ). By the definition of active and Lemma 6.1(3), p𝑝pitalic_p and psuperscript𝑝p^{\prime}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT must be on the second layer. W.l.o.g. let (u,v)𝑢𝑣(u,v)( italic_u , italic_v ) be the clockwise ordering of the points on the first layer. In addition, let t𝑡titalic_t be u𝑢uitalic_u’s counterclockwise neighbor.

Say that p𝑝pitalic_p is the first point in A(v)𝐴𝑣A(v)italic_A ( italic_v ). Then we have the tangent line up𝑢𝑝\overrightarrow{up}over→ start_ARG italic_u italic_p end_ARG that defines p𝑝pitalic_p. By definition of tangent lines, no point on the second layer can be to the left of up𝑢𝑝\overrightarrow{up}over→ start_ARG italic_u italic_p end_ARG. But for psuperscript𝑝p^{\prime}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT to be active for v𝑣vitalic_v, then psuperscript𝑝p^{\prime}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT must be to the left of pv𝑝𝑣\overrightarrow{pv}over→ start_ARG italic_p italic_v end_ARG. The only way to satisfy both half-planes is for psuperscript𝑝p^{\prime}italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT to be placed such that ptpv𝑝𝑡superscript𝑝𝑣p\in\triangle tp^{\prime}vitalic_p ∈ △ italic_t italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT italic_v, in which case by Lemma 6.1(2) p𝑝pitalic_p cannot be in A(u)𝐴𝑢A(u)italic_A ( italic_u ), which is a contradiction.