Projection depth for functional data: Theoretical properties
Abstract
We introduce a novel projection depth for data lying in a general Hilbert space, called the regularized projection depth, with a focus on functional data. By regularizing projection directions, the proposed depth does not suffer from the degeneracy issue that may arise when the classical projection depth is naively defined on an infinite-dimensional space. Compared to existing functional depth notions, the regularized projection depth has several advantages: (i) it requires no moment assumptions on the underlying distribution, (ii) it satisfies many desirable depth properties including invariance, monotonicity, and vanishing at infinity, (iii) its sample version uniformly converges under mild conditions, and (iv) it generates a highly robust median. Furthermore, the proposed depth is statistically useful as it (v) does not produce ties in the induced ranks and (vi) effectively detects shape outlying functions. This paper focuses mainly on the theoretical properties of the regularized projection depth.
Summary
This paper introduces a novel "regularized projection depth" (RPD) for functional data analysis, addressing the degeneracy issue that arises when classical projection depth is naively extended to infinite-dimensional Hilbert spaces. The key idea is to regularize the projection directions by excluding those along which the projected data are highly concentrated around their medians. This ensures the depth remains non-degenerate and provides a sensible ranking of data. The paper focuses on establishing the theoretical properties of RPD, including invariance, monotonicity, continuity, and vanishing at infinity. It also proves the consistency of the sample version of RPD and the robustness of the induced median. RPD does not require moment assumptions on the underlying distribution, making it suitable for heavy-tailed data, unlike some existing functional depths. The paper also demonstrates through simulations that RPD is effective in outlier detection. The authors show that the RPD possesses several desirable properties. They establish that the RPD is quasi-concave, median-symmetric, and monotonic on rays. Crucially, they prove the non-degeneracy of RPD under mild conditions and demonstrate that the sample RPD is uniformly consistent for the population RPD under equicontinuity assumptions. Moreover, they show that the RPD median has a breakdown point of 1/2, indicating good robustness. Finally, they illustrate the practical utility of RPD in detecting shape outliers in functional data, comparing its performance favorably against existing functional depth methods like regularized halfspace depth, integrated depth, and infimal depth. The results of the paper provide a theoretically sound and practically useful tool for functional data analysis.
Key Insights
- •The paper identifies the degeneracy of naive projection depth in infinite-dimensional Hilbert spaces due to the possibility of arbitrarily small MADs along certain projection directions (Theorem 1).
- •The proposed RPD addresses this degeneracy by restricting the set of projection directions to those with MAD greater than a regularization parameter β, ensuring a positive depth value for all data points (Theorem 2).
- •RPD is shown to be Lipschitz continuous with a constant of 1/β, where β is the regularization parameter, implying that larger β leads to a more stable depth function (Theorem 5, N1).
- •The paper establishes uniform consistency of the sample RPD under an equicontinuity condition on the maximal outlyingness function, providing a theoretical guarantee for its performance on finite samples (Theorem 7).
- •The RPD median is shown to have a breakdown point of 1/2, demonstrating its robustness to outliers in the data (Theorem 10).
- •The paper shows that the RPD value of a point depends only on its projection onto the closed linear span of the regularized direction set, indicating an implicit dimension reduction (Theorem 5, N4).
- •In an outlier detection simulation, RPD significantly outperforms existing functional depths in identifying shape outliers, achieving a mean rank close to the minimum possible value of 0.045 (Section 4.2).
Practical Implications
- •The RPD can be used for robust estimation, outlier detection, hypothesis testing, and data visualization in functional data analysis.
- •Researchers and practitioners working with functional data can use the RPD to obtain a robust measure of centrality and outlyingness, particularly when dealing with heavy-tailed data or shape outliers.
- •The authors provide an R package for efficient computation of the approximated RPD, making the method readily accessible for applied work.
- •The theoretical results provide guidance for selecting the regularization parameter β, suggesting the use of quantiles of MAD[⟨X,V⟩] to ensure a non-empty direction set.
- •Future research directions include exploring the application of RPD to other statistical inference tasks and developing more sophisticated computational procedures for the projection depth in Hilbert spaces.