A Unified Inference Method for FROC-type Curves and Related Summary Indices
Abstract
Free-response observer performance studies are of great importance for accuracy evaluation and comparison in tasks related to the detection and localization of multiple targets or signals. The free-response receiver operating characteristic (FROC) curve and many similar curves based on the free-response observer performance assessment data are important tools to display the accuracy of detection under different thresholds. The true positive rate at a fixed false positive rate and summary indices such as the area under the FROC curve are also commonly used as the figures of merit in the statistical evaluation of these studies. Motivated by a free-response observer performance assessment research of a Software as a Medical Device (SaMD), we propose a unified method based on the initial-detection-and-candidate model to simultaneously estimate a smooth curve and derive confidence intervals for summary indices and the true positive rate at a fixed false positive rate. A maximum likelihood estimator is proposed and its asymptotic normality property is derived. Confidence intervals are constructed based on the asymptotic normality of our maximum likelihood estimator. Simulation studies are conducted to evaluate the finite sample performance of the proposed method. We apply the proposed method to evaluate the diagnostic performance of the SaMD for detecting pulmonary lesions.
Summary
This paper addresses the problem of statistical inference for Free-response Receiver Operating Characteristic (FROC)-type curves, which are widely used to evaluate the accuracy of detection and localization tasks, particularly in medical imaging. The authors argue that existing methods for estimating FROC curves and related summary indices (e.g., Area Under the Curve - AUC) have limitations, such as producing non-smooth curves, lacking theoretical justification for confidence intervals, or failing to provide inference methods for the true positive rate at a fixed false positive rate. To overcome these limitations, the paper proposes a unified method based on the initial-detection-and-candidate model (IDCA). This model divides the diagnostic process into two stages: initial candidate detection and confidence score assignment. The authors derive a maximum likelihood estimator (MLE) for the model parameters and prove its asymptotic normality, even with the challenge of random sample sizes inherent in FROC data. This allows them to construct confidence intervals for summary indices like AUC and the true positive rate at a fixed false positive rate. Simulation studies demonstrate the finite sample performance of the proposed method, and the method is applied to evaluate the diagnostic performance of a Software as a Medical Device (SaMD) for detecting pulmonary lesions. The proposed method provides valid theoretical guarantee and can simultaneously estimate smooth FROC-type curves and construct confidence intervals for related summary indices.
Key Insights
- •The paper proposes a novel unified method for estimating smooth FROC-type curves and constructing confidence intervals for related summary indices, addressing limitations of existing empirical and parametric methods.
- •The method is based on an IDCA model and derives a maximum likelihood estimator (MLE) with proven asymptotic normality, even with the challenge of random sample sizes.
- •A key theoretical contribution is Theorem 1, which establishes the asymptotic normality of the MLE estimator for the model parameters (p, λ, θ1, θ2).
- •The paper provides a new formula for AUC (Theorem 2) facilitating easier estimation and inference.
- •Theorem 3 provides the relationship between the lesion location fraction (LLF) at a given false positive fraction (FPF) and the model parameters, enabling inference on LLF.
- •Simulation results demonstrate that the proposed method achieves good coverage rates for confidence intervals, even under mild correlation within subjects, a common concern for independence assumptions. In some simulation settings, coverage rates are near 95%.
- •A limitation of the method is the independence assumption, which may not hold in all situations. However, the authors argue that this assumption is more plausible for computer-aided detection systems (SaMDs) and demonstrate robustness to mild violations.
Practical Implications
- •The proposed method can be directly applied to evaluate the performance of medical imaging systems, particularly SaMDs, used for detection and localization tasks.
- •Researchers and engineers working on diagnostic imaging systems can use this method to obtain more accurate estimates of FROC curves and related performance metrics, along with valid confidence intervals.
- •Practitioners can use the method to determine confidence sets for multiple indices, which allows for a more comprehensive assessment of diagnostic performance.
- •The method opens up future research directions, including relaxing the independence assumption, extending the method to ordinal data, and studying nonparametric estimates of FROC-type curves.
- •The method provides a validated approach for regulatory submissions requiring FROC analysis, particularly for SaMDs seeking FDA approval.