Convergence analysis of data augmentation algorithms in Bayesian lasso models with log-concave likelihoods
Abstract
We study the convergence properties of a class of data augmentation algorithms targeting posterior distributions of Bayesian lasso models with log-concave likelihoods. Leveraging isoperimetric inequalities, we derive a generic convergence bound for this class of algorithms and apply it to Bayesian probit, logistic, and heteroskedastic Gaussian linear lasso models. Under feasible initializations, the mixing times for the probit and logistic models are of order $O[(p+n)^3 (pn^{1-c} + n)]$, up to logarithmic factors, where $n$ is the sample size, $p$ is the dimension of the regression coefficients, and $c \in [0,1]$ is determined by the lasso penalty parameter. The mixing time for the heteroskedastic Gaussian model is $O[n(n+p)^3 (p n^{1-c} + n)]$, up to logarithmic factors.
Summary
This paper addresses the problem of quantifying the convergence rate of data augmentation algorithms used for Bayesian lasso models. The authors focus on models with log-concave likelihoods, which are prevalent in various statistical applications. They derive a generic convergence bound based on isoperimetric inequalities, a technique that relates the probability mass of a set to the measure of its boundary. This bound is then applied to three specific Bayesian lasso models: probit, logistic, and heteroskedastic Gaussian linear regression. The key contribution is providing explicit mixing time bounds for these algorithms, showing how the convergence rate depends on the sample size (n), the dimension of the regression coefficients (p), and the lasso penalty parameter (λ). Specifically, they show that, under feasible initializations, the mixing times for the probit and logistic models are of order O[(p+n)^3 (pn^(1-c) + n)], up to logarithmic factors, where c is determined by the lasso penalty parameter. The mixing time for the heteroskedastic Gaussian model is O[n(n+p)^3 (pn^(1-c) + n)], up to logarithmic factors. The significance of this work lies in providing theoretical guarantees for the practical use of MCMC in Bayesian lasso models. While these models are widely used for variable selection and shrinkage, understanding the convergence properties of the algorithms used to sample from the posterior distribution is crucial for reliable inference. The authors build upon existing work by extending convergence analysis to non-Gaussian likelihoods and heteroskedastic settings, and by leveraging isoperimetric inequalities. This contributes to a more comprehensive understanding of the computational behavior of Bayesian lasso and provides guidance for practitioners on how to choose parameters and assess convergence. The work also provides a pathway for analyzing other MCMC algorithms targeting similar statistical models.
Key Insights
- •The paper provides a generic convergence bound for data augmentation algorithms in Bayesian lasso models with log-concave likelihoods, relying on isoperimetric inequalities and a close coupling condition.
- •The mixing time bounds for the probit and logistic models are shown to be of order O[(p+n)^3 (pn^(1-c) + n)], up to logarithmic factors, under feasible initializations, where 'c' is determined by the lasso penalty parameter (λ).
- •The mixing time for the heteroskedastic Gaussian model is shown to be of order O[n(n+p)^3 (pn^(1-c) + n)], up to logarithmic factors.
- •The authors leverage perturbation bounds to establish an isoperimetric inequality for the log-concave but non-smooth posterior distribution, addressing challenges arising from the non-Gaussian likelihood and non-smooth prior.
- •The close coupling condition (A1) requires bounding the total variation distance between g(·| α^(1),β^(1),y) and g(·| α^(2),β^(2),y), which reflects the sensitivity of the augmentation step to changes in the parameters.
- •The authors provide explicit expressions for the constant 'D' in condition (A2), which bounds the ratio between the posterior density and a reference distribution. These expressions depend on model parameters and the data.
- •The analysis relies on the assumption that the likelihood is log-concave, which limits the applicability to certain types of models.
Practical Implications
- •The theoretical results provide guidance for practitioners using Bayesian lasso models, especially for assessing the convergence of MCMC algorithms. The derived mixing time bounds can help determine the necessary number of iterations for reliable inference.
- •The paper identifies specific factors that influence the mixing time, such as the sample size, the dimension of the regression coefficients, and the lasso penalty parameter. This knowledge can inform the choice of hyperparameters and data collection strategies.
- •The developed framework can be extended to analyze other MCMC algorithms targeting similar statistical models, such as those with different priors or likelihoods.
- •The results highlight the importance of feasible initializations for achieving fast convergence. The paper provides specific recommendations for initial distributions that satisfy the required conditions.
- •The work opens up avenues for future research, such as exploring alternative MCMC algorithms or developing techniques to improve convergence rates in specific application domains.