Overview
This paper is concerned with variable selection in linear high-dimensional framework when the set of covariates under consideration are highly correlated. Existing methods in the literature generally require that the degree of correlation among covariates to be weak, yet, often in applied research, covariates could be strongly cross correlated due to common factors. This paper generalizes the One Covariate at a Time Multiple Testing procedure proposed by Chudik et al. (2018) to allow the set of covariates under consideration to be highly correlated. We exploit ideas from latent factor and multiple testing literature to control the probability of selecting the approximating model. We also establish the asymptotic behavior of the post GOCMT selected model estimated by the least squares method. Our results show that the estimation error of the coefficients converge to zero at the limit. Moreover, the mean square error and the mean square forecast error of the estimated model approaches to their corresponding optimal values asymptotically. The proposed method is shown to be valid under general assumptions and is computationally very fast. Monte Carlo experiments indicate that the newly suggested method have appealing finite-sample performance relative to competing methods under many different settings. The benefits of the proposed method are also illustrated by an empirical application to selection of risk factors in asset pricing literature.