Learning Factor Analysis Structures: A Clique Search Method on Correlation Thresholded Graphs and a Piecewise Linear Spline Approach
Dale Kim
PhD, 2022
Qing, Zhou
Factor analysis is a widely used method for modeling a set of observed variables by a set of unobserved latent factors. Despite their widespread application, existing methods for factor analysis suffer from some or all of the following weaknesses: requiring the number of factors to be known, lack of theoretical guarantees for learning the model structure, and nonidentifiability of the parameters due to rotation invariance properties of the likelihood. To address these concerns, this dissertation proposes two main methods. First, we propose a fast correlation thresholding (CT) algorithm that simultaneously learns the number of latent factors and a model structure that leads to identifiable parameters.This approach translates this structure learning problem into the search for so-called independent maximal cliques in a thresholded correlation graph that can be easily constructed from the observed data. Moreover, we present a routine to find all independent maximal cliques very efficiently by checking the neighborhood of each node in the graph. Finite-sample error bound and high-dimensional consistency for the structure learning of this method is also presented. Second, we consider the problem of non-linear factor analysis, and propose a piecewise linear spline method under an EM-algorithm framework. In many practical settings, learning a non-linear model may obviate the need for multiple latent factors, and also allow the model to avoid rotational invariance nonidentifiability. This method is explored by simulation, and a preliminary study into the non-linear multidimensional extension is also presented.
2022