Multilinear Approximation with Kronecker Weights

Wei Tan Tsai
Ph.D., 2011
Advisor: Jan de Leeuw
Since numerous modern real-world datasets are stored in matrices or higher dimensional arrays, multiple correlations may potentially co-exist. For example, in environmental statistics, temporally and spatially correlated behaviors may be observed simultaneously. This consideration has led to the use of separable covariance (Kronecker product) structure, by means of an extension of the clas- sical multivariate normal distribution to the matrix normal distribution for a matrix-valued dataset. The General Growth Component Model, with a bilinear structured mean and a Kronecker structured covariance, is more generally applicable than the linear model in the description of such data. This model is a generalization of the Growth Curve Model proposed by Potthoff and Roy [1964] by adding the flexibility of the components of the mean structure and dispersion matrices. In many previous publication on the Growth Curve Model, the emphasis is on estimation and hypothesis testing. In this dissertation, we have proposed the Three-Stage Kronecker Algorithm with actual implementation in R which estimates the parameters using a systematic optimization approach. The algo- rithm allows us to incorporate different variations of the structures of the mean parameters as well as the dispersion parameters. However, it is not limited to the Growth Curve Model, it can also be applied to other related models, such as General Linear Model (GLM), Principal Component Analysis (PCA), Factor Analysis (FA), etc, which can be described by the General Growth Component Model and fit into the same relaxation or alternating least square schemes to minimize their loss functions. The expectation-maximization (EM) imputation technique which accounts for inevitable missing values in real-world datasets is also considered. We have demonstrated its application to the longitudinal study of childrens dental health and to the environmental study of traffic and ozone data.