Robust Methods for Mean and Covariance Structure Analysis
K. H. Yuan, P. M. Bentler
Covariance structure analysis plays an important role in social and bahavioral sciences to evaluate hypothesized influences among unmeasured latent and observed variables. Existing methods for analysing these data rely on unstructured sample means and covariances estimated under normality, and evaluate a proposed structural model using statistical theory based on normal theory MLE and generalized least squares (GLS) with a weight matrix obtained from inverting a matrix based on sample fourth moments and covariances. Since the influence functions associated with these methods are quadratic, a few outliers can make these classical procedures a total failure. Considering that data collected in social and behavioral sciences are not so accurate, some robust methods are necessary in estimation and testing. Even though the theory for robustly estimating multivariate location and scatter has been developed extensively, very little has been accomplished in robust mean and covariance structure analysis. While robust principal components and canonical variates have been described many years ago, this methodology is essentially exploratory in nature and does not provide tests of model fit nor the covariance matrix of the estimator that are essential to covariance structure analysis. In this paper, several robust methods in model fitting and testing are proposed. These include direct estimation of M-estimators of structured parameters and a two-stage procedure based on robust M- and S-estimators of population covariances. The large sample properties of these estimators are obtained. The equivalence between a direct M-estimator and a two-stage estimator based on an M-estimator of population covariance is established when sampling from an elliptical distribution. Two test statistics are presented in judging the adequacy of of a hypothesized model: both are asymptotically distribution free if using distribution free weight matrices. So these test statistics possess both small sample and large sample robustness. The two-stage procedures can be easily adapted into standard software packages by modifying existing GLS procedures. To demonstrate the easy application of the two-stage procedure, M-estimators under six different weight functions are calculated for a real data set. All the weight functions give the smallest weight to the case which has been formerly identified as the most influential point.
1995-09-01