An Introduction to Ensemble Methods for Data Analysis (April 2005 Revision)

Richard A. Berk

This paper provides an introduction to ensemble statistical procedures as a special case of algorithmic methods. The discussion begins with classification and regression trees (CART) as a didactic device to introduce many of the key issues. Following the material on CART is a consideration of cross-validation, bagging, random forests and boosting. Major points are illustrated with analyses of real data.

This paper supersedes Preprint number uclastat-preprint-2003:26.

2005-09-01