From Information Scaling of Natural Images to Regimes of Statistical Models

Ying Nian Wu, Song-Chun Zhu, and Cheng-en Guo
Computer vision can be considered a highly specialized data collection and data analysis problem. We need to understand the special properties of image data in order to construct statistical models for representing the wide variety of image patterns. One special property of vision that distinguishes itself from other sensory data such as speech data is that distance or scale plays a profound role in image data. More specifically, visual objects and patterns can appear at a wide range of distances or scales, and the same visual pattern appearing at different distances or scales produces different image data with different statistical properties, thus entails different regimes of statistical models. In particular, we show that the entropy rate of the image data changes over the viewing distance (as well as the camera resolution). Moreover, the inferential uncertainty changes with viewing distance too. We call these changes information scaling. From this perspective, we examine both empirically and theoretically two prominent and yet largely isolated research themes in image modeling literature, namely, wavelet sparse coding and Markov random fields. Our results indicate that the two models are appropriate on two different entropy regimes: sparse coding targets the low entropy regime, whereas the random fields are suitable for the high entropy regime. Because of information scaling, both models are necessary for representing and interpreting image intensity patterns in the whole entropy range, and information scaling triggers transitions between these two regimes of models. This motivates us to propose a full-zoom primal sketch model that integrates both sparse coding and Markov random fields. In this model, local image intensity patterns are classified into
2004-09-01