Image and Video Processing: Scene Statistics and Visual Neuroscience at work!

Efficient coding of visual information and efficient inference of missing information in images depend on two factors:

  1. The statistical structure of photographic images, and
  2. The nature of the observer that will analyze the result.

Interestingly, these two factors (image regularities and human vision) are deeply related since the evolution of biological sensors seems to be guided by statistical learning (see our work on the Efficient Coding Hypothesis in Visual Neuroscience). However, the simultaneous consideration of these two factors is unusual in the image processing community, particularly beyond Gaussian image models and linear models of the observer.
Our work in image and video processing has been parallel to our investigation in describing the non-Gaussian nature of visual scenes and the nonlinear behavior of visual cortex. This parallel approach is sensible since these are two sides of the same issue in vision (the Efficient Coding Hypothesis again!). Specifically, the core algorithm used in many applications has been the Divisive Normalization, a canonical computation in sensory neurons with interesting statistical effects (see Neur.Comp.10).

We have used this perceptual (and also statistical) model to propose novel solutions in bit allocation, to identify perceptually relevant motion, to smooth image representations, and to compute distances between images.

Image and Video Processing

Low level Image Processing (coding, restoration, synthesis, white balance, color and texture edition, etc…) is all about image statistics in a domain where the metric is non-Euclidean (i.e. induced by the data or the observer).

We proposed original image processing techniques using both perception models and image statistics including:

(i) improvements of JPEG standard for image coding through nonlinear texture vision models Electr.Lett.95, Electr.Lett.99, IEEE TNN05, IEEE TIP06a, JMLR08,RPSP12, Patent08, (ii) improvements of MPEG standard for video coding with new perceptual quantization scheme and new motion estimation focused on perceptually relevant optical flow LNCS97, Electr.Lett.98, Electr.Lett.00a, Electr.Lett.00b, IEEE TIP01, Redund.Reduct.99, (iii) new image restoration techniques based on nonlinear contrast perception models and the image statistics in local frequency domains IEEE TIP 06b, JMLR10, (iv) new approaches to color constancy either based on relative chromatic descriptors
Vis.Res.97,J.Opt.96, statistically-based chromatic adaptation models Neur.Comp.12, PLoS-ONE14, or Bayesian estimation of surface reflectance IEEE-TGRS14, (v) new subjective image and video distortion measures using nonlinear perception models Im.Vis.Comp.97, Disp.99, IEEE ICIP02, JOSA10,Proc.SPIE15, (vi) image classification and knowledge extraction (or regression) based on our feature extraction techniques IEEE-TNN11, IEEE-TGRS13,Int.J.Neur.Syst.14, IEEE-JSTSP15. See CODE for image and video processing applications here.

References

Download