Vision Science: from Optics to Neuroscience and Statistical Learning
Vision is the ability to interpret the surrounding environment by analyzing the measurements drawn by imaging systems. This ability is particularly impressive in humans compared to the current state of the art in computers.
The study of all phenomena related to vision in biological systems (particularly in humans) is usually referred to as Vision Science. It addresses a variety of issues ranging from the formation of the visual signal, such as the physics of the imaging process, which includes Radiometry and Physiological Optics—to the analysis of the visual signal, which is of interest for Neuroscience and Psychology.
This analysis involves the extraction of visual primitives through basic computations in the retina-cortex neural pathway and the subsequent information processing that leads to scene descriptors of higher abstraction levels (see elsewhere). These problems can be approached from different perspectives:
- A mechanistic perspective, which focuses on describing the empirical behavior of the system, based on experimental recordings from Psychophysics and Neurophysiology.
- A normative perspective, which looks for the functional reasons (organization principles) that explain the behavior. This perspective relies on the study of Image Statistics and the use of concepts from Information Theory and Statistical Learning.
The latter is known as the [Efficient Coding Hypothesis].
Over the years, we have made original contributions in all of the above subdisciplines related to (low-level) Vision Science. Currently, we are shifting our focus to more abstract visual functions.
Experiments in Vision Science
I made experimental contributions in three aspects: Physiological Optics, Psychophysics, and Image Statistics.
In the field of Physiological Optics, we measured the optical transfer function of the lens+cornea system in-vivo [Opth.Phys.Opt.97]. This work received the European Vistakon Research Award in 1994.
In Psychophysics, we proposed simplified methods to measure the Contrast Sensitivity Function across the entire frequency domain [J.Opt.94], and developed a fast and accurate method to measure the parameters of multi-stage linear+nonlinear vision models [Proc.SPIE15].
In Image Statistics, we gathered spatially and spectrally calibrated image samples to determine the properties of these signals and their variations under changes in illumination, contrast, and motion [Im.Vis.Comp.00, Neur.Comp.12, IEEE-TGRS14, PLoS-ONE14, Rem.Sens.Im.Proc.11, Front.Neurosci.15].
Theory: empirical models in Vision Science
We proposed mathematical descriptions of different visual dimensions: Texture, Color, and Motion.
We used wavelet representations to propose nonstationary Texture Vision models [J.Mod.Opt.97, MScThesis95].
We developed Color Vision models with illumination invariance, which allow for the reproduction of chromatic anomalies, adaptation, and aftereffects [Vis.Res.97, J.Opt.96, J.Opt.98, JOSA04, Neur.Comp.12].
We created Motion Vision models [Alheteia08] that focus on optical flow computation in perceptually relevant moving regions [J.Vis.01, PhDThesis99], and explain the static motion aftereffect [Front.Neurosci.15].
All these psychophysical and physiological models have a parallel linear+nonlinear structure where receptive fields and surround-dependent normalization play an important role.
Theory: principled models in Vision Science
This category refers to the proposition of organizational laws of sensory systems that explain empirical phenomena. These principles demonstrate that neural function has been adapted to (or is determined by) the statistics of visual stimuli.
Derivation of Linear Properties: We worked on deriving the linear properties of the sensors and found that their spatio-chromatic sensitivity, changes in receptive fields, and phase properties arise from optimal solutions to the adaptation problem under noise constraints and manifold matching [PLoS-ONE14, IEEE-TGRS13]. These properties are also derived from statistical independence requirements [LNCS11, NeuroImag.Meeting11], and from optimal estimation of object reflectance [IEEE TGRS14].
Derivation of Non-Linear Behavior: We also derived the non-linear behavior for a variety of visual sensors (chromatic, texture, and motion sensors). We found that these nonlinearities are linked to optimal information transmission (entropy maximization) and/or error minimization in noisy systems (optimal vector quantization).
We studied this relationship in the traditional statistics-to-perception direction, deriving the nonlinearity from regularities in the scene [Network06, Neur.Comp.12, Front.Neurosci.15].
We also explored the (more novel) perception-to-statistics direction, examining the statistical effects of perceptually motivated nonlinearities [J.Opt.95, Im.Vis.Comp.00, LNCS00, Patt.Recog.03, Neur.Comp.10, LNCS10, NeuroImag.Meeting11].
Theory: Statistical Learning for Vision Science
In theoretical neuroscience, deriving properties of biological sensors from the regularities in visual scenes requires novel tools for statistical learning. In this field, we developed new techniques for unsupervised manifold learning, feature extraction (or symmetry detection in datasets), dimensionality reduction, probability density estimation, multi-information estimation, distance learning, and automatic adaptation from optimal dataset matching.
Given my interest in applicability to Vision Science problems, I focused on techniques that can be explicitly represented in the image domain, to be compared with receptive fields of visual neurons, as opposed to the usual practice in the Machine Learning community. Techniques include:
- Rotation-based Iterative Gaussianization (RBIG) [IEEE TNN 11].
- Sequential Principal Curves Analysis (SPCA) [Network06, Neur.Comp.12, Front. Neurosci.15].
- Principal Polynomial Analysis (PPA) Int.J.Neur.Syst.14
- Dimensionality Reduction based on Regression (DRR) [IEEE JSTSP15].
- Graph Matching for Adaptation [IEEE TGRS13.]