VistaLab: The Matlab Toolbox for Linear Spatio-Temporal Vision Models

The Matlab toolbox for linear spatio-temporal Vision Models

VistaLab is a Matlab toolbox that provides the linear building-blocks to create spatio-temporal vision models and the tools to control the spatio-temporal properties of video sequences. These building blocks include the spatio-temporal receptive fields of LGN, V1, and MT cells, and the spatial and spatio-temporal Contrast Sensitivity Functions (CSFs). Additionally, VistaLab allows accurate spatio-temporal sampling, spatio-temporal Fourier domain visualization, and generation of video sequences with controlled texture and speed. Tools for video sequence generation include noise, random dots, and rigid-body animations with Lambertian reflectance.

The perception and video synthesis tools enable accurate illustrations of the visibility of achromatic spatio-temporal patterns. Linear filters in VistaLab provide rough approximations of pattern visibility, which can be enhanced with non-linear models available in related toolboxes.

The standard tools in VistaLab (and ColorLab) are essential for building more sophisticated vision models, available on the VistaModels dedicated site.

Table of Contents

Retina and Lateral Geniculate Nucleus (LGN)

Most of the Retinal Ganglion Cells and cells in the LGN can be modelled with center-surround receptive fields with monophasic or biphasic temporal response. VistaLab comes with a configurable implemenation of such receptive fields according to the general expressions in [Cai, Freeman, DeAngelis, J. Neurophysiol. 97]. Using these units it is easy to generate artificial retinas with arbitrary sampling [Martinez-Garcia et al. 16, Martinez-Garcia et al. 17].

The examples below show (a) the receptive field of a representative neuron in the spatiotemporal and in the 3D Fourier domain, and (b) the response of a population of such neurons to a natural movie assuming uniform retinal sampling and spatial invariance of the receptive field. VistaLab allows explicit implementation of each sensor response using the scalar product by the corresponding receptive field to get rid of the uniform sampling and the convolution assumptions.

Primary Visual Cortex (V1)

Simple cells in the V1 cortex can be modelled with Gabor-like receptive fields tuned to certain spatial and temporal frequencies. VistaLab comes with a configurable implemenation of such receptive fields according to the general expressions in [Daugman JOSA A 89, Simoncelli & Heeger Vis. Res. 98]. Using these units it is easy to generate artificial cortex with arbitrary sampling [Martinez-Garcia et al. 17].

The examples below show six representative neurons tuned to the same spatial frequencies (7 cpd) but different temporal frequencies 2, 7, and 10 Hz, both positive and negative. Eventhough there is no conclusive tuning to two-dimensional speed due to the aperture problem [Heeger JOSA 87], in the direction perpendicular to the grating, these are tuned to 0.3, 1 and 1.5 degrees/sec respectively (both positive and negative). Figures show: (a) the receptive fields in the spatiotemporal and in the 3D Fourier domain, and (b) the response of a population of such neurons to a natural movie assuming uniform retinal sampling and spatial invariance of the receptive field. VistaLab allows explicit implementation of each sensor response using the scalar product by the corresponding receptive field to get rid of the uniform sampling and the convolution assumptions.

Middle Temporal (MT) region

Cells in the MT cortex receive projections from V1 cells aligned in a plane in the spatio-temporal Fourier domain. Therefore, they are narrow-band in speed tuning. VistaLab comes with a configurable implemenation of such receptive fields according to the general expressions in [Simoncelli & Heeger Vis. Res. 98]. Using these units and a spatio-temporal window it is easy to generate artificial MT cortex with arbitrary sampling [Martinez-Garcia et al. 17].

The examples below shows six representative sets of neurons tuned to tuned to speeds of 0.3, 1 and 1.5 degrees/sec respectively (both positive and negative). In this case while Figures show: (a) the receptive fields in 3D Fourier domain, the kind of features these cells are optimally tuned to, and (b) the response of a population of such neurons to a natural movie assuming uniform retinal sampling and spatial invariance of the receptive field. VistaLab allows explicit implementation of each sensor response using the scalar product by the corresponding receptive field to get rid of the uniform sampling and the convolution assumptions.

Spatio-temporal Contrast Sensitivities

VistaLab comes with different Contrast Sensitivity Functions (CSFs): (a) the spatial-achromatic CSF from the OSA Standard Spatial Observer Watson & Malo IEEE ICIP 02, (b) the spatial-chromatic, Red-Green and Yellow-Blue CSFs of K. Mullen [Vis. Res. 85], with approrpiate scaling Gutierrez et al. 12, and (c) the achromatic spatio-temporal CSFs of D. Kelly [JOSA 79], and S. Daly (with object tracking speed compensation) [SPIE 98].

Controlled spatio-temporal stimuli

The movies below illustrate the abilities of VistaLab for accurate motion control.

  • First row: includes sequences of the motion of a lambertian rigid body evolving in a gravitatory field with inelastic restrictions recorded from different points of view, this example allows arbitrary locations of the illumination and camera. In this case the actual motion in 3D world and the optical flow (motion in the retinal plane) are known.

  • Second row: includes an example of random dots moving according to arbitrary optical flow fields.

  • Third row: shows how static pictures can be animated using spatially uniform flows of arbitrary speed leading to interesting shape-from-motion effects in the case of noise patterns.

  • Fourth row: shows different movies of the same periodic pattern moving at progressively increasing speeds. Aliasing introduces speed reversal at the expected place, as demonstrated by the Fourier diagrams below.

Extensions of VistaLab

VistaLab only addresses the linear part of the neural mechanisms that mediate the preattentive perception of spatio-temporal patterns. However, it doesnt combine these mechanisms to compute motion (optical flow), it doesnt include the nonlinear interactions between the linear mechanisms, and it doesnt include color.

These issues can be addressed with other toolboxes, namely VistaVideoCoding, BioMultiLayer_L_NL_color in VistaModels, and Colorlab.

Key Capabilities

  • Spatio-temporal Modeling: Build models for LGN, V1, and MT neural responses.
  • Contrast Sensitivity: Apply achromatic and chromatic CSFs to video and images.
  • Video Synthesis: Create controlled video sequences with specific spatio-temporal properties.
  • Fourier Domain Tools: Visualize spatio-temporal frequency response of neural models.

References

Download