
Neural Mechanisms of Visual Perception
The interpretation of 2D images in terms of a 3D world is a fundamental task of vision, which, through evolution and experience, has shaped the neural processing mechanisms in our visual system. We perceive 3D shape and the lay-out of objects in space immediately, without effort, and are hardly aware that vision is all based on 2D images, the patterns of activity in an array of receptors. However, machine vision studies have shown that the interpretation of images of natural scenes is very difficult, and no artificial vision system has yet come close to human visual performance.
Neurophysiological studies have given a detailed picture of the initial stages of visual processing in the brain, from the registration of the image in the retina to the local feature representation in primary visual cortex. Further investigations of brain areas in the temporal, parietal, and frontal lobes have given insight into the cognitive processes underlying control of attention, working memory, and object recognition.
However, psychophysical studies have indicated that the processes of recognition and attentional selection do not work on the local feature representation directly, but use some intermediate stage of perceptual organization. This hypothetical stage is thought to provide a structured representation in which surfaces and 3D scene layout are made explicit, and local features are somehow lumped together in terms of perceptual objects. This intermediate stage of processing, and its interface with the central processes of selective attention and recognition, are the focus of our research.
Images are cluttered, because near and distant objects are projected onto adjacent regions (Fig. 1). To interpret an image, it has to be ‘segmented’ first into regions that correspond to different objects. The borders between those regions are called occluding contours because they are generated by objects partially occluding other objects in the scene. As illustrated in Fig. 1, occluding contours carry information
about the form of the occluding object, but bear no relationship to the occluded objects, which extend behind the contours. Thus, image segmentation requires not only the detection of the borders between regions corresponding to different objects, but also the correct assignment of ‘border-ownership’. The shape of a border should be linked to the color, texture, and other features of the region on the foreground side, while the shapes and regions on the background side must be labeled as incomplete.

The detection of contours may be achieved by local feature detectors, such as the cortical simple cells, but assignment of border-ownership generally requires global processing. A contrast border alone carries no information about border-ownership: a dark-light border could be the contour of a light object to the right, or the contour of a dark object to the left (Fig. 2).
This is a well-known demonstration of an ambiguous figure that makes perception unstable. In general, however, perception is
remarkably stable, despite the fact that 2D displays always have multiple interpretations. The black square in Fig. 3, for example, is usually
perceived as a black object on a gray background, with the light-dark borders forming the contours of the square, although physically, it is just an area of low luminance on a flat screen. Of course, the square could also be a window, and the borders would then be the edges of the dark
surface, but this interpretation is usually not perceived.
Neural Mechanisms in Figure-Ground Segregation
To interpret contrast borders correctly, one needs the image context and additional information, for example, memory of the shapes of objects. Therefore, figure-ground segregation has often been thought of as a high-level process. However, to our surprise, we found border-ownership coding at the level of V2, the cortical area next to the primary visual cortex. The neuron of Fig. 4, for example, responded strongly to a
light-dark border that belonged to a light square above the receptive field (ellipse), but the same neuron responded hardly at all when the light-dark border was presented as part of a dark square below the receptive field.
Fig. 5 shows the example of a color selective neuron. Various stimuli were tested, each of which displayed the same contrast border in the receptive field. The bar graph on the right shows the responses (mean firing rate) of the neuron. It can be seen that this neuron consistently produced a greater response for those displays in which the border in the receptive field perceptually belonged to a figure on the lower left.
Thus, despite their tiny receptive fields, V2 neurons carry border-ownership information which depends on the global configuration.
About 50% of
the edge-selective cells in areas V2 and V4 signal border-ownership.
Our current experiments in this project focus (a) on the mechanisms that generate border-ownership selectivity, and (b) on the role that this representation plays in tasks that involve selective attention and recognition.
(a) Border-ownership signals emerge with short latency, and even under conditions when attention is directed away from the test figure. Thus, it seems that multiple locations of the retinal image are processed automatically and in parallel all the time. Three general hypotheses have been proposed: models using simple feed-forward mechanisms, models of cooperative feed-back networks within visual cortex, and the hypthesis of top-down influence from higher-level areas.
(b) Understanding the role of the intermediate representation is crucial to understanding vision and perception in general. Every task in perception requires specific processing of information selected from the onslaught of sensory input (on the order of Megabytes per second). Border-ownership coding is evidence for a structured visual representation that might be the basis for deployment of attention and selective visual processing. We test this hypothesis by recording neural signals in monkeys performing a selective visual attention task.
Cortical Representation of 3D Shape
In another project, we study the neural mechanisms of 3D shape perception. Recording neuronal responses to random-dot stereograms revealed that many cells in prestriate cortex are selective for 3D shape primitives, such as roof-edges, step-edges, and flat surfaces.
Stereoscopic mechanisms certainly play an important role in depth perception. However, binocular mechanisms are not the only way of perceiving 3D shape. We study the influence of monocular cues such as occlusion, perspective, texture, shading, and motion parallax, on the responses cortical neurons. In parallel psychophysical experiments, we measure the ‘subjective’ perception of depth from monocular cues. We developed a paradigm in which subjects adjust a stereoscopic cursor to a test object in a mixed binocular-monocular display. This technique allows us to study monocular depth perception quantitatively in humans as well as monkeys. As in the project on border-ownership, the goal of these experiments is to understand the general nature of object representation in the brain.
| Friedman HS, Zhou H, von der Heydt R (2003) The coding of uniform color figures in monkey visual cortex. J Physiol (Lond) 548: 593-613 figures in monkey visual cortex. J Physiol (Lond) 548: 593-613. PDF |
| von der Heydt R (2003) Image parsing mechanisms of the visual cortex. In: The Visual Neurosciences (Werner JS, Chalupa LM, eds), pp 1139-1150. Cambridge, Mass.: MIT press. PDF |
| von der Heydt R, Zhou H, Friedman HS (2000) Representation of stereoscopic edges in monkey visual cortex. Vision Research 40: 1955-1967. PDF |
| Zhou H, Friedman HS, von der Heydt R (2000) Coding of border ownership in monkey visual cortex. J Neuroscience 20: 6594-6611. PDF |
| Heitger F, von der Heydt R, Peterhans E, Rosenthaler L, Kübler O (1998) Simulation of neural contour mechanisms: representing anomalous contours. Image and Vision Computing 16: 409-423. PDF |








