Theories of Visual Perception: Problems and Perspectives Greek
theories of visual perception The
Greeks had two clearly opposing views on the way visual perception works -
intromission theories and extramission theories. Intromission theorists, such
as Democritus (c. 425 B.C.) and Epicurus (342-270 B.C.), believed that
objects cast off resemblances of themselves, called eidola, rather in the way
that snakes cast off their skins. These eidola are captured by the eye. It is
the entry of eidola into the eye that allow us to see their shape. They took
as evidence the fact that objects can be seen to be mirrored in the cornea of
the observer. However this approach leads to unanswered questions - How do
eidola pass through one another without interference? How do eidola of large
objects shrink to enter the eye? How do eidola from a single object reach
many people simultaneously? Extramission theorists, such as Plato (c. 427-347
B.C.) believed that visual fire emanated from the eye and coalesced with
light to form a conduit that allows "motions" of the object to pass
to the sensorium. However as Aristotle (384-322 B.C.) points out it is
unreasonable to think that a ray from the eye could reach as far as the
stars. These
theories demonstrate a lack of a modern understanding of physics and optics
but the idea that perception involves the presence of copy of the object in
the eye or brain is represented in modern theories of template matching. Johannes
Kepler and the retinal image Modern
theories of vision start with Johannes Kepler who in Ad Vitellionem
paralipomena
(1604) first correctly described the formation of the retinal image in the
eye. A few years later Christoph Scheiner (1619) observed the retinal image
by scraping away the sclera of the eye of an Ox which was placed in a hole in
a shutter (reported by Descartes, 1637). However the was a problem - the
retinal image was upside down. Why do we not see the world upside down? The
answer to this problem is that the retinal image is not observed. If there
existed a small man in the brain (a homunculus) looking at the retinal image
then we would still need to explain how he sees the world and so on to an
infinite regress. Kepler's
theory of the retinal image is pivotal. Old problems are not solved; they are
explained away and new problems arise which still set the agenda today. Since
the retinal image is two dimensional, how do we see a three dimensional
world? How do we work out the real size of objects from their retinal size?
How do we recognise an object is the same from different views? How can we
see features that are not present in the retinal image? Perspective
ambiguity Perspective
drawing in art was developed by the C15th Italian artists/architects
Brunelleschi and Alberti. A convenient way of thinking about perspective
derives from Leonardo's window. This is a technique for
perspective drawing in which the artist views a scene though a glass from a
fixed vantage point. The artist then simply copies what he sees in the window
on canvas. However there are many possible three-dimensional scenes that can
give rise to the same two-dimensional image. This
was forcibly brought home by Albert Ames demonstrations in the 1940's. The
Ames chair demonstration involves a collection of rods and shapes in 3D
space, which looks like a chair from one vantage point. The point of the
demonstration is that the visual input to a single eye is ambiguous. We
cannot know the true 3D layout of surfaces in a scene from a single
viewpoint. Perceptual
hypotheses Constructivists
such as Hermann von Helmholtz and Richard Gregory start with the position
that external world cannot be directly perceived because of the poverty of
the information in the retinal images. Since information is not directly
given, we have to interpret the sensory data in order to construct percepts.
Images are interpreted on the basis of stored knowledge acquired through
learning. Helmholtz
believed the visual system drew "unconscious inferences" which he
later referred to as "inductive conclusions". Induction is the
process of drawing a general conclusion from individual instances - if all
the swans we ever see are white we draw the conclusion that "all swans
are white". This is same process as is used in the formation of
scientific hypotheses. Gregory takes this further and argues that perception
is a collection of hypotheses about the world. Evidence for this view comes
from analysis of many visual illusions that can be attributed to calibration
errors (e.g. the tilt illusion) or misplaced assumptions (Kanisza's triangle)
and to the top-down influence of knowledge and expectation. The
ecological approach to perception In
the 1950's James Gibson challenged this view of visual processing. He
referred to his theory as an ecological approach because, rather than
emphasising the poverty of the retinal image, he emphasised the information
available in the visual environment to an active observer. He believed that
perception was direct, by which he meant that perception is not mediated by a
process of inference, and percepts are not constructed from sensations.
Gibson emphasised relations in the environment. Whereas the constructivists
argue size constancy requires us to scale the
retinal image by the viewing distance, Gibson argues we judge size in
relation to the amount of background texture covered by the object. Motion of
the observer gives rise to optic flow, which specifies how the observer is
moving in relation to the environment. Theories of direct perception however
do not provide very satisfactory explanations of visual illusions. Gestalt
psychologists, such as Wertheimer, Koffka and Kohler also rejected the
structuralist ideas that perceptions were constructed from sensations. They
addressed the question "Why do things look as they do?" (Koffka).
They noted the spontaneous tendency to split scenes into figure and ground. They also studied the rules by
which material is grouped and segmented. The so-called laws of grouping
include good continuation, proximity, symmetry, similarity and common fate.
These laws may simply reflect the statistical regularities of the natural
visual environment - similar patterns normally arise from the same surface.
The core Gestalt idea, that the whole is greater that the sum of the parts,
emphasises relations between parts. The melody of a tune is still
recognisable though it is played on different instruments. Kohler attempted
to explain perception through neural isomorphism i.e. what we see reflects
isomorphic patterns in the brain. A good example of this kind of theorising
is Kohler's explanation of phi motion. If two spatially separated
lights flash on and off in sequence, one experiences continuous motion from
the first position to the second position. Kohler supposed that each flash
sets up an electric field in the brain and the interaction of these fields
caused the perception of motion. Recently, there has been a resurgence of interest
in the difficult problems raised by grouping, segmentation and perceptual
constancy studied by the Gestalt school. The
computational approach Illustrated
well by the work of David Marr, computational psychologists aim to understand
visual processes by building computer models of these processes. Vision is
seen as the process of forming a description of what is in the scene from the
retinal images. This process is sometimes referred to as inverse graphics.
From the starting point of a description of the geometry of a scene, the
reflectances of surfaces, the position of light sources and the position of a
viewer, it is possible to construct a realistic image of a scene. The task of
the visual system is to reverse this process and recover the causes of the
scene from the images on the retina. Computational vision aims to specify
mathematically how this is done and to assign a functional role to neural
components involved in this computation. Reading: Gordon,
I.E. (1997) Theories of Visual Perception, John Wiley, Chichester. Lindberg,
D.C. (1976) Theories of Vision from Al-Kindi to Kepler, U. of Chicago Press. Originally
written by: Prof.
Alan Johnston Division
of Psychology and Language Sciences University
College London Prof.
Johnston's original version is available at: http://www.psychol.ucl.ac.uk/alan.johnston/Theories.html The present
version has been edited for typographical errors, and some of the hyperlinks
have been modified. |