Hack13.Understand Visual Processing


Hack 13. Understand Visual Processing

The visual system is a complex network of modules and pathways, all specializing in different tasks to contribute to our eventual impression of the world.

When we talk about "visual processing," the natural mode of thinking is of a fairly self-contained process. In this model, the eye would be like a video camera, capturing a sequence of photographs of whatever the head happens to be looking at at the time and sending these to the brain to be processed. After "processing" (whatever that might be), the brain would add the photographs to the rest of the intelligence it has gathered about the world around it and decide where to turn the head next. And so the routine would begin again. If the brain were a computer, this neat encapsulation would be how the visual subsystem would probably work.

With that (admittedly, straw man) example in mind, we'll take a tour of vision that shows just how nonsequential it all really is.

And one need go no further than the very idea of the eyes as passive receptors of photograph-like images to find the first fault in the straw man. Vision starts with the entire body: we walk around, and move our eyes and head, to capture depth information [Hack #22] like parallax and more. Some of these decisions about how to move are made early in visual processing, often before any object recognition or conscious understanding has come into play.

This pattern of vision as an interactive process, including many feedback loops before processing has reached conscious perception, is a common one. It's true there's a progression from raw to processed visual signal, but it's a mixed-up, messy kind of progression. Processing takes time, and there's a definite incentive for the brain to make use of information as soon as it's been extracted; there's no time to wait for processing to "complete" before using the extracted information. All it takes is a rapidly growing dark patch in our visual field to make us flinch involuntarily [Hack #32], as if something were looming over us. That's an example of an effect that occurs early in visual processing.

But let's look not at the mechanisms of the early visual system, but how it's used. What are the endpoints of all this processing? By the time perception reaches consciousness, another world has been layered on top of it. Instead of seeing colors, shapes, and changes over time (all that's really available to the eyes), we see whole objects. We see depth, and we have a sense of when things are moving. Some objects seem to stand out as we pay attention to them, and others recede into the background. Consciously, we see both the world and assembled result of the processing the brain has performed, in order to work around constraints (such as the eyes' blind spot [Hack #16] ), and to give us a head start in reacting with best-guess assumptions. The hacks in this chapter run the whole production line of visual processing, using visual illusions and anomalies to point out some detail of how vision works.

But before diving straight into all that, it's useful to have an overview of what's actually meant by the visual system. We'll start at the eye, see how signals from there go almost directly to the primary visual cortex on the back of the brain, and from there are distributed in two major streams. After that, visual information distributes and merges with the general functions of the cortex itself.

2.2.1. Start at the Retina

In a sense, light landing on the retinathe sensory surface at the back of the eyeis already inside the brain. The whole central nervous system (the brain and spinal column [Hack #7]) is contained within a number of membranes, the outermost of which is called the dura mater. The white of your eye, the surface that protects the eye itself, is a continuation of this membrane, meaning the eye is inside the same sac. It's as if two parts of your brain had decided to bulge out of your head and become your eyes, but without becoming separate organs.

The retina is a surface of cells at the back of your eye, containing a layer of photoreceptors, cells that detect light and convert it to electrical signals. For most of the eye, signals are aggregateda hundred photoreceptors will pass their signal onto a single cell further along in the chain. In the center of the eye, a place called the fovea, there is no such signal compression. (The population density of photoreceptors changes considerably across the retina [Hack #14] .) The resolution at the fovea is as high as it can be, with cells packed in, and the uncompressed signal dispatched, along with all the other information from other cells, down the optic nerve. The optic nerve is a bundle of projections from the neurons that sit behind the photoreceptors in the retina, carrying electrical information toward the brain, the path of information out of the eye. The size of the optic nerve is such that it creates a hole in our field of vision, as photoreceptors can't sit over the spot where it quits the eyeball (that's what's referred to as the blind spot [Hack #16] ).

2.2.2. Behind the Eyes

Just behind the eyes, in the middle, the optic nerves from each eye meet, split, and recombine in a new fashion, at the optic chiasm. Both the right halves of the two retinas are dispatched to the left of the brain and vice versa (from here on, the two hemispheres of the brain are mirror images of each other). It seems a little odd to divide processing directly down the center of the visual field, rather than by eye, but this allows a single side of the brain to compare the same scene as observed by both eyes, which it needs to get access to depth information.

The route plan now is a dash from the optic chiasm right to the back of the brain, to reach the visual cortex, which is where the real work starts happening. Along the way, there's a single pit stop at a small region buried deep within the brain called the lateral geniculate nucleus, or LGN (there's one of these in each hemisphere, of course).

Already, this is where it gets a little messy. Not every signal that passes through the optic chiasm goes to the visual cortex. Some go to the superior colliculus, which is like an emergency visual system. Sitting in the midbrain, it helps with decisions on head and eye orienting. The midbrain is an evolutionary, ancient part of the brain, involved with more basic responses than the cortex and forebrain, which are both better developed in humans. (See [Hack #7] for a quick tour.) So it looks as if this region is all low-level functioning. But also, confusingly, the superior colliculus influences high-level functions, as when it suddenly pushes urgent visual signals into conscious awareness [Hack #37] .


Actually, the LGN isn't a simple relay station. It deals almost entirely with optical information, all 1.5 million cells of it. But it also takes input from areas of the brain that deal with what you're paying attention to, as well as from the cortex in general, and mixes that in too. Before visual features have as been extracted from the raw visual information, sophisticated input from elsewhere is being addedwe're not really sure of what's happening here.

There's another division of the visual signal here, too. The LGN has processing pathways for two separate signals: coarse, low-resolution data (lacking in color) goes into the magnocellular pathway. High-resolution information goes along the parvocellular pathway. Although there are many subsequent crossovers, this division remains throughout the visual system.

2.2.3. Enter the Visual Cortex

From the LGN, the signals are sent directly to the visual cortex. At the lower back of the cerebrum (so about a third of the way up your brain, on the back of your head, and toward the middle) is an area of the cortex called either the striate or primary visual cortex. It's called "striate" simply because it contains a dark stripe when closely examined.

Why the stripes? The primary visual cortex is literally six layers of cells, with a thicker and subdivided layer four where the two different pathways from the LGN land. These projections from LGN create the dark band that gives the striate cortex its name. As visual information moves through this region, cells in all six layers play a role in extracting different features. It's way more complex than the LGNthe striate contains about 200 million cells.

The first batch of processing takes place in a module called V1. V1 holds a map of the retina as source material, which looks more or less like the area of the eye it's dealing with, only distorted. The part of the map that represents the foveathe high-resolution center of the eyeis all out of proportion because of the number of cells dedicated to it. It's as large as the rest of the map put together.

Physically standing on top of this map are what are called hypercolumns. A hypercolumn is a stack of cells performing processing that sits on top of an individual location and extracts basic information. So some neurons will become active when they see a particular color, others when they see a line segment at a particular angle, and other more complex ones when they see lines at certain angles moving in particular directions. This first map and its associated hypercolumns constitute the area V1 (V for "vision"); it performs really simple feature extraction.

The subsequent visual processing areas named V2 and V3 (again, V for "vision," the number just denotes order), also in the visual cortex, are similar. Information gets bumped from V1 to V2 by dumping it into V2's own map, which acts as the center for its batch of processing. V3 follows the same pattern: at the end of each stage, the map is recombined and passed on.

2.2.4. "What" and "Where" Processing Streams

So far visual processing has been mostly linear. There are feedback (the LGN gets information from elsewhere on the cortex, for example) and crossovers, but mostly the coarse and fine visual pathways have been processed separately and there's been a reasonably steady progression from the eye to the primary visual cortex.

From V3, visual information is sent to dozens of areas all over the cortex. These modules send information to one another and draw from and feed other areas. It stops being a production line and turns into a big construction site, with many areas extracting and associating different features, all simultaneously.

There's still a broad distinction between the two pathways though. The coarse visual information, the magnocellular pathway, flows up to the top of the head. It's called the dorsal stream, or, more memorably, the "where" stream. From here on, there are modules to spot motion and to look for broad features.

The fine detail of vision from the parvocellular pathway comes out of the primary visual cortex and flows down the ventral streamthe "what" stream. The destination for this stream is the inferior temporal lobe, the underside of the cerebrum, above and behind the eyes.

As the name suggests, the "what" stream is all about object recognition. On the way to the temporal lobe, there's a stop-off for a little further processing at a unit called the lateral occipital complex (LOC). What happens here is key to what'll happen at the final destination points of the "what" stream. The LOC looks for similarity in color and orientation and groups parts of the visual map together into objects, separating them from the background.

Later on, these objects will be recognized as faces or whatever else. It represents a common method: the visual information is processed to look for features. When found, information about those features is added to the pool of data, and the whole lot is sent on.

2.2.5. Processing with Built-in Assumptions

The wiring diagram for all the subsequent motion detection and object recognition modules is enormously complex. After basic feature extraction, there's still number judgment, following moving objects, and spotting biological motion [Hack #77] to be done. At a certain point, the defining characteristic of the cortex as a whole must come into play, and visual information is processed enough to be associated with memory, language, and reading emotions. This is where it blends in to the higher-order functions of the whole brain.

In the hacks that follow, we'll explore the effects of early and late visual processing. A common thread through these effects will be the assumptions the visual system has made about the visual world to expedite its computationand by looking at the quirks of vision, we can draw some of these out. Assumptions like the visual world remaining relatively stable from second to second (so we don't notice if it doesn't [Hack #40] ) and supposing that dark areas are shadows, which is the quirk that makeup takes advantage of [Hack #20] .

In a sense, the fact that we can observe these assumptions suggests that the visual system assumes as much about the external environment as about its own modules. The visual system's expectation that the motion module will report motion correctly (and therefore our confusion when the module doesn't identify motion correctly [Hack #25] ) is much the same as the visual system's expectation that a shadow is reporting 3D shape correctly. While we could think of the visual system as entirely in the brain, really we should include the eyes, the head, the body, and the environment as components in this big, messy, densely connected human visual processing system, all of which report their conclusions into the mix.

And somehow, in all of this, the visual perception we know and love somehow springs into existence. There doesn't seem to be a single place where all this visual processing is reassembled, no internal television screen that we watch (and even if there were, who would watch it?). It's distributed over the whole visual system, and over the environment too. Not just a picture at the retina, after all.



    Mind Hacks. Tips and Tools for Using Your Brain
    Mind Hacks. Tips and Tools for Using Your Brain
    ISBN: 596007795
    EAN: N/A
    Year: 2004
    Pages: 159

    flylib.com © 2008-2017.
    If you may any questions please contact us: flylib@qtcs.net