Neural Mechanisms for Reference Frame Transformations

Neural Mechanisms for Reference Frame Transformations

As we have already described, most of the neurons in the early visuomotor pathways show receptive fields that are explicitly gaze centered. And yet, this information must ultimately be used to control muscles that control effectors in other egocentric frame. From this, once can logically deduce that a reference frame transformation is occurring between these levels. How is it done?

*Let's start from the classical view of transforming a target in eye coordinates to a target in head or space coordinates. Clearly, visual information must be compared to eye position in order to compute the output in this scheme. The most important advance on how this might be done came from early work by Richard Andersen and colleagues. Andersen observed that neurons in LIP had gaze-centered receptive fields, but these were modulated by eye position in a fashion that he called a 'gain field'. This is easiest to imagine if it is reduced to one dimension (say horizontal). Suppose you plot the response of the neuron to visual targets along this dimension, and find a mountain-shaped 'Gaussian' response, with a peak somewhere that drops off in either direction. Then suppose you-retest the neurons with the eyes left and then right. An example of a gain field would be where you see the entire response scale down when the eye is left, and scale up when the eye is right, as if the response (Y axis) were being multiplied by some gain factor of eye position. Since their original discovery, such gain fields have been observed throughout the visualmotor system, down to the level of the superior colliculus and even the brainstem reticular formation. They are not always eye position gain fields, as touched on previously, the gain factor could be a multiple of vergence angle, gaze (eye + head) angle, hand position, or other factors. It seems to be an almost universal way of implicitly encoding information within neurons (or really multiplexing information).

After the discover of gain fields, Andersen and Zipser trained a neural network to perform the reference frame transformation described above, and found that the artificial neurons in the intermediate stage of the model produced the same gain fields seen in parietal cortex. Moreover, they demonstrated how this works at the population level: by placing more weight on some neurons (through the implicit gain field effect), and then summing their outputs downstream, that can reproduce a shifted explicit spatial code in the downstream neurons, and this can be arranged to go from eye to head coordinates. By inference, one could conclude that the same thing is happening in the real brain.

[*the preceding two paragraphs should likely come earlier, in the intro or spatial updating section, because we refer to gain fields throughout the course].

This is the story, but some have questioned the role that gain fields play in physiology.

First, as described previously, the notion of transforming goals into head or space coordinates has fallen out of favor. Only in VIP has this been consistently reported, and VIP does not appear to be involved in saccades and reach.

Second, many assumed that if one stuck with a vector displacement code, comparisons with eye and head position are no longer required. However, as we saw in the last section, this is not mathematically correct when one considers the real geometry of motion.

How are Reference Frame Transformations Done in 3-D?

Fortunately, much of the story told above can still be preserved when we shift to 3-D. When neural networks are trained to transform a topographically encoded 2-D visual target in gaze-centered targets, into a saccade command in a head-fixed 3-D coordinate system based on brainstem physiology, using the correct 3-D geometry, the intermediate units develop gaze-centered visual receptive fields with modulations very similar to gain fields. Likewise, when neural networks are trained to transform visual codes are transformed into a shoulder-centered motor code that satisfies the 3-D geometry discussed in the last section, the intermediate layers develop gaze-centered visual responses with eye and head modulations similar to gain fields.

However, some interesting new properties emerge in 3-D. First, the eye position gain fields must be modulated at least partially along eye position components orthogonal (including 2-D and torsion) to the direction of visual coding in order to produce the non-linear transformations reported in the last section. Otherwise, the gain field story is similar (Smith and Crawford, Blohm et al.).

Second, and perhaps most importantly, the same artifical units show different properties, depending on how they are tested. Fundamentally, this is because different information is encoded at the efferent and efferent level. This had concept been described previously in terms of tensor theory (Pellionisz), but the effects are more pervasive and critical when the complete 3-D geometry of the system is considered. At early levels, units show receptive fields organized in the reference frame of the afferent input sensory apparatus (e.g., the retina), and this is propagated even to some units late in the system. At intermediate levels to late levels, simulated microstimulation produced movement organized in the reference frame of the effector on the efferent output side. (Correlations of activity to movement parameters had intermediate characteristics)*. Importantly, at intermediate levels of the network, most units showed all of these properties simultaneously. Indeed, this is how it has to be. In other words, since receptive field mapping reveals the input frame, whereas motor tuning and especially microstimulation revealed the output frame, these measures should only be the same when the input and outputs share the same frame and no transformation is occurring.

[*this is a simplification: many neurons also fit best with imaginary reference frames that are intermediate between the input and output frames].

The theoretical properties are consistent with known physiology, but the detailed predictions have largely been untested because a) most studies in this area continue to test linear 2-D models, and b) most labs are not equipped to test non-linear 3-D models. What has been tested it the 3-D geometry of head-free gaze shifts evoked from the SC, SEF, FEF, and LIP. From a classical perspective, small gaze shifts evoked from the SC look 'fixed vector', whereas very large gaze shifts (in the posterior SC) look goal directed. From a 3-D analysis, these are simply eye-fixed vectors. LIP follows the same model, but SEF and FEF show a mixture of reponses ranging from eye-fixed to head-fixed to body-fixed, with FEF somewhat intermediate between SC and SEF. These data suggest that eye-centered coding of gaze shifts persists within PPC down to the SC (with perhaps a final transformation in the reticular formation), but that frontal cortex has the capacity for more complex reference frame transformations, for reasons currently unknown.