Wednesday, October 22, 2003

12 noon

Redwood Neuroscience Institute

 

Title: " Compositional memory for recognition of complex objects: A proposal"

 

Ross Gayler

Melbourne, AUSTRALIA

 

Abstract:

The problem of extracting an invariant representation from perceptual inputs has long been recognised (e.g. Lashley, 1942).  More recently, various proposals have been made and implemented for recurrent connectionist systems that simultaneously settle on a mapping and retrieve an item from memory (Arathorn, 2002; Hinton, 1981; Olshausen, Anderson, & Van Essen, 1993).  The mapping (which captures the variant aspect of the input) transforms the input into the cue that retrieves the item (which is the invariant representation) from memory.

Coming from a completely different direction, I have been developing a connectionist memory architecture to support high level cognition (Gayler, 2000).  Surprisingly, this architecture is also based on simultaneous transformation and recognition and is abstractly isomorphic to the perceptual invariance architectures.  The similarity between the perceptual and cognitive architectures suggests that there may be a fundamental unity between them.  The difference lies in the details; the perceptual architectures use localist representations and a fixed palette of geometric transformations, while the cognitive architecture uses distributed connectionist representations capable of representing recursive structures, and transformations that are arbitrary structural substitutions.  I propose that the architectures could be unified and devote the remainder of this presentation to exploring how this may enable the recognition of composite objects.

The perceptual architectures mentioned earlier recognise a single item at a time.  They can be persuaded to attend to multiple items serially, but they do not allow for representation of the relations between items.  These architectures do represent the relations between the elements (pixels or feature vectors) within an item, but these relations are fixed.  Each item is recognised holistically and treated as atomic (having no internal compositional structure).  Thus, multi-level composite items can not be represented.

The representational advantage offered by the distributed approach is that transformations are “first-class” entities, having the same status as the content mapped by the transformations.  This means that representations of transformations can be included in the representations of objects.  In particular, two serially fixated items and the attentional transformation between the fixations could be represented on the same set of connectionist units used to represent just one item.  Thus, it should be possible to represent complex entities as a network of components with transformations between them.  This leads naturally to graph structures as representations of objects – a common choice in computer vision systems.

The process advantage of such an approach is that it should be possible to build a connectionist memory that simultaneously recalls multiple items while settling on mappings between them.  These mappings would serve to unify the retrieved items into a representation of a novel composite object.  Memory systems of this sort should be able to recognise novel compositions of familiar components as readily as they recognise the components themselves.  The distributed connectionist implementation of this recognition process can be construed as an indirect implementation of Pelillo's (1999) approximate graph matching via replicator equations, by embedding his algorithm in a fixed high-dimensional vector space.

Arathorn, D. W. (2002). Map-seeking circuits in visual cognition: A computational mechanism for biological and machine vision. Stanford, CA, USA: Stanford University Press.

Gayler, R. W. (2000). Multiplicative Binding, Representation Operators & Analogical Inference. Presented at Cognitive Science Conference. Melbourne, Australia.

Hinton, G. E. (1981). A parallel computation that assigns canonical object-based frames of reference. Proceedings of the Seventh International Joint Conference on Artificial Intelligence Vol. 2. Vancouver BC, Canada.

Lashley, K. S. (1942). The problem of cerebral organization in vision. Biological Symposia, 7, 301-322.

Olshausen, B. A., Anderson, C. H., & Van Essen, D. C. (1993). A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information. The Journal of Neuroscience, 13, 4700-4719.

Pelillo, M. (1999). Replicator equations, maximal cliques, and graph isomorphism. Neural Computation, 11, 1933-1955.