MAVRIC has four light detectors arrayed so as to provide directional information from bright light sources. We use battery powered, radio controlled light towers as sources of stimulation. It also has a microphone with four narrow bandpass filters used to simulate the sense of smell. Each of the four separate tones represents a different odor and the volume of the tone represents the concentration of the odor in some gradient. Battery powered, radio controlled tone generators (sound towers) provide the source of stimulus. These towers can be placed arbitrarily in a large gymnasium.
The robot base is an off-the-shelf Pioneer I robot from ActivMedia (unfortunately a discontinued line!) which we have modified rather extensively. We do use the front sonar array but only to simulate proximity detection like antennae of a snail (MAVRIC is meant to emulate the intelligence of a moronic snail!).
MAVRIC's task is to forage for resource objects, some combination of towers that it learns represent a resource, in a vast space and when the resource is sparsely and stochastically distributed. One of the four tones has been pre-designated as representing 'food'. MAVRIC learns that a combination of another, neutral tone and light, is associated with the food tone. Thereafter, MAVRIC uses light and the neutral tone cue to find the food.
Another tone has been pre-designated as representing poison. Thus MAVRIC must also learn to identify threats and avoid these. MAVRIC's world is thus composed of vast open space, cue signals (that it must learn to recognize) and both resources and threats. Its objective is to find a sufficient number of resources (under nonstationary distribution conditions) to maintain its energy level over time. Every time it encounters food it gets to feed and obtains a variable amount of energy. If it encounters poison it feels pain (causing an escape reaction) and suffers damage which leads to a reduction in energy. Also, just the act of moving in the world reduces its energy level, so it must find adequate food supplies or eventually it dies (energy == 0).
The operation of the towers is under computer control so that we can simulate an essentially infinite world for MAVRIC to search in. When it comes near the edge of a 20m X 20m area, the robot pauses while we 'scroll' the world past the robot. When it wakes up again, it proceeds into a new region of its world, even though the floor space is the same. From MAVRIC's point of view it must search over an infinite plain.
We are investigating the learning efficacy of a unique learning mechanism called the adaptrode which encodes causal correlations between any number of 'neutral' signals (stimuli) and signals that matter (physiologically speaking) such as food odor or poison. Our hypothesis is that this form of learning is the basis of classical and operant conditioning and is fundamental to all other higher-order learning phenomena. In the current line of research we are interested in how this kind of learning increases the capacity of an agent to survive in a potentially hostile, realistic world. The latter aspect is covered by the notion of nonstationarity in environmental contingencies, an aspect that has received scant attention in agent research until now. We have speculated that the multi-time scale learning attributes of the adaptrode mitigate the confounding nature of non-homogenious nonstationarity in associative relations in environmental objects. MAVRIC will have to deal with changes in distribution density, and even what signals portend (are cues to) food or poison, as these relations will be changed in a nonstationary manner over the course of MAVRIC's life. One of the more interesting things to look at is how well and quickly MAVRIC adapts to abrupt changes in relations.
In the first set of experiments performed we were interested in emulating the search pattern used by many kinds of hunting animals as they forage for food. It is known from various ethological studies that animals do not use systematic search strategies when hunting in unfamiliar territory. Nor do the employ Brownian search (pure random motion as is used in many robot foraging researches). Rather they use a form of quasi-random motion (see Foraging Search) which ensures that they will cover a larger area of the territory and not get caught in a local cycle. This form of search may be of general interest in computational processes such as data mining. We employ a simulated central pattern generator circuit (as in the above reference) or CPG network to generate a 1/f noise oscillation. This signal, when applied to the motor outputs of the robot, produce a 'drunken sailor' walk which provides considerable novelty but is not Brownian. A report on this movement behavior is forthcomming.
At the moment, then, MAVRIC knows how to explore its environment but has yet to learn to recognize cue events or stimuli which will cause it to enter more directed search. That is the stage being implemented now. See MAVRIC's Brain for background on this step. This paper was based on an earlier, more primitive version, of the robot.
For more details on the development of MAVRIC's brain architecture and current software implementation see:
Hogg, D. W., Martin, F. & Resnick, M. (1991). Braitenberg Creatures, MIT Media Laboratory
Pfeifer, R. & Scheier, C. (1999). Understanding Intelligence, The MIT Press, Cambridge MA.
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.