2006 Scientific Research and Experimental Development tax credit claim
T661 - Part 2 – Scientific or Technological Project Information
Step 1 - Detailed Project Description
Project Identification: code and name
Project Number – 1
Project Name – Adaptron
Project Type – Basic Scientific Research
Subject Area – Artificial Intelligence, Artificial Life, Simulation of Adaptive Behavior
A. Scientific or technological objectives
This scientific research aims to simulate human learning and thinking using an artificial neural network (ANN - see reference) for pattern recognition with an integrated behavior network (see reference) for action selection. The resulting agent is called Adaptron. The objective of this research is to devise an ANN that dynamically grows the number of nodes as new experiences are acquired and that prunes the nodes (forgets them) as new learning replaces old. The research aims to extend the ANN by connecting up recognition events to action sequences.
The goal of Adaptron is for it to be general purpose. This means it should be able to learn how to function in any environment that can produce quantized stimuli and that obeys a set of deterministic rules whenever actions are performed.
Adaptron must begin with the ability to recognize simultaneously a primitive / non-reducible set of stimuli from several senses and the ability to produce a predefined set of primitive actions in parallel on several response devices. The objective is for it to first learn using novelty as a goal and by avoiding boredom. It must also be preprogrammed to recognize a subset of its stimuli as rewarding and another disjoint subset of its stimuli as punishing. With only these predetermined parameters Adaptron should learn to recognize combinations of the primitive stimuli from its environment and to perform primitive actions and combinations of primitive actions so as to minimize punishing stimuli and maximize rewarding stimuli.
Thus Adaptron must be able to “live” in an artificial environment in which it can sense the stimuli and produce actions. The environment must be 100% deterministic such that in all initial environmental states any action performed by Adaptron will always result in the same final states. All detectable dimensions of the environment must be 100% discrete – there are no continuously measurable quantities. The environment cannot change unless Adaptron performs an action i.e. there are no other agents in the environment changing its state. The environment must produce rewarding and punishing stimuli. Designing Adaptron to live in a continuous, changing and noisy environment are goals for subsequent research projects.
When the research has proven that the theories are correct and the software design is viable, Adaptron Inc. plans to promote the software for imbedding in robots and control systems.
B. Technology or knowledge base or level
Existing ANNs are built from a fixed number of nodes and the weights on the connections between these nodes are adjusted as they are trained to recognize a set of input stimuli. Devising a self-organizing ANN that also grows i.e. adds nodes as it encounters new stimuli, as a means for learning has not been attempted. Such an ANN could be used in an adaptive control system without having to reprogram it with a new set of nodes.
Existing behavior / action networks are designed to be general-purpose networks with learning rules imbedded by the developer. They have not been integrated with ANNs nor constrained by their topology and learning algorithms. Successful integration of behavioral networks with ANNs that grow should result in an adaptive system that can build ever increasingly complicated hierarchical behavior networks.
The area of robotics is a prime candidate for advancement through the use of Adaptron. Artificial Intelligence research into robots has been progressing for several decades and many successes have been accomplished. Many robots have been developed to perform specific tasks in very narrow environments and include some limited learning ability. However they cannot handle general-purpose situations or if they can (e.g. Brook’s robot Cog (see reference)) they do not have the ability to combine learnt behavior into more complicated behavior, i.e. they do not scale well. Current research in this field is best followed by the International Society for Adaptive Behavior (see reference). The scientific advancement that Adaptron aims to accomplish is general purpose learning and thinking software that can be imbedded in a robot such that it can learn all its knowledge from and operate in the environment in which it is placed.
C. Scientific or technological advancement
An ANN that grows hierarchically as it learns to recognize more complicated patterns of stimuli has not been invented. It is uncertain as to how the growth should be controlled. It is also unknown if any node in an ANN can be used to trigger actions. With current fixed node ANNs the actions would only be associated with final recognition nodes, not hidden nodes. With a growing ANN the actions will end up associated with hidden nodes. Strategies for using novelty, familiarity, punishment and reward, as feedback in guiding the growth of the ANN must also be discovered.
Of even more scientific uncertainty is how thinking can be introduced into the ANN. Various processes based on signaling between the nodes need to be invented and tested. These are based on the idea that thinking is a stream of expectations which effectively model experiences in a goal directed fashion in order to predict consequences of future actions.
The fields of science that this project is involved with are Artificial Intelligence, Artificial Life and Simulation of Adaptive Behavior. Results obtained from the field of Cognitive Science are also used as input. More specifically the areas within Artificial Intelligence are Artificial Neural Networks (ANNs) – unsupervised learning in dynamic neural networks and dynamic hierarchical Behavior Networks.
D. Description of work in the tax year
Determination of Adaptron’s success at learning and thinking is based on the observation of its actions in test environments and by inspection of its internal memory traces and processes. The tasks performed this year were:
- Explore new attention algorithm based on changing stimuli versus novelty.
- Tried randomly generated actions versus an ordered set done sequentially.
- Changed the criteria for when a stimulus becomes permanent, i.e. no longer prompts a response.
- Incorporated new emotional feeling (good/bad) scale and algorithm for operant conditioning (learning and extinction).
Testing of Adaptron was done in artificial environments simulated in software.
E. Supporting Information
Research notes in new Notebook started 11th July, 2005:
- Ideas for recognition based on the changes in stimuli values rather than their absolute values.
- A design for generalization / specialization based on simultaneous stimuli,
- Various feeling scales for good/bad and interesting/uninteresting,
- New design for a hierarchical structure to store and recognize simultaneous stimuli from several senses.
Versions of the Adaptron software that were developed are named:
Learn9 – 8 versions in total
Habits1 thru Habits3 – 25 versions in total
14 Screen captures of memory dumps of test runs of Adaptron have been kept.
A logbook is kept of the daily experiments performed and the time spent doing research by the specified employee.
Artificial Neural Networks:
http://www.dacs.dtic.mil/techs/neural/neural_ToC.html [website no longer active]
http://www-robotics.usc.edu/~monica/Research/Control/ctrldata.html [website no longer active]
International Society for Adaptive Behavior