Roadmap

This is a living document in the early stages of construction.

Roadmap

Overview

The basic strategy is to initially create a functioning system using Reinforcement Learning (RL) with a minimal reward function, capable of taking actions and then proceed by elaborating the reward function and supporting networks to support more sophisticated actions and their supporting perceptions.

VOR Reflex

When you move your head, for example left-to-right or up and down, your eyes move in the opposite direction to maintain image stability. While there are multiple mechanisms at work in general, an important one is the Vestibulo-Ocular Reflex (VOR). This is a feedback system from the vestibular system in the inner ears (which provides your sense of balance) to the ocular muscles. In addition, there is also feedback from vision itself.

Research visual image stabilization in animals
- Which brain regions are involved and what do they do (VOR and visual feedback)
Hardware
- Initially, use Raspberry Pi and Pi Camera
- Find a fast pan/tilt[/yaw] unit (possibly look at Quadcopter GoPro/stabilization units)
- Find hardware to measure gravity direction and 3-axis angular velocity (e.g. this Adruino NO055 shield - also available via Aliexpress)
- Mounts for camera, servos/motors, sensors and optionally controller
Construct a reward function for stable vision
Implement RL via Q-learning and an ANN for learning the action-value function
Determine of ANN can be trained in real-time or if alternate training data-set or faster ANN hardware implementation is required

Orienting reflexes

One building block of attention and consciousness, is the ability of some senses to ‘grab’ attention by automatically orienting to particular stimuli. For example, bright flashes or loud noises elicit automatic head and eye orientation responses to focus visual attention toward the source of the stimuli. In order to successfully orient to a bright flash in the visual field, the agent requires a functioning coordinate transformation from eye to head to body coordinate frames and from the body frame to the require motor actions for orienting.

Research visual orienting responses in animals
Augment reward function to reward accurate orienting responses overriding VOR behaviour
Train ANN with adjusted reward function to adapt action-value function using partial pre-trained ANN from VOR-only network