July 8, 2020
My fascination with insect navigation began roughly a couple of weeks ago when I hurt my little toe while aggressively pursuing a hostile housefly running havoc in my bedroom. It buzzed loudly and glided graciously around my room like it was the lord of all it surveyed, apparently unabashed at my vicious attempts to swat it into a smear of insect innards on the wall. How are such tiny organisms so agile and adept at navigation?
Insects represent more than half of all known living organisms. If species diversity were the key metric for success, they are without doubt the most successful group of living organisms since life on earth began. In spite of their miniature brain size (0.001% of the number of neurons in a human brain), they are able to perform highly complex navigation and coordination tasks. There is an unspoken rule in the neurobiology community that the larger the size of the brain of an organism, the more complex the tasks the organism can perform. Insects however obviously defy this rule. How do they squeeze that much computational performance from such little hardware?
First, let’s look into its sensor stack. The primary organ for perception in insects is their eye. Unlike those in vertebrates, insect eyes are compound eyes. The compound eyes make up majority of the size of the insect head.
As depicted in Figure 1, insect eyes are made up of thousands of tiny light detectors called ommatidia. The convex nature of the compound eyes gives the insect a wide field of view, enabling it to spot inimical adversaries in time, regardless of their angle of approach. The dorsal ommatidia help track the sun’s position, enabling the insect to keep track of its direction when navigating to and from food sources. How do insects know where to go to find food and how to get back into their nests after locating a food source?
Insects have two major navigation strategies for navigation namely Path Integration and Visual Memory.
Path Integration involves adding up every direction and distance you travel from the starting position so that you can maintain a vector which could lead you straight back to the starting position, as depicted in Figure 2 by the red arrow.
To perform path integration, the insect would have to keep track of both the distance traveled and its orientation. Insects use a phenomenon called Optic Flow to keep track of the distance it has traveled and its velocity. Optic flow is a pattern of apparent motion of objects caused by a relative motion between an object and its environment. Insects keep track of their orientation by observing the polarization patterns in the sky using the dorsal(upward-facing) ommatidia of their compound eyes. The part of the insect brain responsible for interpreting direction from the polarization patterns is the Central Complex, as depicted in Figure 3 below.
Another strategy insects employ for navigation is the use of Visual Memory. Visual memory involves memorizing visual patterns as they navigate from their nest to food sources so that on their return journey, they’d be able to match their current observation with their memorized visual patterns to know exactly what path to take to get back to their nest. Knowing this, the following natural questions arise:
- What visual processing does the insect perform to extract information from a visual scene that support navigation, particularly under variable conditions?
- What is the form of the stored memory?
- How are the memory and the current scene compared to recover a movement direction?
These questions are tackled in Thomas Stone’s article on Rotation Invariant Visual processing for Spatial Memory. According to Stone, the low-resolution nature of images captured by the ommatidia enable the insect to generalize across different lighting and fog conditions when matching memorized images with retinal images from the insect’s present environment. Various proposals have been made to explain the format in which visual images are stored. Stone proposes that insects convert visual images into rotation invariant forms. By doing this, insects are able to successfully match retinal images to the ones they’ve memorized regardless of how the retinal image is oriented due to a mismatch between the insect’s body orientation when memorizing the scene and the insect’s current body orientation when it sees the scene again. At each memorization instance, Insects store multiple images at slightly different orientations and positions. This way, upon return, the insect is able to infer its movement direction from the multiple images through an estimation of local gradient of familiarity. By computing the similarity between the current retinal image and the memorized images and using it as some form of loss function, the insect is able to zero-in on its exact direction using some form of gradient descent.
Delightful.
With these navigation strategies, insects perform some interesting navigation behaviors such as visual course control, communication and extreme pursuit.
In his article on vision in flying insects, Martin Egelhaaf argues that bees perform course control when flying by controlling the angular velocity of the retinal image in their eyes. To turn left, bees fly such that the angular velocity of the retinal image in their right compound eye is greater than that of their left compound eye and to fly straight, bees equate the overall optic flow on their eyes.
After foraging and learning about the location of food sources through path integration and visual memory, bees communicate distance and direction to food source to their hive mates by a curious kind of dance called the waggle dance.
As much as it is fun to look at, the waggle dance could be quite ambigious considering the fact that path integration only gives the 2-Dimensional position and direction of the food source and not its elevation. It could be that they communicate that through other means not yet discovered by humans.
Some insects are also able to chase moving targets, performing in extreme acrobatic aerial manoeuvres in the process. Martin Egelhaaf argues that insects employ smooth pursuit when chasing moving targets. They control their forward velocity by the angular size of the target (the smaller the size, the greater their velocity and vice versa), whereas their turning velocity depends on the angle at which their target is seen. They also perform ‘catcup-saccades‘ (rapid movement of the eyes between fixation points) only when the target alters its trajectory too rapidly to allow the insect to follow smoothly.
As a roboticist, the major lesson I learnt from my 2-week-long fascination with all things ‘insectile’ is the value of specialization. The insect anatomy is highly specialized to perform specific tasks that enable the insect to survive in its world. By zeroing in on a specific set of ecological tasks, the insect anatomy, through evolution, has been optimized for performing exactly those tasks with very little resources.
This also brings to mind the famous dictum in architecture and industrial design: Form follows Function. The shape of a building or object should primarily relate to its intended function or purpose. In as much as generalization is desirable, only systems that are highly specialized at performing a specific set of tasks succeed at effectively performing those tasks.
To create high performance generalized systems, do we need to build individual specialized systems and come up with a way of getting them to coordinate with each other? Or do we build a separate system optimized to optimize coordination between the specialized systems? I don’t know.
The current in deep learning models require huge amounts of computation and memory to train and use. If, like our insect neighbors, we are able to understand the crux of what is actually pertinent in training models and performing inference, we will be able to optimize them in a way that makes them light and easily deploy-able on less powerful devices.