Food foraging with a simulated ant colony. The nest is located in the center of
the space, while the food source is located in the upperleft region. The map is
toroidal.

A cloud of pheromones strongly connects the two sites in this snapshot.
Food foraging with a simulated ant colony. The nest is located in the lowerright area of the space, while the food source is located in the upperright region. The map is not toroidal.

When an obstacle is centered on the shortest path between the nest and the food source, the ants manage to find a way arround it (on one of the sides).
When an obstacle is placed between the nest and the food site, the system initially finds both a shorter (lower region) and a longer (upper, with fewer ants) trails connecting the two sites (the leftmost picture below). However, after some time elapses (center and rightmost pictures below), the ants concentrate the foraging on the approximately shortest path.
Food foraging with a simulated ant colony. The nest is located in the lowerrigh
t area of the space, while the food source is located in the upperleft region.
The map is not toroidal.

When two obstacles are places such that they narrow and entangle the shortest path between the nest and the food source, the system starts with an initial exploration phase where several trails are found (leftmost picture below). With time, the number of such trails decreases based on their cost (distance between the two sites), such that many dissapear (rightmost picture below). The two main trails remaining have approximatively similar lenghts, so it will take more for the ants to choose the shorter. However, as the movie above shows, that will eventually happen.
When exploring for food, ants do not necessarily find the shortest path. The trails they leave are often convoluted as many of the images shown above demonstrate. And when the strategies are greedier, in the sense that they always choose the location with more pheromones, one would expect them to exploit suboptimal paths. But that does not seem to be the case with a version of this algorithm that we implemented. We discovered that the ants straighten their paths because a small fixed maximal number of ants (in this case 10) is enforced at each location. For the following experiment we deposited some food pheromones before the forage process was starter. The pheromones led the ants on a clear suboptimal path. As the following images show, the ants slowly straightened the trail.
After 100 steps 
After 600 steps 
After 1100 steps 
After 1600 steps 
After 2100 steps 
After 2600 steps 
After 3100 steps 
After 3600 steps 
After 4100 steps 
After 4600 steps 
After 6600 steps 
After 8600 steps 
Our foraging mechanisms inspired from reinforcement learning work fine in dynamic environments as well. In the following set of experiments, we set the nest location in the lower area of the environment, and we allowed the food source to slowly move horizontally (one cell every 50 time steps). The images show how, over time, the ants learned about the new locations of the food source and adjusted their foraging paths to be minimal.
After 500 steps 
After 1000 steps 
After 1500 steps 
After 2000 steps 
After 2500 steps 
After 3000 steps 
After 3500 steps 
After 4000 steps 
After 4500 steps 
After 5000 steps 
After 5500 steps 
After 6000 steps 
After 6500 steps 
After 7000 steps 
After 7500 steps 
After 8000 steps 
After 8500 steps 
After 9000 steps 
After 9500 steps 
After 10000 steps 
Whenever the agent is allowed to sense the pheromone concentrations of the nearby locations as well (the difference between the SARSA and the QLearning algorithms), the adjustment is even faster:
After 500 steps 
After 1000 steps 
After 1500 steps 
After 2000 steps 
After 2500 steps 
After 3000 steps 
After 3500 steps 
After 4000 steps 
After 4500 steps 
After 5000 steps 
After 5500 steps 
After 6000 steps 
After 6500 steps 
After 7000 steps 
After 7500 steps 
After 8000 steps 
After 8500 steps 
After 9000 steps 
After 9500 steps 
After 10000 steps 
In the next experiment, we set the nest in the middle of the environment, and set up a moving food source that rotates arround it. The food source is fixed for the first 200 time steps, and it starts rotating clockwise afterwards with a speed of almost 1.5 degrees per every 15 time steps. There's also an obstacle under the nest, so the ants will have to find paths arround it. We set the evaporation and diffusion rates for the home pheromones to 0.0001, an order of magnitude smaller than usual. Interestingly, ands find the shorter path arround the obstacle whenever their current estimate on the current longer path matches the degraded estimate (due to evaporation and diffusion) on the actual shorter path.

So far, we've seen many examples where a couple of sites need to be connected: the nest and a food source. In this section, we'll see how this approach can be extended to several sites, such that they need to be visited in a specific order.


[paper] A PheromoneBased Utility Model for Collaborative Foraging. 2004. Liviu Panait and Sean Luke. Proceedings of 2004 Conference on Autonomous Agents and Multiagent Systems.
[paper] Ant Foraging Revisited. 2003. Liviu Panait and Sean Luke. Proceedings of the Ninth International Conference on the Simulation and Synthesis of Living Systems (ALIFE9).
[paper] Learning Ant Foraging Behaviors. 2003. Liviu Panait and Sean Luke. Proceedings of the Ninth International Conference on the Simulation and Synthesis of Living Systems (ALIFE9).