Ant System

AS is a basic form of ant colony optimization.

Let us consider a graph search problem. We operate on a graph $G = (V, E)$ , where nodes $V$ are locations, and edges $E$ are possible moves/transitions. A solution is a path or tour, constructed step-by-step. An example problem would be to find the shortest path from $s$ to $t$ .

Information Storage

With ACO, we have many agents constructing paths on a graph with shared edge memory. Decisions combine a learned desirability $τ$ and a local desirability $η$ .

Pheromone memory: Each edge $(i, j)$ has a pheromone level $τ_{ij}$ . High $τ_{ij}$ means that this edge often appears in good solutions. Pheromone is shared across the colony (collective memory).
Heuristic information: Each edge also has a local desirability $η_{ij}$ . For example, $η_{ij} = \frac{1}{d _{ij}}$ for distances.

Heuristic design

Note that the common choice in routing/TSP is $η_{ij} = \frac{1}{d _{ij}}$ . Short edges are locally attractive, and thus if $β$ is large, ants will behave more greedily in this local manner.

In general, we just want $η_{ij}$ to be cheap to compute and locally informative.

AS Components

Ants: Construct solutions by moving on the graph
Pheromone matrix $τ$ : Shared memory on edges
Heuristic $η$ : Problem-specific local guidance
Probabilistic choice rule: Balances exploration vs. exploitation
Evaporation $ρ$ : Forgetting mechanism
Reinforcement: Deposit pheromone on good edges

We already saw how $τ$ and $η$ works.

Probabilistic Move Rule

At node $i$ , an ant chooses the next node $j$ stochastically. It prefers edges with high pheromone $τ_{ij}$ , and edges with good heuristic $η_{ij}$ . Because of randomness, we still allow exploration, such that non-best edges can be selected.

Strength of pheromone influence is controlled with parameter $α$ , and strength of heuristic influence is controlled with $β$ .

The transition probability is then determined as:

P_{ij}^{(k)} = \frac{τ _{ij}^{α} η _{ij}^{β}}{\sum _{ℓ = N_{i}^{(k)}} τ _{i ℓ}^{α} η _{i ℓ}^{β}}

which is essentially just normalized probability for each path:

$P_{ij}^{(k)}$ : probability ant $k$ moves from $i \to j$
$N_{i}^{(k)}$ : feasible next nodes (e.g., unvisited cities)

If $α = 0$ , we are purely using heuristic (probabilistically greedy). If we are using $β$ , we are purely using pheromone (memory only).

We can pick a next action through roulette-wheel sampling. We compute the transition probabilities using the formula above, draw a random threshold $u \sim Uniform (0, 1)$ , then select the next city cumulative probability:

h \leq j \sum P_{ih}^{(k)} \geq u

Numerical example:

Parameter effects:

Increased $α$ means that ants follow pheromone trails more strongly (exploitation), exploiting experience
Increased $β$ means that ants prefer locally good edges more strongly (heuristic greediness), exploiting problem structure

Pheromone Update

Pheromones are updated using some update rule. A generic global update is:

τ_{ij} \leftarrow (1 - ρ) τ_{ij} + k = 1 \sum m Δ τ_{ij}^{(k)}

$ρ \in (0, 1]$ is the evaporation rate. Higher $ρ$ means faster evaporation (more exploration).
$m$ is the number of ants
$Δ τ_{ij}^{(k)}$ is the deposit (reinforcement) by ant $k$ , which is usually based on solution quality.

For example, a typical deposit rule is:

Δ τ_{ij}^{(k)} = ⎩ ⎨ ⎧ \frac{Q}{L _{k}} 0 if ant k uses edge (i, j) otherwise

where $L_{k}$ is the tour/path cost, and $Q$ is a pheromone scaling constant.

Thus, we can think of this pheromone update rule as combining the evaporation and reinforcement:

τ_{ij} \leftarrow evaporation (1 - ρ) τ_{ij} + reinforcement k = 1 \sum m Δ τ_{ij}^{(k)}

Stagnation

Note that a common failure mode is stagnation: if one trail becomes too dominant early, the colony may stop exploring alternatives. Some solutions to this include:

Increasing evaporation rate $ρ$
Limiting pheromone values (Max-Min Ant System)
Restarting pheromones when diversity collapses
Injecting random/exploratory ants.

Convergence vs. Stagnation: For convergence, ants increasingly select the same edges, with pheromones concentrating on a few tours. For stagnation, no new tours are explored, with all ants following the same path (stuck in local optima).

Walkthrough

For one iteration $t$ of ACO:

Each ant constructs a path using $P_{ij}^{(k)}$
Evaluate cost $L_{k}$ for each ant’s solution
Evaporate pheromones
Reinforce edges using good solutions

Early on, this will explore many edges (diversity). Later on, strong trails will emerge (exploitation).

This basic version we have introduced here is called the Ant System.

/notes/

Recent

Autocorrelation Function

Residual Networks

Shattered Gradients