- Published on
Aspects of OR - Traveling Salesperson Problem - Metaheuristics (Part 3)
- Authors

- Name
- Till Heller
In Part 2 we built construction and improvement heuristics for the TSP. Methods like 2-opt are fast and effective, but they get trapped in local optima — once no single edge swap improves the tour, the search stops, even if much better solutions exist elsewhere.
Metaheuristics are general-purpose search frameworks that overcome this limitation. They use strategies like accepting worse solutions temporarily (simulated annealing), combining features of good solutions (genetic algorithms), or reinforcing successful search paths (ant colony optimization). These methods do not guarantee optimality, but they consistently find near-optimal solutions for large TSP instances.
In this post we cover three of the most popular metaheuristics, each implementing a fundamentally different search philosophy. All implementations reuse the distance matrix setup and the tour_length and two_opt functions from Part 2.
Simulated annealing
Simulated annealing (SA) is inspired by the physical process of annealing in metallurgy — heating a metal and then slowly cooling it to reach a low-energy crystalline state. The key idea: at high "temperatures", the algorithm accepts worsening moves with high probability, enabling it to escape local optima. As the temperature decreases, it becomes increasingly selective, converging toward a good solution.
Algorithm
- Start with an initial tour and an initial temperature .
- Repeat until the stopping criterion is met: a. Generate a neighbor of (e.g., by a random 2-opt move). b. Compute . c. If (improvement), accept: . d. If (worsening), accept with probability , where is the current temperature. e. Reduce the temperature: (geometric cooling).
- Return the best tour found.
Key parameters
- Initial temperature : Should be high enough that most moves are accepted initially. A common rule of thumb is to set so that the acceptance probability for an average worsening move is around 80%.
- Cooling rate : Typically between 0.99 and 0.9999. Slower cooling (higher ) gives better solutions but takes longer.
- Stopping criterion: A minimum temperature, a maximum number of iterations, or no improvement for a specified number of steps.
- Iterations per temperature: Often called the "epoch length". More iterations per temperature level allow the search to equilibrate before cooling.
Implementation
import numpy as np
def simulated_annealing(D, T0=100.0, alpha=0.9995, min_temp=0.01, epoch=100):
"""Simulated annealing for TSP."""
n = len(D)
# Start with a random tour
tour = list(range(n))
np.random.shuffle(tour)
current_length = tour_length(tour, D)
best_tour = tour[:]
best_length = current_length
temp = T0
while temp > min_temp:
for _ in range(epoch):
# Generate neighbor: random 2-opt move
i = np.random.randint(0, n - 1)
j = np.random.randint(i + 1, n)
# Compute delta (only the changed edges)
a, b = tour[i], tour[(i - 1) % n]
c, d = tour[j], tour[(j + 1) % n]
# Before: (prev_i, i) and (j, next_j)
# After: (prev_i, j) and (i, next_j) with segment reversed
delta = (D[b, tour[j]] + D[tour[i], d]
- D[b, tour[i]] - D[tour[j], d])
if delta < 0 or np.random.random() < np.exp(-delta / temp):
# Accept the move
tour[i:j + 1] = tour[i:j + 1][::-1]
current_length += delta
if current_length < best_length:
best_length = current_length
best_tour = tour[:]
temp *= alpha
return best_tour, best_length
np.random.seed(42)
coords = np.random.rand(50, 2) * 100
D = make_distance_matrix(coords)
tour_sa, length_sa = simulated_annealing(D)
print(f'SA tour length: {length_sa:.2f}')
Why it works
At high temperatures, SA behaves like a random walk — it explores the solution space broadly. As the temperature drops, it gradually transitions into a greedy local search. This two-phase behavior (exploration then exploitation) is why SA can escape local optima that trap pure local search methods.
The theoretical foundation is the Metropolis criterion: under certain conditions on the cooling schedule, SA converges to the global optimum with probability 1. In practice, we use much faster cooling schedules that do not guarantee convergence but produce excellent solutions in reasonable time.
Genetic algorithms
Genetic algorithms (GA) are inspired by biological evolution. They maintain a population of candidate solutions (tours) and iteratively evolve the population through selection, crossover (combining two parents into offspring), and mutation (random perturbation).
TSP-specific challenges
Crossover is tricky for the TSP because naively combining two tours often produces invalid solutions (missing or duplicated cities). Several specialized crossover operators have been developed for the TSP:
- Order Crossover (OX): Copy a segment from one parent and fill the remaining positions with cities from the other parent in order.
- Partially Mapped Crossover (PMX): Swap a segment between parents and use a mapping to resolve conflicts.
- Edge Recombination Crossover (ERX): Preserve edges present in either parent.
We implement Order Crossover below, as it is the most commonly used.
Algorithm
- Initialize a population of random tours.
- Evaluate the fitness (tour length) of each individual.
- Repeat for generations: a. Selection: Choose parents based on fitness (e.g., tournament selection). b. Crossover: Combine pairs of parents to create offspring. c. Mutation: Apply random perturbations (e.g., 2-opt move) with probability . d. Replacement: Form the new population from the best individuals among parents and offspring.
- Return the best tour found across all generations.
Implementation
def order_crossover(parent1, parent2):
"""Order Crossover (OX) for TSP."""
n = len(parent1)
start, end = sorted(np.random.choice(n, 2, replace=False))
# Copy segment from parent1
child = [-1] * n
child[start:end + 1] = parent1[start:end + 1]
# Fill remaining from parent2 in order
segment_set = set(child[start:end + 1])
fill = [c for c in parent2 if c not in segment_set]
idx = 0
for i in range(n):
if child[i] == -1:
child[i] = fill[idx]
idx += 1
return child
def tournament_select(population, lengths, k=3):
"""Select an individual via tournament selection."""
candidates = np.random.choice(len(population), k, replace=False)
best = min(candidates, key=lambda i: lengths[i])
return population[best]
def genetic_algorithm(D, pop_size=100, generations=500, mutation_rate=0.3):
"""Genetic algorithm for TSP."""
n = len(D)
# Initialize population with random tours
population = []
for _ in range(pop_size):
tour = list(range(n))
np.random.shuffle(tour)
population.append(tour)
lengths = [tour_length(t, D) for t in population]
best_idx = np.argmin(lengths)
best_tour = population[best_idx][:]
best_length = lengths[best_idx]
for gen in range(generations):
new_population = []
# Elitism: keep the best individual
new_population.append(best_tour[:])
while len(new_population) < pop_size:
# Selection
parent1 = tournament_select(population, lengths)
parent2 = tournament_select(population, lengths)
# Crossover
child = order_crossover(parent1, parent2)
# Mutation: random 2-opt move
if np.random.random() < mutation_rate:
i, j = sorted(np.random.choice(n, 2, replace=False))
child[i:j + 1] = child[i:j + 1][::-1]
new_population.append(child)
population = new_population
lengths = [tour_length(t, D) for t in population]
gen_best_idx = np.argmin(lengths)
if lengths[gen_best_idx] < best_length:
best_length = lengths[gen_best_idx]
best_tour = population[gen_best_idx][:]
return best_tour, best_length
tour_ga, length_ga = genetic_algorithm(D)
print(f'GA tour length: {length_ga:.2f}')
Enhancing the GA
The basic GA above can be improved significantly:
- Local search after crossover: Apply 2-opt to every offspring before inserting it into the population. This combination of GA with local search is called a memetic algorithm and is one of the best-known metaheuristics for the TSP.
- Diverse initial population: Use different construction heuristics (nearest neighbor from various starts, greedy) instead of random tours.
- Adaptive mutation rate: Increase mutation when the population converges (diversity drops) and decrease it when the population is diverse.
Ant colony optimization
Ant colony optimization (ACO) is inspired by the foraging behavior of ants. Real ants deposit pheromone on paths to food sources. Shorter paths get more pheromone (ants return faster, depositing more per unit time), which attracts more ants, creating a positive feedback loop that converges on short paths.
Algorithm (Ant System)
The original Ant System (AS) by Dorigo (1992) works as follows:
- Initialize pheromone levels on all edges to a uniform value .
- Repeat for iterations: a. Each of ants constructs a tour:
- Starting from a random city, the ant chooses the next city from city with probability proportional to , where is the heuristic desirability and control the relative importance. b. Pheromone evaporation: for all edges, where is the evaporation rate. c. Pheromone deposit: Each ant deposits on every edge in its tour, where is the tour length and is a constant.
- Return the best tour found across all iterations.
Key parameters
- : Pheromone importance (typically 1).
- : Heuristic importance (typically 2–5). Higher makes ants prefer short edges.
- : Evaporation rate (typically 0.1–0.5). Higher means faster forgetting.
- : Number of ants (often set to , the number of cities).
- : Pheromone deposit constant. Often set so that is on the order of .
Implementation
def ant_colony(D, n_ants=20, n_iterations=200, alpha=1.0, beta=3.0, rho=0.2, Q=100.0):
"""Ant Colony Optimization (Ant System) for TSP."""
n = len(D)
# Heuristic information (inverse distance)
eta = np.where(D > 0, 1.0 / D, 0.0)
# Initialize pheromone
tau = np.ones((n, n)) * (1.0 / (n * np.mean(D)))
best_tour = None
best_length = np.inf
for iteration in range(n_iterations):
all_tours = []
all_lengths = []
for ant in range(n_ants):
# Construct tour
visited = [False] * n
start = np.random.randint(n)
tour = [start]
visited[start] = True
for step in range(n - 1):
current = tour[-1]
# Compute probabilities for unvisited cities
probs = np.zeros(n)
for j in range(n):
if not visited[j]:
probs[j] = (tau[current, j] ** alpha) * (eta[current, j] ** beta)
prob_sum = probs.sum()
if prob_sum == 0:
# Fallback: pick random unvisited
unvisited = [j for j in range(n) if not visited[j]]
next_city = np.random.choice(unvisited)
else:
probs /= prob_sum
next_city = np.random.choice(n, p=probs)
tour.append(next_city)
visited[next_city] = True
length = tour_length(tour, D)
all_tours.append(tour)
all_lengths.append(length)
if length < best_length:
best_length = length
best_tour = tour[:]
# Pheromone evaporation
tau *= (1 - rho)
# Pheromone deposit
for tour, length in zip(all_tours, all_lengths):
deposit = Q / length
for i in range(n):
j = tour[(i + 1) % n]
tau[tour[i], j] += deposit
tau[j, tour[i]] += deposit
return best_tour, best_length
tour_aco, length_aco = ant_colony(D)
print(f'ACO tour length: {length_aco:.2f}')
Variants
Several improved ACO variants exist:
- MAX-MIN Ant System (MMAS): Bounds the pheromone levels between and to prevent premature convergence. Only the best ant deposits pheromone.
- Ant Colony System (ACS): Uses a different transition rule (pseudorandom proportional), local pheromone updates during construction, and only the global-best ant deposits pheromone.
Comparison
Let us compare all methods on our 50-city instance:
import time
methods = {
'Nearest neighbor': lambda: (nearest_neighbor(D), None),
'NN + 2-opt': lambda: (two_opt(nearest_neighbor(D), D), None),
'Simulated annealing': lambda: simulated_annealing(D),
'Genetic algorithm': lambda: genetic_algorithm(D),
'Ant colony': lambda: ant_colony(D),
}
print(f'{"Method":<25} {"Length":>10} {"Time (s)":>10}')
print('-' * 47)
for name, method in methods.items():
start = time.time()
result = method()
elapsed = time.time() - start
tour = result[0]
length = tour_length(tour, D)
print(f'{name:<25} {length:>10.2f} {elapsed:>10.3f}')
Typical behavior for 50-city random Euclidean instances:
| Method | Solution quality | Runtime |
|---|---|---|
| Nearest neighbor | 20–30% above optimal | milliseconds |
| NN + 2-opt | 3–8% above optimal | milliseconds |
| Simulated annealing | 1–5% above optimal | seconds |
| Genetic algorithm | 2–8% above optimal | seconds |
| Ant colony | 2–6% above optimal | seconds |
The metaheuristics consistently outperform simple heuristics, at the cost of higher runtime. For even better results, combine metaheuristics with local search: run SA or GA and apply 2-opt to the final solution.
When to use what
- Small instances (< 100 cities): Use the exact ILP formulation from Part 1. Modern solvers handle these in seconds.
- Medium instances (100–5,000 cities): Use metaheuristics combined with 2-opt/Or-opt. Simulated annealing with careful parameter tuning tends to be the most robust choice.
- Large instances (> 5,000 cities): Use the Lin-Kernighan-Helsgott (LKH) heuristic, which is the state-of-the-art heuristic solver for TSP. It combines a sophisticated variable-depth local search with effective perturbation strategies.
- Need proven optimality: Use the Concorde solver, the most powerful exact TSP solver, which combines branch-and-cut with problem-specific cuts (comb inequalities, blossom inequalities).
Summary of the TSP mini-series
- Part 1 — Problem definition, ILP formulation (MTZ subtour elimination), Gurobi implementation.
- Part 2 — Construction heuristics (nearest neighbor, greedy) and improvement heuristics (2-opt, 3-opt, Or-opt).
- Part 3 — Metaheuristics (simulated annealing, genetic algorithms, ant colony optimization) and practical guidance.
The TSP is a microcosm of combinatorial optimization: it illustrates the tension between exact methods and heuristics, the power of local search, and the creativity of metaheuristic design. The ideas in this series — local search, neighborhood structures, population-based search, constructive heuristics — apply far beyond the TSP to scheduling, routing, packing, and virtually every other combinatorial optimization problem.