Aspects of OR - Traveling Salesperson Problem - Metaheuristics (Part 3)

In Part 2 we built construction and improvement heuristics for the TSP. Methods like 2-opt are fast and effective, but they get trapped in local optima — once no single edge swap improves the tour, the search stops, even if much better solutions exist elsewhere.

Metaheuristics are general-purpose search frameworks that overcome this limitation. They use strategies like accepting worse solutions temporarily (simulated annealing), combining features of good solutions (genetic algorithms), or reinforcing successful search paths (ant colony optimization). These methods do not guarantee optimality, but they consistently find near-optimal solutions for large TSP instances.

In this post we cover three of the most popular metaheuristics, each implementing a fundamentally different search philosophy. All implementations reuse the distance matrix setup and the tour_length and two_opt functions from Part 2.

Simulated annealing

Simulated annealing (SA) is inspired by the physical process of annealing in metallurgy — heating a metal and then slowly cooling it to reach a low-energy crystalline state. The key idea: at high "temperatures", the algorithm accepts worsening moves with high probability, enabling it to escape local optima. As the temperature decreases, it becomes increasingly selective, converging toward a good solution.

Algorithm

Start with an initial tour $T$ and an initial temperature $T_0$ .
Repeat until the stopping criterion is met: a. Generate a neighbor $T'$ of $T$ (e.g., by a random 2-opt move). b. Compute $\Delta = \text{length}(T') - \text{length}(T)$ . c. If $\Delta < 0$ (improvement), accept: $T \leftarrow T'$ . d. If $\Delta \geq 0$ (worsening), accept with probability $e^{-\Delta / \tau}$ , where $\tau$ is the current temperature. e. Reduce the temperature: $\tau \leftarrow \alpha \cdot \tau$ (geometric cooling).
Return the best tour found.

Key parameters

Initial temperature $T_0$ : Should be high enough that most moves are accepted initially. A common rule of thumb is to set $T_0$ so that the acceptance probability for an average worsening move is around 80%.
Cooling rate $\alpha$ : Typically between 0.99 and 0.9999. Slower cooling (higher $\alpha$ ) gives better solutions but takes longer.
Stopping criterion: A minimum temperature, a maximum number of iterations, or no improvement for a specified number of steps.
Iterations per temperature: Often called the "epoch length". More iterations per temperature level allow the search to equilibrate before cooling.

Implementation

import numpy as np

def simulated_annealing(D, T0=100.0, alpha=0.9995, min_temp=0.01, epoch=100):
    """Simulated annealing for TSP."""
    n = len(D)
    # Start with a random tour
    tour = list(range(n))
    np.random.shuffle(tour)
    current_length = tour_length(tour, D)

    best_tour = tour[:]
    best_length = current_length
    temp = T0

    while temp > min_temp:
        for _ in range(epoch):
            # Generate neighbor: random 2-opt move
            i = np.random.randint(0, n - 1)
            j = np.random.randint(i + 1, n)

            # Compute delta (only the changed edges)
            a, b = tour[i], tour[(i - 1) % n]
            c, d = tour[j], tour[(j + 1) % n]
            # Before: (prev_i, i) and (j, next_j)
            # After:  (prev_i, j) and (i, next_j) with segment reversed
            delta = (D[b, tour[j]] + D[tour[i], d]
                     - D[b, tour[i]] - D[tour[j], d])

            if delta < 0 or np.random.random() < np.exp(-delta / temp):
                # Accept the move
                tour[i:j + 1] = tour[i:j + 1][::-1]
                current_length += delta

                if current_length < best_length:
                    best_length = current_length
                    best_tour = tour[:]

        temp *= alpha

    return best_tour, best_length

np.random.seed(42)
coords = np.random.rand(50, 2) * 100
D = make_distance_matrix(coords)

tour_sa, length_sa = simulated_annealing(D)
print(f'SA tour length: {length_sa:.2f}')

Why it works

At high temperatures, SA behaves like a random walk — it explores the solution space broadly. As the temperature drops, it gradually transitions into a greedy local search. This two-phase behavior (exploration then exploitation) is why SA can escape local optima that trap pure local search methods.

The theoretical foundation is the Metropolis criterion: under certain conditions on the cooling schedule, SA converges to the global optimum with probability 1. In practice, we use much faster cooling schedules that do not guarantee convergence but produce excellent solutions in reasonable time.

Genetic algorithms

Genetic algorithms (GA) are inspired by biological evolution. They maintain a population of candidate solutions (tours) and iteratively evolve the population through selection, crossover (combining two parents into offspring), and mutation (random perturbation).

TSP-specific challenges

Crossover is tricky for the TSP because naively combining two tours often produces invalid solutions (missing or duplicated cities). Several specialized crossover operators have been developed for the TSP:

Order Crossover (OX): Copy a segment from one parent and fill the remaining positions with cities from the other parent in order.
Partially Mapped Crossover (PMX): Swap a segment between parents and use a mapping to resolve conflicts.
Edge Recombination Crossover (ERX): Preserve edges present in either parent.

We implement Order Crossover below, as it is the most commonly used.

Algorithm

Initialize a population of $P$ random tours.
Evaluate the fitness (tour length) of each individual.
Repeat for $G$ generations: a. Selection: Choose parents based on fitness (e.g., tournament selection). b. Crossover: Combine pairs of parents to create offspring. c. Mutation: Apply random perturbations (e.g., 2-opt move) with probability $p_m$ . d. Replacement: Form the new population from the best individuals among parents and offspring.
Return the best tour found across all generations.

Implementation

def order_crossover(parent1, parent2):
    """Order Crossover (OX) for TSP."""
    n = len(parent1)
    start, end = sorted(np.random.choice(n, 2, replace=False))

    # Copy segment from parent1
    child = [-1] * n
    child[start:end + 1] = parent1[start:end + 1]

    # Fill remaining from parent2 in order
    segment_set = set(child[start:end + 1])
    fill = [c for c in parent2 if c not in segment_set]
    idx = 0
    for i in range(n):
        if child[i] == -1:
            child[i] = fill[idx]
            idx += 1

    return child

def tournament_select(population, lengths, k=3):
    """Select an individual via tournament selection."""
    candidates = np.random.choice(len(population), k, replace=False)
    best = min(candidates, key=lambda i: lengths[i])
    return population[best]

def genetic_algorithm(D, pop_size=100, generations=500, mutation_rate=0.3):
    """Genetic algorithm for TSP."""
    n = len(D)

    # Initialize population with random tours
    population = []
    for _ in range(pop_size):
        tour = list(range(n))
        np.random.shuffle(tour)
        population.append(tour)

    lengths = [tour_length(t, D) for t in population]
    best_idx = np.argmin(lengths)
    best_tour = population[best_idx][:]
    best_length = lengths[best_idx]

    for gen in range(generations):
        new_population = []

        # Elitism: keep the best individual
        new_population.append(best_tour[:])

        while len(new_population) < pop_size:
            # Selection
            parent1 = tournament_select(population, lengths)
            parent2 = tournament_select(population, lengths)

            # Crossover
            child = order_crossover(parent1, parent2)

            # Mutation: random 2-opt move
            if np.random.random() < mutation_rate:
                i, j = sorted(np.random.choice(n, 2, replace=False))
                child[i:j + 1] = child[i:j + 1][::-1]

            new_population.append(child)

        population = new_population
        lengths = [tour_length(t, D) for t in population]

        gen_best_idx = np.argmin(lengths)
        if lengths[gen_best_idx] < best_length:
            best_length = lengths[gen_best_idx]
            best_tour = population[gen_best_idx][:]

    return best_tour, best_length

tour_ga, length_ga = genetic_algorithm(D)
print(f'GA tour length: {length_ga:.2f}')

Enhancing the GA

The basic GA above can be improved significantly:

Local search after crossover: Apply 2-opt to every offspring before inserting it into the population. This combination of GA with local search is called a memetic algorithm and is one of the best-known metaheuristics for the TSP.
Diverse initial population: Use different construction heuristics (nearest neighbor from various starts, greedy) instead of random tours.
Adaptive mutation rate: Increase mutation when the population converges (diversity drops) and decrease it when the population is diverse.

Ant colony optimization

Ant colony optimization (ACO) is inspired by the foraging behavior of ants. Real ants deposit pheromone on paths to food sources. Shorter paths get more pheromone (ants return faster, depositing more per unit time), which attracts more ants, creating a positive feedback loop that converges on short paths.

Algorithm (Ant System)

The original Ant System (AS) by Dorigo (1992) works as follows:

Initialize pheromone levels $\tau_{ij}$ on all edges to a uniform value $\tau_0$ .
Repeat for $T$ $T$ iterations: a. Each of $m$ $m$ ants constructs a tour:
- Starting from a random city, the ant chooses the next city $j$ from city $i$ with probability proportional to $\tau_{ij}^\alpha \cdot \eta_{ij}^\beta$ , where $\eta_{ij} = 1/d_{ij}$ is the heuristic desirability and $\alpha, \beta$ control the relative importance. b. Pheromone evaporation: $\tau_{ij} \leftarrow (1 - \rho) \cdot \tau_{ij}$ for all edges, where $\rho \in (0, 1)$ is the evaporation rate. c. Pheromone deposit: Each ant $k$ deposits $\Delta \tau_{ij}^k = Q / L_k$ on every edge in its tour, where $L_k$ is the tour length and $Q$ is a constant.
Return the best tour found across all iterations.

Key parameters

$\alpha$ : Pheromone importance (typically 1).
$\beta$ : Heuristic importance (typically 2–5). Higher $\beta$ makes ants prefer short edges.
$\rho$ : Evaporation rate (typically 0.1–0.5). Higher $\rho$ means faster forgetting.
$m$ : Number of ants (often set to $n$ , the number of cities).
$Q$ : Pheromone deposit constant. Often set so that $Q/L^*$ is on the order of $\tau_0$ .

Implementation

def ant_colony(D, n_ants=20, n_iterations=200, alpha=1.0, beta=3.0, rho=0.2, Q=100.0):
    """Ant Colony Optimization (Ant System) for TSP."""
    n = len(D)

    # Heuristic information (inverse distance)
    eta = np.where(D > 0, 1.0 / D, 0.0)

    # Initialize pheromone
    tau = np.ones((n, n)) * (1.0 / (n * np.mean(D)))

    best_tour = None
    best_length = np.inf

    for iteration in range(n_iterations):
        all_tours = []
        all_lengths = []

        for ant in range(n_ants):
            # Construct tour
            visited = [False] * n
            start = np.random.randint(n)
            tour = [start]
            visited[start] = True

            for step in range(n - 1):
                current = tour[-1]
                # Compute probabilities for unvisited cities
                probs = np.zeros(n)
                for j in range(n):
                    if not visited[j]:
                        probs[j] = (tau[current, j] ** alpha) * (eta[current, j] ** beta)

                prob_sum = probs.sum()
                if prob_sum == 0:
                    # Fallback: pick random unvisited
                    unvisited = [j for j in range(n) if not visited[j]]
                    next_city = np.random.choice(unvisited)
                else:
                    probs /= prob_sum
                    next_city = np.random.choice(n, p=probs)

                tour.append(next_city)
                visited[next_city] = True

            length = tour_length(tour, D)
            all_tours.append(tour)
            all_lengths.append(length)

            if length < best_length:
                best_length = length
                best_tour = tour[:]

        # Pheromone evaporation
        tau *= (1 - rho)

        # Pheromone deposit
        for tour, length in zip(all_tours, all_lengths):
            deposit = Q / length
            for i in range(n):
                j = tour[(i + 1) % n]
                tau[tour[i], j] += deposit
                tau[j, tour[i]] += deposit

    return best_tour, best_length

tour_aco, length_aco = ant_colony(D)
print(f'ACO tour length: {length_aco:.2f}')

Variants

Several improved ACO variants exist:

MAX-MIN Ant System (MMAS): Bounds the pheromone levels between $\tau_{\min}$ and $\tau_{\max}$ to prevent premature convergence. Only the best ant deposits pheromone.
Ant Colony System (ACS): Uses a different transition rule (pseudorandom proportional), local pheromone updates during construction, and only the global-best ant deposits pheromone.

Comparison

Let us compare all methods on our 50-city instance:

import time

methods = {
    'Nearest neighbor': lambda: (nearest_neighbor(D), None),
    'NN + 2-opt': lambda: (two_opt(nearest_neighbor(D), D), None),
    'Simulated annealing': lambda: simulated_annealing(D),
    'Genetic algorithm': lambda: genetic_algorithm(D),
    'Ant colony': lambda: ant_colony(D),
}

print(f'{"Method":<25} {"Length":>10} {"Time (s)":>10}')
print('-' * 47)
for name, method in methods.items():
    start = time.time()
    result = method()
    elapsed = time.time() - start
    tour = result[0]
    length = tour_length(tour, D)
    print(f'{name:<25} {length:>10.2f} {elapsed:>10.3f}')

Typical behavior for 50-city random Euclidean instances:

Method	Solution quality	Runtime
Nearest neighbor	20–30% above optimal	milliseconds
NN + 2-opt	3–8% above optimal	milliseconds
Simulated annealing	1–5% above optimal	seconds
Genetic algorithm	2–8% above optimal	seconds
Ant colony	2–6% above optimal	seconds

The metaheuristics consistently outperform simple heuristics, at the cost of higher runtime. For even better results, combine metaheuristics with local search: run SA or GA and apply 2-opt to the final solution.

When to use what

Small instances (< 100 cities): Use the exact ILP formulation from Part 1. Modern solvers handle these in seconds.
Medium instances (100–5,000 cities): Use metaheuristics combined with 2-opt/Or-opt. Simulated annealing with careful parameter tuning tends to be the most robust choice.
Large instances (> 5,000 cities): Use the Lin-Kernighan-Helsgott (LKH) heuristic, which is the state-of-the-art heuristic solver for TSP. It combines a sophisticated variable-depth local search with effective perturbation strategies.
Need proven optimality: Use the Concorde solver, the most powerful exact TSP solver, which combines branch-and-cut with problem-specific cuts (comb inequalities, blossom inequalities).

Summary of the TSP mini-series

Part 1 — Problem definition, ILP formulation (MTZ subtour elimination), Gurobi implementation.
Part 2 — Construction heuristics (nearest neighbor, greedy) and improvement heuristics (2-opt, 3-opt, Or-opt).
Part 3 — Metaheuristics (simulated annealing, genetic algorithms, ant colony optimization) and practical guidance.

The TSP is a microcosm of combinatorial optimization: it illustrates the tension between exact methods and heuristics, the power of local search, and the creativity of metaheuristic design. The ideas in this series — local search, neighborhood structures, population-based search, constructive heuristics — apply far beyond the TSP to scheduling, routing, packing, and virtually every other combinatorial optimization problem.