Aspects of OR - Traveling Salesperson Problem - Heuristics (Part 2)

In Part 1 we formulated the Traveling Salesperson Problem as an integer linear program and solved it exactly with Gurobi. Exact methods guarantee an optimal solution, but they hit a wall as instance sizes grow — the ILP formulation has $O(n^2)$ binary variables and $O(n^2)$ subtour elimination constraints, and solve times can grow exponentially.

For large instances (hundreds or thousands of cities), we need heuristics: algorithms that trade optimality for speed. A good heuristic finds a solution that is close to optimal in a fraction of the time an exact method would need. In this post we cover two families of heuristics:

Construction heuristics — build a tour from scratch (nearest neighbor, greedy).
Improvement heuristics — start from an existing tour and iteratively improve it (2-opt, 3-opt, Or-opt).

In Part 3 we go further with metaheuristics that combine these building blocks in more sophisticated search frameworks.

Setup

Throughout this post, we assume a complete graph with $n$ cities and a symmetric distance matrix $d_{ij}$ . We represent a tour as a list of city indices. All Python implementations use NumPy and a precomputed distance matrix:

import numpy as np

def make_distance_matrix(coords):
    """Compute Euclidean distance matrix from (x, y) coordinates."""
    n = len(coords)
    D = np.zeros((n, n))
    for i in range(n):
        for j in range(i + 1, n):
            dx = coords[i][0] - coords[j][0]
            dy = coords[i][1] - coords[j][1]
            D[i, j] = D[j, i] = np.sqrt(dx * dx + dy * dy)
    return D

def tour_length(tour, D):
    """Total length of a tour."""
    return sum(D[tour[i], tour[(i + 1) % len(tour)]] for i in range(len(tour)))

We will use a random 50-city instance for demonstrations:

np.random.seed(42)
coords = np.random.rand(50, 2) * 100
D = make_distance_matrix(coords)

Nearest neighbor

The nearest neighbor heuristic is the simplest construction method. Start at some city, then repeatedly visit the closest unvisited city until all cities have been visited. Finally, return to the starting city.

Algorithm

Choose a starting city $c_0$ .
Set the current city to $c_0$ and mark it as visited.
While there are unvisited cities:
- Find the nearest unvisited city $c$ to the current city.
- Move to $c$ and mark it as visited.
Return to $c_0$ .

Analysis

Nearest neighbor runs in $O(n^2)$ time — at each of the $n$ steps, we scan up to $n$ cities to find the nearest one. It is fast and easy to implement, but the solution quality can be poor. The heuristic tends to make good local choices early on but may be forced into long edges at the end when few unvisited cities remain.

The approximation ratio of nearest neighbor is $O(\log n)$ — on worst-case instances it can produce tours up to $\Theta(\log n)$ times longer than optimal. In practice, the typical overhead is 20–30% above optimal.

A simple improvement: run nearest neighbor from every city as the starting point and keep the best tour. This costs $O(n^3)$ but often yields noticeably better results.

Implementation

def nearest_neighbor(D, start=0):
    """Nearest neighbor heuristic for TSP."""
    n = len(D)
    visited = [False] * n
    tour = [start]
    visited[start] = True

    for _ in range(n - 1):
        current = tour[-1]
        best_next = -1
        best_dist = np.inf
        for j in range(n):
            if not visited[j] and D[current, j] < best_dist:
                best_dist = D[current, j]
                best_next = j
        tour.append(best_next)
        visited[best_next] = True

    return tour

tour_nn = nearest_neighbor(D)
print(f'Nearest neighbor tour length: {tour_length(tour_nn, D):.2f}')

Greedy heuristic

The greedy heuristic takes a different approach: instead of building the tour city by city, it builds it edge by edge. Sort all edges by distance and greedily add the shortest edge that does not violate tour constraints.

Algorithm

Sort all $\binom{n}{2}$ edges by distance.
For each edge (shortest first):
- Add the edge to the tour if it does not:
  - Give any city degree > 2 (at most two edges per city).
  - Create a cycle of length < $n$ (premature subtour).
The result is a Hamiltonian cycle.

Analysis

The greedy heuristic runs in $O(n^2 \log n)$ due to the sorting step. It tends to produce better solutions than nearest neighbor because it considers the globally shortest edges first, rather than making locally greedy decisions from a single starting point.

Implementation

def greedy_tsp(D):
    """Greedy edge-insertion heuristic for TSP."""
    n = len(D)

    # Generate and sort all edges
    edges = []
    for i in range(n):
        for j in range(i + 1, n):
            edges.append((D[i, j], i, j))
    edges.sort()

    degree = [0] * n
    adj = [[] for _ in range(n)]

    # Union-Find for cycle detection
    parent = list(range(n))

    def find(x):
        while parent[x] != x:
            parent[x] = parent[parent[x]]
            x = parent[x]
        return x

    def union(x, y):
        parent[find(x)] = find(y)

    edges_added = 0
    for dist, i, j in edges:
        if edges_added == n:
            break
        if degree[i] >= 2 or degree[j] >= 2:
            continue
        # Avoid premature cycle (unless it would complete the tour)
        if find(i) == find(j) and edges_added < n - 1:
            continue

        adj[i].append(j)
        adj[j].append(i)
        degree[i] += 1
        degree[j] += 1
        union(i, j)
        edges_added += 1

    # Extract tour from adjacency list
    tour = [0]
    prev = -1
    current = 0
    for _ in range(n - 1):
        for neighbor in adj[current]:
            if neighbor != prev:
                tour.append(neighbor)
                prev = current
                current = neighbor
                break

    return tour

tour_greedy = greedy_tsp(D)
print(f'Greedy tour length: {tour_length(tour_greedy, D):.2f}')

2-opt

The construction heuristics above give us a decent starting tour, but we can almost always improve it with local search. The most widely used improvement heuristic for the TSP is 2-opt.

The idea: take two edges in the tour and check whether reconnecting them (reversing the segment between them) would shorten the tour. If so, make the swap. Repeat until no improving 2-opt move exists.

The 2-opt move

A 2-opt move removes two edges $(i, i+1)$ and $(j, j+1)$ from the tour and reconnects them as $(i, j)$ and $(i+1, j+1)$ . This reverses the segment between positions $i+1$ and $j$ :

Before: ... → a → b → ... → c → d → ...
After:  ... → a → c → ... → b → d → ...
              (segment b...c is reversed)

The move improves the tour if:

$d_{a,c} + d_{b,d} < d_{a,b} + d_{c,d}.$

Algorithm

Start with an initial tour.
For each pair of edges $(i, j)$ $(i, j)$ with $i < j$ $i < j$ :
- Compute the change in tour length from the 2-opt move.
- If the change is negative (improvement), apply the move.
Repeat from Step 2 until a full pass finds no improvement (locally optimal).

Analysis

Each pass takes $O(n^2)$ time (checking all pairs of edges). The number of passes is typically small in practice — often $O(n)$ or less. The resulting tour is 2-optimal: no single 2-opt move can improve it.

2-opt typically improves a nearest neighbor tour by 15–25% and brings the solution within a few percent of optimal for moderate-sized instances.

Implementation

def two_opt(tour, D):
    """Improve a tour using 2-opt local search."""
    n = len(tour)
    improved = True
    while improved:
        improved = False
        for i in range(n - 1):
            for j in range(i + 2, n):
                if j == n - 1 and i == 0:
                    continue  # skip: same as no change
                # Cost of current edges
                a, b = tour[i], tour[i + 1]
                c, d = tour[j], tour[(j + 1) % n]
                # Cost change
                delta = D[a, c] + D[b, d] - D[a, b] - D[c, d]
                if delta < -1e-10:
                    # Reverse the segment between i+1 and j
                    tour[i + 1:j + 1] = tour[i + 1:j + 1][::-1]
                    improved = True
    return tour

tour_2opt = two_opt(tour_nn.copy(), D)
print(f'2-opt tour length: {tour_length(tour_2opt, D):.2f}')

3-opt

3-opt generalizes 2-opt by removing three edges and reconnecting the three resulting segments. This allows moves that 2-opt cannot make — for example, rearranging the order of segments rather than just reversing one.

With three edges removed, there are 8 possible ways to reconnect the segments (including the original tour). After excluding the original and the moves that are equivalent to one or two 2-opt moves, there are a few genuinely new reconnection patterns.

Analysis

Each pass of 3-opt takes $O(n^3)$ time — significantly more expensive than 2-opt. For this reason, full 3-opt is rarely used in practice. Instead, solvers often use restricted 3-opt or the closely related Lin-Kernighan heuristic, which uses a variable number of edge swaps guided by a clever search strategy.

The additional improvement from 3-opt over 2-opt is typically 1–3% — meaningful for high-quality solutions but at a much higher computational cost.

Implementation (simplified)

A full 3-opt implementation is quite involved due to the many reconnection cases. Here is a simplified version that considers the most common non-sequential 3-opt moves:

def three_opt_pass(tour, D):
    """One pass of 3-opt. Returns True if an improvement was found."""
    n = len(tour)
    for i in range(n - 2):
        for j in range(i + 2, n - 1):
            for k in range(j + 2, n + (1 if i > 0 else 0)):
                # Segment endpoints
                a, b = tour[i], tour[i + 1]
                c, d = tour[j], tour[j + 1]
                e, f = tour[k % n], tour[(k + 1) % n]

                d0 = D[a, b] + D[c, d] + D[e, f]

                # Try all reconnections and pick the best
                # Reconnection: a-d, e-b, c-f (segment reversal + reorder)
                d1 = D[a, d] + D[e, b] + D[c, f]
                if d1 < d0 - 1e-10:
                    # Apply the move: reverse and reorder segments
                    seg1 = tour[i + 1:j + 1]
                    seg2 = tour[j + 1:k % n + 1] if k % n > j else tour[j + 1:]
                    tour[i + 1:i + 1 + len(seg2)] = seg2
                    tour[i + 1 + len(seg2):i + 1 + len(seg2) + len(seg1)] = seg1
                    return True
    return False

In practice, most implementations skip full 3-opt and use the Or-opt move instead, which is a special case that is much faster.

Or-opt

Or-opt is a fast improvement heuristic that relocates short segments of the tour. Specifically, it takes a segment of 1, 2, or 3 consecutive cities and moves it to a different position in the tour.

The Or-opt move

Remove a segment of $k$ consecutive cities ( $k = 1, 2, 3$ ) from the tour and insert it between two other consecutive cities. For $k = 1$ , this is simply relocating a single city.

Analysis

Or-opt runs in $O(n^2)$ per pass — the same as 2-opt — but considers moves that 2-opt cannot make (relocating a city or segment without reversing). It is an excellent complement to 2-opt: applying 2-opt followed by Or-opt (and iterating) often reaches solutions close to what full 3-opt would achieve, at much lower cost.

Implementation

def or_opt(tour, D, segment_sizes=(1, 2, 3)):
    """Improve a tour using Or-opt (relocate segments of size 1, 2, 3)."""
    n = len(tour)
    improved = True
    while improved:
        improved = False
        for size in segment_sizes:
            for i in range(n):
                # Segment to relocate: tour[i], ..., tour[i+size-1]
                if i + size > n:
                    continue

                prev_i = (i - 1) % n
                next_seg = (i + size) % n

                # Cost of removing the segment
                removal_gain = (
                    D[tour[prev_i], tour[i]]
                    + D[tour[(i + size - 1) % n], tour[next_seg]]
                    - D[tour[prev_i], tour[next_seg]]
                )

                # Try inserting the segment at every other position
                for j in range(n):
                    if j == prev_i or j == i or j == (i + size - 1) % n:
                        continue

                    next_j = (j + 1) % n
                    insertion_cost = (
                        D[tour[j], tour[i]]
                        + D[tour[(i + size - 1) % n], tour[next_j]]
                        - D[tour[j], tour[next_j]]
                    )

                    if insertion_cost < removal_gain - 1e-10:
                        # Extract segment and reinsert
                        segment = [tour[(i + s) % n] for s in range(size)]
                        remaining = [tour[t] for t in range(n)
                                     if t not in set(range(i, i + size)) and
                                     (i + size <= n or t not in set(range(0, (i + size) % n)))]
                        # Simpler approach: rebuild tour
                        new_tour = []
                        seg_inserted = False
                        for t in range(n):
                            idx = tour[t]
                            if t >= i and t < i + size:
                                continue
                            new_tour.append(idx)
                            if idx == tour[j] and not seg_inserted:
                                new_tour.extend(segment)
                                seg_inserted = True

                        if len(new_tour) == n:
                            tour = new_tour
                            improved = True
                            break
                if improved:
                    break
            if improved:
                break

    return tour

Combining heuristics

The best results come from combining construction and improvement heuristics. A typical pipeline is:

Construct an initial tour with nearest neighbor (from multiple starting cities).
Improve with 2-opt until locally optimal.
Improve further with Or-opt.
Repeat steps 2–3 until no improvement.

def solve_tsp_heuristic(D):
    """Combined heuristic: best nearest neighbor + 2-opt + Or-opt."""
    n = len(D)
    best_tour = None
    best_length = np.inf

    for start in range(n):
        tour = nearest_neighbor(D, start=start)
        tour = two_opt(tour, D)
        length = tour_length(tour, D)
        if length < best_length:
            best_length = length
            best_tour = tour[:]

    return best_tour, best_length

tour_best, length_best = solve_tsp_heuristic(D)
print(f'Best heuristic tour length: {length_best:.2f}')

Comparison

Let us compare the heuristics on our 50-city instance:

# Run all methods
tour_nn = nearest_neighbor(D, start=0)
tour_gr = greedy_tsp(D)
tour_nn_2opt = two_opt(nearest_neighbor(D, start=0), D)
tour_best, _ = solve_tsp_heuristic(D)

print(f'{"Method":<30} {"Tour length":>12}')
print('-' * 44)
print(f'{"Nearest neighbor":<30} {tour_length(tour_nn, D):>12.2f}')
print(f'{"Greedy":<30} {tour_length(tour_gr, D):>12.2f}')
print(f'{"NN + 2-opt":<30} {tour_length(tour_nn_2opt, D):>12.2f}')
print(f'{"Best NN + 2-opt (all starts)":<30} {tour_length(tour_best, D):>12.2f}')

Typical results for random Euclidean instances: nearest neighbor produces tours 20–30% above optimal, greedy is slightly better, 2-opt brings it down to 5–10% above optimal, and the multi-start combination reaches 2–5% above optimal.

Limitations

Improvement heuristics like 2-opt and Or-opt are local search methods: they stop at the first local optimum they encounter. If the initial tour happens to lead to a poor local optimum, no amount of 2-opt moves will escape it.

To overcome this, we need strategies that can escape local optima — metaheuristics. In Part 3, we explore simulated annealing, genetic algorithms, and ant colony optimization, all of which use randomization and population-based search to explore the solution space more broadly.