- Published on
Verifying Gallai's Path Conjecture for Small Graphs
- Authors

- Name
- Till Heller
Gallai's Conjecture, formulated by Tibor Gallai in 1968, is a foundational problem in graph theory concerning the decomposition of graphs into paths. A path decomposition of a graph is a set of paths such that every edge of belongs to exactly one path. The conjecture states that:
Conjecture (Gallai, 1968). Every connected graph with vertices can have its edges decomposed into at most paths.
This conjecture highlights a deep connection between the structure of a graph and its decomposition into a minimal number of paths. Over the years, Gallai's Conjecture has been verified for specific classes of graphs:
- Lovász proved that the conjecture holds for graphs where all vertices have odd degree. In this case, the bound is tight: the decomposition requires exactly paths.
- Favaron et al. showed that the conjecture holds for 2-regular and 4-regular graphs.
- Botler et al. generalized this to graphs with maximum degree at most 5.
- Pyber proved the conjecture for graphs where the subgraph induced by even-degree vertices is a forest.
- Fan generalized Pyber's result to graphs where the even-degree induced subgraph is triangle-free with maximum degree at most 3.
Despite these advancements, the conjecture remains unproven in its full generality, presenting an enduring challenge in combinatorics.
In this article, we computationally verify Gallai's Conjecture for all connected graphs with up to 11 vertices — a total of over one billion graphs. No counterexample exists for graphs of this size.
Approach
Our strategy is a three-stage pipeline: preprocessing, heuristics, and an exact ILP solver. The pipeline short-circuits at each stage — the first method that proves the conjecture for a given graph returns immediately, so the expensive ILP is only invoked as a last resort.
Preprocessing: Exploiting Known Results
The theoretical results listed above give us sufficient conditions that are cheap to check. For each graph, we test four conditions:
- Induced forest: Do the even-degree vertices induce a forest? (Pyber's result)
- Induced triangle-free: Is the even-degree induced subgraph triangle-free with maximum degree ? (Fan's result)
- Maximum degree : Does the graph have maximum degree at most 5? (Botler et al.)
- All degrees 2 or 4: Are all vertex degrees in ? (Favaron et al.)
If any of these conditions holds, the conjecture is already proven for that graph and we skip it entirely.
def preprocessing(g, graph_name):
induced_forest = subblocks_induce_forest(g)
induced_triangle_free = subblocks_induce_triangle_free(g)
maximum_degree_at_most_5 = has_maximum_degree_at_most_5(g)
all_vertices_have_degree_two_or_four = vertices_have_degree_two_or_four(g)
return PreprocessingResult(g, graph_name, induced_forest, induced_triangle_free,
maximum_degree_at_most_5, all_vertices_have_degree_two_or_four)
This turns out to be remarkably effective: for graphs with , every single graph is covered by preprocessing. Even at , preprocessing eliminates about 34% of the one billion connected graphs.
Heuristics: Greedy Path Decomposition
For graphs that survive preprocessing, we try two greedy heuristics before resorting to the ILP.
Highest-degree heuristic. This heuristic repeatedly picks the vertex with the highest remaining degree and greedily extends a path from it by following edges until no more are available. The intuition is that starting from high-degree vertices tends to produce long paths, which uses up many edges per path and keeps the total count low.
def highest_degree_heuristic(graph, graph_name):
adj = _build_adjacency(graph)
degree = {n: len(adj[n]) for n in graph.nodes()}
paths = []
while any(degree[n] > 0 for n in degree):
start_node = max(degree, key=degree.get)
current_path = []
current_node = start_node
while adj[current_node]:
neighbor = next(iter(adj[current_node]))
current_path.append((current_node, neighbor))
_remove_edge(adj, current_node, neighbor)
degree[current_node] -= 1
degree[neighbor] -= 1
current_node = neighbor
paths.append(current_path)
return Solution(Algorithm.HIGHEST_DEGREE_HEURISTICS, graph, graph_name,
len(paths), [len(p) for p in paths])
Random start heuristic. If the deterministic heuristic fails, we try a randomized variant: pick a random vertex with remaining edges, extend a path by following random edges, and repeat. We run 10 independent attempts and return the best result.
These heuristics are fast — linear in the number of edges — and solve the vast majority of graphs. At , 87.4% of the remaining graphs are solved by the highest-degree heuristic and 12.6% by the random heuristic, leaving zero graphs for the ILP.
Exact ILP Formulation
For any graph that resists both heuristics, we formulate the path decomposition problem as an integer linear program (ILP) and solve it with Gurobi.
Let be a graph with vertices. We set , the conjectured upper bound on the number of paths. We introduce three sets of binary variables:
- — edge is assigned to path
- — vertex participates in path
- — path is used in the decomposition
The objective is to minimize the number of paths:
We start by setting up the model and defining the variables:
from gurobipy import GRB, Model
model = Model("Gallai's Path Conjecture")
x = model.addVars([(e[0], e[1], p) for e in edges for p in range(k)], vtype=GRB.BINARY, name="x")
z = model.addVars(vertices, range(k), vtype=GRB.BINARY, name="z")
y = model.addVars(range(k), vtype=GRB.BINARY, name="y")
model.setObjective(y.sum(), GRB.MINIMIZE)
Edge coverage. Every edge must appear in exactly one path:
for e in edges:
model.addConstr(sum(x[e[0], e[1], p] for p in range(k)) >= 1,
name=f"edge_cover_{e[0]}_{e[1]}")
Path continuity. If an edge is in a path, both its endpoints must participate in that path:
for e in edges:
u, v = e
for p in range(k):
model.addConstr(x[u, v, p] <= z[u, p])
model.addConstr(x[u, v, p] <= z[v, p])
Degree constraint. Each vertex has degree at most 2 in any path (since a path visits each vertex at most once, with at most one edge entering and one leaving):
for v in vertices:
for p in range(k):
model.addConstr(
sum(x[u, v1, p] for u, v1 in edges if v1 == v) +
sum(x[v1, u, p] for v1, u in edges if v1 == v) <= 2 * z[v, p])
Acyclicity. A path is a tree, so the number of edges must be strictly less than the number of vertices. We enforce this per path: if path is used (), then the number of edges in is at most the number of vertices in minus 1:
for p in range(k):
model.addConstr(
sum(x[u, v, p] for u, v in edges) <= sum(z[v, p] for v in vertices) - y[p])
Combined with the degree-2 constraint, this ensures each connected component of path is indeed a path (not a cycle).
Path usage. An edge can only be assigned to a path that is marked as used:
for e in edges:
for p in range(k):
model.addConstr(x[e[0], e[1], p] <= y[p])
Symmetry breaking. Since paths are interchangeable, we break symmetry by requiring that paths are used in order:
for p in range(k - 1):
model.addConstr(y[p] >= y[p + 1])
This concludes the formulation. We solve it with:
model.optimize()
If the model finds an optimal solution with objective value , the conjecture holds for this graph.
Scaling Up: Parallel Graph Generation
For , we can enumerate all connected graphs, store them, and process them. But at , there are over one billion connected graphs — storing them all at once is impractical.
We use nauty's geng tool to generate all non-isomorphic connected graphs, and pipe them directly through our preprocessing filter. The geng tool supports a res/mod partitioning scheme that lets us split the work across parallel workers:
#!/bin/bash
# generate_filtered.sh <n_vertices> <n_workers> <output_file>
for i in $(seq 0 $((WORKERS - 1))); do
geng "$N" -c "$i/$WORKERS" | python filter_graphs.py >> "$OUTPUT" &
done
wait
Each worker generates its partition of graphs, filters out those covered by preprocessing, and appends the survivors to the output file. For , this produces a 7.4 GB file of 660 million graphs that need to be checked by heuristics.
The main computation then processes this file in parallel using Python's multiprocessing, with results streamed to a SQLite database. The system supports resume on interruption — essential when processing hundreds of millions of graphs.
Results
We verified Gallai's Conjecture for all connected graphs with vertices. Here is a summary:
| Connected graphs | After preprocessing | Solved by heuristics | ILP calls | |
|---|---|---|---|---|
| 4 | 6 | 0 | 0 | 0 |
| 5 | 21 | 0 | 0 | 0 |
| 6 | 112 | 0 | 0 | 0 |
| 7 | 853 | 109 | 109 | 0 |
| 8 | 11,117 | 2,854 | 2,854 | 0 |
| 9 | 261,080 | 109,647 | 109,647 | 0 |
| 10 | 11,716,571 | 6,539,574 | 6,539,573 | 1 |
| 11 | 1,006,700,565 | 660,907,582 | 660,907,582 | 0 |
Several observations stand out:
Preprocessing is remarkably effective. For , every connected graph satisfies at least one of the known sufficient conditions — no computation beyond the preprocessing checks is needed. Even at , preprocessing eliminates about 34% of graphs upfront.
The heuristics handle almost everything else. Of the 660 million graphs remaining after preprocessing at , the highest-degree heuristic solves 87.4% and the random heuristic covers the remaining 12.6%. Together, they achieve a 100% success rate at .
The ILP is almost never needed. Across all graph sizes from to , the ILP solver was invoked exactly once: for a single 10-vertex graph with 25 edges (graph6 encoding: I?AFnZ^no). The ILP decomposed it into 4 paths, well within the upper bound of .
The Single ILP Instance
The lone graph requiring the ILP solver is worth examining. With 10 vertices and 25 edges, it has an edge density of (out of a maximum of possible edges). It is dense enough that the greedy heuristics struggle to find an efficient decomposition, but not so structured that the preprocessing conditions apply. The ILP found a decomposition into just 4 paths.
The fact that this is the only graph across all 11.7 million 10-vertex graphs (and zero out of one billion 11-vertex graphs) that needed exact optimization speaks to the effectiveness of the heuristic approach.
Conclusion
We have computationally verified that Gallai's Path Decomposition Conjecture holds for all connected graphs with up to 11 vertices — over one billion graphs in total. The three-stage pipeline of preprocessing, heuristics, and exact ILP solving proved highly effective:
- Known theoretical results eliminate a significant fraction of graphs before any computation.
- Simple greedy heuristics solve virtually all remaining cases.
- The ILP solver serves as a safety net that was needed for exactly one graph.
These results provide strong computational evidence for the conjecture and suggest that counterexamples, if they exist, must occur at substantially larger graph sizes. The code is available on GitHub.