Reduction of HAMILTONIAN CIRCUIT to Perl Regular Expression Matching
Credit: Mike Rosulek
The HAMILTONIAN CIRCUIT Problem
A Hamiltonian Circuit in a graph G=(V,E) is a cycle that visits every vertex in V exactly once. The Hamiltonian Circuit problem is: given a graph as input, decide if it contain a Hamiltonian Circuit.
Transformation to Regex Match
Let's represent the vertices of the graph as numbers between 1 and $V. We also suppose that we have an adjacency list for the graph. The adjacency list contains [u, v] if the graph contains an edge uv. So for example, the following graph:

is represented as follows:
$V = 5;
@E = ([1,3],[1,5],[2,3],[2,4],[2,5],[4,5]);
Construct a temporary list of all valid possible edges. This list is used to check that the vertices chosen by the regular expression engine are all pairwise distinct:
@all_edges = map { my $x = $_; map { [$x, $_] } $x+1 .. $V } 1 .. $V-1;
Construct a string as follows:
$string = (join(' ', 1 .. $V) . "\n") x $V
. "\n"
. (join(' ', map { join "-", @$_ } @all_edges ) . "\n") x @all_edges
. "\n"
. (join(' ', map { join "-", @$_ } @E ) . "\n") x $V;
Now construct a regex as follows:
$regex = "^ "
. ".* \\b (\\d+) \\b .* \\n\n" x $V
. "\\n\n"
. join("", map { my ($x, $y) = @$_;
".* \\b (?: \\$x-\\$y | \\$y-\\$x ) \\b .* \\n\n"
} @all_edges)
. "\\n\n"
. join("", map { my ($x, $y) = ($_, $_+1);
".* \\b (?: \\$x-\\$y | \\$y-\\$x ) \\b .* \\n\n"
} 1 .. ($V-1))
. ".* \\b (?: \\$V-\\1 | \\1-\\$V ) \\b .* \\n \$\n";
Now the graph has a Hamiltonian Circuit if and only if
$string =~ /$regex/x;
is true. If so, the backreference variables $1, $2, etc., will contain the vertices of V in the order visited by a Hamiltonian Circuit.
Example
Suppose the graph is as depicted above. Then the value of $string is as follows: (newlines are significant)
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1 2 3 4 5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-2 1-3 1-4 1-5 2-3 2-4 2-5 3-4 3-5 4-5
1-3 1-5 2-3 2-4 2-5 4-5
1-3 1-5 2-3 2-4 2-5 4-5
1-3 1-5 2-3 2-4 2-5 4-5
1-3 1-5 2-3 2-4 2-5 4-5
1-3 1-5 2-3 2-4 2-5 4-5
and the value of $regex is: (whitespace in the regular expression is ignored by using the /x modifier)
^ .* \b (\d+) \b .* \n
.* \b (\d+) \b .* \n
.* \b (\d+) \b .* \n
.* \b (\d+) \b .* \n
.* \b (\d+) \b .* \n
\n
.* \b (?: \1-\2 | \2-\1 ) \b .* \n
.* \b (?: \1-\3 | \3-\1 ) \b .* \n
.* \b (?: \1-\4 | \4-\1 ) \b .* \n
.* \b (?: \1-\5 | \5-\1 ) \b .* \n
.* \b (?: \2-\3 | \3-\2 ) \b .* \n
.* \b (?: \2-\4 | \4-\2 ) \b .* \n
.* \b (?: \2-\5 | \5-\2 ) \b .* \n
.* \b (?: \3-\4 | \4-\3 ) \b .* \n
.* \b (?: \3-\5 | \5-\3 ) \b .* \n
.* \b (?: \4-\5 | \5-\4 ) \b .* \n
\n
.* \b (?: \1-\2 | \2-\1 ) \b .* \n
.* \b (?: \2-\3 | \3-\2 ) \b .* \n
.* \b (?: \3-\4 | \4-\3 ) \b .* \n
.* \b (?: \4-\5 | \5-\4 ) \b .* \n
.* \b (?: \5-\1 | \1-\5 ) \b .* \n $
The test $string =~ /$regex/x succeeds, so the graph does contain a Hamiltonian Circuit. Additionaly, the match assigns to the special variables $1 through $5 the values (5, 4, 2, 3, 1), indicating the order in which the cycle visited all the vertices. For an example where the graph does not contain a Hamiltonian Circuit, remove the edge [2,4] from the list.
Conclusion
The Hamiltonian Circuit problem is NP-hard. If there were an efficient (polynomial-time) algorithm for computing whether or not a regex matched a certain string, we could use it to quickly compute solutions to the graph 3-colorability problem, and, by extension, to the knapsack problem, the travelling salesman problem, etc. etc.
For A More Detailed Discussion
See this thread on perlmonks.org. A slightly improved regex which uses lookaheads is also given.