Reduction of VERTEX COVER to Perl Regular Expression Matching

Credit: Mike Rosulek

The VERTEX COVER Problem

A k-vertex cover in a graph G=(V,E) is a set A of k vertices, such that for every edge u-v in E, one or both of u and v is in A. Informally, the vertices of A "cover" their incident edges, and we want all edges covered by some vertex. The vertex cover problem is: given a graph G and a positive integer k, does G contain k-vertex cover?

Transformation to Regex Match

Let's represent the vertices of the graph as numbers between 1 and $V. We also suppose that we have an adjacency list for the graph. The adjacency list contains [u, v] if the graph contains an edge u-v. So for example, the following graph:

is represented as follows:

        $V = 5;
        @E = ([1,3],[1,5],[2,3],[2,4],[2,5],[4,5]);

with k stored in $k. Construct a string as follows:

        $string = ('x' x $V) . ':' . ('x' x $k) . ':' . join(',', ('xx') x @E);

Now construct a regex as follows:

        $regex  = '^' . ('(x?)' x $V) . 'x*:'
                . join('', map { "\\$_" } 1 .. $V) . ':'
                . join(',', map { my ($x,$y) = @$_; "\\$x\\${y}x?" } @E) . '$';

Now the graph has a k-vertex cover if and only if

        $string =~ /$regex/;

is true. If so, the backreference variables $1, $2, etc., will be assigned an x or the empty string, to indicate each vertex's inclusion or exclusion from the vertex cover, respectively.

Example

Suppose the graph is as depicted above and k=3. Then the value of $string is as follows:

        xxxxx:xxx:xx,xx,xx,xx,xx,xx

and the value of $regex is:

        ^(x?)(x?)(x?)(x?)(x?)x*:\1\2\3\4\5:\1\3x?,\1\5x?,\2\3x?,\2\4x?,\2\5x?,\4\5x?$

The test $string =~ /$regex/ succeeds, so the graph does contain a 3-vertex cover. Additionaly, the match assigns "x" to the special variables $1, $2, and $4, indicating that vertices 1, 2, and 4 are a 3-vertex cover.

Conclusion

The vertex cover problem is NP-hard. If there were an efficient (polynomial-time) algorithm for computing whether or not a regex matched a certain string, we could use it to quickly compute solutions to the graph 3-colorability problem, and, by extension, to the knapsack problem, the travelling salesman problem, etc. etc.