A Brief on "An Empirical Analysis of Search in GSAT" by Ian P. Gent and Toby Walsh

The importance of research into the area of propositional satisfiability (SAT) becomes obvious in the arena of AI as SAT is both a natural (i.e. well understood mathematical method) of representing knowledge coupled with a technique of verifying that knowledge. In addition the fact that SAT is the arch-typical example of hard problems in computer science (i.e. it is NP-complete) SAT becomes a natural target for research into understanding how (in)tractable these hard problems are.

GSAT is an algorithm for solving SAT which is based on greedy hill-climbing using randomization to select among variable assignments of equal utility. The problem space selected empirical analysis by Gent and Walsh is a randomly generated k-SAT problems with N variables and L clauses. More specifically for 3-SAT and L ~ 4.3N (which was determined to be the transition point between satisfiable and unsatisfiable problems, thus representing a "difficult" search space). Gent and Walsh present their results graphically followed by a numerical analysis of said results.

One of the things that surprised me the most about their results was the very consistent shape of the plot of the mean change in the "score" (i.e. the number of clauses that become satisfied by the "flipping" of a variables assigned value) and the mean change in "poss-flips" (the number of variables that can be selected from for "flipping" that derive the same increase in the score). The fact that these graphs maintain their shape over a significant range of values of N indicates these shapes are at least an aspect of a class problems and not one of a specific instance of a problem. Although it is certainly true that the specific shapes/behavior involved are a result of GSAT, deeper meanings can be derived from these observations. One of the more important is the potential for determining the optimal number for "Max-flips" (i.e. the limit to the number of variable value changes) based on the configuration of the problem.

Gent and Walsh point out that, due to the difficulty of solving the version of 3-SAT that they used, they could not provide graphs or analysis of the differences between satisfiable and unsatisfiable results. (It would have helped if Gent and Walsh had provided a plot of frequency of satisfiable instances found.) It seems that it is possible that a "signature" or pattern that arises during the solution of the problem that might be identified that could distinguish between satisfiable and unsatisfiable versions of a class of problems (perhaps satisfiable and unsatisfiable problems approach different asymptotes). The existence of a runtime signature for GSAT (or some other algorithm) would be an advantage similar to the pre-runtime knowledge that a problem is in Horn-clause form.

I do have a serious question about the wider scale application of the results that Gent and Walsh have found. The fact that they used a large number of randomly generated instances of problems averaged together may be deceptive in so far as that it may hide artifacts of distinct subclasses of 3-SAT (or SAT in general). The reason that I make this conjecture is that known subclasses of SAT, such as Horn-clauses, are solvable in polynomial time, thus it is possible that other such classes may exist. As an example, it seems that the "interconnectedness" (i.e. the number of variables shared between clauses or the grouping of clauses based on shared variables) of a set of clauses would be expected to have a strong impact on the behavior of GSAT (and other) algorithm(s). Another possible issue is the existence of multiple plateaus/local maximums and their (potential) impact on GSAT in this context. This concern may, of course, be dispelled by showing that these results are characteristic of a larger class of SAT problems than 3-SAT .

In conclusion it is not obvious that even if we know that a problem in general is hard or intractable that there might not be information about specific sub-classes or instances of a problem that might make it tractable (i.e. it might belong to some subset of easier versions of the problem). In the case of particularly complex problems it can be very difficult to get a handle on what elements of a problem make it tractable (or even that a class of more tractable versions of the problem exists) so it becomes important to make empirical explorations into the intractable wilderness of problems and hope that empirical observation will lead to theories that allow tractable maps of the problem space.