MAXEkSAT

MAXEkSAT is a problem in computational complexity theory that is a maximization version of the Boolean satisfiability problem 3SAT. In MAXEkSAT, each clause has exactly k literals, each with distinct variables, and is in conjunctive normal form. These are called k-CNF formulas. The problem is to determine the maximum number of clauses that can be satisfied by a truth assignment to the variables in the clauses.

We say that an algorithm A provides an α-approximation to MAXEkSAT if, for some fixed positive α less than or equal to 1, and every kCNF formula φ, A can find a truth assignment to the variables of φ that will satisfy at least an α-fraction of the maximum number of satisfiable clauses of φ.

Because the NP-hard k-SAT problem (for k ≥ 3) is equivalent to determining if the corresponding MAXEkSAT instance has a value equal to the number of clauses, MAXEkSAT must also be NP-hard, meaning that there is no polynomial time algorithm unless P=NP. A natural next question, then, is that of finding approximate solutions: what's the largest real number α < 1 such that some explicit P (complexity) algorithm always finds a solution of size α·OPT, where OPT is the (potentially hard to find) maximizing assignment. While the algorithm is efficient, it's not obvious how to remove its dependence on randomness. There are problems related to the satisfiability of conjunctive normal form Boolean formulas.

Approximation Algorithm[edit]

There is a simple randomized polynomial-time algorithm that provides a $\textstyle \left(1-{\frac {1}{2^{k}}}\right)$ -approximation to MAXEkSAT: independently set each variable to true with probability 1/2, otherwise set it to false.

Any given clause c is unsatisfied only if all of its k constituent literals evaluates to false. Because each literal within a clause has a 1⁄2 chance of evaluating to true independently of any of the truth value of any of the other literals, the probability that they are all false is $\textstyle ({\frac {1}{2}})^{k}={\frac {1}{2^{k}}}$ . Thus, the probability that c is indeed satisfied is $\textstyle 1-{\frac {1}{2^{k}}}$ , so the indicator variable $\textstyle 1_{c}$ (that is 1 if c is true and 0 otherwise) has expectation $\textstyle 1-{\frac {1}{2^{k}}}$ . The sum of all of the indicator variables over all $\textstyle |C|$ clauses is $(\textstyle 1-{\frac {1}{2^{k}}})|C|$ , so by linearity of expectation we satisfy a $\textstyle \left(1-{\frac {1}{2^{k}}}\right)$ fraction of the clauses in expectation. Because the optimal solution can't satisfy more than all $\textstyle |C|$ of the clauses, we have that ${\textit {ALG}}=\left(1-{\frac {1}{2^{k}}}\right)\cdot |C|>\left(1-{\frac {1}{2^{k}}}\right)\cdot {\textit {OPT}}$ , so the algorithm finds a $\textstyle \geq \left(1-{\frac {1}{2^{k}}}\right)$ approximation to the true optimal solution in expectation.

Despite its high expectation, this algorithm may occasionally stumble upon solutions of value lower than the expectation we computed above. However, over a large number of trials, the average fraction of satisfied clauses will tend towards $\textstyle \left(1-{\frac {1}{2^{k}}}\right)$ . This implies two things:

There must exist an assignment satisfying at least a $\textstyle \left(1-{\frac {1}{2^{k}}}\right)$ fraction of the clauses. If there weren't, we could never attain a value this large on average over a large number of trials.
If we run the algorithm a large number of times, at least half of the trials (in expectation) will satisfy some $\textstyle (1-{\frac {2}{2^{k}}})$ fraction of the clauses. This is because any smaller fraction would bring down the average enough that the algorithm must occasionally satisfy more than 100% of the clauses to get back to its expectation of $\textstyle \left(1-{\frac {2}{2^{k}}}\right)$ , which cannot happen. Extending this using Markov's inequality, at least some $\textstyle 1-\left({\frac {1}{1+2^{k}\epsilon }}\right)$ -fraction of the trials (in expectation) will satisfy at least an $\textstyle \left(1-{\frac {1}{2^{k}}}-\epsilon \right)$ -fraction of the clauses. Therefore, for any positive $\textstyle \epsilon$ , it takes only a polynomial number of random trials until we expect to find an assignment satisfying at least an $\textstyle \left(1-{\frac {1}{2^{k}}}-\epsilon \right)$ fraction of the clauses.

A more robust analysis (such as that in ^[1]) shows that we will, in fact, satisfy at least a $\textstyle \left(1-{\frac {1}{2^{k}}}\right)$ -fraction of the clauses a constant fraction of the time (depending only on k), with no loss of $\textstyle \epsilon$ .

Derandomization[edit]

While the above algorithm is efficient, it's not obvious how to remove its dependence on randomness. Trying out all possible random assignments is equivalent to the naive brute force approach, so may take exponential time. One clever way to derandomize the above in polynomial time relies on work in error correcting codes, satisfying a $\textstyle \left(1-{\frac {1}{2^{k}}}\right)$ fraction of the clauses in time polynomial in the input size (although the exponent depends on k).

We need one definition and two facts to find the algorithm.

Definition[edit]

$S\subseteq \{0,1\}^{n}$ is an $ℓ$ -wise independent source if, for a uniformly chosen random $(x 1, x 2, ..., x n) \in S,$ $x 1, x 2, ..., x n$ are $ℓ$ -wise independent random variables.

Fact 1[edit]

Note that such an assignment can be found among elements of any $ℓ$ -wise independent source over n binary variables. This is easier to see once you realize that an $ℓ$ -wise independent source is really just any set of binary vectors over ${0, 1} n$ with the property that all restrictions of those vectors to $ℓ$ co-ordinates must present the 2^ℓ possible binary combinations an equal number of times.

Fact 2[edit]

Recall that BCH_2,m,d is an $[n=2^{m},n-1-\lceil {d-2}/2\rceil m,d]_{2}$ linear code.

There exists an $ℓ$ -wise independent source of size $O(n^{\lfloor \ell /2\rfloor })$ , namely the dual of a $BCH 2,log n, ℓ +1$ code, which is a linear code. Since every BCH code can be presented as a polynomial-time computable restriction of a related Reed Solomon code, which itself is strongly explicit, there is a polynomial-time algorithm for finding such an assignment to the x_i's. The proof of fact 2 can be found at Dual of BCH is an independent source.

Outline of the Algorithm[edit]

The algorithm works by generating $BCH 2,log n, ℓ +1$ , computing its dual (which as a set is an $ℓ$ -wise independent source) and treating each element (codeword) of that source as a truth assignment to the n variables in φ. At least one of them will satisfy at least $1 - 2 - ℓ$ of the clauses of φ, whenever φ is in kCNF form, $k = ℓ$ .

References[edit]

^ "Max-SAT" (PDF). Archived from the original (PDF) on 2015-09-23. Retrieved 2014-09-01.
^ Josep Argelich and Felip Manyà. Exact Max-SAT solvers for over-constrained problems. In Journal of Heuristics 12(4) pp. 375-392. Springer, 2006.
^ Jaulin, L.; Walter, E. (2002). "Guaranteed robust nonlinear minimax estimation" (PDF). IEEE Transactions on Automatic Control. 47 (11): 1857–1864. doi:10.1109/TAC.2002.804479.

External links[edit]

Coding Theory notes at MIT

[1] "Max-SAT" (PDF). Archived from the original (PDF) on 2015-09-23. Retrieved 2014-09-01.

[2] Josep Argelich and Felip Manyà. Exact Max-SAT solvers for over-constrained problems. In Journal of Heuristics 12(4) pp. 375-392. Springer, 2006.

[3] Jaulin, L.; Walter, E. (2002). "Guaranteed robust nonlinear minimax estimation" (PDF). IEEE Transactions on Automatic Control. 47 (11): 1857–1864. doi:10.1109/TAC.2002.804479.

[1]

[2]

[3]