A Practical Introduction to NP Completeness

by Michalis Faloutsos

Version 1.0
-----------------------------------------------
Why do we want to classify problems?
The main reason: we want to be able to show that a problem is "difficult" and
a polynomial algorithm is highly unlikely to exist. This way we can stop trying
to find such a solution.
A minor reason, scientists love classifying things.

Polynomial complexity
A problem has a polynomial algorithm means that it can be solved in polynomial time
with respect to the size of its input. For example, a weighted  connected graph is of size O(E)  in pairs
of edges representation: we need a file of E lines:  v_1    v_2    weight
Note, here  we consider that representing an integer takes constant number of bits.
We will say that a "problem can be solved in polynomial time" if there exists a polynomial algorithm
for that problem.
A term of endearment for these problems is "easy".
For example, Maximum Flow is a polynomial problem.

The  goal of this area of research revolves around the concept of polynomial complexity.

A difficult problem: is a problem that most likely does not have a polynomial algorithm.
The Steiner tree problem is such a problem.

Polynomial Reduction (intuitively)
A problem A can be  polynomially reduced to problem B if and only if,
 any instance of problem A, I_A  can be transformed to an instance of problem B, I_B,  in polynomial time,
and a solution of problem B can be transformed to a solution of problem A in polynomial time.
We denote:  if A to reduces to problem B, we can denote: A <= B.
Intuitively, we can say that B is more difficult problem than A, but it is slightly wrong.
The accurate expression is that B is at least as difficult as A.

Important observation: if B can be solved in polynomial time, then A can also be solved in polynomially time.
The complexity of A is bounded above by:
    O(transformation I_A -> I_B) + O(B) + O(transformation of the solution S_B -> S_A)

Intuitive definitions, P and NPC
Class P: contains all the easy  problems (can be solved in polynomial time)
Class NP-Complete (NPC): contains all the difficult problems (require exponential time from
all known algorithms).

General properties/facts of NPC problems
1. If an NPC problem has a polynomial algorithm, then all NPC problems have a polynomial algorithm.
2. It is highly unlikely that an NPC problem can be solved by a polynomial algorithm.
3. If you can solve an NPC problem polynomially, you will become instantly famous.

How do we deal with a problem
(in progress)

------------------------------------------------------------------

Versions of Problems
We have three version of problems. We consider problems where we have
one "cost" that we want to minimize.

Optimization: Given a problem instance, find the minimum cost solution.
Evaluation: Given a problem instance, find the cost of the min-cost solution.
Recognition: Given a problem instance and an integer L, is there a solution with
                         cost no grater than L?
 Note that the third version requires only a "yes-no" answer. This makes the discussion
easier, and we will focus on Recognition problems. Depending on assumptions about the
cost function, it can be shown that the three problem formulations are equivalent.
 

The NP Class of problems
    Intuitively, this is the class of problems that given a problem instance and guess for a solution, we can verify
    in polynomial time, whether the guess is a correct solution.

   A.  Polynomial verification of a solution
   For a recognition problem,  if we are given a guess of a solution we want to verify if this solution can help
   us answer the problem. If we can "doublecheck" that the guess is a solution in polynomial time,
   we say that we can verify the solution in polynomial time.
    Example,  is there a Steiner tree of cost less than L?
     If we guess a tree, we can verify polynomially if the tree is spanning the nodes we want, and if its
      cost is less or equal to L.
     Note that if the tree is of cost higher than L, then we can reject is as a solution, but we can  not say "no"
     to the problem: there may exist some other  tree with cost <= L.
   B. Non Deterministic Algorithms
     Imagine an algorithm for a recognition problem, which for a given instance and value L,  it somehow comes
     up with a guess and then checks to see  it the guess leads to a yes answer, i.e. if with the guess is really a solution
     of cost less than L.
      Therefore the algorithm can be decomposed in two parts: guessing and verifying.
       Let us call all the problem instances for which the answer is yes, yes-instance (there exists
       a tree of cost less than L in a Steiner problem).
 
        Assume that we are really good at guessing: if the problem is a yes-instance , we will find it with the
        first try.  Let us call this algorithm Lucky Non-deterministic algorithm. This way, if we are given
        a yes-instance, we can solve the problem in polynomial time:
                           guessing + poly. verification.
        However, if the problem is not a yes-instance, (there is no tree with cost less than L), our approach
        does not have a way to prove that there is no such solution. We can think of solutions that we will try
        to verify and they will turn out to be of higher cost than L.
 
 
    C. The NP class
       A problem belongs in the Non-determinsitic Polynomial (NP) class, if a yes-solution of the problem
      can be verified in polynomial time.

Now we can make some more rigorous definitions.

Definition of NPC
        A problem P is a NP-Complete problem if and only if:
        a. P is an NP problem
        b. Every NP problem A can be reduced to problem P.
Note that it is impossible to show that every problem in NP can be transformed to P,
 by doing all the reductions. Instead,  to prove that a problem is NPC, we only need to prove
the following equivalent statements:
         a. P is an NP problem.
         b. A known NPC problem, can be reduced to problem P.
How does this work? Given an NPC problem, A, we know that all NP problems, Z_i, reduce to it:
                             Z_i <= A.
If we can prove that A reduces to P:
                              A <= P
Then, we have proven that all NP problems can reduce to P:
                           Z_i <= P