Given two sequences, the *shared subsequence*
is a sequence that is a subsequence of both sequences.

A *maximum-size shared subsequence* is a shared subsequence
of maximum size.

For example, the maximum-size shared subsequence of (3,2,8,2,3,9,4,3,9) and (1,3,2,3,7,9) is (3,2,3,9).

In this homework we will develop a dynamic programming algorithm to find a maximum-size shared subsequence of two subsequences (the two subsequences will be given as input).

1. What is the maximum shared subsequence of each of the following pairs of sequences?

1A: (10, 3,2,8,2,3,9,4,3,9) and (10, 1,3,2,3,7,9)

*10, 3, 2, 3, 9*

1B: (1, 3,2,8,2,3,9,4,3,9) and (10, 1,3,2,3,7,9)

*1, 3, 2, 3, 9*

2. Prove the following claims:

Given two sequences A[1..n] = (A[1], A[2],..., A[n]) and B[1..m] = (B[1], B[2], ...., B[m]), define MCS(A, n, B, m) to be the maximum size of any subsequence shared by A[1..n] and B[1..m].

2A: If A[n] = B[m], then MCS(A, n, B, m) ≥ 1 + MCS(A, n-1, B, m-1).

*
*

- Let S be a maximum common subsequence of A[1..n-1] and B[1..m-1], so that |S| (the size of S) is MCS(A, n-1, B, m-1).
- Let S' be the sequence containing S followed by A[n].
- Then S' is a common subsequence of A[1..n] and B[1..m].
- Thus, MCS(A, n, B, m) ≥ |S'| = |S| + 1 = 1 + MCS(A, n-1, B, m-1).

2B: In all cases, MCS(A, n, B, m) ≤ 1 + MCS(A, n-1, B, m-1).

*
*

- Let S be a maximum common subseqence of A[1..n] and B[1..m].
- If S is empty (of size 0), then clearly the inequality holds.
- Otherwise, let S' be the sequence obtained from S by removing the last element of S.
- Then S' is a common subsequence of A[1..n-1] and B[1..m-1].
- Thus, MCS(A, n-1, B, m-1) ≥ |S'| = |S|-1 = MCS(A,n,B,m) - 1.
- Rewriting gives MCS(A, n, B, m) ≤ 1 + MCS(A,n-1,B,m-1) .

2C: If A[n] = B[m] then MCS(A, n, B, m) = 1 + MCS(A, n-1, B, m-1). (You may use the facts you proved in 2A and 2B.)
*
*

- From 2A and 2B, it follows directly that:
- If A[n] = B[m] then MCS(A, n, B, m) ≤ 1 + MCS(A, n-1, B, m-1) and MCS(A, n, B, m) ≥ 1 + MCS(A, n-1, B, m-1).

- Thus,
- If A[n] = B[m] then MCS(A, n, B, m) = 1 + MCS(A, n-1, B, m-1).

--

2D: In all cases, MCS(A, n, B, m) ≥ max(MCS(A, n-1, B, m), MCS(A, n, B, m-1).

*
*

- Let S be a maximum common subsequence of A[1..n-1] and B[1..m].
- Then S is also a common subsequence of A[1..n] and B[1..m].
- Thus MCS(A,n,B,m) ≥ |S| = MCS(A, n-1, B, m).
- A similar argument shows MCS(A,n,B,m) ≥ MCS(A, n, B, m-1).
- Thus,MCS(A,n,B,m) ≥ max { MCS(A, n, B, m-1), MCS(A,n-1,B,m) }.

2E: If A[n] ≠ B[m], then MCS(A, n, B, m) ≤ max(MCS(A, n-1, B, m), MCS(A, n, B, m-1).

*
*

- Assume A[n] ≠ B[m].
- Let S be a maximum common subsequence of A[1..n] and B[1..m].
- Since A[n] ≠ B[m], either S does not end in A[n], or S does not end in B[m] (or both).
- Thus, either S is a subsequence of A[1..n-1], or S is a subsequence of B[1..m-1] (or both).
- Suppose S is a subsequence of A[1..n-1].
- Since S is also a subsequence of B[1..m], it follows that S is a common subsequence of A[1..n-1] and B[1..m].
- Thus, MCS(A,n,B,m) = |S| ≤ MCS(A, n-1, B, m).

- If S is not a subsequence of A[1..n-1], then S is a subsequence of B[1..m-1].
- In this case, a similar argument shows MCS(A,n,B,m) = |S| ≤ MCS(A, n, B, m-1).

- We conclude that either MCS(A,n,B,m) ≤ MCS(A, n-1, B, m)
- or MCS(A,n,B,m) ≤ MCS(A, n, B, m-1) (or both).
- Thus, MCS(A,n,B,m) ≤ max { MCS(A, n, B, m-1), MCS(A, n-1, B, m) }.

2F: If A[n] ≠ B[m], then MCS(A, n, B, m) = max(MCS(A, n-1, B, m), MCS(A, n, B, m-1). (You may use the facts you proved in 2D and 2E.)

*
*

- From 2D and 2E it follows that
- If A[n] ≠ B[m], then MCS(A, n, B, m) ≤ max{ MCS(A, n-1, B, m), MCS(A, n, B, m-1) } AND MCS(A, n, B, m) ≥ max(MCS(A, n-1, B, m), MCS(A, n, B, m-1).

- This is equivalent to
- If A[n] ≠ B[m], then MCS(A, n, B, m) = max{ MCS(A, n-1, B, m), MCS(A, n, B, m-1) }.

3. The facts proved in 2C and 2F lead to the following recursive algorithm to compute MCS(A, n, B, m):

MCS(A, n, B, m) if (n == 0 or m ==0) return 0; if (A[n] == B[m]) return 1+MCS(A, n-1, B, m-1); return max(MCS(A, n, B, m-1), MCS(A, n-1, B, m));

3A: Give the best big-Ω lower bound you can on worst-case running time of MCS(A, n, B, m) as a function of n and m. Explain your reasoning.

*
*

- In the case when A[i] \neq B[j] for all i, j, the recursion tree for MCS(A, n, B, m) has branching factor 2
- at every node at depth min(n, m) or less. Thus, the recursion tree contains at least 2
^{min(n,m)}nodes. - Thus, the worst-case running time is at least Ω(2
^{min(n,m)}).

3B: Give the best big-O upper bound you can on worst-case running time of MCS(A, n, B, m) as a function of n and m. Explain your reasoning.

*
*

- The recursion tree for MCS(A, n, B, m) has branching factor at most 2 at every node,
- and has depth at most n+m. Thus, the recursion tree contains at most 2
^{n+m}nodes. - Since O(1) work is done for each node in the recursion tree,
- the worst-case running time is at most O(2
^{n+m}).

4A: Precisely describe a faster algorithm (running in time O(n m)) for computing MCS(A, n, B, m).

*
*

- Modify the recursive algorithm above to cache answers.
- Alternatively, use dynamic programming to compute MCS(A, i, B, j) for 1 ≤ i ≤ n and 1 ≤ j ≤ m "bottom up".

4B: Explain why your algorithm is correct.

*
*

- The correctness of the algorithm follows from the recurrence relation proved in problem 2.

4C: Give the best big-O upper bound you can on the worst-case running time of your algorithm, in terms of n and m. Explain your reasoning.

*
*

- O(n m), because there are at most n*m distinct subproblems, and each subproblem requires O(1) work.

5. Implement your algorithm and use it to find the maximum size of any subsequence shared by the following two sequences:

int A[100] = {48, 29, 25, 7, 21, 32, 32, 13, 38, 16, 13, 29, 8, 28, 0, 21, 11, 27, 17, 44, 28, 10, 49, 23, 20, 33, 35, 40, 4, 15, 40, 34, 23, 40, 3, 39, 26, 45, 16, 23, 22, 39, 25, 32, 2, 34, 3, 46, 16, 19, 4, 25, 36, 14, 37, 30, 34, 49, 5, 9, 32, 19, 19, 6, 33, 9, 28, 32, 1, 29, 41, 42, 11, 12, 31, 13, 33, 5, 31, 6, 35, 10, 27, 36, 45, 48, 38, 5, 27, 21, 34, 23, 11, 20, 22, 25, 11, 44, 3, 32};

int B[100] = {33, 31, 9, 41, 49, 35, 12, 3, 43, 2, 47, 43, 11, 29, 11, 24, 4, 15, 28, 48, 3, 28, 9, 20, 10, 0, 1, 26, 35, 37, 48, 26, 32, 8, 14, 48, 9, 45, 16, 27, 13, 21, 6, 28, 36, 1, 16, 4, 41, 33, 49, 36, 20, 44, 46, 26, 36, 42, 22, 29, 29, 24, 30, 3, 20, 42, 3, 36, 14, 1, 44, 26, 35, 9, 47, 32, 43, 47, 29, 45, 36, 20, 0, 48, 10, 18, 40, 20, 41, 42, 11, 5, 30, 32, 46, 20, 38, 9, 19, 24};

You may use C++, Python, or Perl to implement your algorithm. Include a print-out of your algorithm and the length of the shared subsequence that it finds.

*
The answer I get is 21.
*

*
Here is my code:
*

#include <iostream.h> #define N 100 int A[N] = {48, 29, 25, 7, 21, 32, 32, 13, 38, 16, 13, 29, 8, 28, 0, 21, 11, 27, 17, 44, 28, 10, 49, 23, 20, 33, 35, 40, 4, 15, 40, 34, 23, 40, 3, 39, 26, 45, 16, 23, 22, 39, 25, 32, 2, 34, 3, 46, 16, 19, 4, 25, 36, 14, 37, 30, 34, 49, 5, 9, 32, 19, 19, 6, 33, 9, 28, 32, 1, 29, 41, 42, 11, 12, 31, 13, 33, 5, 31, 6, 35, 10, 27, 36, 45, 48, 38, 5, 27, 21, 34, 23, 11, 20, 22, 25, 11, 44, 3, 32}; int B[N] = {33, 31, 9, 41, 49, 35, 12, 3, 43, 2, 47, 43, 11, 29, 11, 24, 4, 15, 28, 48, 3, 28, 9, 20, 10, 0, 1, 26, 35, 37, 48, 26, 32, 8, 14, 48, 9, 45, 16, 27, 13, 21, 6, 28, 36, 1 , 16, 4, 41, 33, 49, 36, 20, 44, 46, 26, 36, 42, 22, 29, 29, 24, 30, 3, 20, 42, 3, 36, 14, 1, 44, 26, 35, 9, 47, 32, 43, 47, 29, 45, 36, 20, 0, 48, 10, 18, 40, 20, 41, 42, 11, 5, 30, 32, 46, 20, 38, 9, 19, 24}; int max(int i, int j) { return i > j ? i : j; } main() { int MCS[N+1][N+1]; int best = 0; int besti = 0, bestj = 0; for (int i = 0; i <= N; ++i) { MCS[0][i] = MCS[i][0] = 0; } for (int i = 1; i <= N; ++i) for (int j = 1; j <= N; ++j) if (A[i-1] == B[j-1]) { MCS[i][j] = 1 + MCS[i-1][j-1]; if (MCS[i][j] > best) { best = MCS[i][j]; besti = i; bestj = j; } } else MCS[i][j] = max(MCS[i-1][j], MCS[i][j-1]); cout << best << endl; cout << "ending at A[" << besti-1 << "] = " << A[besti-1] << ", B[" << bestj << "] = " << B[bestj-1] << endl; }

*
*