Friday, September 28, 2012

Dynamic Programming Practice Problems : Strings and Interleaving

Dynamic Programming Practice Problems : Strings and Interleaving


For bit strings X = x1 . . . xm, Y = y1 . . . yn and Z = z1 . . . zm+n, we say that Z is an interleaving of X and Y if it can be obtained by interleaving the bits in X and Y in a way that maintains the left-to-right order of the bits in X and Y .

For example if X = 101 and Y = 01 then x1x2y1x3y2 = 10011 is an interleaving of X and Y , whereas 11010 is not. Give the most efficient algorithm you can to determine if Z is an interleaving of X and Y . Prove your algorithm is correct and analyze its time complexity as a function m = |X| and n = |Y |.

Solution :

Its a simple sober problem and one can easily write a program to solve it. However, the dynamic representation and proof is important in this case.

  • Let z(1), . . . , z(i+j) is an interleaving of x1, . . . , xi and y1, . . . yj for 0 < i <= m and 0 < j <= n. 
  • Let c[i, j] be true if and only if z(1) . . . , z(i+j) is an interleaving of x(1), . . . , x(i) and y(1), . . . , y(j) . 
  • If i = 0 then x(i) = (the empty string) and if j = 0 then y(j) = (the empty string).
  • The subproblem c[i, j] can be recursively defined as shown (where c[m, n] gives the answer to the original problem.
  • i starts from m and j starts from n. We start from the back of the interleaving string i,e, at  i+j position and moves towards the 0th position to see if all strings are tested concurrently and are empty.
  • First the case where i = j = 0 is when both X and Y are empty and then by definition Z (which is also empty) is a valid interleaving of X and Y .
  • If x(i) != z(i+j) and y(j) == z(i+j) then there could only be a valid interleaving in which xi appears last in the interleaving, and hence c[i, j] is true exactly when z1, . . . , z ( i+j−1) is a valid interleaving of x1, . . . , x(i−1) and y1, . . . yj which is given by c[i , j -1]. 
  • Similarly, when xi == zi+j and yj != zi+j then c[i, j] = c[i − 1, j].
  • If xi = yj = z(i+j), the interleaving (if it exists) must either end with xi (in which case c[i − 1, j] is true) or must end with yi (in which case c[i, j − 1] is true). Thus returning c[i − 1, j]  OR c[i, j − 1] gives the correct answer.
  • Finally, since in all cases the value of c[i, j] comes directly from the answer to one of the subproblems, we have the optimal substructure property.
  • The time complexity is clearly O(nm) since there are n ·m subproblems each of which is solved in constant time. 
  • Finally, the c[i, j] matrix can be computed in row major order.


PROOF

We now argue this recursive definition is correct. 

No comments:

Post a Comment