REVIEW FOR CS141 FINAL EXAM Winter, 2009 The final exam will be comprehensive, covering all topics we have discussed in class; but more weight will be given to topics discussed after the midterm test. A rough division would be 50% of the exam will be on topics disscussed before the midterm and 50% on things covered after the midterm. The style of the final exam will be similar to that of the midterm. In the following we list all topics learned in the course. Some practice exams may be conducted in the AEW hosted by Robert, which are highly recommended. TOPICS COVERED -- before the midterm 1. Fundamentals of algorithms and analysis Concepts of algorithms, pseudocode, time complexity/efficiency, worst-case, average-case, asymptotic notation, big-O (for upper bound), big-Omega (for lower bound), Theta (for tight bound), manipulation of the notations, connection to limit. Math tools (mostly learned in Math 111): log/exp/poly functions, summation and their asymptotic expressions, max/min operator, arithmetic/geometric progression, recurrence relation and techniques for solving recurrences (backward substitution, the Master Theorem, linear homogeneous recurrences). How to use these concepts and tools in analyses. Main data structrues learned in CS14 and efficient use of them in algorithms, especially adjacency lists and adjacency matrix, trees and binary trees. Properties of graphs and digraphs: max number of edges in a graph/digraph, the Handshaking Theorem (relation between the sum of degrees and the number of edges), sparse graphs, rooted and unrooted trees (and their definitions), number of edges in a tree (|E| = |V| - 1), binary trees, complete binary trees, etc. 2. Brute force and exhaustive search algorithms Selection sort, bubble sort, sequential search, the naive string matching algorithm, closest pair, convex-hull, TSP, Knapsack, and assignment. The worst-case analyses of these algorithms. 3. Divide-and-conquer algorithms Mergesort, quicksort, binary tree traversals, fast multiplication of integers and matrices, closest pair and convex-hull revisited. The analyses of these algorithms and why they are more efficient than the brute force solutions. 4. Decrease-and-conquer algorithms DFS and BFS in graphs and directed graphs, topological sorting, Euclid's algorithm, selection, interpolation search. How would you solve the problem otherwise (without using the reduction/induction technique)? What data structures are used in the most efficient algorithms for DFS/BFS? TOPICS COVERED -- after the midterm 5. Transform-and-conquer algorithms The presorting (and in general preprocessing) technique, data structures to speed up dictionary operations such binary search trees and balanced binary search trees (in particular AVL trees), Horner's rule, counting the number of paths in a (di)graph. 6. Dynamic programming: recursion versus iteration, the basic framework (the steps) of dynamic programming, computing Fibanacci numbers and binomial ceoefficients (or counting combinations), knapsack, longest common subsequence, Warshall's algorithm for transitive closure, Floyd's algorithm for all-pairs shortest path, optimal binary search trees. Given a recurrence relation, can you derive a D.P. algorithm in pseudocode? How to use these methods to design efficient (new) algorithms to solve (simple) problems? 7. The greedy method: (combinatorial) optimization problem, the greedy strategy, change making, minimum spanning trees and Prim's and Kruskal's algorithms, Dijkastra's algorithm for single shortest path, Huffman code, how do you know if a greedy solution is optimal or not. 8. String matching: The concept of time-space trade-off (or the use of preprocessing), the brute-force algorithm, Horspool's algorithm, Boyer-Moore algorithm, the KMP algorithm and failure/prefix function, the worst-case analyses of KMP. 9. Lower bound techniques: decision tree and adversary arguments, reducibility, NP and NP-completeness. Note that several fundamental graph algorithms are included above such as DFS (and topological sort, checking acyclicity and connectivity), BFS, Warshall's algorithm, Floyd's algorithm, Prim's algorithm, Kruskal's algorithm, and Dijkatra's algorithm. For these algorithms, not only do you need know how they work and how fast they run, you should also know their assumed data structures (adjacency matrix vs adjacency lists) and how to use them to solve other problems. TYPES OF FINAL EXAM QUESTIONS * Short statements. What are the differences between an algorithm and a program? What factors determine the speed of a program? When is a divide-and-conquer algorithm very efficient and when is it not? What kind of strings would make the brute force string matching algorithm to run in its worst-case (i.e. quadratic) time? How would you represent a graph in order to make DFS run in linear time? What is backtracing in dynamic programming? Do greedy algorithms always give optimal soutions? If not, give an example where the greedy algorithm does not always give the optimal solution. Do Floyd's and Dijkstra's algorithms work on digraphs with negative weights? If not, can you find a counterexample? * Given a math function/summation, find its asymptotic bound in theta 3n^2 + 2nlog n + 3^n = ? n --- \ / i^2 + i log (i+1) = ? --- i=1 * Given a recurrence relation, solve it by finding its closed form f(n) = 2, if n <= 1 f(n) = f(n-2) + n, n > 1 * Given a recurrence relation, design an efficient dynamic programming algorithm Write a dynamic programming algorithm to compute f(n) above. var f[0..n]: array of integer; f[0] <- 2; f[1] <- 2; for i <- 2 to n do f[i] <- f[i-2] + i; output(f[n]); * Analyze the time complexity for a given algorithm (written in pseudocode) (a) x = 0; y = 1; for( i = 0; i < n; i++ ) for( j = 0; j < n; j++ ) x++; y = x * x; (b) x = 0; for( i = 0; i < n; i++ ) for( j = 0; j < n * n; j++ ) x++; (c) x = 0; for( i = 0; i < 2 * n; i++ ) for( j = 0; j < 3 * i; j++ ) x++; (d) x = 0; for( i = 1; i < n; i++ ) for( j = 0; j < i * log n; j++ ) x++; (e) x = 0; i = 1; while( i < n ) { i = 2 * i; x++; } (f) algorithm Test(A[1..n]); for i := 1 to ... for j := 1 to ... while .... call Test(A[i..j]); (g) Prove that in the Euclid's algorithm, the parameter n is reduced by at least a factor of two after two iterations. (h) Prove that DFS runs in time O(|V|+|E|) when the input graph is represented as adjacency lists and in time O(|V|^2) when the input graph is represented as an adjacency matrix. * Run an existing/known algorithm on a given instance Apply the divide-and-conquer integer multiplication algorithm on two input integers Apply the Euclid's algorithm on two input integers Run the topological sorting algorithm on some DAG Trace the KMP algorithm on a given text and pattern Trace the greedy Huffman code algorithm Trace Dijktra's algorithm on a given digraph of 5 vertices and show the d_u values and p_u values after each iteration Given the adjacency matrix of a digraph, compute its transitive closure using Warshall's algorithm * Design an efficient (new) algorithm in pseudocode for a given (simple) problem, perhaps with a target efficiency/running time. 1) Write a brute force algorithm for finding a Hamiltonian circuit. 2) Write an algorithm to sort 4 keys in 5 comparisons. (hint: divide-and-conquer) 3) Given two sequences A and B, write an algorithm to decide if A is a subsequence of B in O(|B|) time. E.g. 32123 is a subsequence of 123123123123, but 321321 is not. (hint: perform a sequential search of the letters of A in B. Note that the resultant algorithm is in fact greedy.) Answer: Suppose A = a_1 ... a_n, B = b_1 ... b_m j <- 1; for i <- 1 to n do while b_j <> a_i and j <= m do j++; if j <= m then output true; 4) Given two sets of n integers A and B, determine if A is equal to B. Can you do this in O(n*log n) time? Answer: Heapsort/Mergesort A and B, and compare the sorted lists sequentially as in the Merge procedure. 5) Let A = a_1, a_2, ..., a_n and B = b_1, b_2, ..., b_n be two lists of integers, x another integer. Write an efficient algorithm to decide if x = a_i + b_j for some i and j. Sure you can do it in O(n^2) time. But can you do it in O(n*log n) time? (Hint: Define another list B' = x - b_1, x - b_2, ..., x - b_n. Sort both A and B', and then compare sequentially. This is a good example of the transform and conquer paradigm.) 6) Levitin, p. 171, Question 8 (using BFS to check if a graph is bipartitte or 2-colorable). 7) Levitin, p. 187, Question 3 (3-way algorithm to find a fake coin). 8) Given two graphs G_1 = (V_1,E_1) and G_2 = (V_2,E_2), determine if G_1 is a subgraph of G_2. Suppose G_1 and G_2 are represented as adjacency matrices, can you do this in O(|V_2| + |V_1|^2) time? Answer: If V_1 is contained in V_2 then check if for every u,v \in V_1 A_1[u,v] = 1 => A_2[u,v] = 1 Suppose G_1 and G_2 are represented as adjacency lists, how fast can you do this? Answer: If V_1 is contained in V_2 then for every u \in V_1 check if Adj_1[u] is contained in Adj_2[u] O(|V_2|) + O(|E_2|) time if the lists are sorted. 9) Write an O(|V|+|E|) time algorithm to determine if a given digraph is in fact a DAG. (hint: perform DFS and watch for back edges) 10) Given a weighted graph G = (V,E) and edge e in E, check if e appears in some MST for G, i.e. check if {e} is expandable to some MST. (hint: run either Kruskal's (or Prim's) algorithm from scratch or with A = {e} as the initial set, and then compare costs.) 11) Given a digraph G = (V,E), compute the transpose of G. (The transpose is defined as the digraph obtained by reversing every edge of G.) How would your algorithm change if G is represented as adjacency lists or an adjacency matrix? 12) Levitin, p. 106, Question 9. (Counting substrings starting with A and ending with B.) 13) Levitin, p. 171, Questions 6 and 7. (Does DFS or BFS find a cycle faster than the other?) 14) Levitin, p. 292, Questions 2 and 9. (A counter example for Floyd's algorithm on graphs with negative weights.) 15) Levitin, p. 327, Questions 4 and 7. (Is the Dijkstra tree an MST? Can you design a linear time algorithm for single-source shortest paths on DAGs? Hint: Grow the tree in the topologically sorted order.) 16) Questions in the homeworks and midterm. How? Use design techniques learned, your basic training in math and programming (sometimes common sense), work on a few small instances to develop intuition before you start pseudocoding * Use fundamental algorithms learned in class to design new algorithms (here you don't have to explain how the fundamental algorithms work and you can just refer to them as subroutines) 1) Use DFS to determine if a given graph is connected. 2) Use BFS to determine if a given graph is acyclic. 3) Use Warshall's algorithm to determine if a digraph is strongly connected. (Note: a digraph is strongly connected iff for every pair of vertices u,v, there is a path from u to v and there is a path from v to u.) Answer: Run Warshall's algorithm to obtain the transitive matrix T Check if for every u,v, T[u,v] and T[v,u] = 1 Time: O(V^3) + O(V^2) = O(V^3)