W'05 cs141: Cs141 Home/Lecture4

/Lecture3 /Lecture5

logarithms, polynomials, exponentials, Moore's law, big-O notation, a trick for bounding sums, geometric sums

math for asymptotic analysis

(under construction)

Review of chapter 3.

polynomials:

: for example, n² + 3*n, or n^1/2 (square root of n)

For any polynomial p(n), multiplying n by a constant factor changes p(n) by at most a constant factor.

If your algorithm has worst-case running time bounded by a polynomial in the size of the input, that is good!

exponentials:

: b^k = b*b*b*...*b (k times).
: bⁱ * b^j = b^i+j
: (bⁱ)^j = b^ij
: b^-i = (1/b)ⁱ

Exponentials with base > 1 (e.g. 2^n) grow large very fast. Exponentials with base < 1 (e.g. 2^{-n}) grow small very fast.

For any exponential function f(n) (such as 2ⁿ), if you increase n by a constant factor (say a factor of 2) how much can the function increase by?

If your algorithm takes time exponential in the size of the input, it won't be useful for solving very large problems!

logarithms:

: log_b(n) is roughly the number of times you have to divide n by b to get it down to 1 or less.
: log₂(n) is proportional to log₁₀(n) --- the base of the logarithm doesn't matter (up to constant factors).
: log(a*b) = log(a) + log(b)

log(n) grows quite slowly. e.g. log₁₀(10¹⁰⁰) is only 100.

If you increase n by a constant factor (say a factor of 2) how much does log(n) increase by?

summations: ∑_i=a^b f(i)

: geometric sums such as ∑_i=0ⁿ 2ⁿ = 1+2+4+ ... + 2ⁿ are proportional to their largest term
: for other sums such as ∑_i=1ⁿ i², get upper and lower bounds neglecting constant factors.

Worst-case analysis of running times of algorithms

1. We try to bound the running time from above by a function of the size of the input.

For example, "to bubblesort an n-item list takes at most time proportional to n²". Here we take the number of items in the list as a measure of the input size.

"Euclid's algorithm takes at most time proportional to log(min(i,j))"? What is the size of the input? (Trick question. Generally, by the size of an input instance, we mean the number of characters it takes to write down the instance. So, the size of an integer n, in this sense is proportional to log(n).) Euclid's algorithm runs in polynomial time.

2. We usually neglect constant factors in the running time. ("the time is at most proportional to..." something.)

3. Running time is proportional to the number of basic operations made by the algorithm. What is a basic operation? Need to think about what machine has to do to support the operation. Basic arithmetic, array access, expression evaluation, etc.

Moore's law: Every few years, computers get faster by a constant factor.

If you get a computer that is faster by a constant factor than your old one, the new one can solve bigger problems than your old one. How much bigger? Depends on the algorithm:

: polynomial time --- constant factor increase
: exponential time --- bigger by an additive constant

Review of big-O notation

We've been using the phrase proportional to. To be more precise about what we mean, we will start using the big-O notation. Here we define it.

: big-O: Suppose f(n) and g(n) are two functions.
: Then f(n) is O(g(n)) if there exist positive constants n₀ and c such that
: for all integers n ≥ n₀ it is the case that f(n) ≤ c g(n).

Or, we might say that f(n) is at most proportional to g(n).

llustration: 3 (sin(x/10)+2) is O(x) (take n₀ = 100 and c=4. Also, x is O(3 (sin(x/10)+2))

For example, let's apply the definition to try to show 3n²+10 is O(n²).

We need to find constants n₀ and c such that (∀ n ≥ n₀) 3n²+10 ≤ c n². Let's work backwards to find the constants that will work.

We want 3n²+10 ≤ c n² for n ≥ n₀.

Simplifying and solving for c, we want c ≥ 3+10/n² for n ≥ n₀.

Let's try n₀ = 10. Since 10/n² gets smaller as n gets larger, we will have want we want provided c ≥ 3+10/100 for n ≥ 10. This is true for c = 4.

So, we've worked out that 3n² + 10 is O(n²). If we wanted to make a clean proof out of our reasoning, it might look like this:

: claim: 3n²+10 is O(n²)
: proof:
: (1) 10 ≤ n² for n ≥ 10 (trivially).
: (2) 3 n² + 10 ≤ 4 n² for n ≥ 10 (by adding n² to both sides of line (2)).
: (3) 3n² + 10 is O(n²) (from (2) and the definition of big-O, taking n₀ = 4 and c=4

Or, if we are confident of our algebra and confident that the intended reader of the proof can work it out without help, we could just say:

: claim: 3n²+10 is O(n²)
: proof: Verify using algebra from the definition of big-O, taking n₀ = 10 and c=4.

exercise: Using the definition, prove that 1000 n² is O(n¹0).

exercise: Using the definition (and contradiction), prove that n² is <i>not</i> O(n).

exercise: Define f(n) = 100. Prove that f(n) is O(1).

Rules of thumb:

: Any polynomial in n is O(n^k) provided every term in the polynomial has exponent k or less.
: (transitivity) If f(n) is O(g(n), and g(n) is O(h(n)), then f(n) is O(h(n)).
: (For more, see proposition 3.16 in the text.)

We also discussed:

: f(n) is Ω(g(n)) - f(n) is <i>at least</i> proportional to g(n). Equivalently, g(n) is O(f(n)).
: f(n) is Θ(g(n)) - f(n) is proportional to g(n). Equivalently, f(n) is O(g(n)) and f(n) is Ω(g(n)).
: ω(g(n)), θ(g(n))

bounding sums

Recall that ∑_i=1..n i² is 1+2²+3²+ ... + n².

This looks at first like a quadratic polynomial , so you might be tempted to say it is O(n²). But that's wrong. If you try to apply the definition of big-O, you will see why.

It is possible to use clever tricks to get "closed-form" expressions for sums such as this. For example, you may know that 1+2+3+...+n = n(n+1)/2.

But in this class, we generally are happy to get upper and lower bounds on the sum that are within constant factors of each other. Here's a useful trick for doing that:

Consider ∑_i=1..n i² is 1+2²+3²+ ... + n². To get the upper bound, note that there are n terms, and each term is at most n². Thus,

: ∑_i=1..n i² ≤ n*n² = n³.

To get the lower bound is a little trickier: note that the last n/2 terms are each at least (n/2)², so

: ∑_i=1..n i² ≥ (n/2)*(n/2)² = n³/8.

Thus, the value of the sum is between n³/8 and n³. We can conclude that

: ∑_i=1..n i² = Θ(n³)

This trick will give tight bounds (up to constant factors) for sums where the largest n/2 terms are proportional to the maximum term.

Geometric sums do not have this property. For example, consider the sum

: ∑_i=0..n 2ⁱ = 1 + 2 + 2² + 2⁴ + ... + 2ⁿ.

This is a geometric sum because the each term is a constant factor larger than the previous one.

The upper bound trick gives us that this sum is at most n2ⁿ, the lower bound trick gives us that the sum is at least n 2^n/2/2.

Note that 2^n/2 is MUCH SMALLER than 2ⁿ. To see this consider their ratio: 2ⁿ / 2^n/2 = 2^n-n/2 = 2^n/2. So, the lower bound is off. A better lower bound is simply to take the largest term:

: ∑_i=0..n 2ⁱ = 1 + 2 + 2² + 2³ + ... + 2ⁿ ≥ 2ⁿ.

A general principle is that any geometric sum is proportional to its largest term. We gave a clever "penny-shifting" argument in class that the number of interior nodes in a complete binary tree is one less than the number of leaves. That corresponds to

: ∑_i=0..n 2ⁱ = 2ⁿ⁺¹-1

Here is an algebraic argument that generalizes that "penny-shifting" argument. Fix any base b other than 1. Let

: X = 1 + b + b² + b³ + ... + bⁿ

Multiply both sides by b-1:

: X(b-1) = (1 + b + b² + b³ + ... + bⁿ)(b-1)

Expanding the right-hand side, we get

: X(b-1) = (b + b² + b³ + ... + bⁿ+bⁿ⁺¹) - (1 + b + b² + b³ + ... + bⁿ).

Note that all but two of the terms on the right cancel, so we get

: X(b-1) = bⁿ⁺¹ - 1.

Solving for X:

: X = (bⁿ⁺¹-1)/(b-1).

We conclude that

: ∑_i=0..n bⁱ = (bⁿ⁺¹-1)/(b-1).

So, if b is a constant >1, then

: ∑_i=0..n bⁱ = Θ(bⁿ),

If b is a constant < 1, then

: ∑_i=0..n bⁱ = Θ(1).

The punch line: a geometric sum has value proportional to its largest term.