Again, we analyze things differently for recursive and iterative programs. For an iterative program, it's usually just a matter of looking at the variable declarations (and storage allocation calls such as malloc() in C). For instance, algorithm 2 declares only an array of n numbers. Analysis of recursive program space is more complicated: the space used at any time is the total space used by all recursive calls active at that time. Each recursive call in algorithm 1 takes a constant amount of space: some space for local variables and function arguments, but also some space for remembering where each call should return to. The calls active at any one time form a path in the tree we drew earlier, in which the argument at each node in the path is one or two units smaller than the argument at its parent. The length of any such path can be at most n, so the space needed by the recursive algorithm is again (some constant factor times) n. We abbreviate the "some constant factor times" using "O" notation: O(n).
It turns out that algorithm 2 can be modified to use a much smaller amount of space. Each step through the loop uses only the previous two values of F(n), so instead of storing these values in an array, we can simply use two variables. This requires some swapping around of values so that everything stays in the appropriate places:
Algorithm 3:
int fib(int n)
{
int a, b , c = 0;
a = b =1;
if(n <=2 ) return 1;
for(int i =3; i <= n; i++){
c= a + b;
a = b;
b = c;
}
return b;
}
Here c represents f[i], b represents f[i-1], and a represents f[i-2]. The two extra assignments after the sum shift those values over in preparation for the next iteration. This algorithm uses roughly 4n lines to compute F(n), so it is slower than algorithm 2, but uses much less space.
A problem with the analysis of the two algorithms above: what is a line of code? If I use whitespace to break a line into two, it doesn't change the program speed but does change the number of lines executed. And as mentioned before, if I buy a faster computer, it does change program speed but doesn't change the analysis.
To avoid extraneous details like whitespace and computer type, we use "big O" notation. The idea: we already write the times as a function of n. Big O notation treats two functions as being roughly the same if one is c times the other where c is a constant (something that doesn't depend on n). So for instance we would replace 3F(n)-2 by O(F(n)) and both 2n and 4n by O(n).
Formally, we say that
f(n)=O(g(n))
If there is some constant c such that f(n) <= c g(n)
It is true that 4n=O(n), but it is also true that n=O(4n). However note that this is not always a symmetric relation; n=O(F(n)) but it is not true that F(n)=O(n). In practice we will usually only use O notation to simplify formulas, by ignoring constant factors and other extraneous details.
What is the point of O notation? First, it makes life easier by allowing us to be less careful of all the fine details of an algorithm's behavior. But also, it allows us to compare two algorithms easily. Algorithm 2 and algorithm 3 are both O(n). According to the number of lines executed, one is twice as fast as the other, but this ratio does not change as a function of n. Other factors (like the amount of time needed to allocate the large array in algorithm 2) may mean that in actual time, the algorithms are closer to each other; a more careful analysis is needed to determine which of the two to use. On the other hand, we know that 4n is much better than 3F(n)-2 for any reasonable value of n -- but this doesn't depend on the factor of 4 in the 4n time bound, it would be the same for 7n or 12n. For larger and larger n, the ratio of n to F(n) gets very large, so that very quickly any O(n) will be faster than any O(F(n)). Replacing 4n by O(n) is an abstraction that lets us compare it to other functions without certain details (the 4) getting in the way.
Recursive powering
Algorithms 3 and 4 above aren't the best!
Here's a mathematical trick with matrices:
[ 1 1 ] n [ F(n+1) F(n) ]
[ 1 0 ] = [ F(n) F(n-1) ]
(You don't have to remember much linear algebra to understand this -- just the formula for multiplying two symmetric 2x2 matrices: [ a b ] [ d e ] [ ad + be bd + ce ]
[ b c ] [ e f ] = [ bd + ce be + cf ]
You can then prove the result above by induction: Let [ 1 1 ]
A = [ 1 0 ]
assume by induction that the equation above is is true for some n, multiply both sides by another power of A using the formula for matrix multiplication, and verify that the terms you get are the same as the formula defining the Fibonacci numbers.)
We can use this to define another iterative algorithm, using matrix multiplication. Although I will write this in C syntax, we are starting to get to pseudo-code, since C does not have matrix multiplication built in to it the way I have it written below. The following algorithm initializes a matrix M to the identity matrix (the "zeroth power" of A) and then repeatedly multiplies M by A to form the (n-1)st power. Then by the formula above, the top left corner holds F(n), the value we want to return.
Algorithm 4:
int fib(int n)
{
int M[2][2] = {{1,0},{0,1}}
for (int i = 1; i < n; i++)
M = M * {{1,1},{1,0}}
return M[0][0];
}
This takes time O(n) (so much better than algorithm 1) but is probably somewhat slower than algorithm 2 or algorithm 3. (The big O notation hides the difference between these algorithms, so you have to be more careful to tell which is better.) Like algorithm 3, this uses only O(1) space.
But we can compute M^n more quickly. The basic idea: if you want to compute e.g. 3^8 you can multiply 8 3's together one at a time (3*3*3*3*3*3*3*3) or you can repeatedly square: square 3^2 = 9, 9^2 = 3^4 = 81, 81^2 = 3^8 = 6561. The squaring idea uses many fewer multiplications, since each one doubles the exponent rather than simply adding one to it. With some care, the same idea works for matrices, and can be extended to exponents other than powers of two.
Algorithm 5:
int M[2][2] = {{1,0}{0,1}}
int fib(int n)
{
matpow(n-1);
return M[0][0];
}
void matpow(int n)
{
if (n > 1) {
matpow(n/2);
M = M*M;
}
if (n is odd) M = M*{{1,1}{1,0}}
}
Basically all the time is in matpow, which is recursive: it tries to compute the nth power of A by squaring the (n/2)th power. However if n is odd, rounding down n/2 and squaring that power of A results in the (n-1)st power, which we "fix up" by multiplying one more factor of A.
This is a recursive algorithm, so as usual we get a recurrence relation defining time, just by writing down the time spent in a call to matpow (O(1)) plus the time in each recursive call (only one recursive call, with argument n/2). So the recurrence is
time(n) = O(1) + time(n/2)
It turns out that this solves to O(log n). For the purposes of this class, we will use logarithms base 2, and round all logarithms to integers, so log n is basically the number of bits needed to write n down in binary. An equivalent way of defining it is the smallest value of i such that n < 2^i. But clearly if n < 2^i, n/2 < 2^(i-1) and conversely, so log n satisfies the recurrence log(n) = 1 + log(n/2). The recurrence defining the time for matpow is basically the same except with O(1) instead of 1. So the solution to the recurrence is just the sum of log n copies of O(1), which is O(log n).
If n is 1 billion, log n would only be 30, and this algorithm would be better than algorithms 2 and 3 in the same way that they are better than algorithm 1.
(This is actually somewhat cheating: to be able to use this for n equal to a billion you need to be able to write down the answer which will have O(n) digits, and you need to be able to store variables with that many digits. Manipulating such large numbers would take more like O(n) steps per operation, where here we are only counting one step per integer multiplication or addition. But even if you used a special library for dealing with large numbers, algorithm 4 would be much faster than the other ones.)
Actually you can get the original formula 1.618^n to work using a similar repeated squaring trick, also with time O(log n). So to tell which is better you have to be more careful and not just use O-notation -- dealing with an integer matrix is somewhat simpler than having to compute floating point square roots so it wins.
Which is the sort of comparison that analysis of algorithms is all about...