Answers to "questions1" Analysis of Algorithms ---------------------- ----------------------- 1. A recursive design for the merge is: To merge a[1]...a[i] and a[i+1]...a[n] into b[1]...b[n] ------------------------------------------------------- if either sequence is exhausted, copy the other into b and return if a[1] < a[i+1] then b[1] := a[1] merge a[2]...a[i], a[i+1]...a[n] into b[2]...b[n] else b[1] := a[i+1] merge a[1]...a[i], a[i+2]...a[n] into b[2]...b[n] This is followed by copying b back into a. This recursive design easily yields a sequential one: To merge a[1]...a[i] and a[i+1]...a[n] into b[1]...b[n] ------------------------------------------------------- initialise x := 1, y := i+1, z := 1 while x <= i and y <= n do if a[x] < a[y] then b[z] := a[x] x := x + 1 else b[z] := a[y] y := y + 1 z := z + 1 copy any remaining items into b copy b back to a The inner part of the "while" loop takes O(1) time and is repeated at most once for each x from 1 to i, and once for each y from i+1 to n, making n times in all. Thus the while loop takes O(n) time, as does the copying of b into a, so the total is O(n). If we only have b[1]...b[i], then we can use the algorithm: copy a[1]...a[i] into b[1]...b[i] merge b[1]...b[i] and a[i+1]...a[n] into a[1]...a[n] The merging can be done by the algorithm above. The items in a[i+1]...a[n] will not be over-written before they are dealt with, since if k items remain in a[n-k+1]...a[n], then at most n-k items have been dealt with and have been copied into a[1]...a[n-k]. Clearly the algorithm is still O(n). We can always do the merging using only b[1]...b[n/2]. If i <= n/2, then we can use the above method with b[1]...b[i]. If not, then a[i+1]...a[n] contains <= n/2 items, so we copy a[i+1]...a[n] into b[1]...b[n/2], then move a[1]...a[i] into a[n-i+1]...a[n], and merge b[1]...b[n/2] and a[n-i+1]...a[n] into a[1]...a[n] as before. You were not expected to know it, but the problem can still be solved in O(n) time without using b at all - if you want to find the algorithm, try it first using b[1]...b[sqrt(n)]. 2. We can set b[i] to TRUE if i is in the set, and FALSE otherwise. Thus for n = 9 and set {1,2,4,5,7,8,9} we have: 1 2 3 4 5 6 7 8 9 b: TRUE TRUE FALSE TRUE TRUE FALSE TRUE TRUE TRUE To insert, delete or test i for membership, the operations are: insert i: b[i] := TRUE delete i: b[i] := FALSE if i in set then ...: if b[i] then ... which are clearly O(1). If a[1]...a[n] represents S, b[1]...b[n] represents T and c[1]...c[n] is to represent the union, then c is calculated by: To calculate union ------------------ for i := 1 to n do c[i] := a[i] or b[i] which is O(n). Equality is tested by: To test equality ---------------- for i := 1 to n do if not (a[i] = b[i]) then return FALSE return TRUE which is again O(n). 3. We use s as the size of the set. Then a[1]...a[s] contain a list of its elements in any order (with a[s+1]...a[n] arbitrary). The array b contains the sequence numbers of the elements in a[1]...a[s] as follows. If i is in the set, and appears in the list as a[k] = i, then b[i] = k. If i is not in the set, then b[i] is arbitrary. This representation can be initialised in O(1) time by setting s := 0, the entire contents of a and b being arbitrary. Insertion, deletion and membership testing can be carried out in O(1) time by the algorithms: To insert i ----------- a[s+1] := i b[i] := s+1 s := s + 1 To delete i ----------- k := b[i] (find out where it is in the list) last := a[s] (find last item) a[k] := last (reposition last item) b[last] := k s := s - 1 To test if i in set ------------------- pos := b[i] (attempt to find position in list) if (pos < 1) or (pos > s) (check pos was not arbitrary) then return FALSE if not (a[pos] = i) (check pos not accidentally in range) then return FALSE return TRUE If we want U to be the union of sets S and T represented as above, we can use the algorithm: To form union U of S and T -------------------------- U := S for each item i in T do if i not in S then insert i in U The first step "U := S" can be done in time O(n), the inside of the loop involves membership tests and insertions as above and can be done in O(1) time, and thus the loop takes O(n) time. Thus the total time taken is O(n), and in fact is proportional to the number of items in S plus the number of items in T. Equality can be tested by: To test equality ---------------- if sizes unequal then return FALSE for each item i in S do if i not in T then return FALSE return TRUE which similarly takes O(n) time (actually proportional to the set sizes). 4. Suppose the mode of a[1]...a[i-1] is a number appearing (say) 5 times. Then the mode of a[1]...a[i] will only be different if it contains a number appearing 6 times. The number must be a[i], and it must appear in the sequence a[i-5], a[i-4], ... , a[i]. Thus the mode changes only if a[i-5] = a[i]. This is one comparison. If the frequency of a[1]...a[i-1] was f, we test whether a[i-f] = a[i]. A recursive algorithm is thus: To find mode and frequency of a[1]...a[i] ----------------------------------------- find mode m and frequency f of a[1]...a[i-1] if a[i-f] = a[i] then return mode a[i] and frequency f+1 return m and f The time taken by the second and third lines is O(1), so if the total time is f(n), we have f(n) = f(n-1) + O(1) => f(n) = O(n) This can be turned into a non-recursive algorithm using a loop: To find mode and frequency of a[1]...a[n] ----------------------------------------- m := a[1] f := 1 for i := 2 to n do if a[i-f+1] = a[i] then m := a[i] f := f + 1 5. If the time taken by the Fibonacci algorithm is f(n), we have: f(n) > f(n-1) + f(n-2) This shows in particular that f(n) > f(n-1), and f(n-1) > f(n-2) etc. Thus f(n) > f(n-1) + f(n-2) > 2*f(n-2) Now applying this repeatedly gives: f(n) > 2*f(n-2) > 2^2 * f(n-4) > 2^3 * f(n-6) > 2^4 * f(n-8) .... Repeating this n/2 times gives f(n) > 2^(n/2) * f(0), i.e. f(n) is at least O(2^(n/2)). The time taken thus rises exponentially quickly, and the algorithm will only be of use for very small values of n. It is thus a very poor algorithm. A recursive algorithm to find both F(n) and F(n+1) is: To find F(n) and F(n+1) ----------------------- find F(n-1) and F(n) recursively calculate F(n+1) = F(n-1) + F(n) return F(n) and F(n+1) If the time for this is f(n), then f(n) = f(n-1) + O(1) giving f(n) is O(n). This is clearly far, far better thatn the previous algorithm. A non-recursive version is: To find F(n) ------------ x := F(0) y := F(1) for i := 2 to n do z := x + y x := y y := z return y 6. The letters to be permuted are held in l[1]...l[n] in alphabetical order. The current partial permutation is held in a[1]...a[k] and a boolean array b[1]...b[n] is used to indicate which letters appear in the partial permutation. The values b[i] are initially all FALSE. The algorithm is: To permute remaining k letters ------------------------------ if k=0 then print out the permutation a[1]...a[n] and return for i := 1 to n do if not b[i] then b[i] := TRUE a[n-k] := l[i] permute remaining k-1 letters b[i] := FALSE 7. Note that the last permutation of a set of letters such as ACDF consists of those letters in reverse alphabetical order, i.e. FDCA. If we examine a permutation like CAFD in which the A, F and D are about to change to give CDAF, this is because we have reached the last permutation of the letters FD (they are in reverse order), but we have not reached the last permutation of AFD (they are not in reverse order). We can tell this by finding the longest "suffix" which is in reverse order. Thus given a permutation a[1]...a[n] we can find the following one by: Find permutation following a[1]...a[n] -------------------------------------- find longest "suffix" a[k+1]...a[n] in reverse alphabetical order find item a[j] among a[k+1]...a[n] which is next in order after a[k] swap a[j] with a[k] (a[k+1]...a[n] will still be in reverse order) reverse the order of a[k+1]...a[n] The longest reverse suffix can be found by scanning a backwards from the end. The reversal in the last line can be done by swapping a[k+1] with a[n], a[k+2] with a[n-1] etc. Thus the whole process takes an amount of time proportional to the length of the suffix. The average time taken is the average length of the suffix. Now in 1/2 of all cases, the suffix will be of length 1 as a[n-1] will be less than a[n]. In 1/2 of the remaining cases (1/4 of the time), the suffix will be of length 2 as a[n-2] will be less than a[n-1]. Thus the average length is approximately: 1/2 * 1 + 1/4 * 2 + 1/8 * 3 + ... = sum of i/2^i = O(1) Thus the algorithm takes an average of O(1) time on a random permutation. This is an improvement over the previous algorithm which takes O(n) time to scan the array b even when k = 1, so takes at least O(n) time per permutation.