Analysis of Algorithms

Answers to "questions1"                               Analysis of Algorithms
----------------------                                -----------------------

1. A recursive design for the merge is:

      To merge a[1]...a[i] and a[i+1]...a[n] into b[1]...b[n]
      -------------------------------------------------------
      if either sequence is exhausted, copy the other into b and return
      if a[1] < a[i+1] then
         b[1] := a[1]
         merge a[2]...a[i], a[i+1]...a[n] into b[2]...b[n]
      else
         b[1] := a[i+1]
         merge a[1]...a[i], a[i+2]...a[n] into b[2]...b[n]

   This is followed by copying b back into a. This recursive design easily
   yields a sequential one:

      To merge a[1]...a[i] and a[i+1]...a[n] into b[1]...b[n]
      -------------------------------------------------------
      initialise x := 1, y := i+1, z := 1
      while x <= i and y <= n do
         if a[x] < a[y] then
            b[z] := a[x]
            x := x + 1
         else
            b[z] := a[y]
            y := y + 1
         z := z + 1
      copy any remaining items into b
      copy b back to a

   The inner part of the "while" loop takes O(1) time and is repeated at most
   once for each x from 1 to i, and once for each y from i+1 to n, making n
   times in all. Thus the while loop takes O(n) time, as does the copying of
   b into a, so the total is O(n).


   If we only have b[1]...b[i], then we can use the algorithm:

      copy a[1]...a[i] into b[1]...b[i]
      merge b[1]...b[i] and a[i+1]...a[n] into a[1]...a[n]

   The merging can be done by the algorithm above. The items in a[i+1]...a[n]
   will not be over-written before they are dealt with, since if k items remain
   in a[n-k+1]...a[n], then at most n-k items have been dealt with and have
   been copied into a[1]...a[n-k]. Clearly the algorithm is still O(n).


   We can always do the merging using only b[1]...b[n/2]. If i <= n/2, then
   we can use the above method with b[1]...b[i]. If not, then a[i+1]...a[n]
   contains <= n/2 items, so we copy a[i+1]...a[n] into b[1]...b[n/2], then
   move a[1]...a[i] into a[n-i+1]...a[n], and merge b[1]...b[n/2] and
   a[n-i+1]...a[n] into a[1]...a[n] as before.

   You were not expected to
   know it, but the problem can still be solved in O(n) time without using
   b at all - if you want to find the algorithm, try it first using
   b[1]...b[sqrt(n)].



2. We can set b[i] to TRUE if i is in the set, and FALSE otherwise. Thus
   for n = 9 and set {1,2,4,5,7,8,9} we have:

             1     2     3     4     5     6     7     8     9
      b:   TRUE  TRUE  FALSE TRUE  TRUE  FALSE TRUE  TRUE  TRUE

   To insert, delete or test i for membership, the operations are:

      insert i:                   b[i] := TRUE
      delete i:                   b[i] := FALSE
      if i in set then ...:       if b[i] then ...

   which are clearly O(1). If a[1]...a[n] represents S, b[1]...b[n] represents
   T and c[1]...c[n] is to represent the union, then c is calculated by:

      To calculate union
      ------------------
      for i := 1 to n do
         c[i] := a[i] or b[i]

   which is O(n). Equality is tested by:

      To test equality
      ----------------
      for i := 1 to n do
         if not (a[i] = b[i]) then return FALSE
      return TRUE

   which is again O(n).



3. We use s as the size of the set. Then a[1]...a[s] contain a list of its
   elements in any order (with a[s+1]...a[n] arbitrary). The array b contains
   the sequence numbers of the elements in a[1]...a[s] as follows.
   If i is in the set, and appears in the list as a[k] = i, then b[i] = k.
   If i is not in the set, then b[i] is arbitrary.

   This representation can be initialised in O(1) time by setting s := 0, the
   entire contents of a and b being arbitrary. Insertion, deletion and
   membership testing can be carried out in O(1) time by the algorithms:

      To insert i
      -----------
      a[s+1] := i
      b[i] := s+1
      s := s + 1

      To delete i
      -----------
      k := b[i]         (find out where it is in the list)
      last := a[s]      (find last item)
      a[k] := last      (reposition last item)
      b[last] := k
      s := s - 1

      To test if i in set
      -------------------
      pos := b[i]               (attempt to find position in list)
      if (pos < 1) or (pos > s) (check pos was not arbitrary)
         then return FALSE
      if not (a[pos] = i)       (check pos not accidentally in range)
         then return FALSE
      return TRUE

   If we want U to be the union of sets S and T represented as above, we
   can use the algorithm:

      To form union U of S and T
      --------------------------
      U := S
      for each item i in T do
         if i not in S then insert i in U

   The first step "U := S" can be done in time O(n), the inside of the loop
   involves membership tests and insertions as above and can be done in
   O(1) time, and thus the loop takes O(n) time. Thus the total time taken is
   O(n), and in fact is proportional to the number of items in S plus the
   number of items in T. Equality can be tested by:

      To test equality
      ----------------
      if sizes unequal then return FALSE
      for each item i in S do
         if i not in T then return FALSE
      return TRUE

   which similarly takes O(n) time (actually proportional to the set sizes).



4. Suppose the mode of a[1]...a[i-1] is a number appearing (say) 5 times.
   Then the mode of a[1]...a[i] will only be different if it contains a number
   appearing 6 times. The number must be a[i], and it must appear in the
   sequence a[i-5], a[i-4], ... , a[i]. Thus the mode changes only if
   a[i-5] = a[i]. This is one comparison. If the frequency of a[1]...a[i-1]
   was f, we test whether a[i-f] = a[i]. A recursive algorithm is thus:

      To find mode and frequency of a[1]...a[i]
      -----------------------------------------
      find mode m and frequency f of a[1]...a[i-1]
      if a[i-f] = a[i] then return mode a[i] and frequency f+1
      return m and f

   The time taken by the second and third lines is O(1), so if the total time
   is f(n), we have

      f(n) = f(n-1) + O(1)        =>         f(n) = O(n)

   This can be turned into a non-recursive algorithm using a loop:

      To find mode and frequency of a[1]...a[n]
      -----------------------------------------
      m := a[1]
      f := 1
      for i := 2 to n do
         if a[i-f+1] = a[i] then
            m := a[i]
            f := f + 1



5. If the time taken by the Fibonacci algorithm is f(n), we have:

      f(n) > f(n-1) + f(n-2)

   This shows in particular that f(n) > f(n-1), and f(n-1) > f(n-2) etc.
   Thus

      f(n) > f(n-1) + f(n-2) > 2*f(n-2)

   Now applying this repeatedly gives:

     f(n) > 2*f(n-2) > 2^2 * f(n-4) > 2^3 * f(n-6) > 2^4 * f(n-8) ....

   Repeating this n/2 times gives f(n) > 2^(n/2) * f(0), i.e. f(n) is at
   least O(2^(n/2)).


   The time taken thus rises exponentially quickly, and the algorithm
   will only be of use for very small values of n. It is thus a very
   poor algorithm.


   A recursive algorithm to find both F(n) and F(n+1) is:

      To find F(n) and F(n+1)
      -----------------------
      find F(n-1) and F(n) recursively
      calculate F(n+1) = F(n-1) + F(n)
      return F(n) and F(n+1)

   If the time for this is f(n), then f(n) = f(n-1) + O(1) giving f(n)
   is O(n). This is clearly far, far better thatn the previous algorithm.
   A non-recursive version is:

      To find F(n)
      ------------
      x := F(0)
      y := F(1)
      for i := 2 to n do
         z := x + y
         x := y
         y := z
      return y



6. The letters to be permuted are held in l[1]...l[n] in alphabetical order.
   The current partial permutation is held in a[1]...a[k] and a boolean array
   b[1]...b[n] is used to indicate which letters appear in the partial
   permutation. The values b[i] are initially all FALSE. The algorithm is:

      To permute remaining k letters
      ------------------------------
      if k=0 then print out the permutation a[1]...a[n] and return
      for i := 1 to n do
         if not b[i] then
            b[i] := TRUE
            a[n-k] := l[i]
            permute remaining k-1 letters
            b[i] := FALSE



7. Note that the last permutation of a set of letters such as ACDF consists
   of those letters in reverse alphabetical order, i.e. FDCA.
   If we examine a permutation like CAFD in which the A, F and D are about to
   change to give CDAF, this is because we have reached the last permutation
   of the letters FD (they are in reverse order), but we have not reached the
   last permutation of AFD (they are not in reverse order). We can tell this
   by finding the longest "suffix" which is in reverse order. Thus given
   a permutation a[1]...a[n] we can find the following one by:

      Find permutation following a[1]...a[n]
      --------------------------------------
      find longest "suffix" a[k+1]...a[n] in reverse alphabetical order
      find item a[j] among a[k+1]...a[n] which is next in order after a[k]
      swap a[j] with a[k]   (a[k+1]...a[n] will still be in reverse order)
      reverse the order of a[k+1]...a[n]

   The longest reverse suffix can be found by scanning a backwards from the
   end. The reversal in the last line can be done by swapping a[k+1] with
   a[n], a[k+2] with a[n-1] etc. Thus the whole process takes an amount of
   time proportional to the length of the suffix. The average time taken
   is the average length of the suffix.

   Now in 1/2 of all cases, the suffix will be of length 1 as a[n-1] will
   be less than a[n]. In 1/2 of the remaining cases (1/4 of the time), the
   suffix will be of length 2 as a[n-2] will be less than a[n-1].
   Thus the average length is approximately:

      1/2 * 1  +  1/4 * 2  +  1/8 * 3  + ...

      =  sum of  i/2^i

      =  O(1)

   Thus the algorithm takes an average of O(1) time on a random permutation.
   This is an improvement over the previous algorithm which takes O(n) time
   to scan the array b even when k = 1, so takes at least O(n) time per
   permutation.