Kruskal’s algorithm


In general, a sequence of instructions that solve all instances of a well-defined problem is called an algorithm. The algorithm used to find a minimum cost spanning tree in a graph is called greedy. We greedily choose the edges of the graph to build our tree hat are of least weight, as long as they do not create a circuit.

The Kruskal Algorithm starts with a forest which consists of n trees. Each and everyone tree,consists only by one node and nothing else. In every step of the algorithm, two different trees of this forest are connected to a bigger tree.

Therefore, we keep having less and bigger trees in our forest until we end up in a tree which is the minimum genetic tree (m.g.t.).

In every step we choose the side with the least cost, which means that we are still under greedy policy. If the chosen side connects nodes which belong in the same tree the side is rejected, and not examined again because it could produce a circle which will destroy our tree.

Either this side or the next one in order of least cost will connect nodes of different trees, and this we insert connecting two small trees into a bigger one.

  1.  

    Set i=1 and let E0={}

  2. Select an edge ei of minimum value not in Ei-1 such that Ti=<Ei-1 cup {ei} >is acyclic and define Ei=Ei-1 cup {ei}. If no such edge exists, Let T=<Ei>and stop.
  3. Replace i by i+1. Return to Step 2.

The time required by Kruskal’s algorithm is O(|E|log|V|).

Advertisements

Asymptote Analysis


Consider any mathematical function of  T(n) = O(f(n)), which after n exceeds some value cf(n) will exceed the value of T(n) for some constant c. Does this really mean anything useful? I might say in a correct or logic way that n2 + 2n = O(n25), though we don’t get a lot of information from that; n25 is simply too big. 

While anyone use Big Oh analysis, there it can be chosen the function f(n) to be as small as possible and still satisfy the criteria of the definition of Big Oh. Thus, it is more meaningful to affirm that n2 + 2n = O(n2); this tells us something about the growth pattern of the function n2 + 2n, namely that the n2 term will dominate the growth as n increases. The following functions are often encountered in computer science Big Oh analysis:

The growth patterns that follows have been listed in order of increasing “size.” That is,

                                                                                                             O(1), O(lg(n)), O(n lg(n)), O(n2), O(n3), … , O(2n).

     

  • T(n) = O(1). This is called constant growth. T(n) does not grow at all as any function of n, it is a constant, which is pronounced as “Big Oh of one.” For instance, array access has this characteristic. A[i] takes the same time independent of the size of the array A
  • T(n) = O(lg(n)). This is called logarithmic growth. T(n) grows proportional to the base 2 logarithm of n. Actually, the base doesn’t matter, it’s just traditional to use base-2 in computer science. Thus, it is pronounced “Big Oh of log n.” For instance, binary search has this characteristic. 
  • T(n) = O(n). This is called linear growth. T(n) grows linearly with n, which is pronounced “Big Oh of n.” For instance, looping all over the elements in a one-dimensional array would be an O(n) operation. 
  • T(n) = O(n log n). This is called “n log n” growth. T(n) grows proportional to n times the base 2 logarithm of n, which is pronounced “Big Oh of n log n.” For instance, Merge Sort has this characteristic. In fact, no sorting any algorithm that uses comparison between elements can be faster than n log n. 
  • T(n) = O(nk). This is called polynomial growth. T(n) grows proportional to the k-th power of n. We rarely consider algorithms that run in time O(nk) where k is greater than 5, because such algorithms are very slow. For instance, selection sort is an O(n2) algorithm. Thus, it is pronounced “Big Oh of n squared.” 
  • T(n) = O(2n) This is called exponential growth. T(n) grows exponentially, which is pronounced “Big Oh of 2 to the n.” Exponential growth is the most-feared growth pattern in computer science; algorithms that grow on this way are primordially useless for anything but very small problems.

Note that it is not true that if f(n) = O(g(n)) then g(n) = O(f(n)). The “=” sign does not mean equality in the usual algebraic sense—that’s why some people say “f(n) is in Big Oh of g(n)” and we never say “f(n) equals Big Oh of g(n).”

As always suppose that you have a choice of two approaches to writing a program. Both approaches have the same asymptotic performance (for example, both are O(n lg(n)). Why select one over the other, they’re both the same, right? They may not be the same.

Regarding that there is this small matter of the constant of proportionality. Suppose algorithms A and B have the same asymptotic performance, TA(n) = TB(n) = O(g(n)). Now suppose that A does ten operations for each data item, but algorithm B only does three. It is reasonable to expect B to be faster than A even though both have the same asymptotic performance. The reason is that asymptotic analysis ignores constants of proportionality.

As a quite specific example for this, let’s say that algorithm A is

 

{ set up the algorithm, taking 50 time units; read in n elements into array A; /* 3 units per element */ for (i = 0; i < n; i++) { do operation1 on A[i]; /* takes 10 units */ do operation2 on A[i]; /* takes 5 units */ do operation3 on A[i]; /* takes 15 units */ } } Let’s now say that algorithm B is

 

{ set up the algorithm, taking 200 time units; read in n elements into array A; /* 3 units per element */ for (i = 0; i < n; i++) { do operation1 on A[i]; /* takes 10 units */ do operation2 on A[i]; /* takes 5 units */ } }

Algorithm A sets up faster than B, but does more operations on the data. The execution time of A and B will be

 

                                                                                                            TA(n) = 50 + 3*n + (10 + 5 + 15)*n = 50 + 33*n

and

 

                                                                                                            TB(n) =200 + 3*n + (10 + 5)*n = 200 + 18*n

 

respectively. The following graph shows the execution time for the two algorithms as a function of n. Algorithm A is the better choice for small values of n. For values of n > 10, algorithm B is the better choice. Remember that both algorithms have time complexity O(n).

 

graph algorithm