ALGORITHM ANALYSIS


Important point analysis The important point in the algorithm is how to take out the k items of each transaction itemset in an orderly fashion. Let the item number of the current transaction be n(n>k). Through combinatorics, we know that the formula for taking k items out of n is . Owing to k not being certain, the method should be realized by recursion, which has been reflected in the above function MSPP. If the function is correct, the formula for taking k items out of n by the function should also be C k h . The proof is as follows :

Proof:

Let g(n,k) be the number of k items taken from n by the function MSPP, then:

  1. g(n,1): Owing to jygs=n, k=1, then g (n, 1) =n= C 1 n

  2. g(n,2): Owing to jygs=n, k=2, then g (n, 2)=

  3. g(n,k ): Use induction

    Let i=k ˆ’ 1, g (n, k ˆ’ 1)=C k ˆ’ 1 n is tenable, then prove that when i=k, g (n, k)= C k n is tenable also.

    • The knowledge of combinatorics g (n, k) = g (n ˆ’ 1, k ˆ’ 1)+g (n ˆ’ 1, k) is tenable

    • C k n = C k ˆ’ 1 n ˆ’ 1 + C k n ˆ’ 1 is tenable

    • The supposition is correct; the proposition is proved.

Efficiency Analysis

Let there be m rows in the candidate k-itemsets, and take the k-itemsets of transaction in the transaction data set to match the candidate k-itemsets set.

When using the ordinary pattern match method, we must also use the sequential lookup method owing to having not one times compare with the candidate k-itemsets set. The sequential lookup method's average comparison time is (m+1)/2, and the expense of the comparison between the k-itemsets of transaction in the transaction data set and the candidate k-itemsets set is k(m+1)/2.

When using this pattern match method, the match objects fall into an alphabetic linear array after being disposed because of their own alphabetization. The bisearch method is chosen because of the characteristics of alphabetization and linearity . Its expense is log 2 m.

The expense can be known clearly from the above comparison.




(ed.) Intelligent Agents for Data Mining and Information Retrieval
(ed.) Intelligent Agents for Data Mining and Information Retrieval
ISBN: N/A
EAN: N/A
Year: 2004
Pages: 171

flylib.com © 2008-2017.
If you may any questions please contact us: flylib@qtcs.net