CPS 432/562 Homework #7 Solution Sketches


  1. (5 points) Exercise 20.5.1 (only part b) on p. 1087 from [DBCB]: What is the ratio of the size of CUBE(F) to the size of F if fact table F has ten dimension attributes, each with two different values.

      The ratio is (3/2)10 = 57.67.

  2. (5 points) Exercise 20.5.6 on p. 1088 from [DBCB]: Suppose that a cube has 10 dimensions, and each dimension has 5 options for granularity of aggregation, including `no aggregation' and `aggregate fully.' How many different views can we construct by choosing a granularity in each dimension?

      There are 510 possible views.

  3. (5 points) Exercise 20.5.7 on p. 1088 from [DBCB]: Show how to add the following time units to the lattice of Fig. 20.20: hours, minutes, seconds, fortnights (two-week periods), decades, and centuries.

  4. (35 points) For the dataset given here, compute all association rules with a confidence of 0.7 and a support of 0.1. Each row is a transaction of items. For full credit, show your code (and e-mail it to the instructor with a README describing how to run it) and the association rules mined.

      The following 16 association rules meet the specified constraints:

        f -> a
        g -> a
        s -> t
        s -> u
        s -> t,u
        s, t -> u
        s, u -> t
        t -> s
        t -> u
        t -> s,u
        t, u -> s
        v -> u
        z -> k
        p, r -> t
        p, t -> r
        r, t -> p

  5. (10 points) Required only for CPS 562 students
    Read the article titled `Mining the Web's Link Structure' by Chakrabarti et al in the section on data mining in the August 1999 issue of IEEE Computer (if you are seeking access from a computer outside of the UD network, you made need to may need to authenticate yourself here) and write a 1 page (11pt Times font, single-spaced, 1" margins) critical summary of it. Identify what you think is the single most striking contribution of this paper. To get you thinking in the right direction, strip away the mathematical details and try to hone in on the main idea that this paper articulates which had not been expressed until this point. Also, address the main difference between CLEVER and Google. Why didn't CLEVER take off the way Google did? This is a very interesting article; enjoy it.

      The main difference between Google and CLEVER is ... The greatest contribution of this paper is its view that the web contains two modes: hubs and authorities. Previously, the design of search engines proceeded with the assumption that all webpages were created equal.



Return Home