CPS 432/562 Project #1Coverage: [DBCB] Chapter 16, §16.6, pp. 847-858Assigned: February 8 Due: February 22, 4:30p, in class Assume that you are given n relations: R1, R2, ..., Rn. Each of the Ri's has only two attributes. The second attribute of a relation is the same as the first attribute of the next relation. For instance, R1 could be R1(X,Y), then R2 would be R2(Y,Z), R3 is R3(Z,L) and so on. Thus, the total number of distinct attributes in all the relations is n+1. Write a program (in any programming language) which determines an optimal join strategy for computing R1 R2
...
Rn.
Assume that all relations are of size 1000.
The program will accept from standard input
the value count information, e.g., V(R,a),
for only the common
attributes.
The number of Vi's needed is 2(n-2) + 2.
For example, one invocation of your program could be:
$ order_joins 200 100 500 20 200 100 500 20 50 1000The first test case attempts to do a join of three relations: R(X,Y), S(Y,Z), T(Z,L), where T(R) = T(S) = T(Y) = 1000, and V(R,Y) = 200, V(S,Y) = 100, V(S,Z) = 500, and V(T,Z) = 20. Restrict your answer to only left-deep joins. Your program should output two groupings:
Dynamic Programming Ordering: ((R2 join R3) join R1) Greedy Ordering: ((R2 join R3) join R1) Dynamic Programming Ordering: (((R3 join R4) join R2) join R1) Greedy Ordering: (((R3 join R4) join R2) join R1)In this case, the results are the same, but you cannot generally assume that they will be identical. We will not provide any test data nor tell you how many n's will appear. Feel free to share test cases. You may have some surprises when you go from three relations to four relations. Requirements
HintUse of an appropriate programming language can simply the implementation of these algorithms. I advise you to use a language with good capabilities for list/matrix processing (e.g., MATLAB, Lisp, or Python). You also may want to use an interpreted language, rather than one which is compiled, to help deal with debugging, testing, and incremental development. |