CPS 430/542 Project #1

Coverage: [FCDB] §§3.4-3.5 (pp. 82-102)
Assigned: October 3
Due: October 31 (extended to November 7), 4:30pm, in class

Problem

(100 points) Write a program (in any language) which reasons with FD's. Your program must:

  • compute the closure of a set of FD's,
  • determine if one set of FD's entails the other and vice versa, and
  • compute all minimal bases from a full set of FD's.

Input

The first line of input will define the relation schema. It will be a string of characters (each corresponding to an attribute), with no leading or trailing whitespace. Each line after the first will contain one FD with a single space before and after the arrow (->), with no leading or trailing whitespace, and no extraneous whitespace anywhere else in the line. For instance, the following stream corresponds to the input from [FCDB] exercise 3.5.1(a), p. 100:

ABCD
AB -> C
C -> D
D -> A
Do not assume that i) the attributes in the relation schema will be sorted, ii) the set of FD's will be sorted by first attribute on each lhs, and iii) the attributes on each side of the arrow in an FD will be sorted. In short, make no assumptions about the input beyond those given here.

Output

When computing the closure of a set of FD's, simply write only the completely nontrivial FD's which follow from the given set, one per line, with no leading or trailing whitespace, using the input format. In addition, write each FD with its lhs and rhs sorted in ascending order, and write the closed set of FD's sorted in ascending order based on the first attribute on the lhs of each FD (progressively moving right to break ties). By default, only write FD's whose rhs is maximally-expanded (we will use a command-line option for printing FD's with only a single attribute on the rhs). Do not print any messages, labels, or other data.

When determining if one set of FD's entails the other and vice versa, write only one of the following lines, if necessary (with no leading or trailing whitespace and no extraneous output):

F1 |= F2.
F2 |= F1.
F1 and F2 are equivalent.
You may assume that F1 is the set of FD's given first in the input stream.

When computing all minimal bases from a full set of FD's, write each minimal base using the same format used for writing a closed set of FD's and delimit each with a line containing only -- (with no leading or trailing whitespace).

Your output must not contain any blank lines.

Command-line options

Your program must support the following command-line options (with associated semantics given below):

  • -s: write FD's with only a single attribute on the rhs (e.g., print the A -> BCD FD as three individual FD's: A -> B, A -> C, and A -> D).
  • -2: indicates that the input contains two sets of FD's (each given using the format above, with no blanks lines between the two) and the output should be one of the three lines given above.
  • -m: indicates that the input is a full set of FD's for the given relation schema and that all minimal bases should be written.
If no command-line options are given, then simply print the closed set of FD's. The -s and -2 options are mutually exclusive; and the -2 and -m options are mutually exclusive. You may assume the input supplied for each test run will be consistent with the command-line options used in that run.

Sample runs

(in the following, $ is the prompt for input, fdinf is the executable, ^D (<crtl-D>) is the EOF marker on UNIX systems; note: ^Z (<crtl-Z>) is the EOF marker on Windows systems)
$ fdinf
ABCD
AB -> C
C -> D
D -> A
^D
AB -> CD
ABC -> D
ABD -> C
AC -> D
BC -> AD
BCD -> A
BD -> AC
C -> AD
CD -> A
D -> A
$ fdinf -s
ABCD
AB -> C
C -> D
D -> A
^D
AB -> C
AB -> D
ABC -> D
ABD -> C
AC -> D
BC -> A
BC -> D
BCD -> A
BD -> A
BD -> C
C -> A
C -> D
CD -> A
D -> A
$ fdinf -2
ABC
A -> B
B -> C
--
A -> B
A -> C
^D
F1 |= F2.
$ fdinf -2
ABC
A -> B
B -> C
--
A -> B
AB -> C
^D
F1 |= F2.
$ fdinf -2
ABC
A -> B
A -> C
--
A -> B
AB -> C
^D
F1 and F2 are equivalent.
$ fdinf -2
ABC
A -> B
--
B -> C
^D
$ fdinf -m
ABC
B -> AC
A -> B
C -> B
A -> C
C -> A
AB -> C
AC -> B
BC -> A
^D
A -> BC
B -> A
C -> A
--
A -> B
B -> AC
C -> B
--
A -> C
B -> C
C -> AB
--
A -> B
B -> C
C -> A
--
A -> C
B -> A
C -> B
$ fdinf -m -s
ABC
B -> AC
A -> B
C -> B
A -> C
C -> A
^D
A -> B
A -> C
B -> A
C -> A
--
A -> B
B -> A
B -> C
C -> B
--
A -> C
B -> C
C -> A
C -> B
--
A -> B
B -> C
C -> A
--
A -> C
B -> A
C -> B
$

Additional requirements

  • Your program must read from standard input and write to standard output. Do not open any files in your program.
  • Submit your program as a single source code file. You are welcome (and encouraged) to develop your program in multiple source files, but for purposes of simplicity, submit only one source file.
  • Include comments in your code to clarify your approach
  • Follow the CPS 444 programming style guide (which is tailored to C, but generally applicable to any language).
  • In the comment header of your program, indicate which compiler or interpreter must be used to compile/interpret your program. Give the complete name and version number as well as web link if available.
  • E-mail only your source code file as p1.(use an extension appropriate for the programming language used) to the instructor by 4:30pm.
  • Turn in a pretty-printed hard copy of your source code (preferably using a2ps) in-class at 4:30pm.
  • The hard-copy of your source code which you submit must have be generated from the source code file which you e-mail to the instructor.
Any student who submits without following these requirements will be assessed a 10% penalty. Ninety percent of your score will come from correctness and 10% from following the programming style guide. Applicable submission/late penalties will then be assessed. No credit will be given to i) a program which does not compile without warnings/errors, ii) a program which produces a run-time error (e.g., core dump or segmentation fault), or iii) a program which fails to terminate normally.

Test data

The exercises for [FCDB] §3.5 (pp. 100-102) provide copious test cases for this program.

Hint

Use of an appropriate programming language will significantly simply the implementation of this program. PROLOG is an appropriate language for this project since it comes with a built-in reasoning and inference engine. SWI-PROLOG is installed on the CPS computer lab network and is also available for free download. Using PROLOG will not only help you learn a new style of programming (i.e., declarative/logic programming), but will also give you a head start to learning Datalog, a query language, with similar syntax and semantics, we will study in the next module of the course. If you decide to use PROLOG, start from template.pl. We have also rendered the test cases in PRLOG (testcases.pl) for your convenience. Please note PROLOG is not required for this project (it is sufficient, but not necessary). You are strongly advised to use whatever programming language you are most comfortable with.



Return Home