CPS 430/542 Lecture notes:
Introduction to the Relational Database Model
Coverage: [FCDB] §§3.1-3.3 (pp. 61-82)
Introduction to the relational data model
- proposed by Edgar Codd in 1969
- centered around its primary element: the relation
- everything is a relation in this model
Why the relational model?
- clean separation between physical and logic layers;
ANSI Sparc architecture
- lead to powerful and declarative query languages such as SQL
- runaway success of simple theoretical concepts put into practice
- why not design using the relational model from the start?
- E/R good basis for design
- E/R not good basis for DML
- since the relational model has only
one main element (the relation) in contrast
to the E/R model (which has entity sets,
relationships, and so on), it is less
flexible for design
- these inflexibilities are best handled
after a design has been constructed
- relational model has its own design
theory (normalization)
- there are many normal forms
- depend on design goals
Overview
Overview of what we will study
- E/R (or ODL) → R
- R → better R (process called normalization)
- functional dependencies (generalize concept of key)
- design theory, normal forms
Essential elements of the relational model
From E/R diagrams to relations
(a relation can represent more than just entity sets)
- convert each entity set into a relation;
make all attributes of the entity set attributes of the relation
- convert each relationship into a relation
- make the key attributes of the participating
entity sets as well as the attributes of the
relationship attributes of the relation
(renaming attributes as appropriate)
- what to do with multiple roles?
Some impurities
- many-one relationships
(are source of consolidation)
- weak entity sets
- isa relationships
Combining relations
- many-one relationships
- relations for the many entity set [E] and
the many-one relationship <R> each should
have a key for [E] in their relation
- non-key attributes of [E], key attributes of [F],
and attributes of <R>, are uniquely determined
by the key of [E]
- combine
- all attributes of [E]
- the key attributes of [F]
- the attributes of <R>
(2) and (3) might be NULL if the one
in [F] does not exist!
- more efficient to answer queries involving
attributes from one relation than attributes
distributed across more than one relation
- example:
Players(ssn, name, no)
---
PlaysFor(ssn, tname)
Teams(name, city)
----
Playsfor(ssn, name, no, tname)
---
Teams(name, city)
----
why not add city to combined relation as well?
- risky
- source of redundancy
Converting weak entity sets
- incorporate all key attributes of
weak entity set [[E]] in relation for E
- no relation necessary for any supporting relationship
unless it has its own attributes
Eliminating relations
- what is a proper subset? proper superset?
- subset is necessary, but not sufficient
- IRS example: People(name, ssn) and
TaxPayers(name, ssn, amount)
- movies example: Stars(name, address) and
Studios(name, address)
- moral of the story: cannot consolidate blindly
ISA relationships
- why no relation necessary for isa relationship?
- 3 approaches
- E/R style: one relation per entity set
- OO style: one relation per sub-tree (including the root)
- closed form: 2n,
where n is number of nodes, excluding the root
- there are 4 sub-trees in the movies example
- can combine the first two,
though we loose some info, i.e., which movies are cartoons
- can combine the last two, though, again,
though we loose some info, i.e., which movies are cartoons
- NULLs approach: use only one big relation
- advantages and disadvantages of each approach?
- consider queries
- NULLs approach minimizes number of relations required to
produce an answer
- tradeoffs between E/R and OO
- number of relations (small number preferred)
- storage (prefer to minimize)
- 1 tuple per entity
- OO
- nulls, though the tuples might be long
- more than 1 tuple per entity:
E/R, but only key duplicated
Formula possible?
Can we develop a formula for the number
of relations necessary for an E/R diagram having
e entity sets and r relations?
- may need to partition e into weak and non-weak entity sets
- may need to partition r into supporting, isa, and
non-supporting-or-non-isa relationships
References
| [FCDB] |
J.D. Ullman and J. Widom. A First Course in Database Systems.
Prentice Hall, Upper Saddle River, NJ, Second edition, 2002.
|
|