CPS 430/542 Lecture notes: Joins
Coverage: [FCDB] §§5.2 (pp. 191-214),
6.2 (pp. 254-264), and 10.2 (pp. 471-480)
Legend: RA = relational algebra, DL = Datalog
assumes set semantics
Overview
- cartesian product (aka cross-join): ×
- theta-join: |><|C
- natural join: |><|
- outerjoin:
- full outerjoin: |>o<|
- left outerjoin: |>o<|L
- right outerjoin: |>o<|R
- join tuples
- (left and/or right) dangling tuples
- concept of NULL (⊥)
Cartesian product
- use it whenever you want to pair items
- RA: Actors × Actresses
(n*m tuples)
- SQL:
SELECT *
FROM Actors, Actresses;
- DL: Pairs(x,y,z,a,b,c) ← Actor(x,y,z) AND Actress(a,b,c)
sometimes requires renaming
- RA: Students × Students
(n*n tuples)
- RA: πi,id
((ρS(i,n) Students) × Students)
use AS keyword in SQL
- SQL:
SELECT S1.name, S2.name
FROM Students AS S1, Students AS S2;
- DL: Students2(s1,n1,s2,n2) ←
Students(s1,n1) AND Students(s2,n2)
how can we prune out duplicates?
Theta-join
Find names of all students
with gpa > 3.0
RA: πname (Students
|><|Students.id =
Grades.id AND
gpa > 3.0 Grades)
SQL:
SELECT name
FROM Students, Grades
WHERE Students.id = Grades.id AND gpa > 3.0;
DL: HighGPA(n) ← Students(s,n) AND Grades(s,g) AND g > 3.0
Natural join
- R |><| S = πL
(σC(R × S))
- Cartesian product, followed by a select on tuples which
agree on the common attributes, followed projecting out
one copy of each common attribute
find names of all students with gpa > 3.0
- RA: πname (σgpa > 3.0
(Students |><| GPA))
- SQL: same as above
- DL: same as above
References
| [FCDB] |
J.D. Ullman and J. Widom. A First Course in Database Systems.
Prentice Hall, Upper Saddle River, NJ, Second edition, 2002.
|
|