CPS 430/542 Lecture notes: Extended Operators

Coverage: [FCDB] §§5.3-5.4 (pp. 214-230), 6.1 (pp. 251-252), and 6.4 (pp. 277-285)

Relational operations on bags

∪, π, σ, ×, |><| no surprises

∩ min(m,n)

- max(0, n-m) think of tuples canceling each other out

Set vs. bag semantics in SQL

• SELECT-FROM-WHERE construct uses bag semantics by default
• insert DISTINCT between SELECT and attribute to force set semantics, e.g.,
```SELECT DISTINCT name
FROM Students
WHERE GPA > 3.0
```
• UNION, INTERSECT, and MINUS use set semantics by default
• bags converted to sets and then set operator is applied
• add ALL after (UNION | INTERSECT | MINUS) to force bag semantics, e.g.,
```(SELECT name FROM Students) UNION ALL (SELECT name FROM Faculty)
```

Other things to know

• which SQL operator has higher precedence: AND or OR?
• how can we alter precedence these operators in a query?
• INTERSECT and MINUS not implemented in mySQL; need to use a subquery (use postgreSQL as an alternative where AS keyword is required)
• see Ullman's notes on differences between mySQL and standard SQL

Relational algebra toolkit

(listed in decreasing order of abstraction/expressivity)
×, |><|C, |><|

π, σ, ρ, :=

∪, ∩, -,

Extended relational algebra (aggregation and grouping)

• δ is the duplicate elimination operator (= SELECT DISTINCT)
• γ is the grouping operator
• example (courtesy [FCDB] example 5.23, p. 225): consider relation schema StarsIn(title, year, starName) and query for each star who has appeared in a least three movies, the earliest year in which they appeared:

γstarName, MIN(year) → minYear, COUNT(title) → ctTitle (StarsIn)
• δ is a special case of γ. why?
• can view γ as an extension of the projection operator π on sets. why?
• γ sometime referred to as generalized projection
• extended projection
• τ is sort operator (results in a list, rather than a set)
• equijoin (!= theta-join) example

Final relational algebra toolkit

(listed in decreasing order of abstraction/expressivity)
τ

δ, γ

×, |><|C, |><|

π, σ, ρ, :=

∪, ∩, -,

Quotient operator

```1 2
3 4

×

a b
c d

=

1 2 a b
1 2 c d
3 4 a b
3 4 c d

div

1 2
3 4

=

a b
c d

--

1 2 a b
1 2 c d
3 4 a b

div

1 2
3 4

=

(a b) remainder (1 2 c d)
```
• when is quotient useful?
• often necessary when a query uses the term every
• consider finding all students who have fulfilled all of the graduation requirements
• consider Suppliers(sid), Parts(pid), and Catalog(sid, pid); find the sids of all supplier who supply every product, i.e., compute Catalog(sid,pid) / Parts(pid)
• implement the quotient operator using primitive relational algebra operators (see [FCDB] exercise 5.2.12 on p. 213)

SQL (and RA) aggregation operators

```SUM
AVG
MIN
MAX
COUNT
```
```COUNT(*)
COUNT(DISTINCT *)
```

Grouping in SQL

```GROUP BY
HAVING (a SELECT over the groups)
```
usually group for subsequent aggregation (through grouping is not always used in aggregations)
• grouping with aggregation (duplicates pruned)
• grouping without aggregation (duplicates retained

Ordering the output

```SELECT *
FROM Students
ORDER BY id;
```
default ordering is ASCending

use the keyword DESC for descending order

```SELECT *
FROM Students
ORDER BY gpa DESC;

SELECT name, SUM(length)
FROM Movie, MovieExec
WHERE producerC# = cert#
GROUP BY name
HAVING MIN(year) < 1980
ORDERBY name
ORDERBY SUM(length), DESC name;
```

SQL order of execution

(only first and last are required)
(HAVING illegal without a GROUP BY)
```FROM
WHERE
GROUP BY
HAVING
ORDER BY
SELECT
```

Writing queries

```SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY
```

[FCDB] example 5.23 in SQL

```SELECT starName, MIN(year)
FROM StarsIn
GROUP BY starName
HAVING COUNT(starName) >= 3;
```

References

 [FCDB] J.D. Ullman and J. Widom. A First Course in Database Systems. Prentice Hall, Upper Saddle River, NJ, Second edition, 2002.