CPS 430/542 Lecture notes: Introduction to Query languages



Coverage: [FCDB] §§5.1-5.2 (pp. 189-214), 6.1 (pp. 239-254), and 10.1-10.2 (pp. 463-480)


We will study multiple query languages

  • Relational algebra ([FCDB] Ch5)
  • SQL (Structured Query Language; sequel; [FCDB] Ch6)
  • Datalog ([FCDB] Ch10)
  • OQL ([FCDB] Ch9)
(why study multiple langauges?)


Comparision of QLs

    QL paradigm implemented efficient recursion
    RA procedural no - no
    SQL declarative yes yes maybe
    DL declarative no - yes
    PROLOG declarative yes no (DFS) yes
    N/W procedural yes yes yes


Closure

  • closure property - a function is said to be closed if the result of the function can be feed back into the function, e.g.,
    • f(x)=x+1
    • int f(int) {...}
    • is compilation closed?
  • closure property in SQL: results of any query can be used in another query, e.g.,
    • subqueries
    • queries (SFWs) in the FROM clause


Relational algebra

  • originally 5 operators: ∪, -, ×, π, σ (why is ∩ missing?)
  • algebbraic expressions are composed with parentheses
  • projection (π)
  • selection (σ)
  • combining operations to form queries; 2 equivalent expressions
  • πlistcondition (relation))
  • linear notation for expressions (answer := ...)
  • use renaming (ρ) to avoid conflicts, e.g., ρS(a,b,c) (R)
  • not Turing complete


SQL

  • success of SQL due to its declarative nature and efficiency
  • more than just a query language (DDL + DML)
  • DDL (define relation schemas)
  • DML (query relation instances)
  • ANSI SQL-92 or SQL2
  • SQL-99 (previously SQL3) adds object-relation stuff and recursion
  • case insensitive, case-sensitive within '...'
  • there are some queries you cannot write in SQL
  • syntax of a query
    SELECT L(ist)
    FROM R(elation)
    WHERE C(ondition)
    
  • always start an SQL query with the FROM part
    1. FROM
    2. WHERE
    3. SELECT
  • SQL SELECT = RA project (π)
  • SQL WHERE = RA select (σ)
  • strings
    • 'literal'
    • (lexical) comparison < >
    • || is concatenation operator
    • regular expressions
      • LIKE
      • NOT LIKE
      • _ = . (any 1)
      • % = * (0 or more)
      • can specify your own escape character with ESCAPE
  • WHERE condition, same as C++, except <> = != (not equal)
  • renaming, e.g., title AS name, length AS duration
  • time and date datatypes
  • NULL are an impurity (or in other words a hack)
    • NULL has multiple semantics
    • NULL <arithmetic op> x = NULL
    • NULL <comparison op> x = UNKNOWN (use IS and IS NOT comparison operators instead)
    • NULL is not a constant and, thus, cannot be used explicitly as operand
  • UNKNOWN = 1/2
    • x AND y = min(x, y)
    • x OR y = max(x, y)
    • NOT x = 1-v
  • ORDERBY <list of attributes>
  • a1, DESC a2 ...


Datalog

  • resembles PROLOG (series of if-then rules); has slightly different semantics
  • has roots in relational/tuple calculus
  • relation = predicate (function which returns T or F)
  • atom = relation schema
  • extensional predicate (EDB)
  • intensional predicate (IDB)
  • more expressive than SQL2 ... SQL-99
  • relational atoms vs. arithmetic atoms
  • queries must be safe so that result is a finite relation, and so that rules with arithmetic subgoals or with negated subgoals make inuitive sense
  • safe query: every variable which appears anywhere (head, negated relation subgoal, or arthmetic subgoal) in the rule must also appear in some nonnegated, relation subgoal, e.g., LongMovie(t,y) ← Movie(t,y,l,_) AND l >= 100


Review of Properties of Query Languages

  • Relational algebra: procedural
  • SQL
    • declarative
    • implemented
  • Datalog
    • declarative
    • most expressive


References

    [FCDB] J.D. Ullman and J. Widom. A First Course in Database Systems. Prentice Hall, Upper Saddle River, NJ, Second edition, 2002.

Return Home