CPS 343/543 Lecture notes: Representation strategies

Coverage: [EOPL2] §2.1 (pp. 39-42) & §§2.3-2.4 (pp. 55-68)

Data abstraction

  • involves factoring a data structure into
    • interface (which is implementation-neutral),
    • implementation, and
    • application (client code which is also implementation-neutral)
  • if a data type is developed in this way, it is called an abstract data type or ADT
  • opaque vs. transparent data types
  • Scheme does not provide direct support for creating opaque data types (i.e., it is not built into the language)
  • how about C, C++, or Java?
  • ML has one of most elegant type systems in all of programming languages
  • what are the semantics of the static keyword in C?
    • gives the variable or function it precedes internal linkage
    • the variable is bound statically to a location and that binding does not change during run-time
    • but its lifetime is dynamic
    • simple usage: a function which monitors the number of times it is called
    • static keyword in C is analog of what in Scheme?

Example: non-negative integers

  • representation-independent
  • [.] means `representation of .'
  • interface, implementation, and client
  • what about possible choices for representations?
    • unary representation: represent n by a list of n #t's
    • others? twos-complement

Advantages to an ADT

  • client code is completely independent from the representation
  • and, therefore, we are free to substitute any other implementation of the interface without requiring modifications to the client

Choices of representation

  • data structure representation (e.g., list)
  • abstract syntax representation
  • procedural representation (fundamentally different from prior two)
  • others?


  • an environment is a mapping which associates symbols with values in a programming language implementation
  • a symbol table is an example of an environment
  • a symbol table is used in a compiler to associate variable names with lexical address information
  • an environment is a mapping (a set of pairs)
    • domain: the finite set of Scheme symbols
    • range: the set of all Scheme values
  • interface and sample client code on p. 56
  • constructors create: empty-env and extend-env
  • observers extract: apply-env
  • how might we represent an environment?
    • as a list of lists
    • as abstract syntax
    • procedurally

Procedural representation

  • since procedures are first-class objects in Scheme, `it is often advantageous to represent data as procedures, particular when the data type has multiple constructors, but only a single observer' [EOPL2] p. 56
  • note: this may seem like a non-intuitive use of procedures; we usually do not think of a data as a program
  • how can we represent an environment (which we think of as a data structure) as a procedure?
  • how about using a Scheme procedure which takes a symbol and returns its associated value
  • with such a representation, we can define the interface procedurally
  • implementation in Fig. 2.3 (p. 57)
  • do you remember what empty-env and extend-env procedures are called? think back to the first day of class. closures
  • this example brings us face to face with the fact that a program is nothing more than data, and therefore a data structure can be represented as a program
  • very often the set of values in a data type can be represented as a set of
  • procedures
  • how can we extract the interface for an ADT and the (procedural representation) implementation from the client? see pp. 58

List of lists environment representation

  • grammar and interpretation on p. 61
  • sample client code and corresponding list representation on p. 62
  • implementation on p. 62
  • the list of lists is called ribs
    • the car of each rib is a list of symbols
    • the cadr of each rib is the corresponding list of values
  • this is called the ribcage representation:

    (regenerated from [EOPL2] Fig. 2.4 on p. 64)

  • how can we make this representation more efficient?
    • use a vector instead of a list
      • lookup in a list (list-ref ...): linear time (sequential access)
      • lookup in a vector (vector-ref ...): constant time (direct access)
    • represent of a rib as a single pair (e.g., (((d x) 6 7) ((y) 8)))
  • revised implementation on p. 63
  • if lookup is based on lexical, rather than symbolic, information
    • we can eliminate the symbol lists, and
    • represent environments a list of vectors (e.g., (#2(6 7) #1(8)))
    • re-revised implementation on p. 63

Abstract syntax representation

  1. take expressions in the form as
    (extend-env syms_n vals_n
          (extend-env syms_1 vals_1
    and define them using EBNF (a concrete syntax)
    <env-rep> ::= ???
    <env-rep> ::= ???
  2. represent that concrete syntax as an abstract syntax
  3. define the data type
  4. define the environment implementation
  5. see §2.3.3 (pp. 59-61) when done

Queue abstraction

  • brings us closer to object-oriented programming (OOP)
  • implementation in a purely functional setting would require passing and returning queues from procedure to procedure
  • an alternative is to used a queue shared among all of the procedures
  • this sounds imperative and it is
  • we still want the representation of the queue to be hidden
  • we can create an interface with procedures that will return each of the operations that will act on the shared hidden state of the queue
  • each of those returned procedures is a closure (recall analogy to OOP)
  • interface on p. 66
  • single queue is simulated by two lists
  • imperative features in the implementation:
    • (set! ...) (Scheme's assignment statement; works by side effect)
    • (begin ...) creates statement blocks
    • create-queue is the queue constructor; returns a vector of queue operations
  • implementation in Fig. 2.5 (p. 67)
  • this queue is an object and the operations on it are called methods


    [EOPL2] D.P. Friedman, M. Wand, and C.T. Haynes. Essentials of Programming Languages. MIT Press, Cambridge, MA, Second edition, 2001.

Return Home