CPS 343/543 Lecture notes: Representation strategies
Coverage: [EOPL2] §2.1 (pp. 39-42) &
§§2.3-2.4 (pp. 55-68)
Data abstraction
- involves factoring a data structure into
- interface (which is implementation-neutral),
- implementation, and
- application (client code which is also implementation-neutral)
- if a data type is developed in this way, it is called an
abstract data type or ADT
- opaque vs. transparent data types
- Scheme does not provide direct support for creating opaque
data types (i.e., it is not built into the language)
- how about C, C++, or Java?
- ML has one of most elegant type systems in all of
programming languages
- what are the semantics of the static keyword in C?
- gives the variable or function it precedes internal linkage
- the variable is bound statically to a location
and that binding does not change during run-time
- but its lifetime is dynamic
- simple usage: a function which monitors the
number of times it is called
- static
keyword in C is analog of what in Scheme?
Example: non-negative integers
- representation-independent
- [.] means `representation of .'
- interface, implementation, and client
- what about possible choices for representations?
- unary representation: represent n by a list of n #t's
- others? twos-complement
Advantages to an ADT
- client code is completely independent from the representation
- and, therefore,
we are free to substitute any other implementation of the interface
without requiring modifications to the client
Choices of representation
- data structure representation (e.g., list)
- abstract syntax representation
- procedural representation (fundamentally different from prior two)
- others?
Environment
- an environment is a mapping which associates symbols
with values in a programming language implementation
- a symbol table is an example of an environment
- a symbol table is used in a compiler to associate variable
names with lexical address information
- an environment is a mapping (a set of pairs)
- domain: the finite set of Scheme symbols
- range: the set of all Scheme values
- interface and sample client code on p. 56
- constructors create: empty-env and extend-env
- observers extract: apply-env
- how might we represent an environment?
- as a list of lists
- as abstract syntax
- procedurally
Procedural representation
- since procedures are first-class objects in Scheme,
`it is often advantageous to represent data as procedures,
particular when the data type has multiple constructors,
but only a single observer' [EOPL2] p. 56
- note: this may seem like a non-intuitive use of procedures;
we usually do not think of a data as a program
- how can we represent an environment (which we
think of as a data structure) as a procedure?
- how about using a Scheme procedure which takes a
symbol and returns its associated value
- with such a representation, we can define the interface
procedurally
- implementation in Fig. 2.3 (p. 57)
- do you remember what empty-env
and extend-env procedures are called? think back to the
first day of class. closures
- this example brings us face to face with the fact that
a program is nothing more than data, and therefore a data
structure can be represented as a program
- very often the set of values
in a data type can be represented as a set of
- procedures
- how can we extract the interface for an ADT and the (procedural
representation) implementation from the client? see pp. 58
List of lists environment representation
- grammar and interpretation on p. 61
- sample client code and corresponding list representation on p. 62
- implementation on p. 62
- the list of lists is called ribs
- the car of each rib is a list of symbols
- the cadr of each rib is the corresponding list of values
- this is called the ribcage representation:
(regenerated from [EOPL2] Fig. 2.4 on p. 64)
- how can we make this representation more efficient?
- use a vector instead of a list
- lookup in a list (list-ref ...):
linear time (sequential access)
- lookup in a vector (vector-ref ...):
constant time (direct access)
- represent of a rib as a single pair (e.g.,
(((d x) 6 7) ((y) 8)))
- revised implementation on p. 63
- if lookup is based on lexical, rather
than symbolic, information
- we can eliminate the symbol lists, and
- represent environments a list of vectors (e.g.,
(#2(6 7) #1(8)))
- re-revised implementation on p. 63
Abstract syntax representation
(self-study)
- take expressions in the form as
(extend-env syms_n vals_n
...
(extend-env syms_1 vals_1
(empty-env)))
and define them using EBNF (a concrete syntax)
<env-rep> ::= ???
<env-rep> ::= ???
- represent that concrete syntax as an abstract syntax
- define the data type
- define the environment implementation
- see §2.3.3 (pp. 59-61) when done
Queue abstraction
- brings us closer to object-oriented programming (OOP)
- implementation in a purely functional setting would require
passing and returning queues from procedure to procedure
- an alternative is to used a queue shared among all
of the procedures
- this sounds imperative and it is
- we still want the representation of the queue to be hidden
- we can create an interface with procedures that will return each
of the operations that will act on the shared hidden state of the queue
- each of those returned procedures is a closure (recall analogy
to OOP)
- interface on p. 66
- single queue is simulated by two lists
- imperative features in the implementation:
- (set! ...) (Scheme's assignment statement; works via
side effect)
- (begin ...) creates statement blocks
- create-queue is the queue constructor; returns a
vector of queue operations
- implementation in Fig. 2.5 (p. 67)
- this queue is an object and the operations on it are
called methods
References
| [EOPL2] |
D.P. Friedman, M. Wand, and C.T. Haynes.
Essentials of Programming Languages.
MIT Press, Cambridge, MA, Second edition, 2001. |
|