CPS 343/543 Lecture notes: Binding and scope
Coverage: [EOPL2] §1.3 (pp. 28-38)
Checkpoint
Maintain perspective: this is a course on the concepts of programming
languages.
What is a PL concept?
Our approach: study those concepts by building interpreters which implement
them in Scheme for languages.
Why Scheme?
- you do not know it and therefore will learn something new
- ideal vehicle to study programming language
concepts because it forces us to focus on
fundamental language concepts
- very simple and consistent, yet powerful language (see HW2...DSs <= 100 LOC)
Powerful languages are those which support the creation of new languages.
- LISP is a language for designing languages.
- XML is a language for designing languages (e.g., VoiceXML).
- Usually implies no syntactic distinction between programs and data
(so called homoiconic).
Binding times
- a useful concept for studying language concepts
- a mapping from representation → intended meaning
The times
of our lives (and their bindings)
- birth
- dob
- sex
- parents
- older siblings (if any)
- life
- death: beneficiary of a will
Times in the study of programming
languages
- language definition time:
- keyword int bound to meaning of integer
- N. Wirth (designer of Pascal): program mypgm () begin end
- language implementation time:
int datatype bound to a size (e.g., 4 bytes)
- compile time: identifer x bound to an integer variable
- link time: printf is bound to the definition
- load time:
- variable x bound to memory cell at address 7cd7
- could be at run-time as well (consider a variable local to a function)
- run-time: x bound to value 10
- some are static, some are dynamic
- static: before run-time, and unchangeable
- dynamic: at or during run-time, and modifiable
- earlier times imply
- safety
- reliability
- predictability, no surprises
- efficiency
- later times imply flexibility
- interpreted languages (e.g., Scheme): most bindings happen at run-time
- compiled languages (e.g., C, C++, FORTRAN): many bindings are static
Early vs. late binding
- humans analogy
- parents and birthday are bound statically (at birth)
- marital status and height are bound and re-bound
dynamically (throughout life)
- a programming language should be an algorithm for algorithm development,
rather than just a tool to implement an algorithm (recall oil
painting)
- do not get too wedded to a design or you will be
forced to use duck-tape to patch things later should the specifications
change (which they always do!)
(it's like a bad movie which never ends)
- now even Microsoft trying to get a piece of the pie with
F#
Bindings of variables
- references vs. declarations
- denotation
- a reference is bound to a declaration
- declarations have limited scope
- references are (statically or dynamically) bound to declarations (which have
limited scope)
- the scope of a declaration is the
region of the program (a range of statements)
where that variable is visible (i.e., can be referenced)
- local vs. nonlocal references
- binding rules or scope rules
specify to which declaration a reference is bound
- languages where that binding can be determined by analyzing the program
text are said to use static scoping
- languages where that binding cannot be determined until run-time
are said to use dynamic scoping
- binding rule for λ-calculus expressions ([EOPL2] Definition 1.3.1
on p. 29)
- qualifiers, concepts, and operators such as
private, public, friends, and the scope resolution
operator (::) in C++ give the programmer a
finer control over scope
Static scoping
- introduced in ALGOL 60
- local + ancestor blocks:
sometimes called lexical scoping
- declaration associated with a referenced variable
can be determined statically (i.e., before run-time)
- scope of a variable reference is the code
constituting its static ancestors
- advantages of static scoping
- readability
- predictability
- type checking/validation
- disadvantages of static scoping
- scope of a variable tends to be larger than necessary;
see [COPL9] p. 232
- sometimes leads to several globals or all subprograms
residing at the same level
Dynamic scoping
- scope determined `based on the calling sequence of
subprograms, not on their spatial relationship to each other'
[COPL9] p. 232; implies run-time
- used in McCarthy's original version of LISP as
well as APL and SNOBOL4
- Scheme, a popular dialect of LISP, adopted static scoping;
an example of mutation in the evolution of programming
languages
- Perl and COMMON LISP leave it up to the programmer
- example in Perl:
$l = 10;
$d = 10;
# reads an integer from standard input
$input = <STDIN>;
if ($input == 5) {
print "Before the call to sub1 -- l: $l, d: $d\n";
&sub1();
print "After the call to sub1 -- l: $l, d: $d\n";
} else {
print "Before the call to sub2 -- l: $l, d: $d\n";
&sub2();
print "After the call to sub2 -- l: $l, d: $d\n";
}
exit(0);
sub sub1 {
my $l; # only in this block (statically scoped)
local $d; # accessible to children (dynamically scoped)
$l = 5;
$d = 20;
print "Inside the call to sub1 -- l: $l, d: $d\n";
print "Before the call to sub2 -- l: $l, d: $d\n";
&sub2();
print "After the call to sub2 -- l: $l, d: $d\n";
}
sub sub2 {
print "Inside the call to sub2 -- l: $l, d: $d\n";
}
- advantages of dynamic scoping
- flexibility
- sometimes makes things easy (e.g., no need to pass parameters
if they are present in an outer scope)
- often parameters passed from one subprogram to
another are simply variables local to the caller
- disadvantages of dynamic scoping
- readability
- reliability
- type checking; can we use static type checking in a
dynamically scoped language?
- can be less efficient to implement than static scoping
- difficult to debug
- no locality of access
- no way to protect local variables
- subprograms are always executed in the environment
of all previously called subprograms which have not
yet completed their execution
- can have unintended consequences
Referencing environment
- the referencing environment is
the set of variables (and their bindings)
which are visible at any given point in a program
- examples from [COPL9] pp. 235-237
- scope and referencing environments are
inverses of each other
- scope(<declaration>) = {a set of program points}
- refenv(a program point) = {a set of variable bindings}
free or bound?
- for any programming language (see [EOPL2] Definition 1.3.2
on p. 29)
- value of an expression depends only its free variables
- value of an expression is independent of its bound
variables
- value of an expression with no free variables is fixed
- such expressions are called combinators
- for instance, identity function or application combinator
- for λ-calculus (see [EOPL2] Definition 1.3.3) on p. 31
- occurs-free? and occurs-bound? (see [EOPL2] Fig. 1.1
on p. 32)
Determining the declaration
associated with a reference
- notion of a block-structured language
- a block is a group of statements with associated
declarations (scope)
- sometimes involves nested subprograms
- Scheme, C, and Perl are each block-structured,
statically-scoped languages
- lexical binding
- scope of a variable declaration is the text within which
references to the variable refer to the declaration [EOPL2] p. 33
- scope is therefore a subset of the program
- one (inner) declaration may shadow
another (outer) declaration, or
- that the (inner) declaration creates a scope hole
in the other
- visibility
- procedure for determining the declaration to which a variable
reference is bound
- lexical depth; use zero-based
indexing
- declaration position; also use zero-based indexing
- can associate each variable reference with a (lexical depth,
declaration position) pair (i.e., (v: d p))
- lexical address makes variable name unnecessary
- replace formal parameter lists with their length
- identifiers are necessary for writing programs, but
unnecessary for executing them
- contour diagrams
Evolution of computer languages
Overview of lecture
You may not have realized it, but in learning
let, let*, and letrec,
you have been studying a concept called scope.
Identifiers may appear in two different contexts:
as references:
in (f x y), f, x, and y are references
as declarations:
in (lambda (x) ...) or (let ((x ...)) ...)
the occurrence of x is a declaration
The value named by an identifier is called its denotation.
Each reference is (statically or dynamically) bound to a declaration
(which has limited scope in most languages).
Languages have binding rules.
In Scheme, the relationship between a reference and its declaration
is a static property.
Static scoping: can determine scope by examining the text of the program.
Dynamic scoping: can only determine scope at run-time.
McCarthy's original version of LISP used dynamic scoping.
Perl and COMMON LISP let you choose the scoping method used per variable.
Perl and COMMON LISP let the programmer tune the scoping method used
for each variable.
Examples in Perl: dynamic.pl and dynamic2.pl.
Binding rule for lambda calculus: [EOPL2] p. 29.
free or bound (in general for any PL)?
((lambda (x) x) y) (x bound, y free)
(lambda (y)
((lambda (x) x) y))
(x and y now both bound)
The meaning of an expression with no free variables is fixed.
Lambda calculus expressions without
free variables are called combinators
and are useful programming tools.
;;; application combinator
(lambda (f)
(lambda (x)
(f x)))
free or bound in lambda calculus, [EOPL2] p. 31
occurs-free?
and occurs-bound? on [EOPL2] p. 32 implement those rules.
Relationship between references and declarations:
(lambda (x) ...)
(define x ...)
nesting
block-structured language
language rules: scoping rules
> (define x ; line 1
(lambda (x) ; line 2
(map
(lambda (x) ; line 4
(+ x 1)) ; line 5; reference x refers to declaration x on line 4
x))) ; line 6; reference x refers to declaration x on line 2
> (x '(1 2 3)) ; line 7; reference x refers to declaration x on line 1
(2 3 4)
scope of x on line 1 ? {line 7}
scope of x on line 2 ? {line 6}
scope of x on line 4 ? {line 5}
Scope of a variable declaration is the text within which references
to the variable refer to that declaration.
Scope of declaration v includes all references to v which occur free.
Bound references to v are shadowed by inner declarations of v.
Algorithm: search the regions enclosing the reference inside-out
(i.e., from the innermost block to the outermost block).
Lexical depth (use zero-based indexing):
(lambda (x y)
((lambda (a)
(x (a y))) ; line 3
x)) ; line 4
0: x on line 4 and a on line 3
1: x and y on line 3
Declaration position (use zero-based indexing)
Variable's lexical address: (v : d p)
(lambda (x y)
((lambda (a)
((x : 1 0) ((a : 0 0) (y : 1 1))))
(x : 0 0)))
Lexical address is all we need; (identifier) names are superfluous!
Formal parameter lists are replaced by their length.
(lambda 2
((lambda 1
((: 1 0) ((: 0 0) (: 1 1))))
(: 0 0)))
Lexically-bound identifiers are useful for writing and understanding
programs, but are unnecessary for executing programs.
References
| [COPL9] |
R.W. Sebesta.
Concepts of Programming Languages.
Addison-Wesley, Ninth edition, 2010. |
| [EOPL2] |
D.P. Friedman, M. Wand, and C.T. Haynes.
Essentials of Programming Languages.
MIT Press, Cambridge, MA, Second edition, 2001. |
|