CPS 343/543 Lecture notes: Binding and scope



Coverage: [EOPL2] §1.3 (pp. 28-38)


Checkpoint

Maintain perspective: this is a course on the concepts of programming languages.

What is a PL concept?

Our approach: study those concepts by building interpreters which implement them in Scheme for languages.

Why Scheme?

  • you do not know it and therefore will learn something new
  • ideal vehicle to study programming language concepts because it forces us to focus on fundamental language concepts
  • very simple and consistent, yet powerful language (see HW2...DSs <= 100 LOC)

Powerful languages are those which support the creation of new languages.

  • LISP is a language for designing languages.
  • XML is a language for designing languages (e.g., VoiceXML).
  • Usually implies no syntactic distinction between programs and data (so called homoiconic).


Binding times

  • a useful concept for studying language concepts
  • a mapping from representation → intended meaning


The times of our lives (and their bindings)

  • conception
    • sex
    • parents
    • older siblings (if any)
  • birth: dob
  • life
    • height
    • degree
  • death: beneficiary of a will


Times in the study of programming languages

  • language definition time:
    • keyword int bound to meaning of integer
    • N. Wirth (designer of Pascal): program mypgm () begin end
  • language implementation time: int datatype bound to a size (e.g., 4 bytes)
  • compile time: identifer x bound to an integer variable
  • link time: printf is bound to the definition
  • load time:
    • variable x bound to memory cell at address 7cd7
    • could be at run-time as well (consider a variable local to a function)
  • run-time: x bound to value 10

  • some are static, some are dynamic
    • static: before run-time, and unchangeable
    • dynamic: at or during run-time, and modifiable
  • earlier times imply
    • safety
    • reliability
    • predictability, no surprises
    • efficiency
  • later times imply flexibility
  • interpreted languages (e.g., Scheme): most bindings are dynamic (i.e., happen at run-time)
  • compiled languages (e.g., C, C++, FORTRAN): most bindings are static (i.e., happen before run-time)


Early vs. late binding

  • humans analogy
    • parents and sex are bound statically (at conception)
    • birthday is bound statically (at birth)
    • height is bound and re-bound dynamically (throughout life)
  • a programming language should be an algorithm for algorithm development, rather than just a tool to implement an algorithm (recall oil painting)
  • do not get too wedded to a design or you will be forced to use duck-tape to patch things later should the specifications change (which they always do!) (it's like a bad movie which never ends)
  • now even Microsoft trying to get a piece of the pie with F#


Bindings of variables

  • references vs. declarations
  • denotation
  • a reference is bound to a declaration
  • declarations have limited scope
  • references are (statically or dynamically) bound to declarations (which have limited scope)
  • the scope of a declaration is the region of the program (a range of statements) where that variable is visible (i.e., can be referenced)
  • local vs. nonlocal references
  • binding rules or scope rules specify to which declaration a reference is bound
  • languages where that binding can be determined by analyzing the program text are said to use static scoping
  • languages where that binding cannot be determined until run-time are said to use dynamic scoping
  • binding rule for λ-calculus expressions ([EOPL2] Definition 1.3.1 on p. 29)
  • qualifiers, concepts, and operators such as private, public, friends, and the scope resolution operator (::) in C++ give the programmer a finer control over scope


Static scoping

  • introduced in ALGOL 60
  • local + ancestor blocks: sometimes called lexical scoping
  • declaration associated with a referenced variable can be determined statically (i.e., before run-time)
  • scope of a variable reference is the code constituting its static ancestors
  • advantages of static scoping
    • readability
    • predictability
    • type checking/validation
  • disadvantages of static scoping
    • scope of a variable tends to be larger than necessary; see [COPL9] p. 232
    • sometimes leads to several globals or all subprograms residing at the same level


Dynamic scoping

  • scope determined `based on the calling sequence of subprograms, not on their spatial relationship to each other' [COPL9] p. 232; implies run-time
  • used in McCarthy's original version of LISP as well as APL and SNOBOL4
  • Scheme, a popular dialect of LISP, adopted static scoping; an example of mutation in the evolution of programming languages
    • Perl and COMMON LISP leave it up to the programmer
    • example in Perl:
      $l = 10;
      $d = 10;
      
      # reads an integer from standard input
      $input = <STDIN>;
      
      if ($input == 5) {
         print "Before the call to sub1 -- l: $l, d: $d\n";
         &sub1();
         print "After the call to sub1 --  l: $l, d: $d\n";
      } else {
          print "Before the call to sub2 -- l: $l, d: $d\n";
          &sub2();
          print "After the call to sub2 --  l: $l, d: $d\n";
         }
      
      exit(0);
      
      sub sub1 {
         my $l; # only in this block (statically scoped)
      
         local $d; # accessible to children (dynamically scoped)
      
         $l = 5;
         $d = 20;
      
         print "Inside the call to sub1 -- l: $l, d: $d\n";
      
         print "Before the call to sub2 -- l: $l, d: $d\n";
         &sub2();
         print "After the call to sub2 --  l: $l, d: $d\n";
      
      }
      
      sub sub2 {
         print "Inside the call to sub2 -- l: $l, d: $d\n";
      }
      
  • advantages of dynamic scoping
    • flexibility
    • sometimes makes things easy (e.g., no need to pass parameters if they are present in an outer scope)
    • often parameters passed from one subprogram to another are simply variables local to the caller
  • disadvantages of dynamic scoping
    • readability
    • reliability
    • type checking; can we use static type checking in a dynamically scoped language?
    • can be less efficient to implement than static scoping
    • difficult to debug
    • no locality of access
    • no way to protect local variables
    • subprograms are always executed in the environment of all previously called subprograms which have not yet completed their execution
    • can have unintended consequences


Referencing environment

  • the referencing environment is the set of variables (and their bindings) which are visible at any given point in a program
  • examples from [COPL9] pp. 235-237
  • scope and referencing environments are inverses of each other
    • scope(<declaration>) = {a set of program points}
    • refenv(a program point) = {a set of variable bindings}


free or bound?

  • for any programming language (see [EOPL2] Definition 1.3.2 on p. 29)
  • value of an expression depends only its free variables
  • value of an expression is independent of its bound variables
  • value of an expression with no free variables is fixed
    • such expressions are called combinators
    • for instance, identity function or application combinator
  • for λ-calculus (see [EOPL2] Definition 1.3.3) on p. 31
  • occurs-free? and occurs-bound? (see [EOPL2] Fig. 1.1 on p. 32)


Determining the declaration associated with a reference

  • notion of a block-structured language
    • a block is a group of statements with associated declarations (scope)
    • sometimes involves nested subprograms
    • Scheme, C, and Perl are each block-structured, statically-scoped languages
  • lexical binding
  • scope of a variable declaration is the text within which references to the variable refer to the declaration [EOPL2] p. 33
  • scope is therefore a subset of the program
  • one (inner) declaration may shadow another (outer) declaration, or
  • that the (inner) declaration creates a scope hole in the other
  • visibility
  • procedure for determining the declaration to which a variable reference is bound
  • lexical depth; use zero-based indexing
  • declaration position; also use zero-based indexing
  • can associate each variable reference with a (lexical depth, declaration position) pair (i.e., (v: d p))
  • lexical address makes variable name unnecessary
  • replace formal parameter lists with their length
  • identifiers are necessary for writing programs, but unnecessary for executing them
  • contour diagrams


Evolution of computer languages


Overview of lecture

You may not have realized it, but in learning let, let*, and letrec, you have been studying a concept called scope.

Identifiers may appear in two different contexts:

    as references: in (f x y), f, x, and y are references

    as declarations: in (lambda (x) ...) or (let ((x ...)) ...) the occurrence of x is a declaration

The value named by an identifier is called its denotation.

Each reference is (statically or dynamically) bound to a declaration (which has limited scope in most languages).

Languages have binding rules.

In Scheme, the relationship between a reference and its declaration is a static property.

Static scoping: can determine scope by examining the text of the program.

Dynamic scoping: can only determine scope at run-time.

McCarthy's original version of LISP used dynamic scoping.

Perl and COMMON LISP let you choose the scoping method used per variable.

Perl and COMMON LISP let the programmer tune the scoping method used for each variable.

Examples in Perl: dynamic.pl and dynamic2.pl.

Binding rule for lambda calculus: [EOPL2] p. 29.

free or bound (in general for any PL)?

((lambda (x) x) y)
(x bound, y free)

(lambda (y)
   ((lambda (x) x) y))
(x and y now both bound)

The meaning of an expression with no free variables is fixed.

Lambda calculus expressions without free variables are called combinators and are useful programming tools.

;;; application combinator
(lambda (f)
   (lambda (x)
      (f x)))

free or bound in lambda calculus, [EOPL2] p. 31

occurs-free? and occurs-bound? on [EOPL2] p. 32 implement those rules.

Relationship between references and declarations:

(lambda (x) ...)

(define x ...)

nesting

block-structured language

language rules: scoping rules

> (define x               ; line 1
   (lambda (x)            ; line 2
      (map
         (lambda (x)      ; line 4
            (+ x 1))      ; line 5; reference x refers to declaration x on line 4
         x)))             ; line 6; reference x refers to declaration x on line 2

> (x '(1 2 3))            ; line 7; reference x refers to declaration x on line 1
(2 3 4)

scope of x on line 1 ? {line 7}

scope of x on line 2 ? {line 6}

scope of x on line 4 ? {line 5}

Scope of a variable declaration is the text within which references to the variable refer to that declaration.

Scope of declaration v includes all references to v which occur free.

Bound references to v are shadowed by inner declarations of v.

Algorithm: search the regions enclosing the reference inside-out (i.e., from the innermost block to the outermost block).

Lexical depth (use zero-based indexing):

(lambda (x y)
   ((lambda (a)
      (x (a y)))    ; line 3
   x))              ; line 4

0: x on line 4 and a on line 3

1: x and y on line 3

Declaration position (use zero-based indexing)

Variable's lexical address: (v : d p)

(lambda (x y)
   ((lambda (a)
      ((x : 1 0) ((a : 0 0) (y : 1 1))))
   (x : 0 0)))

Lexical address is all we need; (identifier) names are superfluous!

Formal parameter lists are replaced by their length.

(lambda 2
   ((lambda 1
      ((: 1 0) ((: 0 0) (: 1 1))))
   (: 0 0)))

Lexically-bound identifiers are useful for writing and understanding programs, but are unnecessary for executing programs.


References

    [COPL9] R.W. Sebesta. Concepts of Programming Languages. Addison-Wesley, Ninth edition, 2010.
    [EOPL2] D.P. Friedman, M. Wand, and C.T. Haynes. Essentials of Programming Languages. MIT Press, Cambridge, MA, Second edition, 2001.

Return Home