CPS 343/543 Lecture notes: Lazy evaluation and thunks



Coverage: [EOPL] Chapter 3 (pp. 115-119)


Introduction to being lazy

  • yet another parameter passing mechanism
  • main idea: only evaluate an operand when it is needed
  • this idea has both compelling and complex consequences
  • advantage: if operand is never needed in the body of the procedure, then we have saved the (perhaps long) time it would have taken to evaluate it
  • furthermore, evaluation of an operand can also result in an error or never terminate
    (define mystery
       (lambda (x y)
          (cond
             ((eqv? x 0) 1)
             (else y))))
    
    ; use lazy evaluation to avoid error
    (mystery 0 (/ 1 0))
    
  • seems to be a safe-guard against stupid programmers; not so fast, keep reading
  • eager evaluation = applicative-order evaluation = strict
  • lazy evaluation = normal-order evaluation = nonstrict
  • application-order and normal-order usually apply to languages, while strict and nonstrict apply to specific procedures (e.g., cons is strict)
  • short-circuit evaluation is a form of lazy-evaluation (e.g., false && (true || false))
  • consider how inefficient an if would be if it was implemented as a procedure (rather than a syntactic form) in a applicative-order language?
    ;; unreasonable in applicative-order languages
    (define my-if
       (lambda (condition usual-value exceptional-value)
          (cond 
             (condition usual-value)
             (else exceptional-value))))
    
  • call-by-name and lazy evaluation are each a form of late binding and illuminate the tradeoff between efficiency and software engineering (recall course themes)


Thunks

  • implementing lazy evaluation is easy in a language with first-class procedures
  • use a procedure with no arguments called a thunk (also know as Jensen's device after the man (Jorn Jensen) who invented it)
    • when thunk is called it should return the value of the argument as if it had been evaluated at the time of the procedure invocation
    • therefore, the thunk must contain the necessary information required to produce the value of the argument when needed
      • need the argument expression, and
      • the environment at the time of the call
    • thus, a thunk is a (argument expression, environment at time of call) pair
  • in other words, a thunk is a shell for a delayed expression [PLPP]
  • forming a thunk = freezing (or promise or delay)
  • evaluating a thunk = thawing (or forcing or force)
  • assemblers manipulate thunks all the time (e.g., to implement templates in C++ (for more information, take the compilers course))


Examples (string substitution)

    We can think of call-by-name as a simple string substitution of formals for actuals (akin to #define).

    For instance,
    (define square
       (lambda (x)
          (* x x)))
    
    ; applicative-order evaluation
    > (square (* 3 2))
      (square 6)
      (* 6 6)
      36
    
    ; normal-order evaluation
    > (square (* 3 2))
      (* (* 3 2) (* 3 2))
      (* 6 6)
      36
    
    Conundrum: by trying to be efficient, we end up doing more work (i.e., we evaluated (* 3 2) twice).

    Therefore, implementations of lazy evaluation differ in how they handle multiple references to the same parameter as in square above.


Two implementations of lazy evaluation

  • call-by-name: invoke thunk every time parameter is referenced; disadvantage: it is a waste of time when side-effects are not possible because same value is returned each time
  • call-by-need: `records the value of each thunk the first time it is called, and thereafter refers to the saved value rather then re-invoking the thunk; this is an example of a more general technique called memoization' [EOPL] p. 115, which is also used in dynamic programming (for more information take Dr. Sri's algorithms course)
  • in a language without side effects, call-by-name = call-by-need
  • however, in a language with side-effects, the two methods are clearly different; see example on p. 115 of [EOPL]
  • convention (notice impurity):
    • special syntactic forms such as if and cond should use normal-order evaluation
    • arithmetic primitives such as + and > should use applicative-order evaluation
    • note: scheme macros let the programmer change this!
    • means `programmers cannot use standard mechanisms to extend the language' [PLPP], p. 507
  • in summary,
    • call-by-name is non-memoized lazy evaluation
    • call-by-need is memoized lazy evaluation


Child's play

  • lazy evaluation in a language without side effects supports a very simple view of a program --- one that is extremely close to mathematics
    • replace every reference to the formal parameter in the body with the corresponding operand
    • replace every procedure call with its (properly replaced) body
  • for example:
    (define square
       (lambda (x)
          (* x x)))
    
    (square 2)
    
    =>
    
    ((lambda (x)
          (* x x)) 2)
    
    =>
    
    (* 2 2)
    
    =>
    
    4
    
  • this evaluation strategy in λ-calculus is called β-reduction
  • other languages sometimes call it the copy rule


Call-by-name examples

  • recall, we can think of call-by-name as a simple string substitution of formals for actuals (akin to #define)
  • (examples courtesy the VT PL gang and [PLPP], pp. 321-324)
  • example 1: back to our familiar swap procedure
      (* call-by-value will not work here
         call-by-reference will always work here
         call-by-name works sometimes *)
      
      void swap(int x, int y) {
        int temp = x;
         x = y ;
         y = temp;
      }
      
      • main program a:
           i = 1;
           j = 2;
           a[i] = 3;
           a[j] = 4;
           swap(a[i], a[j]);
        
        textually replaced swap procedure (call-by-name):
        temp = eval(a[i], env);             // temp = 3 
        addr(a[i], env) = eval(a[j], env);  // a[i] = 4
        addr(a[j], env) = temp;             // a[j] = 3
        
        swap worked!

      • main program b:
           i = 1;
           a[1] = 5;
           swap(i, a[i]);
        
        textually replaced swap procedure (call-by-name):
        temp = eval(i, env);                // temp = 1 
        addr(i, env) = eval(a[i], env);     // i = 5 
        addr(a[i], env) = temp;             // a[5] = 1
        
        note: value of a[1] is not changed! swap did not work!

      • main program c:
        i = 1;
        j = 1;
        a[1] = 5;
        swap(i, a[j]);
        
        textually replaced swap procedure (call-by-name):
        temp = eval(i, env);                // temp = 1
        addr(i, env) = eval(a[j], env);     // i = 5
        addr(a[j], env) = temp;             // a[1] = 1
        
  • example 2:
    void sub2(int x, int y) {
       x = 1;
       y = 2;
       x = 2;
       y = 3;
    }
    
       sub2(i, a[i]);
    
    textually replaced sub2 procedure (call-by-name):
    addr(i,env) = 1;                    // i = 1 
    addr(a[i],env) = 2;                 // a[1] = 2 
    addr(i,env) = 2;                    // i = 2 
    addr(a[i],env) = 3;                 // a[2] = 3 
    

  • example 3:
    void sub3(int x, int y, int z) {
       k = 1;
       y = x;
       k = 5;
       z = x;
    }
    
       sub3(k+1,j,i);
    
    textually replaced sub3 procedure (call-by-name):
    k = 1;                              // k = 1
    addr(j,env) = eval(k+1, env);       // j = 2
    k = 5;                              // k = 5 
    addr(i,env) = eval(k+1,env);        // i = 6
    
  • does call-by-name encapsulate all other parameter-passing methods?
    • if actual is a scalar variable: it behaves just like call-by-reference
    • if actual is a constant expression: it behaves just like call-by-value
    • if actual is an array element, it behaves like ?
    • if actual is an expression with a reference to a variable
    • which is also accessible within the program, it behaves like ?


Macros in C use call-by-name

    #include<stdio.h>
    
    /* max of two ints */
    #define MAX(a,b) ((a) > (b) ? (a) : (b))
    
    /* swap macro (call-by-name) */
    #define swap(x, y) int t = (x); (x) = (y); (y) = t;
    
    /* swap function (call-by-value) */
    void swap_cbv (int x, int y) {
       int temp = x;
       x = y;
       y = temp;
    }
    
    /* swap function (call-by-reference) */
    void swap_cbr (int* x, int* y) {
       int temp = *x;
       *x = *y;
       *y = temp;
    }
    
    main() {
    
       int a = 1;
       int b = 2;
       /* int t = 5; */
    
       printf ("The max of %d and %d is %d.\n", 1, 2, MAX(1,2));
       printf ("The max of %d and %d is %d.\n", a, b, MAX(a,b));
       printf ("The max of %d and %d is %d.\n", b, a, MAX(b,a));
       printf ("The max of %d and %d is %d.\n", a+1, b+1, MAX(++a,++b));
       a--; b--;
       printf ("The max of %d and %d is %d.\n", a+1, b, MAX(a++,b));
    
       printf ("\n\n");
    
       printf ("before call-by-value: a = %d, b = %d\n\n", a, b);
       swap_cbv (a, b);
       printf ("after call-by-value: a = %d, b = %d\n\n", a, b);
    
       printf ("\n\n");
    
       printf ("before call-by-reference: a = %d, b = %d\n\n", a, b);
       swap_cbr (&a, &b);
       printf ("after call-by-reference: a = %d, b = %d\n\n", a, b);
    
       printf ("before swap macro: a = %d, b = %d\n\n", a, b);
       swap(a, b)
       /* swap(a, t) */
       printf ("after swap macro: a = %d, b = %d\n", a, b);
    }
    


Thunks in C

    #include<stdio.h>
    
    /* courtesy [PLPP], p. 508 */
    
    typedef int (*IntProc) (void);
    
    /* this is a thunk (or a shell) for the expression 1/0 */
    /* easier in a functional language like Scheme or ML */
    int divbyzero (void) {
       return 1/0;
    }
    
    int f(int x, IntProc y) {
       if (x)
          return 1;
       else
          return y();
    }
    
    main() {
       printf("The result is: %d\n", f(1, divbyzero));
    }
    


Thunks in Scheme

    ; courtesy [PLPP], pp. 508-509
    (define f
       (lambda (x y)
          (cond
             (x #t)
             (else (y)))))
    
    ; second argument is a thunk
    (f #t (lambda () (/ 1 0)))
    
    ; delay is a special syntactic form
    ; force is a procedure
    
    ; delay and force are a call-by-need implementation of lazy evaluation
    ; Scheme requires every `delayed expression to be enclosed in force'
    
    (define f2
      (lambda (x y)
        (cond
          (x #t)
          (else (force y)))))
    
    (f2 #t (delay (/ 1 0)))
    
    ; courtesy [PLP], pp. 294-295 (see also pp. 509-510)
    
    ; notice inefficiency in this: only part of each parameter is needed in a computation
    (define naturals
       (letrec ((next (lambda (n) (cons n (delay (next (+ n 1)))))))
          (next 1)))
    
    (define head car)
    (define tail (lambda (stream) (force (cdr stream))))
    
    (head naturals)
    (head (tail naturals))
    (head (tail (tail naturals)))
    


Implementation of lazy evaluation in Scheme

  • use functions delay and force
  • Scheme requires every `delayed expression to be enclosed in force'
  • delay and force use memoization (i.e., call-by-need semantics)


Adding lazy evaluation to our interpreter

  • we want to retain:
    • expressed value = number + procval
    • denoted value = ref(expressed value)
  • we will extend our reference datatype with a third type of target: a thunk target
  • a thunk target will act like a direct target, except that rather than containing an expressed value, it will contain a thunk which evaluates to an expressed value
  • see code in Figure 3.19 on p. 117
  • see code in Figure 3.20 on p. 117
  • notice we are implementing call-by-need, see eval-thunk procedure in Figure 3.20 on p. 118
  • lastly, of course, we must update eval-rand; see Figure 3.21 on p. 119
  • see illustrative example on p. 117; notice primitives, such as + are strict


Strictness

``A (side-effect-free) function is said to be strict if it requires all of its arguments to be defined, so that its result will not depend on evaluation order. A function is said to be nonstrict if it does not impose this requirement. A language is said to be strict if it requires all functions to be strict. A language is said to be nonstrict if it permits if it the definition of nonstrict functions. Expressions in a strict language can be safely evaluated in applicative order. Expressions in a nonstrict language cannot. ML and (with exception of macro) Scheme are strict. Miranda and Haskell are nonstrict'' [PLP], p. 541.


Three Properties of Lazy Evaluation

(courtesy [PIH], pp. 129-132)
  • if there exists any evaluation sequence which terminates for a given expression, then call-by-name evaluation will also terminate for this expression, and produce the same final result
  • arguments are evaluated precisely once using call-by-value evaluation, but may be evaluated many times using call-by-name
  • using lazy evaluation, expressions are only evaluated as much as required by the context in which they are used


Same-fringe problem

  • classical problem from functional programming which requires a generator-filter style of programming [PLPP]
  • demonstrates the power of lazy evaluation and the streams it enables
  • consider determining if the non-null n atoms in a S-expression are equal in two lists and in the same order
  • approach:
    • flatten both lists and recurse down each list until a mismatch is found
    • if a mismatch is find, the lists do not have the same fringe, otherwise,
    • if both lists are exhausted, the fringes are equal
  • problem: if the first non-null atoms in each list are different, we flattened the lists for naught
  • lazy evaluation, however, will only realize enough of each flattened list until a mismatch is found
  • if the lists have the same fringe, each flatten lists must be fully generated


Advantages to lazy evaluation

  • can save the work required to evaluate an unused argument and, as a result, might prevent a run-time error
  • leads to a uniform language; no need for special syntactic forms (e.g., if)
  • language can be extended by programmer in standard ways


Disadvantages to lazy evaluation

  • it its call-by-name incarnation, it can be inefficient if a formal is used more than once in the body
  • (so use call-by-need version then)
  • call-by-need makes a program difficult to understand in the presence of side-effects


Why isn't call-by-name popular?

  • overhead of freezing and thawing (can be reduced however with memoization, but requires freedom from side-effects)
  • `generally makes it difficult to determine the flow of control (order of evaluation), which in turn is essential to understanding a program with side effects' [EOPL] p. 116
  • however, in a language with no side effects, flow of control has absolutely no effect on the result of a program
  • result: `lazy evaluation is most popular in purely functional programming languages and rarely found elsewhere' [EOPL] p. 116
  • Haskell is about as close as it gets to a purely functional programming language
  • Algol60: first language to introduce/use call-by-name (why? recall historical context, assembly language programming, and macros), and then adopted (and dropped) by all descendant languages such as AlgolW, Algol68, C, and Pascal
  • Haskell (circa 1980s): first practical language to use call-by-need


Applications of lazy evaluation

The power of lazy evaluation is not so much in its ability to obviate errors (or make bad programs run properly (e.g., division by zero)) as much as it is in the form of solutions to problems it enables.
  • potentially infinite (or so called lazy) data structures or streams
  • same-fringe problem
  • combinatorial search (of, for example, a tree) in AI
  • lends itself to straightforward implementation of important recursive algorithms
    • quicksort in 4 lines of code
    • Sieve of Eratosthenes in 1 line!
  • overall, generators, streams, and filters are used in concert in a genertor-filter style of programming ([PLPP] p. 511) akin to the use of pipes in the UNIX shell (concurrent programming)


Parameter passing: language examples

  • FORTRAN:
    • before 77: call-by-reference
    • 77 and after: scalar variables are often passed by value-result
  • LISP, ML, and Smalltalk: call-by-sharing (since these languages use a reference model of variables)
    • call-by-value for immutable objects (numbers, characters)
    • call-by-reference for addresses
  • ALGOL 60:
    • call-by-name is default
    • call-by-value is optional
  • ALGOL W: call-by-value-result
  • Simula-67: call-by-value-result
  • C: call-by-value
  • Pascal and Modula-2:
    • call-by-value is default
    • call-by-reference is optional
  • C++:
    • like C, uses call-by-value
    • but also allows reference type parameters, which provide the efficiency of pass-by-reference with in-mode semantics
  • Ada:
    • all three semantic modes are available (by-value, by-result, by-value-result)
    • if out, it cannot be referenced
    • if in, it cannot be assigned
  • Java: like C++, except mostly references, with a few primitives
  • Miranda (predecessor of Haskell): uses call-by-name
  • Haskell: uses call-by-need
  • R scripting language (statistical package used by scientists, including biologists): uses call-by-name


References

    [COPL] R.W. Sebesta. Concepts of Programming Languages. Addison-Wesley, Sixth edition, 2003.
    [EOPL] D.P. Friedman, M. Wand, and C.T. Haynes. Essentials of Programming Languages. MIT Press, Cambridge, MA, Second edition, 2001.
    [PLP] M.L. Scott. Programming Language Pragmatics. Morgan Kaufmann, Amsterdam, Second edition, 2006.
    [PLPP] K.C. Louden. Programming Languages: Principles and Practice. Brooks/Cole, Pacific Grove, CA, Second edition, 2002.
    [SICP] H. Abelson and G.J. Sussman. Structure and Interpretation of Computer Programs. MIT Press, Cambridge, MA, Second edition, 1996.

Return Home