CPS 343/543 Lecture notes: Inductive data types and abstract syntax



Coverage: [EOPL2] §2.2 (pp. 42-55)


Aggregate data types

  • array (indexed with integers)
  • record (aka struct; indexed with field names) (e.g, in C:
    struct {
       int x;
       double y;
    };
    
    )
  • undiscriminated union (can only hold one of several types) (e.g., in C:
    /* compiler only allocates memory for the largest */
    union {
      int x;
      double y;
    };
    
    )
  • discriminated union contains a tag field which indicates which type the union currently holds (e.g., in C:
     
    /* C compiler does no checking or enforcement. */
    struct {
       enum tag {i, d} flag;
       union {
          int x;
          double y;
       } U;
    }
    
    )


Inductively defined data types

can be represented as unions of record types, (i.e., union of structs)
  • called variant records
  • each record type is called a variant of the union type
  • example of a variant record for a binary tree in C:
    union bintree {
    
       struct {
          int number;
       } leaf;
    
       struct {
          int key;
          /* why are pointers necessary? */
          union bintree* left;
          union bintree* right;
       } interior_node;
    } 
    


(define-datatype ...) and (cases ...)

  • we need a tool for specifying ADTs
  • (define-datatype ...) makes variant records (it is not part of Scheme)
  • a constructor is created for each variant to create data values belonging to that variant
  • binary tree ADT example:
    • interface:
      • a 1-argument procedure leaf-node (to create a leaf node)
      • a 3-argument procedure interior-node (to create an interior node)
      • a 1-argument predicate bintree?
    • using the constructors:
      > (leaf-node 5)
      #(struct:leaf-node 5)
      
      > (define myleaf (leaf-node 5))
      
      > (interior-node 'a (leaf-node 5) (leaf-node 6))
      #(struct:interior-node a #(struct:leaf-node 5) #(struct:leaf-node 6))
      
      > (bintree? myleaf)
      #t
      
      > (bintree? (interior-node 'a (leaf-node 5) (leaf-node 6)))
      #t
      
  • data types can be mutually-recursive (e.g., recall grammar for S-expressions)
  • (cases ...) provides a convenient way to manipulate data types created with (define-datatype ...)
  • can think of (cases ...) as pattern matching (values bound to symbols)
  • make (define-datatype ...) and (cases ...) your friends


Abstract syntax

  • simple grammar for λ-calculus expression re-visited
  • concrete vs. abstract syntax (external vs. internal representation)
  • expression datatype
  • one-to-one mapping between production rules and constructors



    (regenerated from [EOPL2] p. 49)
  • use of the expression datatype
  • makes occurs-free? more readable; eliminates obscure and lengthy car-cdr chains
  • abstract syntax tree (AST): like a parse tree, except uses abstract syntax rather than concrete syntax
  • AST for (lambda (x) (f (f x))):


    (regenerated from [EOPL2] Fig. 2.1, p. 49)

  • programs which process other programs, such as interpreters and compilers, are syntax directed
  • parsing is the process of converting a string of characters representing a program into an AST
    • performed by a program called a parser (or syntactic analyzer)
    • independent of what you are going to do with the tree
  • it is easier to parse list expressions than strings into abstract syntax
  • Scheme (read) facility
  • parse-expression: converts concrete syntax to abstract syntax (or S-expression to abstract data type)
  • unparse-expression: converts abstract syntax to concrete syntax (or abstract data type to S-expression)
  • use of abstract syntax makes data representing code easier to manipulate and a program which processes code (i.e., a program) more readable


Summary

  • discriminated union
  • inductive data types (e.g., variant record: a union of structs)
  • define-datatype constructs inductive data types (specifically, variant records)
  • cases decomposes inductive data types
  • concrete syntax vs. abstract syntax


References

    [EOPL2] D.P. Friedman, M. Wand, and C.T. Haynes. Essentials of Programming Languages. MIT Press, Cambridge, MA, Second edition, 2001.

Return Home