CPS 343/543 Lecture notes: Introduction to Haskell



Coverage: [COPL6] §15.8 (pp. 607-611), [PIH], and [PLPP] §§11.5-11.6 (pp. 507-520)


Key language concepts in Haskell

  • a statically-scoped, strongly-typed, purely functional language with a rich type system, built-in type inference algorithm, and lazy evaluation
  • named after Haskell B. Curry, the pioneer of λ-calculus --- the mathematical theory of functions --- on which all functional languages are based
  • developed at Yale University and University of Glasgow
  • descendant of Miranda
  • designed to be purely functional and thus brings programming closer to mathematics
  • intended for those who want to get work done quickly and, thus, designed to have a crisp, terse syntax; permeates even the language's writability (e.g., notice no define or lambda keywords, cons has even been reduced from cons in Scheme to :: in ML to : in Haskell; programmer nearly never needs to enter a ; (semicolon) in a Haskell program)
  • pattern-directed invocation (pattern matching, pattern-action rule oriented style of programming)
  • higher-order functions (map, ., foldl, foldr)
  • currying (curry, uncurry)
  • strong typing (Haskell is a strongly-typed programming language)
  • type inference (even functions have types!)
  • rich and expressive type system (ML and Haskell have arguably the most powerful and elegant type system in all of programming languages)
  • a useful general-purpose programming language, it incorporates
    • functional features from LISP,
    • rule-based programming from PROLOG,
    • terse syntax, and
    • data abstraction from Smalltalk and C++
  • Haskell is a functional language with declarative features (e.g., pattern-directed invocation, guards, list comprehensions, mathematical notation)
  • the language pH is a parallel dialect of Haskell


Core Haskell

  • simple expressions
  • primitive types: Bool, Char, String, Int (fixed precision), Integer (arbitrary precision), Float (single precision)
  • homogeneous lists
  • list operators: : (cons), ++ (append)
  • higher-order functions
    • functions that take functions as arguments or return a function as a value
    • for instance, map, ., foldl, foldr
    • `allow common programming patterns to be encapsulated as function' [PIH], p. 61
    • can be used to define domain-specific languages
  • powerful type system
  • :: operator associates a type with a value or expression (e.g., 1 :: Int)


Primitive types in Haskell

    :type expression (also :t expression) returns the type of expression
    Prelude> :t 3
    3 :: Num a => a
    Prelude> :t 3.3
    3.3 :: Fractional a => a
    Prelude> :t True
    True :: Bool
    Prelude> :t 'a'
    'a' :: Char
    Prelude> :t "hello world"
    "hello world" :: String
    Prelude>
    


Essentials

  • character conversions: ord and chr functions
  • string concatenation: ++ operator (e.g., "hello " ++ "world")
  • basic arithmetic: +, -, *, / (for reals), div (prefix; for integers) and mod
  • comparison operators: ==, <, >, <=, >=, and /= to compare integers, reals, characters, or strings
  • boolean operators: ||, &&, and not
  • if-then-else expressions: there is no if without an else, why?
  • converting a prefix operator to infix (use backquotes or grave quotes):
    div 7 2 = 7 `div` 2
    mod 7 2 = 7 `mod` 2
  • converting an infix operator to prefix (enclose operator in parentheses):
    7 + 2 = (+) 7 2
    7 - 2 = (-) 7 2
  • comments:
    • single line: -- this is a comment until the end of the line
    • multi-line:
      {-
      a
      multi-line
      comment
      -}
      
    • multi-line comments can nest:
      {- nested
         {-
         comment
         -}
      -}
      
  • :? displays all available commands
  • running an Haskell program
    • $ hugs program.hs # from the command line
    • Prelude> :load program.hs from within the system (or :l)
    • :reload (or :r) reloads the program
  • : quit quits the interpreter (or use the EOF character; on UNIX systems <crtl-d>)


Lists

  • [] is the empty list
  • lists are homogeneous (e.g., [2,3,4])
  • tuples are heterogeneous (see below)
  • can have lists of tuples, but each tuple in the list must have the same type
  • : is the cons operator
    • takes a head (element) and a tail (list)
    • x:xs is a list of at least one element
    • x:[] is a list of exactly one element; same as [x]
    • x:y:excess is a list of at least two elements
    • x:y:nil is a list of exactly two elements
  • head and tail functions (analogs of car and cdr, respectively)
  • !! selection of the nth element from a list using zero-based indexing (e.g., [1,2,3,4,5]!!3)
  • ++ is the append operator
    • takes two lists
    • also inefficient as in Scheme


Tuples

  • can be thought of as a heterogeneous list or struct
  • idea from relational databases
  • a two-element tuple is a called a pair
  • a three-element tuple is a called a triple:
    Prelude> :t (1, "Lewis", 3.76)
    (1,"Lewis",3.76) :: (Fractional a, Num b) => (b,[Char],a)
    


Functions

  • have types
  • use of patterns in function arguments called pattern-directed invocation
  • Haskell prints the arguments to a function which takes >1 argument as a tuple:
    add (x,y) = x+y
    Main> :t add
    add :: Num a => (a,a) -> a
    
  • concept of a type variable which must begin with a lower case letter, typically a, b, and so on
  • polymorphism
  • concept of a class constraint, C a, where C is the name of a class and a is a type variable:
    Prelude> :t (+)
    (+) :: Num a => a -> a -> a
    Prelude>
    


Some user-defined functions

function, parameter, and value names must begin with a lower case letter
    {-
    simple functions
    pattern-directed invocation
    pattern matching
    like cases in Scheme
    -}
    square x = x*x
    
    fact 0 = 1
    fact n = n * fact (n-1)
    
    fact n = product [1..n]
    
    sumlist [] = 0
    sumlist (x:xs) = x + sumlist xs
    
    fib 0 = 1
    fib 1 = 1
    fib (n) = fib (n-1) + fib(n-2)
    
    -- gcd is built in Prelude.hs
    gcd1 u 0 = u
    gcd1 u v = gcd1 v (mod u v)
    
    gcd1 u v = if v == 0 then u else gcd1 v (mod u v)
    
    -- with pattern-directed invocation
    -- reverse is built-in Prelude.hs
    reverse1 [] = []
    reverse1 (h:t) = reverse1 t ++ [h]
    
    -- without pattern-directed invocation need a head and tail,
    -- or an if-then-else and head and tail;
    -- head and tail are the analogs of car and cdr, respectively
    reverse2 [] = []
    reverse2 lst = reverse2(tail(lst)) ++ [head(lst)]
    
    reverse3 (lst) =
       if lst == [] then []
       else reverse3 (tail(lst)) ++ [head(lst)]
    
    reverse4 lst@(x:xs) =
       if lst == [] then [] else reverse4(xs) ++ [x]
    
    -- inspired by [EMLP] pp. 84-88;
    -- use difference lists technique
    rev1 [] m = m
    rev1 (x:xs) ys = rev1 xs (x:ys)
    
    reverse5 lst = rev1 lst []
    
    -- elem is the Haskell member function in Prelude.hs
    member _ []  = False
    member e (x:xs) = (x == e) || member e xs
    
    insertineach _ [] = []
    insertineach item (x:xs) = (item:x):insertineach item xs
    
    -- notice how use of "let"
    -- prevents re-computation
    powerset [] = [[]]
    powerset (x:xs) =
       let
          temp = powerset xs
       in
          (insertineach x temp) ++ temp
    
    -- can be similarly written with where
    powerset [] = [[]]
    powerset (x:xs) = (insertineach x temp) ++ temp
                      where temp = powerset xs
    


Mergesort

    -- ref. [EMLP] section 3.4 
    split [] = ([], [])
    split [x] = ([], [x])
    split (x:y:excess) =
       let
          (left, right) = split excess
       in
          (x:left, y:right)
    
    merge lst [] = lst
    merge [] lst = lst
    merge (l:ls) (r:rs) =
       if l < r then l:(merge ls (r:rs))
       else r:(merge (l:ls) rs)
    
    mergesort [] = []
    mergesort [x] = [x]
    mergesort lst =
       let
          -- split it
          (left, right) = split lst
    
          -- mergesort each side
          leftsorted = mergesort left
          rightsorted = mergesort right
       in
          -- merge
          merge leftsorted rightsorted
    
    -- alternatively
    mergesort [] = []
    mergesort [x] = [x]
    mergesort lst =
        -- merge
       merge leftsorted rightsorted
       where
          -- split it
          (left, right) = split lst
    
          -- mergesort each side
          leftsorted = mergesort left
          rightsorted = mergesort right
    


Lazy Evaluation

    -- main theme: lazy evaluation factors control from data in computations and,
    -- thereby enables 'modular programming'
    
    -- f is guaranteed to return successfully
    -- f (1 / 0)
    f x = 2
    
    noerror = f (1 / 0)
    
    g x y = x+y
    
    -- g (9-3) (g 34 3)
    -- (9-3)+(g 34 3)
    -- 6+(34+3)
    -- 6+37
    -- 43
    
    -- lazy evaluation leads to potentially infinite lists (so called 'streams') or,
    -- more generally, potentially infinite data structures, such as trees
    ones = 1 : ones
    
    -- one for the money,
    -- two for the show,
    -- three to get ready,
    -- four to go
    
    -- list comprehension's are the shorthand (syntactic sugar)
    -- cannot re-define ones, why?
    -- ones1 = [1,1..]
    
    nonnegatives = [0..]
    --positives = [1,2..]
    positives = [1..]
    evens = [2,4..]
    odds = [1,3..]
    
    take1 0 _ = []
    take1 _ [] = []
    take1 n (h:t) = h : take1 (n-1) t
    
    
    drop1 0 l = l
    drop1 _ [] = []
    drop1 n (_:t) = drop1 (n-1) t
    
    squares = [n*n | n <- positives]
    
    result = elem 16 squares
    badresult = elem 15 squares
    
    -- guarded equations are an alternative
    -- to conditional expressions
    -- guarded equations tend to be more readable
    -- than conditional expressions
    sortedmember (x:xs) e
       | x < e = sortedmember xs e
       | x == e = True
       | otherwise = False
    
    goodresult = sortedmember squares 15
    
    -- Sieve of Eratosthenes for enumerating prime #'s
    -- see wikipedia page for imperative version of algorithm
    sieve (two:lon) = two : sieve [n | n <- lon, (mod n two) /= 0]
    primes = sieve [2..]
    
    first100primes = take 100 primes
    
    -- use caution; don't try this at home in C, C++, or Java
    quicksort [] = []
    quicksort (h:t) = quicksort [x | x <- t, x <= h] 
                      ++ [h] ++
                      quicksort [x | x <- t, x > h]
    
    sorted = quicksort [9,6,8,7,10,3,4,2,1,5]
    
    first300primes = take 300 primes
    unsorted = reverse first300primes
    
    cool = quicksort (reverse first100primes)
    cooler = quicksort unsorted
    


Strict functions

  • preface any function argument with $! to force its evaluation and thereby use applicative-order evaluation:
    square $! (3*2) =
    square $! 6 =
    square 6 = 
    36
    
  • strict version of foldl to force the evaluation of the accumulator (ref. [PIH] p. 136):
    foldl' f v [] = v
    foldl' f v (x:xs) = ((foldl' f) $! (f v x)) xs
    
  • `strict application mainly used to improve the space performance of programs' [PIH], p. 135


Anonymous or literal functions

    (\n -> n+1) (5)
    addtwo n = n + 2
    map addtwo [1,2,3]
    map (\n -> n+2) [1,2,3]
    
    why use an anonymous function? see definition of string2int below


Mapping

    ourmap f [] = []
    ourmap f (x:xs) = (f x):(ourmap f xs)
    
    square n = n*n
    
    ans = ourmap square [1,2,3,4,5,6]
    
    squarelist lon = map square lon
    
    ans2 = squarelist [1,2,3,4,5,6]
    
    vs. squarelist = map square
    


Functional composition

    Haskell's composition operator is .
    add3 x = x+3
    mult2 x = x*2
    
    add3_then_mult2 = mult2 . add3
    mult2_then_add3 = add3 . mult2
    
    ans3 = add3_then_mult2 3
    
    ans4 = mult2_then_add3 3
    


Fooooolding lists

    use foldr and foldl

    best to think of foldl and foldr non-recursively [PIH] p. 68
    -- foldl and foldr take a prefix binary function,
    -- a base value of recursion, and a list, in that order
    
    -- foldl :: (a -> b -> a) -> a -> [b] -> a
    -- foldl associates from the left:
    -- +(+(+(+(0,1),2),3),4) = 10
    -- think of foldl as using the accumulator approach
    ans5 = foldl (+) 0 [1,2,3,4]
    
    -- -(-(-(-(0,1),2),3),4) = -10
    ans6 = foldl (-) 0 [1,2,3,4]
    
    -- foldr :: (a -> b -> b) -> b -> [a] -> b
    -- foldr associates from the right:
    -- (1 :: (2 :: (3 :: [])))
    -- (1 - (2 - (3 - (4 - 0)))) = -2
    ans7 = foldr (-) 0 [1,2,3,4]
    
    -- for reasons of efficiency, use foldl
    -- rather than foldr when they produce the
    -- same result
    
    sumlist = foldl (+) 0
    
    -- ref. [PIH] pp. 65-66
    length1 = foldr (\_ n -> n+1) 0
    
    -- cons reversed
    snoc x xs = xs ++ [x]
    
    reverse6a [] = []
    reverse6a (x:xs) = snoc x (reverse6a xs)
    
    reverse6 = foldr snoc []
    
    -- ref. [PIH] p. 148
    reverse7 = foldl (\xs x -> x : xs) []
    


Putting it all together: higher-order functions

Use these concepts (higher-order functions) to define a function string2int which converts a string representation of an integer to the corresponding integer:
    Main> :t string2int
    string2int :: [Char] -> Int
    Main> string2int "0"
    0
    Main> string2int "123"
    123
    Main> string2int "321"
    321
    Main> string2int "5643452"
    5643452
    
Hint: only requires 2 lines of code, or only 1 if you use a literal function (see below).
    helper oursum initChar  = (ord initChar) - (ord '0') + 10*oursum
    
    string2int l = foldl helper 0 l
    
    string2int2 = foldl (\r c -> (ord c) - (ord '0') + 10*r) 0
    


Overloading

    contrast square function in Haskell with that in ML
    square n = n*n
    
    Main> :t square
    square :: Num a => a -> a
    -- the type of square is a 'qualified type'
    -- and Num is a 'type class'
    


Types

    Prelude> :t [1,2,3,4]
    [1,2,3,4] :: Num a => [a]
    Prelude> :t [1.1,2.2,3.3,4.4]
    [1.1,2.2,3.3,4.4] :: Fractional a => [a]
    Prelude> :t head
    head :: [a] -> a
    Prelude> :t isDigit
    isDigit :: Char -> Bool
    
  • type introduces a new name for an existing type:
    type String = [Char]
    type Point = (Int, Int)
    
    -- can be parameterized (like a template in C++)
    type Mapping a b = [(a,b)]
    
    -- recursive types not permitted [PIH] p. 99
    -- type Tree = (Int, [Tree]) 
    
  • data introduces a new type:
    -- a variant record or a union of structs
    -- comparable to define-datatype
    data Bool = True | False
    data Colors = Red | Green | Blue | Orange | Yellow
    
    --decorate :: Mapping Colors Int
    decorate = [("Red",1), ("Blue",2)]
    
    -- can be parameterized (like a template in C++)
    data Student a = New | Id a
    
    -- can be recursive
    data Natural = Zero | Succ Natural
    data IntTree = Leaf Int | Node IntTree Int IntTree
    
    -- can be parameterized and recursive
    data List a = Nil | Cons a (List a)
    


Haskell's type system

    Haskell has a very powerful type system. A type system is a language support for creating new types.
    -- like typedef in C
    -- type and constructor names must begin with a capital letter
    type Id = Int
    
    type Name = String
    
    type Age = Int
    
    type Gender = Char
    
    type Rate = Float
    
    --type Employee = (Id, Name, Gender, Age, Float)
    type Employee = (Id, Name, Gender, Age, Rate)
    
    lucia :: Employee
    lucia = (1, "Lucia", 'f', 46, 45.56)
    
    lewis :: Employee
    lewis = (2, "Lewis", 'm', 64, 7.25)
    
    type Company = [Employee]
    
    udcps :: Company
    udcps = [lucia, lewis]
    
    type Point = (Float, Float)
    
    type Rectangle = (Point, Point, Point, Point)
    
    type Mapping a b = [(a,b)]
    
    emp_mapping :: Mapping Int [Char]
    emp_mapping = [(1, "Lucia"), (2, "Lewis")]
    
    flr :: Mapping Float Int
    flr = [(2.1, 2), (2.2, 2)]
    
    data Daysofourlives = Sun | Mon | Tue | Wed | Thu | Fri | Sat
                          deriving (Show,Eq)
    
    onholiday :: Daysofourlives -> Bool
    onholiday day = (day == Sun) || (day == Sat)
    
    ans = onholiday Mon
    ans2 = onholiday Sat
    
    -- like the define-datatype construct from [EOPL2] or ML's datatype
    data Bintreeofints = Empty | Node Bintreeofints Int Bintreeofints
                         deriving (Show,Eq)
    
    ourbintreeofints :: Bintreeofints
    ourbintreeofints = (Node
       (Node
          (Node Empty 1 Empty)
          7
          (Node Empty 2 Empty))
       6
       (Node
          (Node Empty 3 Empty)
          8
          (Node
             (Node Empty 5 Empty)
             4
             (Node Empty 10 Empty)
          )
       ))
    
    -- if inorder returns a sorted list,
    -- then its parameter is a binary search tree
    -- inorder :: Bintreeofints -> [Int]
    inorder Empty = []
    inorder (Node left i right) =
       (inorder left) ++ [i] ++ (inorder right)
    
    preorder Empty = []
    preorder (Node left i right) =
       [i] ++ (preorder left) ++ (preorder right)
    
    postorder Empty = []
    postorder (Node left i right) =
       (postorder left) ++ (postorder right) ++ [i]
    
    ans3 = inorder ourbintreeofints
    ans4 = preorder ourbintreeofints
    ans5 = postorder ourbintreeofints
    
    -- parameterized datatype 
    data Bintree a = Empty2 | Node2 (Bintree a) a (Bintree a)
                     deriving (Show,Eq)
    
    ourbintree = (Node2 
       (Node2
          (Node2 Empty2 "the" Empty2)
          "type"
          (Node2 Empty2 "is" Empty2))
       "cat"
       (Node2
          (Node2 Empty2 "called" Empty2)
          "bintree"
          (Node2 Empty2 "and" Empty2)))
    
    -- declaring the type of a function is not required, but
    -- 'can be used to resolve ambiguities or restrict the type
    -- of the function beyond what the Hindley-Milner type checking
    -- would infer' [PLPP], p. 514
    inorder2 :: Bintree a -> [a]
    inorder2 Empty2 = []
    inorder2 (Node2 left i right) =
       (inorder2 left) ++ [i] ++ (inorder2 right)
    
    preorder2 Empty2 = []
    preorder2 (Node2 left i right) =
       [i] ++ (preorder2 left) ++ (preorder2 right)
    
    postorder2 Empty2 = []
    postorder2 (Node2 left i right) =
       (postorder2 left) ++ (postorder2 right) ++ [i]
    
    ans6 = inorder2 ourbintree
    ans7 = preorder2 ourbintree
    ans8 = postorder2 ourbintree
    


Haskell type classes and instances

  • a collection of names and types of the functions which every instance must support
  • like a (Java) interface from object-oriented programming
  • some pre-defined Haskell classes: Eq, Show, Ord, Num, Real
  • the Haskell numeric type class hierarchy:



    (regenerated from [PLPP] Fig. 11.10, p. 519)


Monads: side-effect-free I/O

An extension of the idea of potentially infinite data structures or streams made possible through lazy evaluation.

Monad takes its name from a branch of mathematics known as category theory.


Summary

(ref. [WFPM])

Higher-order functions (allows functions to be glued together) and lazy evaluation (allows whole programs to be glued together) are the glue which allows us to combine program components together in creative ways to produce concise, malleable, and reusable programs and, thereby enables modular programming which makes programs easier to debug, maintain, and re-use.


Comparison of ML and Haskell

conceptMLHaskell
listshomogeneoushomogeneous
cons:::
append@++
renaming parameterslst as (x::xs)lst@(x:xs)
functional redefinitionpermittednot permitted
dynamic dispatch|
parameter passingcall-by-value
strict
applicative-order evaluation
call-by-need
non-strict
normal-order evaluation
functional compositiono.
infix → prefix(op +)(+)
curried formomit parenthesesomit parentheses
type declaration:::
datatype definitiondatatypedata
type variablesprefaced with '
written before the datatype name
not prefaced with '
written after the datatype name
function typeoptional, but if used,
embedded within function definition
optional, but if used,
precedes the function definition
type checkingHindley-MilnerHindley-Milner
function overloadingnot supportedsupported through qualified types and type classes


References

    [COPL6] R.W. Sebesta. Concepts of Programming Languages. Addison-Wesley, Sixth edition, 2003.
    [EOPL2] D.P. Friedman, M. Wand, and C.T. Haynes. Essentials of Programming Languages. MIT Press, Cambridge, MA, Second edition, 2001.
    [EMLP] J.D. Ullman. Elements of ML Programming. Prentice Hall, Upper Saddle River, NJ, Second edition, 1997.
    [PIH] G. Hutton. Programming in Haskell. Cambridge University Press, Cambridge, 2007.
    [PLPP] K.C. Louden. Programming Languages: Principles and Practice. Brooks/Cole, Pacific Grove, CA, Second edition, 2002.
    [WFPM] J. Hughes. Why Functional Programming Matters. The Computer Journal, 32(2), 98-107, 1989.

Return Home