CS 463 - Final Exam Review Spring 2024

Final Exam: Tuesday, May 7th 2024, 1:30pm-4:15pm
Location: Usual Classroom (E 201)

Table of Contents

Preparation

  • allowed: one 8.5x11" cribsheet of self-generated notes (both sides). I strongly suggest writing out this crib sheet, as it's a valuable way to drive your studying. If you're typing things out, you need to keep a reasonable font size (no smaller than 10pt), and you may not include any copied images (including images of text). If you bring that kind of sheet I reserve the right to respond accordingly either before (taking sheet) or after (perhaps by deducting ten points from the assessment for failure to follow instructions or other actions depending on the content of the cribsheet).
  • any lambda calculus questions will sufficiently define the language being used, as seen on our quizzes/midterm before.
  • the final exam is cumulative. Make sure you refer back to our midterm review, as this document will not revisit that material beyond the "suggested question topics"/estimated point spreads.
    • roughly 50% will be old material, 50% new.
    • it is reasonable to expect about the same level of depth on previous topics at most for the old material; I'm not suddenly going to reach for obscure subtopics on old material.

After the midterm, our efforts have mostly focused on the simply typed lambda calculus, more Haskell (including monads), and concurrency. Our test aims to reflect that same breakdown of effort with no surprises in weighting.

Suggested Question Topics

The exam will likely cover questions related to the following, so you should make sure you are comfortable with whatever kinds of calculations or concepts we dwelled upon when learning about these.

  • Defining Languages
    • sentence derivations
    • parse trees
    • language properties (associativity, precedence, etc)
    • attribute grammars (top level concepts)
    • dynamic semantics (operational, axiomatic, and especially denotational)
  • Lisp:
    • reading and writing code. always-prefix operators.
    • structurally recursive definitions
    • using if, cond, let/let*, defun
    • lists (cons, car/cdr or first/rest, null, etc.)
  • Haskell:
    • reading and writing code
    • structurally recursive definitions
    • pattern matching
    • map, folds, filter, zip, zipWith usage
    • data types (e.g. lists, Maybe, Either, tuples, user-defined datatypes)
      • creating your own datatypes
    • type class usage/implementations.
    • monads: IO, Maybe, State.
      • reason about IO monadic code similar to the concurrency assignment
  • Lambda Calculus (untyped as well as simply typed):
    • evaluation rules
    • typing rules
    • reducing expressions (using evaluation rules, clearly naming what is used)
    • typing proof trees (showing well-typedness or errors)
    • encodings (writing "programs")
      • use the fix extension for recursion
    • extending the definition (term/value/type extensions)
  • Concurrency:
    • basic definitions/ideas
    • discuss/reason about classical concurrency problems: producer/consumer, reader-writer, dining philosophers.
    • inspect short blocks of code along the lines of our class examples (C, Java, Haskell)
    • answer questions related to our concurrency assignments (I'm assuming background knowledge of you completing the assignments enough to understand the challenges and situations it presented).
    • concept of monitors, use of synchronized in Java
    • concept of critical sections (need for it and implementations we saw in C/Java/Haskell)
    • MVar / Chan / IORef concepts from Haskell
  • subprograms (this section will be minimal points!)
    • parameter passing rules
      • pass by value, reference, name/need
      • in, out, in/out styles
    • different scoping rules (dynamic vs static)
    • implementation details of various frames (we looked at three scenarios)
      • what is a return address used for? what kind of value is a return address? (it holds a code address)
      • what is a dynamic link used for? what kind of value is stored there? (it holds an address of a stack location of the previous frame)
      • what is a static link used for? what kind of value is stored there? (it stores a pointer to the parent scope's most recent frame)
        • how do we resolve a variable in these circumstances? (how many static link hops, and what offset from within that frame?)
    • finding variables via static link chains (see chainingX examples)
    • when can we, and can we not, perform general or tail recursion? (general: we need a stack).

The rest of this document gives some more thorough treatment of the newer topics, or points us in the right direction.

rough points breakdown:

I haven't written the test yet, but here is my basic points break-down plan:

points(100) topic
15 language basics
15 untyped lambda calculus
15 Lisp Code
20 Haskell Code
15 simply typed lambda calculus
15 concurrency
5 subprograms

1. Simply Typed Lambda Calculus λ

We explored adding types to the lambda calculus. Evaluation was essentially unchanged. See the homework and rules document for plenty of information about STLC. The homework should be a great source of sample question styles.

STLC Rules

  • Be ready to write out proof trees within a given calculus.
    • both successful and failed ones (including which rule/reason makes it fail)
  • Be ready to extend the language.
    • add new terms, new types, new values.
    • add new typing rules. (exactly one rule per new term).
    • adding new evaluation rules just like in the untyped calculus if requested (often more than one per term).
  • be ready to write encodings (small definitions) in a specific language.
    • careful! don't use things not actually in the given language. But also, don't miss out on what is there. Take a quick peek at the language to remind yourself what tools you have available before answering these questions.
    • helper functions are still quite useful, and encouraged.
    • recursive definitions (via the fix exension) are expected on the test.

2. Haskell, part 2

Be able to:

  • define and use datatypes. Some examples:
    data IntTree    = ILeaf | IT IntTree Int IntTree    deriving (Show, Eq)
    data Tree a     = Leaf  | B (Tree a) a (Tree a)     deriving (Show, Eq)
    data Coin       = P | N | D | Q
    
  • define a typeclass, and write instances of it. Examples:
    class Show a where
      show :: a -> String
    
    instance Show Coin where
       show P = "1¢"
       show N = "5¢"
       show D = "10¢"
       show Q = "25¢"
    
  • use list comprehensions. Examples:
    [ 2^x | x <- [0..10]]
    flatCoords n = [ (i,j) | i <- [0..n], j<-[0..n]]
    
  • read and write (or modify) code in the IO and (to a lesser degree) Maybe and State monads (proportional to the efforts spent on assignments and in class). See our class examples for more detail.
    • only do-notation will be expected, not use of >>= . But if you choose to use >>= notation it is accepted.

basic basics of monads

  • the Monad type class:
    class Monad m where
      (>>=) :: m a -> (a -> m b) -> m b
      return :: a -> m a
    
  • do-notation is just syntactic sugar for uses of >>= and return.
    • understand basic idea of bind/return version and converting to do-notation.
    • lines that look like this:
      v <- funcM args
      

      funcM args is of some monadic type, such as IO a , and then v::a is the result of running that monadic computation. Think of it as having an action described on the righthand side, the "binding" (left arrow) is requesting we perform the action, and its resulting value is getting stored to the variable on the lefthand side. This is much like a Java assignment statement, a = funcM(a,r,g,s); we have a described action on the right, we perform it and store the result into x.

other monads

  • IO Monad: Understand that this is where all side-effectful things go in Haskell. This is also the only monad we've seen that doesn't have a runTheMonad function (such as runState). This is because we don't want to escape the IO monad by calling a run function and then getting to do other things; then, we'd be allowed to have side effects anywhere! With no run function for IO, any function that deals with IO will thus have to have IO a in its type somewhere.
    • We actually get to run one of these things by either invoking it within the ghci interpreter (which forces the evaluation and gets to perform the side-effects), or else it is the main function and gets run as an executable (same behavior). For a compiled program, we get one chance to do side effects, as the main method. It likely uses all the other fun code we've written.
      • we saw a bit of this on our assignments. We could avoid actually compiling the code by using the runhaskell command, which does on-the-spot interpretation of a file's main function.
    • fun fact: the interactive loop is just an IO do-notation block that reports back immediately as you give it commands. That's why the let syntax is do-ish when used in the interactive loop (versus inside a file), without the "in" keyword we learned originally. You can even bind IO actions right there if you want!
      Prelude> ans <- getLine
      this is just a test
      Prelude> 
      Prelude> ans
      "this is just a test"
      Prelude> 
      Prelude> writeFile "Desktop/example.txt" ans
      Prelude> 
      Prelude> :! cat Desktop/example.txt 
      this is just a testPrelude>
      
  • Maybe monad: chaining together fail-possible actions. Each step might fail, and immediately bail out with Nothing; or, when a step succeeds in generating Just someval, we give someval a name and continue, attempting the next fail-possible action.
    • example code (basically all functions here are made up):
      manySteps name = do
        name       <- tryLookup name directory
        person     <- tryLookup name
        bestFriend <- maybeHead (friendsOf person)
        phoneNum   <- tryLookup bestFriend phoneDirectory
        return phoneNum
      
  • List Monad: like our list comprehensions; can also have let-expressions (as any do-notation can), and guard expressions, so truly just like list comprehensions.
    • example: if we have some generators, those are like nested loops; when we have guards, those are like nested if-statements; and when we use the surviving combinations of items, that's like appending to a list of results. Compare this python code to the haskell code after it:
      # PYTHON-esque CODE
      vals = []
      for x in xs:
          for y in ys:
              if x<y:
                  if y%2==0:
                      vals.append( (x,y) )
      
      -- HASKELL CODE
      vals = [ (x,y)  |  x<-xs,  y<-ys,  x<y, y%2==0 ]
      
      -- do-style
      vals' = do
        x <- xs
        y <- ys
        guard $ x<y
        guard $ y%2==0
        return $ (x,y)
      
  • State monad: threading through some piece of state in the background (our stack in the class example, or a pair of ints in our fib examples). The runState function accepts the state-monadic action (State s a value), the initial state (e.g. an empty list to represent our empty stack), and then proceeds to supply the state to each chained action, getting a value and a next state, to supply to the following action, eventually resulting in the last version of state as well as some last answer-value. This is similar to a function call in some imperative language, in that we get back both a returned value as well as some modified state; the state just happens to be the actual state of all memory in your program :)
    • It's common to want to get a copy of the current state, and to want to replace the current state with some new value. These two operations, get and put, are provided in a typeclass:
      class (Monad m) => MonadState s m | m -> s where
        get :: m s
        put :: s -> m ()
      

      First, get. It needs no arguments, and gives you a representation of computation that reports an s value (the current state that was being threaded through the background). In order to use any represented computation such as this, we need to bind it to perform the action and "unpack" the answer:

      s <- get
      

      Second, put. We want to discard the current value being threaded through in the background and replace it with a new one. This method accepts the new state s, and then gives back a representation of computation that (does some reassigning and then) returns (). All the excitement happened in the state. To call it, we also must bind it (once it's been given its argument):

      put (s+1)
      

      Although we could bind it, the returned value must be unit, so there's not much value added by doing so:

      () <- put (s+1)
      

      So by putting it on a line by itself, we are asking Haskell to perform the monadic action and discard the resulting () value.

    • state-monadic code can be recursive, e.g. when processing a list. Since the function's type may yield a monadic computation, we would need to bind the output.
      primesM :: [Int] -> State [Int] ()
      primesM [] = return ()
      primesM (x:xs) = do
        -- deal with current item
        if (isPrime x) 
            then do
              currentList <- get
              put (x:currentList)
            else return ()
        -- recursive call; based on its type we need to bind it to perform its intended actions
        primesM xs     
      
      primes xs = case runState (primesM xs) [] of
                     (ans, lastState) -> lastState
      
      goStack :: [Command] -> State [Int] ()
      goStack [] = return ()
      goStack (Push v:rest) = do
        stackVals <- get
        put $ v:stackVals
      goStack (Add:rest) = do
        -- this pattern could fail when there aren't enough values...
        (a:b:restOfStack) <- get
        put((a+b):restOfStack
      ...
      

Abstract Data Types

  • the user can create their own types (like IntTree, Tree a, and RoseTree).
    data IntTree    = ILeaf | IT IntTree Int IntTree    deriving (Show, Eq)
    data Tree a     = Leaf  | B (Tree a) a (Tree a)     deriving (Show, Eq)
    data RoseTree a = RNil  | RT a [RoseTree a]         deriving (Show, Eq)
    

    Give it a name, as many uniquely named constructors as desired, as many parameters to each constructor as desired, and then they are all collected into the new given type.

    • can be parameterized over type variables (we have seen this but haven't used it much). Both Tree a and RoseTree a showcase this, with the a type parameter. You can use Tree a at all kinds of specific types, e.g. Tree Int, Tree String, or even weirder things like Tree (Maybe (Tree Int)). This type of polymorphism is called parametric polymorphism: absolutely nothing is known about the type, so any type may be used there (and preserved for all related locations in one particular usage of this thing). It is like generics in Java (Java borrowed it from functional languages when they carefully added it to the language in version 5).
    • just add deriving (Show,Eq) to make printing and equality checks available. This uses type classes, and these are two of the shortlist of typeclasses that Haskell can automatically implement instances for you.
    • More examples (some are built into the Prelude already):
      data Bool = True | False
      data Maybe a = Nothing | Just a
      data Either a b = Left a | Right b
      data Color = RGB Int Int Int | Grey Int | Black | White
      data MyList a = MyNil | MyCons a (MyList a)
      
  • functions can be written over these types, pattern-matching on the constructors.
    sum :: IntTree -> Int
    sum ILeaf = 0
    sum (IT left val right) = (sum left) + val + (sum right)
    
    catMaybes :: [Maybe a] -> [a]
    catMaybes [] = Nothing
    catMaybes (Nothing:rest) = catMaybes rest
    catMaybes ((Just v):rest) = v : catMaybes rest
    

type classes

  • similar to a Java interface, in that it defines a group of methods that can be implemented at various types.
    • this is ad-hoc polymorphism.
  • but, it is "opened": we can write the instance of a type class for some particular type wherever we choose, instead of just inside the class definition as Java requires for interface implementations.

class definition examples

  • The Show typeclass is very simple, defining just one method over the target type:
class Show a where
   show :: a -> String
  • The Monad typeclass provides just a few methods, though we learned that quite a bit of different uses can all arise from these methods! Note the default definition of (>>); much like a Java default method, Haskell typeclasses can provide an implementation in terms of the others (which is still replaceable by you when you implement an instance of the typeclass).
class Monad m where
  return :: a -> m a
  (>>=) :: m a -> (a -> m b) -> m b
  (>>) :: m a -> m b -> m b
  ma >> mb = ma >>= (\ _ -> mb)
  • the Functor typeclass only provides fmap but we saw it could be useful to dig through some structure. Note that f isn't a plain old type, but is something of kind * -> *.
    class Functor f where
       fmap :: (a->b) -> f a -> f b
    

instance examples

An instance is a stand-alone piece of code that claims (and provides a definition of how) a particular type provides the methods of a particular type class. It selects a specific concrete type for the type class's type variables. This is akin to the methods in a Java class that provide the implementation of an interface (not necessarily the entire class!). Instead of needing to be defined in that one class, the instances can be just about anywhere. This adds some convenience and causes some headaches here and there; it's just a different approach.

We must describe what class is being instantiated, by what particular type, and then provide the definitions.

  • simple instances
    data MyList = Nil | Cons Int MyList
    instance Show MyList where
      show (Nil) = "Nil"
      show (Cons x xs) = "Cons " ++ (show x) ++ " " ++ (show xs)
    
    • note that the show (Cons x xs) line has two recursive calls to show; the first (show x) relies upon the instance Show Int, while the second relies upon instance Show MyList recursively. Because show is a method in a typeclass, each usage anywhere might not be from the current instance!
  • Ord

    Look at the Ord typeclass for more examples. We worked on this in class code.

3. Concurrency

Be prepared to:

  • discuss concurrency primitives in general: semaphore, mutex, threads, waiting
  • discuss common issues: deadlock, live lock, race conditions
  • discuss classical concurrency problems:
    • producer/consumer, reader/writer, dining philosopher examples
  • inspect short blocks of Java, C, Haskell code that involve concurrency. Especially:
    • synchronized keyword usage in Java
    • MVar, IORef, Chan usage in Haskell
      • MVar is a box that can be empty or hold one value. Access automatically blocks.
      • IORef is like a variable, with read/write capability. No blocking.
        • the possibility of race conditions exists, and must be accounted for.
      • Chan is like a buffer (FIFO) with put/take operations (annoyingly, named read/write…)
      • note: STM will not be tested on the exam. It's quite interesting, but the items above are more important basic things we should test. Also, a lot of the functions/datatypes have confusingly similar names to the above.

Basics

  • levels of concurrency:
    • machine instructions (e.g., multiple cores)
    • language statements
    • unit level (e.g., subprograms)
    • program level (multiple programs running at once)
  • only the statement and unit levels are interesting for our purposes
  • Categories of Concurrency
    • physical: multiple independent processors
    • logical: simulate physical concurrency (e.g., time-sharing a processor)
    • co-routines: not really concurrency, because we still have one thread of control

Some Definitions

  • liveness: a task's ability to make progress towards completion
  • deadlock: all tasks have lost their liveness.
    • example: dining philosophers are all holding on to one chopstick, and are waiting for the second one without wiilling to release the first one. It's a stalemate.

Tasks

task:

Some unit of code that can run concurrently with others.

  • often work together (coordinate), occasionally compete for resources.
  • initiating a task and completing a task: doesn't behave quite like a normal function call. (the 'main' function regains its own control as soon as the spawning is complete; the child thread might not return to main in any sense when it completes.)
  • kinds of tasks:
    • heavy-weight: own address spaces (like separate processes)
    • light-weight: share an address space.
      • each thread gets its own stack (which isn't particularly large in general - often about 8MB), with its own dedicated address range, but all threads tend to be able to access all memory.
      • easier to implement than heavy-weight threads
      • easier to share data (whether on purpose or not)

synchronization:

  • modes of communication:
    • shared state (issues of read/write timing, i.e. race conditions)
    • passed parameters (at initial thread initialization)
    • message passing

Cooperation and Competition

Cooperation: Multiple tasks are all trying to help solve the same problem. They coordinate and communicate their progress, sharing resources. They need to work together and understand each others' algorithms somewhat.

Competition: Multiple tasks all want access to the same resource. Given the unpredictable timing, they end up fighting over access to the resources.

Approaches to Synchronization

Semaphores

Semaphores: can send signals to other tasks that some condition exists.

  • implementation: counter (number of signals to send) and queue (a list of tasks that are waiting for a signal).
  • wait and release functions. (many other names depending on the language)

Issues:

  • brittle. We rely on correct calls of wait/release by the code that uses the semaphore. If we forget one, anywhere, generally everything breaks.
    • missing waits: underflow/overflow of buffers could occur.
    • missing releases: deadlock can occur.

Language support:

  • often provided as libraries, and not as language primitives.

Mutexes

mutexes: "mutual exclusion". Represents a lock for some resource. Whoever holds the lock has sole access to the resource.

  • not quite a semaphore with a maximum count of one, because there's no queue: every attempt to grab the lock might fail, and a particular task might just always be too slow to acquire the lock.
  • often paired with condition variables that can be signaled and waited for.
  • intention: a mutex is the lock for a specific resource, not just for any one of multiple identical resources. The task that acquires the lock should be the one that releases the lock, but implementations might not enforce this intent.

Monitors

monitors: An object can have sole control over some resource, and require all tasks that want access to the resource to use its own public API (methods) to request interactions with the resource. This one object, called a monitor, can then internally embody all synchronization necessary to make use of the resource safe.

  • example: A Java object is a monitor, always. Any of its methods that are all labeled synchronized (or blocks synchronized on this object) will not be concurrently run. If you want to control access to a resource, hide all accesses to it behind the methods of one object and synchronized those methods. Java takes care of the rest!
    • if you have a Buffer class with synchronized get and put methods, each Buffer is individually monitoring its own state; they are not synchronized with each other. This means that every one of your hundreds of buffers might all be getting or putting at the same time, but no individual buffer is experiencing multiple get/put operations at once.
  • mutual exclusion is guaranteed (from outside the monitor) as long as it's implemented correctly inside the monitor and only the intended methods are visible outside the monitor.
  • programmer still must coordinate between tasks that use the monitor. (Deadlock is still a possibility).
  • concurrent access requests are implicitly blocked.
  • Notes:
    • Competition: straightforward; mutual exclusion is automated.
    • Coordination: programmer still does bookkeeping.
    • just as powerful as semaphores (we can implement each in the other)

Message Passing

message passing: It's like having mailboxes for each task. The delivery of messages can be asynchronous. Each task is responsible for reading its mail frequently enough that communication is maintained.

  • Coordination is achieved through messages that share the information
  • one task might choose to have multiple mailboxes for various purposes, so that messages stay organized. (It helps to deal with the issue of the ordering of messages).

Some Language Approaches to Concurrency

  • C: pthreads library
  • Java:
    • Thread objects (run() methods are tasks)
    • synchronized methods and blocks
    • java.util.concurrent.Lock: locks
    • java.util.concurrent.atomic: variable-level synchronization.
  • Haskell:
    • MVar, IORef, Chan
    • STM (software transactional memory)

Haskell: MVar

An MVar gives us a single spot in which to put or take a value - like a length-one buffer. It automatically blocks when putting into a full MVar or taking from an empty MVar; the blocked thread will be woken back up when the buffer is now in the right state for their attempted action.

  • some operations:
    newEmptyMVar :: IO (MVar a)
    newMVar :: a -> IO (MVar a)
    putMVar :: MVar a -> a -> IO ()
    takeMVar :: MVar a -> IO a
    tryTakeMVar :: MVar a -> IO (Maybe a)
    tryPutMVar :: MVar a -> a -> IO Bool
    isEmptyMVar :: MVar a -> IO Bool
    
  • it can represent a lock by putting a value (really, any value) in the MVar; when your take operation succeeds, you've got the lock. Put it back when you're done though. Forgetting the unlock step in any language is a bug.
  • threads can coordinate by putting messages in each others' "mailboxes" (just MVar's, or perhaps Chan's like below).

Haskell: Chan, BoundedChan

A Chan behaves like an MVar that can hold an unlimited number of values. It thus won't block on put-operations, but take-operations can still block if there are no values in the Chan at the moment.

  • some operations:
    newChan :: IO (Chan a)
    writeChan :: Chan a -> a -> IO ()
    readChan :: Chan a -> IO a
    
  • Chan's are useful when we don't need a lock-step put-take-put-take usage, e.g. when the mail can pile up for awhile before we get around to reading it. It preserves the original order. It is/was useful on our concurrency examples for the announcer thread, which accepts multiple printing-requests, and handles them in the order received.
  • A BoundedChan behaves like a fixed-size buffer with automatic blocking on reads/writes when necessary. It is like a cross between an MVar and a Chan - it has a fixed number of spots available like an MVar (which only has one spot), but can accept multiple puts in a row when there's space (like a Chan, which always has enough space as long as your computer does).

Software Transactional Memory

On the test, you will not be required to write any code in this style. The main idea of "arbitrary blocks that run atomically, recording all changes from that block of code to the global state in one fell swoop" is something you should understand.

  • atomic guarantees atomicity on arbitrary expressions. Either an entire atomically labeled block is successfull and all side-effectful things happen all at once ("atomically" committing all those changes to the shared global state), or else we might retry or try alternates via orElse. GHCI can selectively only wake up threads when one of the relevant shared state variables has actually been modified, instead of any competitive "wake-the-world" approach (e.g. notifyAll in Java).
  • we saw an example of the Dining Philosophers, where the combined actions of taking the left and right utensil, eating, and putting them back were all considered as one atomic group.

4. Subprograms

basics

  • subprogram: collection of statements to run, maybe with arguments and return value. We use subprogram as the general term for various flavors:
    • function: subprogram with parameters and return value.
    • subroutine: subprogram with no parameters or return value.
    • method: 'function' attached to specific object.
  • signature: all structured info in "first line": visibilities, return type, name, formal parameters list, etc.

parameter lists

More terminology to help discuss things:

  • formal parameter: part of subprogram definition (declares it)
  • actual parameter: part of subprogram call (instantiates it for a particular call)

Sometimes we call formals/actuals "parameters" and "arguments".

allowed arguments

  • "first class values": types that may be used as arguments. Usually functions aren't allowed (but it's very useful when they can be).
  • candidates:
    • primitives
    • arrays
    • addresses/references
    • types
    • functions
    • more definitions (class definitions, etc)

Parameters Modes

  • in: info flows from caller to called
  • out: info flows from called to caller
  • in-out: info flows both ways through this particular parameter

Parameter-Passing Conventions

Various ways to implement in-, out-, or in-out parameters. Be careful not to confuse your most familiar version of one style as the true meaning of that style! E.g., passing a pointer as a parameter lets information effectively enter and exit the call, but it's just one way to achieve in-out.

  • pass by value: copy actual value, send to subprogram.
    • in mode
    • ex: Java primitives
  • pass by reference: send copy of address to subprogram.
    • in-out mode (via these "access paths")
    • ex: Java parameters of any reference (non-primitive) type
    • keep in mind that we have some data in mind; you might consider a pointer in C to give us pass-by-reference access to that data. Many sticklers will complain that we're actually passing the pointer by value. It's a matter of perspective, and not worth getting worked up over.
  • pass by name: substitute argument expression for name directly.
    • leads to re-evaluation of the expression each time parameter is evaluated (can be good/desired, can be very wasteful)
    • issue of where transported names come from: original env? env when called? env when actually evaluated?
    • implementation: a closure/thunk. (some code-as-a-value that will evaluate when finally needed/touched).
  • pass by need: call-by-name that memoizes the result.

    With referential transparency, a rare appropriate time for a pass-by-name flavor. If nobody ever needs the result, it's never calculated (we're lazy). If it's needed multiple times, the first usage touches the thunk/evaluates it to the answer, which is stored. Successive uses find the answer instead of recalculating (allowed since we have referential transparency, a.k.a. purity).

    Haskell does this.

Subprograms as Parameters

Referencing Environment: effectively the group of names-to-values for all variables used inside the subprogram that gets used as a parameter.

Be prepared to look at code and differentiate what occurs with different styles of binding. There was an explicit example comparing these styles in our chapter 9 slides.

  • shallow binding: use the local environment when the subprogram is actually executed
    • dynamic scoping; uses a dynamic link in stack frame to look back through frames until you find the variable.
    • X86-64 has dynamic links (unless there was nothing in the frame at all).
  • deep binding: use the environment from the subprogram's original definition. You can "statically" (at compile time, without executing) look for the lexical structure that contains a definition to know where to look for variables that weren't defined locally. We don't care who called us, we only care where we're defined.
    • static scoping; requires a static link in stack frame, which points back not to the caller's frame but to the most recent frame of the surrounding scope's call (the frame of the parent scope).

some Approaches to Closures

  • Haskell thunks (all sub-expressions)
  • C function pointers
  • Python anonymous functions

Kinds of Polymorphism

  • subtype polymorphism: one type extends another, is guaranteed to provide at least all definitions from parent type.
    • ex: Java classes (really any OO classes)
  • parametric polymorphism: "forall" types. Never inspects value, so can work at any type.
    • ex: haskell types like map::(a->b)->[a]->[b].
    • ex: Java Generics
  • ad hoc polymorphism: functionality manually provided at specific types. Usage at just these types thus works.
    • ex: Java interfaces
    • ex: Haskell type classes

Overloading methods (providing alternate versions)

Some various examples:

  • OO: replace inherited definitions
  • Haskell: typeclass instances
  • C++: templates

5. Implementing Subprogram Calls

Definitions

  • Activation Record: structure of values needed to track one subprogram call.
    • some of: return address, dynamic link, static link, parameters, locals
  • Activation Record Instance: actual values shaped like the corresponding activation record.
    • we've been calling these frames, too.
  • active frame: frame of subprogram has been called, but has not yet returned. (Might be paused, awaiting return from function-call)

Scenario 1: Early languages: Static Everything!

Description/Implications:

  • all variables are static: they get a permanent address in memory forever.
  • no need for runtime stack!
  • can place each code segment and frame in known-at-compile-time spots.
    • or leave to linker to allow cross-compilation.
  • still need a return address (one subprogram can still be called from various places)
  • unable to provide general recursion (there's no stack to place extra ARI's of that same subprogram), but we saw it was possible to achieve tail-recursion. It behaved somewhat like a loop, which is exactly what we want.
  • Frame structure:
    • return address
    • parameters
    • local variables

Scenario 2: Many modern languages: Stack-Dynamic Locals

Description/Implications:

  • store locals/params in frame
  • choice: disallow nested subprogram definitions
    • just local (this frame) and global (permanent location) scopes
    • thus we don't need static links. (see below when they're needed).
  • need runtime stack of active frames
    • can have multiple frames of one subprogram definition: must keep separate.
    • thus, can have general recursion. We are used to these kinds of languages.
  • need dynamic links (to remember where previous frame begins on stack)
  • Frame structure:
    • return address
    • dynamic link
    • parameters
    • locals

Scoping issues (for stack-dynamic locals and no nested subprograms)

  • choosing Static Scoping

    can only see locals (known offset within frame) or globals (known permanent address).

    • all known statically at compile time
  • choosing Dynamic Scoping

    must be able to walk back through the dynamic links ("dynamic chain") to find most recent occurrence of named variable

    • requires both name and value on stack. Naive approach: dig through frames. Would cause slow lookups, and also we'd need to be storing the actual parameter names on the stack - weird…
      • alternative: keep stack of values for each unique variable name.
        • call/return now requires push/pop (more expensive calls), but lookups are fast.
      • alternative: keep table of values for each unique variable name.
        • each frame stores any variables it will shadow, and restores upon returning. This treats all variables sort of like callee-save register values.
    • note: dynamic links are also used when we return from one function call to its caller: the dynamic link helps us reset the frame pointer to the beginning of the previous frame on the stack.

Scenario 3: Stack-Dynamic Locals and Allowed Nested Subprograms

Description/Implications:

  • still choose stack-dynamic locals (keep recursion)
  • subprogram definitions may be nested now
  • still use stack of active frames (general recursion available)
    • still need dynamic links
  • also need static links (when scoping is static): pointer of most recent active frame of the direct parent-scope.
    • guaranteed to exist (or we couldn't have called the subprogram at all; that's the only time & place the function was in scope).
  • Frame Structure:
    • return address
    • dynamic link
    • static link (when using static scoping)
    • parameters
    • locals

Scoping (stack-dynamic locals and nested subprograms)

  • choosing Static Scoping:

    Also need static links (when scoping is static): pointer of most recent active frame of the direct parent-scope.

    • parent-scope's frame guaranteed to exist (or we couldn't have called the subprogram at all; that's the only time & place the function was in scope).
    • implementation detail: static chains. We can statically (at compile time) calculate the (/chain/, /offset/) values of any variable (we know the contents of a frame: return address, old frame pointer (dynamic link), static link, parameters, locals.). We had a thorough example of this in class (the chaining.txt samples). We could look at the syntactic structure to figure out how many scoping levels away we had to look to find a variable (locally was zero steps away), and then also looked at how much of an offset was needed within that frame to access the variable. (We assumed the return address, dynamic link, and static link, always started off the frame; then, parameters were placed in order, and then locals in order, all within the current frame). This pair of (static-chain offset, local frame offset) is all we need to find a variable's values at runtime, and that pair can be pre-calculated statically (at compile time).
  • choosing Dynamic Scoping:

    Again, can use dynamic chain to find each variable. (And don't need the static links). Behaves the same as scenario 2 above.