CS 463 - Final Exam Review Spring 2024
Final Exam:
Tuesday, May 7th 2024, 1:30pm-4:15pm
Location: Usual Classroom (E 201)
Table of Contents
Preparation
- allowed: one 8.5x11" cribsheet of self-generated notes (both sides). I strongly suggest writing out this crib sheet, as it's a valuable way to drive your studying. If you're typing things out, you need to keep a reasonable font size (no smaller than 10pt), and you may not include any copied images (including images of text). If you bring that kind of sheet I reserve the right to respond accordingly either before (taking sheet) or after (perhaps by deducting ten points from the assessment for failure to follow instructions or other actions depending on the content of the cribsheet).
- any lambda calculus questions will sufficiently define the language being used, as seen on our quizzes/midterm before.
- the final exam is cumulative. Make sure you refer back to our
midterm review, as this document will not revisit that material
beyond the "suggested question topics"/estimated point spreads.
- roughly 50% will be old material, 50% new.
- it is reasonable to expect about the same level of depth on previous topics at most for the old material; I'm not suddenly going to reach for obscure subtopics on old material.
After the midterm, our efforts have mostly focused on the simply typed lambda calculus, more Haskell (including monads), and concurrency. Our test aims to reflect that same breakdown of effort with no surprises in weighting.
Suggested Question Topics
The exam will likely cover questions related to the following, so you should make sure you are comfortable with whatever kinds of calculations or concepts we dwelled upon when learning about these.
- Defining Languages
- sentence derivations
- parse trees
- language properties (associativity, precedence, etc)
- attribute grammars (top level concepts)
- dynamic semantics (operational, axiomatic, and especially denotational)
- Lisp:
- reading and writing code. always-prefix operators.
- structurally recursive definitions
- using if, cond, let/let*, defun
- lists (cons, car/cdr or first/rest, null, etc.)
- Haskell:
- reading and writing code
- structurally recursive definitions
- pattern matching
- map, folds, filter, zip, zipWith usage
- data types (e.g. lists, Maybe, Either, tuples, user-defined
datatypes)
- creating your own datatypes
- type class usage/implementations.
- monads: IO, Maybe, State.
- reason about IO monadic code similar to the concurrency assignment
- Lambda Calculus (untyped as well as simply typed):
- evaluation rules
- typing rules
- reducing expressions (using evaluation rules, clearly naming what is used)
- typing proof trees (showing well-typedness or errors)
- encodings (writing "programs")
- use the
fix
extension for recursion
- use the
- extending the definition (term/value/type extensions)
- Concurrency:
- basic definitions/ideas
- discuss/reason about classical concurrency problems: producer/consumer, reader-writer, dining philosophers.
- inspect short blocks of code along the lines of our class examples (C, Java, Haskell)
- answer questions related to our concurrency assignments (I'm assuming background knowledge of you completing the assignments enough to understand the challenges and situations it presented).
- concept of monitors, use of
synchronized
in Java - concept of critical sections (need for it and implementations we saw in C/Java/Haskell)
MVar
/Chan
/IORef
concepts from Haskell
- subprograms (this section will be minimal points!)
- parameter passing rules
- pass by value, reference, name/need
- in, out, in/out styles
- different scoping rules (dynamic vs static)
- implementation details of various frames (we looked at three scenarios)
- what is a return address used for? what kind of value is a return address? (it holds a code address)
- what is a dynamic link used for? what kind of value is stored there? (it holds an address of a stack location of the previous frame)
- what is a static link used for? what kind of value is stored
there? (it stores a pointer to the parent scope's most recent
frame)
- how do we resolve a variable in these circumstances? (how many static link hops, and what offset from within that frame?)
- finding variables via static link chains (see chainingX examples)
- when can we, and can we not, perform general or tail recursion? (general: we need a stack).
- parameter passing rules
The rest of this document gives some more thorough treatment of the newer topics, or points us in the right direction.
rough points breakdown:
I haven't written the test yet, but here is my basic points break-down plan:
points(100) | topic |
---|---|
15 | language basics |
15 | untyped lambda calculus |
15 | Lisp Code |
20 | Haskell Code |
15 | simply typed lambda calculus |
15 | concurrency |
5 | subprograms |
1. Simply Typed Lambda Calculus λ→
We explored adding types to the lambda calculus. Evaluation was essentially unchanged. See the homework and rules document for plenty of information about STLC. The homework should be a great source of sample question styles.
- Be ready to write out proof trees within a given calculus.
- both successful and failed ones (including which rule/reason makes it fail)
- Be ready to extend the language.
- add new terms, new types, new values.
- add new typing rules. (exactly one rule per new term).
- adding new evaluation rules just like in the untyped calculus if requested (often more than one per term).
- be ready to write encodings (small definitions) in a specific language.
- careful! don't use things not actually in the given language. But also, don't miss out on what is there. Take a quick peek at the language to remind yourself what tools you have available before answering these questions.
- helper functions are still quite useful, and encouraged.
- recursive definitions (via the fix exension) are expected on the test.
2. Haskell, part 2
Be able to:
- define and use datatypes. Some examples:
data IntTree = ILeaf | IT IntTree Int IntTree deriving (Show, Eq) data Tree a = Leaf | B (Tree a) a (Tree a) deriving (Show, Eq) data Coin = P | N | D | Q
- define a typeclass, and write instances of it. Examples:
class Show a where show :: a -> String instance Show Coin where show P = "1¢" show N = "5¢" show D = "10¢" show Q = "25¢"
- use list comprehensions. Examples:
[ 2^x | x <- [0..10]] flatCoords n = [ (i,j) | i <- [0..n], j<-[0..n]]
- read and write (or modify) code in the IO and (to a lesser degree)
Maybe and State monads (proportional to the efforts spent on
assignments and in class). See our class examples for more detail.
- only do-notation will be expected, not use of
>>=
. But if you choose to use>>=
notation it is accepted.
- only do-notation will be expected, not use of
basic basics of monads
- the Monad type class:
class Monad m where (>>=) :: m a -> (a -> m b) -> m b return :: a -> m a
- do-notation is just syntactic sugar for uses of
>>=
andreturn
.- understand basic idea of bind/return version and converting to do-notation.
- lines that look like this:
v <- funcM args
funcM args
is of some monadic type, such asIO a
, and thenv::a
is the result of running that monadic computation. Think of it as having an action described on the righthand side, the "binding" (left arrow) is requesting we perform the action, and its resulting value is getting stored to the variable on the lefthand side. This is much like a Java assignment statement,a = funcM(a,r,g,s)
; we have a described action on the right, we perform it and store the result intox
.
other monads
- IO Monad: Understand that this is where all side-effectful things
go in Haskell. This is also the only monad we've seen that doesn't
have a
runTheMonad
function (such asrunState
). This is because we don't want to escape theIO
monad by calling a run function and then getting to do other things; then, we'd be allowed to have side effects anywhere! With no run function forIO
, any function that deals withIO
will thus have to haveIO a
in its type somewhere.- We actually get to run one of these things by either invoking it
within the ghci interpreter (which forces the evaluation and gets
to perform the side-effects), or else it is the main function and
gets run as an executable (same behavior). For a compiled
program, we get one chance to do side effects, as the main
method. It likely uses all the other fun code we've written.
- we saw a bit of this on our assignments. We could avoid
actually compiling the code by using the
runhaskell
command, which does on-the-spot interpretation of a file's main function.
- we saw a bit of this on our assignments. We could avoid
actually compiling the code by using the
- fun fact: the interactive loop is just an
IO
do-notation block that reports back immediately as you give it commands. That's why thelet
syntax is do-ish when used in the interactive loop (versus inside a file), without the "in" keyword we learned originally. You can even bindIO
actions right there if you want!Prelude> ans <- getLine this is just a test Prelude> Prelude> ans "this is just a test" Prelude> Prelude> writeFile "Desktop/example.txt" ans Prelude> Prelude> :! cat Desktop/example.txt this is just a testPrelude>
- We actually get to run one of these things by either invoking it
within the ghci interpreter (which forces the evaluation and gets
to perform the side-effects), or else it is the main function and
gets run as an executable (same behavior). For a compiled
program, we get one chance to do side effects, as the main
method. It likely uses all the other fun code we've written.
- Maybe monad: chaining together fail-possible actions. Each step
might fail, and immediately bail out with
Nothing
; or, when a step succeeds in generatingJust someval
, we givesomeval
a name and continue, attempting the next fail-possible action.- example code (basically all functions here are made up):
manySteps name = do name <- tryLookup name directory person <- tryLookup name bestFriend <- maybeHead (friendsOf person) phoneNum <- tryLookup bestFriend phoneDirectory return phoneNum
- example code (basically all functions here are made up):
- List Monad: like our list comprehensions; can also have
let-expressions (as any do-notation can), and
guard
expressions, so truly just like list comprehensions.- example: if we have some generators, those are like nested loops;
when we have guards, those are like nested if-statements; and
when we use the surviving combinations of items, that's like
appending to a list of results. Compare this python code to the
haskell code after it:
# PYTHON-esque CODE vals = [] for x in xs: for y in ys: if x<y: if y%2==0: vals.append( (x,y) ) -- HASKELL CODE vals = [ (x,y) | x<-xs, y<-ys, x<y, y%2==0 ] -- do-style vals' = do x <- xs y <- ys guard $ x<y guard $ y%2==0 return $ (x,y)
- example: if we have some generators, those are like nested loops;
when we have guards, those are like nested if-statements; and
when we use the surviving combinations of items, that's like
appending to a list of results. Compare this python code to the
haskell code after it:
- State monad: threading through some piece of state in the background
(our stack in the class example, or a pair of ints in our fib
examples). The
runState
function accepts the state-monadic action (State s a
value), the initial state (e.g. an empty list to represent our empty stack), and then proceeds to supply the state to each chained action, getting a value and a next state, to supply to the following action, eventually resulting in the last version of state as well as some last answer-value. This is similar to a function call in some imperative language, in that we get back both a returned value as well as some modified state; the state just happens to be the actual state of all memory in your program :)- It's common to want to get a copy of the current state, and to
want to replace the current state with some new value. These two
operations,
get
andput
, are provided in a typeclass:class (Monad m) => MonadState s m | m -> s where get :: m s put :: s -> m ()
First,
get
. It needs no arguments, and gives you a representation of computation that reports ans
value (the current state that was being threaded through the background). In order to use any represented computation such as this, we need to bind it to perform the action and "unpack" the answer:s <- get
Second,
put
. We want to discard the current value being threaded through in the background and replace it with a new one. This method accepts the new states
, and then gives back a representation of computation that (does some reassigning and then) returns()
. All the excitement happened in the state. To call it, we also must bind it (once it's been given its argument):put (s+1)
Although we could bind it, the returned value must be unit, so there's not much value added by doing so:
() <- put (s+1)
So by putting it on a line by itself, we are asking Haskell to perform the monadic action and discard the resulting
()
value. - state-monadic code can be recursive, e.g. when processing a
list. Since the function's type may yield a monadic computation,
we would need to bind the output.
primesM :: [Int] -> State [Int] () primesM [] = return () primesM (x:xs) = do -- deal with current item if (isPrime x) then do currentList <- get put (x:currentList) else return () -- recursive call; based on its type we need to bind it to perform its intended actions primesM xs primes xs = case runState (primesM xs) [] of (ans, lastState) -> lastState
goStack :: [Command] -> State [Int] () goStack [] = return () goStack (Push v:rest) = do stackVals <- get put $ v:stackVals goStack (Add:rest) = do -- this pattern could fail when there aren't enough values... (a:b:restOfStack) <- get put((a+b):restOfStack ...
- It's common to want to get a copy of the current state, and to
want to replace the current state with some new value. These two
operations,
Abstract Data Types
- the user can create their own types (like
IntTree
,Tree a
, andRoseTree
).data IntTree = ILeaf | IT IntTree Int IntTree deriving (Show, Eq) data Tree a = Leaf | B (Tree a) a (Tree a) deriving (Show, Eq) data RoseTree a = RNil | RT a [RoseTree a] deriving (Show, Eq)
Give it a name, as many uniquely named constructors as desired, as many parameters to each constructor as desired, and then they are all collected into the new given type.
- can be parameterized over type variables (we have seen this but
haven't used it much). Both
Tree a
andRoseTree a
showcase this, with thea
type parameter. You can useTree a
at all kinds of specific types, e.g.Tree Int
,Tree String
, or even weirder things likeTree (Maybe (Tree Int))
. This type of polymorphism is called parametric polymorphism: absolutely nothing is known about the type, so any type may be used there (and preserved for all related locations in one particular usage of this thing). It is like generics in Java (Java borrowed it from functional languages when they carefully added it to the language in version 5). - just add
deriving (Show,Eq)
to make printing and equality checks available. This uses type classes, and these are two of the shortlist of typeclasses that Haskell can automatically implement instances for you. - More examples (some are built into the
Prelude
already):data Bool = True | False data Maybe a = Nothing | Just a data Either a b = Left a | Right b data Color = RGB Int Int Int | Grey Int | Black | White data MyList a = MyNil | MyCons a (MyList a)
- can be parameterized over type variables (we have seen this but
haven't used it much). Both
- functions can be written over these types, pattern-matching on the
constructors.
sum :: IntTree -> Int sum ILeaf = 0 sum (IT left val right) = (sum left) + val + (sum right) catMaybes :: [Maybe a] -> [a] catMaybes [] = Nothing catMaybes (Nothing:rest) = catMaybes rest catMaybes ((Just v):rest) = v : catMaybes rest
type classes
- similar to a Java interface, in that it defines a group of methods
that can be implemented at various types.
- this is ad-hoc polymorphism.
- but, it is "opened": we can write the instance of a type class for some particular type wherever we choose, instead of just inside the class definition as Java requires for interface implementations.
class definition examples
- The
Show
typeclass is very simple, defining just one method over the target type:
class Show a where show :: a -> String
- The
Monad
typeclass provides just a few methods, though we learned that quite a bit of different uses can all arise from these methods! Note the default definition of(>>)
; much like a Java default method, Haskell typeclasses can provide an implementation in terms of the others (which is still replaceable by you when you implement an instance of the typeclass).
class Monad m where return :: a -> m a (>>=) :: m a -> (a -> m b) -> m b (>>) :: m a -> m b -> m b ma >> mb = ma >>= (\ _ -> mb)
- the
Functor
typeclass only providesfmap
but we saw it could be useful to dig through some structure. Note thatf
isn't a plain old type, but is something of kind* -> *
.class Functor f where fmap :: (a->b) -> f a -> f b
instance examples
An instance is a stand-alone piece of code that claims (and provides a definition of how) a particular type provides the methods of a particular type class. It selects a specific concrete type for the type class's type variables. This is akin to the methods in a Java class that provide the implementation of an interface (not necessarily the entire class!). Instead of needing to be defined in that one class, the instances can be just about anywhere. This adds some convenience and causes some headaches here and there; it's just a different approach.
We must describe what class is being instantiated, by what particular type, and then provide the definitions.
- simple instances
data MyList = Nil | Cons Int MyList instance Show MyList where show (Nil) = "Nil" show (Cons x xs) = "Cons " ++ (show x) ++ " " ++ (show xs)
- note that the
show (Cons x xs)
line has two recursive calls toshow
; the first (show x
) relies upon theinstance Show Int
, while the second relies uponinstance Show MyList
recursively. Becauseshow
is a method in a typeclass, each usage anywhere might not be from the current instance!
- note that the
- Ord
Look at the
Ord
typeclass for more examples. We worked on this in class code.
3. Concurrency
Be prepared to:
- discuss concurrency primitives in general: semaphore, mutex, threads, waiting
- discuss common issues: deadlock, live lock, race conditions
- discuss classical concurrency problems:
- producer/consumer, reader/writer, dining philosopher examples
- inspect short blocks of Java, C, Haskell code that involve
concurrency. Especially:
synchronized
keyword usage in JavaMVar
,IORef
,Chan
usage in Haskell- MVar is a box that can be empty or hold one value. Access automatically blocks.
- IORef is like a variable, with read/write capability. No blocking.
- the possibility of race conditions exists, and must be accounted for.
- Chan is like a buffer (FIFO) with put/take operations (annoyingly, named read/write…)
- note: STM will not be tested on the exam. It's quite interesting, but the items above are more important basic things we should test. Also, a lot of the functions/datatypes have confusingly similar names to the above.
Basics
- levels of concurrency:
- machine instructions (e.g., multiple cores)
- language statements
- unit level (e.g., subprograms)
- program level (multiple programs running at once)
- only the statement and unit levels are interesting for our purposes
- Categories of Concurrency
- physical: multiple independent processors
- logical: simulate physical concurrency (e.g., time-sharing a processor)
- co-routines: not really concurrency, because we still have one thread of control
Some Definitions
- liveness: a task's ability to make progress towards completion
- deadlock: all tasks have lost their liveness.
- example: dining philosophers are all holding on to one chopstick, and are waiting for the second one without wiilling to release the first one. It's a stalemate.
Tasks
task:
Some unit of code that can run concurrently with others.
- often work together (coordinate), occasionally compete for resources.
- initiating a task and completing a task: doesn't behave quite like a normal function call. (the 'main' function regains its own control as soon as the spawning is complete; the child thread might not return to main in any sense when it completes.)
- kinds of tasks:
- heavy-weight: own address spaces (like separate processes)
- light-weight: share an address space.
- each thread gets its own stack (which isn't particularly large in general - often about 8MB), with its own dedicated address range, but all threads tend to be able to access all memory.
- easier to implement than heavy-weight threads
- easier to share data (whether on purpose or not)
synchronization:
- modes of communication:
- shared state (issues of read/write timing, i.e. race conditions)
- passed parameters (at initial thread initialization)
- message passing
Cooperation and Competition
Cooperation: Multiple tasks are all trying to help solve the same problem. They coordinate and communicate their progress, sharing resources. They need to work together and understand each others' algorithms somewhat.
Competition: Multiple tasks all want access to the same resource. Given the unpredictable timing, they end up fighting over access to the resources.
Approaches to Synchronization
Semaphores
Semaphores: can send signals to other tasks that some condition exists.
- implementation: counter (number of signals to send) and queue (a list of tasks that are waiting for a signal).
wait
andrelease
functions. (many other names depending on the language)
Issues:
- brittle. We rely on correct calls of wait/release by the code that
uses the semaphore. If we forget one, anywhere, generally
everything breaks.
- missing waits: underflow/overflow of buffers could occur.
- missing releases: deadlock can occur.
Language support:
- often provided as libraries, and not as language primitives.
Mutexes
mutexes: "mutual exclusion". Represents a lock for some resource. Whoever holds the lock has sole access to the resource.
- not quite a semaphore with a maximum count of one, because there's no queue: every attempt to grab the lock might fail, and a particular task might just always be too slow to acquire the lock.
- often paired with condition variables that can be signaled and waited for.
- intention: a mutex is the lock for a specific resource, not just for any one of multiple identical resources. The task that acquires the lock should be the one that releases the lock, but implementations might not enforce this intent.
Monitors
monitors: An object can have sole control over some resource, and require all tasks that want access to the resource to use its own public API (methods) to request interactions with the resource. This one object, called a monitor, can then internally embody all synchronization necessary to make use of the resource safe.
- example: A Java object is a monitor, always. Any of its methods
that are all labeled
synchronized
(or blockssynchronized
on this object) will not be concurrently run. If you want to control access to a resource, hide all accesses to it behind the methods of one object andsynchronized
those methods. Java takes care of the rest!- if you have a
Buffer
class with synchronizedget
andput
methods, eachBuffer
is individually monitoring its own state; they are not synchronized with each other. This means that every one of your hundreds of buffers might all be getting or putting at the same time, but no individual buffer is experiencing multiple get/put operations at once.
- if you have a
- mutual exclusion is guaranteed (from outside the monitor) as long as it's implemented correctly inside the monitor and only the intended methods are visible outside the monitor.
- programmer still must coordinate between tasks that use the monitor. (Deadlock is still a possibility).
- concurrent access requests are implicitly blocked.
- Notes:
- Competition: straightforward; mutual exclusion is automated.
- Coordination: programmer still does bookkeeping.
- just as powerful as semaphores (we can implement each in the other)
Message Passing
message passing: It's like having mailboxes for each task. The delivery of messages can be asynchronous. Each task is responsible for reading its mail frequently enough that communication is maintained.
- Coordination is achieved through messages that share the information
- one task might choose to have multiple mailboxes for various purposes, so that messages stay organized. (It helps to deal with the issue of the ordering of messages).
Some Language Approaches to Concurrency
- C: pthreads library
- Java:
- Thread objects (
run()
methods are tasks) - synchronized methods and blocks
- java.util.concurrent.Lock: locks
- java.util.concurrent.atomic: variable-level synchronization.
- Thread objects (
- Haskell:
- MVar, IORef, Chan
- STM (software transactional memory)
Haskell: MVar
An MVar
gives us a single spot in which to put or take a value -
like a length-one buffer. It automatically blocks when putting into a
full MVar
or taking from an empty MVar
; the blocked thread will be
woken back up when the buffer is now in the right state for their
attempted action.
- some operations:
newEmptyMVar :: IO (MVar a) newMVar :: a -> IO (MVar a) putMVar :: MVar a -> a -> IO () takeMVar :: MVar a -> IO a tryTakeMVar :: MVar a -> IO (Maybe a) tryPutMVar :: MVar a -> a -> IO Bool isEmptyMVar :: MVar a -> IO Bool
- it can represent a lock by putting a value (really, any value) in
the
MVar
; when your take operation succeeds, you've got the lock. Put it back when you're done though. Forgetting the unlock step in any language is a bug. - threads can coordinate by putting messages in each others'
"mailboxes" (just
MVar
's, or perhapsChan
's like below).
Haskell: Chan, BoundedChan
A Chan
behaves like an MVar
that can hold an unlimited number of
values. It thus won't block on put-operations, but take-operations can
still block if there are no values in the Chan
at the moment.
- some operations:
newChan :: IO (Chan a) writeChan :: Chan a -> a -> IO () readChan :: Chan a -> IO a
Chan
's are useful when we don't need a lock-step put-take-put-take usage, e.g. when the mail can pile up for awhile before we get around to reading it. It preserves the original order. It is/was useful on our concurrency examples for the announcer thread, which accepts multiple printing-requests, and handles them in the order received.- A
BoundedChan
behaves like a fixed-size buffer with automatic blocking on reads/writes when necessary. It is like a cross between anMVar
and aChan
- it has a fixed number of spots available like anMVar
(which only has one spot), but can accept multiple puts in a row when there's space (like aChan
, which always has enough space as long as your computer does).
Software Transactional Memory
On the test, you will not be required to write any code in this style. The main idea of "arbitrary blocks that run atomically, recording all changes from that block of code to the global state in one fell swoop" is something you should understand.
atomic
guarantees atomicity on arbitrary expressions. Either an entire atomically labeled block is successfull and all side-effectful things happen all at once ("atomically" committing all those changes to the shared global state), or else we mightretry
or try alternates viaorElse
.GHCI
can selectively only wake up threads when one of the relevant shared state variables has actually been modified, instead of any competitive "wake-the-world" approach (e.g.notifyAll
in Java).- we saw an example of the Dining Philosophers, where the combined actions of taking the left and right utensil, eating, and putting them back were all considered as one atomic group.
4. Subprograms
basics
- subprogram: collection of statements to run, maybe with arguments
and return value. We use subprogram as the general term for various
flavors:
- function: subprogram with parameters and return value.
- subroutine: subprogram with no parameters or return value.
- method: 'function' attached to specific object.
- signature: all structured info in "first line": visibilities, return type, name, formal parameters list, etc.
parameter lists
More terminology to help discuss things:
- formal parameter: part of subprogram definition (declares it)
- actual parameter: part of subprogram call (instantiates it for a particular call)
Sometimes we call formals/actuals "parameters" and "arguments".
allowed arguments
- "first class values": types that may be used as arguments. Usually functions aren't allowed (but it's very useful when they can be).
- candidates:
- primitives
- arrays
- addresses/references
- types
- functions
- more definitions (class definitions, etc)
Parameters Modes
- in: info flows from caller to called
- out: info flows from called to caller
- in-out: info flows both ways through this particular parameter
Parameter-Passing Conventions
Various ways to implement in-, out-, or in-out parameters. Be careful not to confuse your most familiar version of one style as the true meaning of that style! E.g., passing a pointer as a parameter lets information effectively enter and exit the call, but it's just one way to achieve in-out.
- pass by value: copy actual value, send to subprogram.
- in mode
- ex: Java primitives
- pass by reference: send copy of address to subprogram.
- in-out mode (via these "access paths")
- ex: Java parameters of any reference (non-primitive) type
- keep in mind that we have some data in mind; you might consider a
pointer in
C
to give us pass-by-reference access to that data. Many sticklers will complain that we're actually passing the pointer by value. It's a matter of perspective, and not worth getting worked up over.
- pass by name: substitute argument expression for name directly.
- leads to re-evaluation of the expression each time parameter is evaluated (can be good/desired, can be very wasteful)
- issue of where transported names come from: original env? env when called? env when actually evaluated?
- implementation: a closure/thunk. (some code-as-a-value that will evaluate when finally needed/touched).
- pass by need: call-by-name that memoizes the result.
With referential transparency, a rare appropriate time for a pass-by-name flavor. If nobody ever needs the result, it's never calculated (we're lazy). If it's needed multiple times, the first usage touches the thunk/evaluates it to the answer, which is stored. Successive uses find the answer instead of recalculating (allowed since we have referential transparency, a.k.a. purity).
Haskell does this.
Subprograms as Parameters
Referencing Environment: effectively the group of names-to-values for all variables used inside the subprogram that gets used as a parameter.
Be prepared to look at code and differentiate what occurs with different styles of binding. There was an explicit example comparing these styles in our chapter 9 slides.
- shallow binding: use the local environment when the subprogram is
actually executed
- dynamic scoping; uses a dynamic link in stack frame to look back through frames until you find the variable.
- X86-64 has dynamic links (unless there was nothing in the frame at all).
- deep binding: use the environment from the subprogram's original
definition. You can "statically" (at compile time, without
executing) look for the lexical structure that contains a
definition to know where to look for variables that weren't defined
locally. We don't care who called us, we only care where we're
defined.
- static scoping; requires a static link in stack frame, which points back not to the caller's frame but to the most recent frame of the surrounding scope's call (the frame of the parent scope).
some Approaches to Closures
- Haskell thunks (all sub-expressions)
- C function pointers
- Python anonymous functions
Kinds of Polymorphism
- subtype polymorphism: one type extends another, is guaranteed to
provide at least all definitions from parent type.
- ex: Java classes (really any OO classes)
- parametric polymorphism: "forall" types. Never inspects value, so
can work at any type.
- ex: haskell types like
map::(a->b)->[a]->[b]
. - ex: Java Generics
- ex: haskell types like
- ad hoc polymorphism: functionality manually provided at specific
types. Usage at just these types thus works.
- ex: Java interfaces
- ex: Haskell type classes
Overloading methods (providing alternate versions)
Some various examples:
- OO: replace inherited definitions
- Haskell: typeclass instances
- C++: templates
5. Implementing Subprogram Calls
Definitions
- Activation Record: structure of values needed to track one
subprogram call.
- some of: return address, dynamic link, static link, parameters, locals
- Activation Record Instance: actual values shaped like the
corresponding activation record.
- we've been calling these frames, too.
- active frame: frame of subprogram has been called, but has not yet returned. (Might be paused, awaiting return from function-call)
Scenario 1: Early languages: Static Everything!
Description/Implications:
- all variables are static: they get a permanent address in memory forever.
- no need for runtime stack!
- can place each code segment and frame in known-at-compile-time
spots.
- or leave to linker to allow cross-compilation.
- still need a return address (one subprogram can still be called from various places)
- unable to provide general recursion (there's no stack to place extra ARI's of that same subprogram), but we saw it was possible to achieve tail-recursion. It behaved somewhat like a loop, which is exactly what we want.
- Frame structure:
- return address
- parameters
- local variables
Scenario 2: Many modern languages: Stack-Dynamic Locals
Description/Implications:
- store locals/params in frame
- choice: disallow nested subprogram definitions
- just local (this frame) and global (permanent location) scopes
- thus we don't need static links. (see below when they're needed).
- need runtime stack of active frames
- can have multiple frames of one subprogram definition: must keep separate.
- thus, can have general recursion. We are used to these kinds of languages.
- need dynamic links (to remember where previous frame begins on stack)
- Frame structure:
- return address
- dynamic link
- parameters
- locals
Scoping issues (for stack-dynamic locals and no nested subprograms)
- choosing Static Scoping
can only see locals (known offset within frame) or globals (known permanent address).
- all known statically at compile time
- choosing Dynamic Scoping
must be able to walk back through the dynamic links ("dynamic chain") to find most recent occurrence of named variable
- requires both name and value on stack. Naive approach: dig through
frames. Would cause slow lookups, and also we'd need to be storing
the actual parameter names on the stack - weird…
- alternative: keep stack of values for each unique variable name.
- call/return now requires push/pop (more expensive calls), but lookups are fast.
- alternative: keep table of values for each unique variable name.
- each frame stores any variables it will shadow, and restores upon returning. This treats all variables sort of like callee-save register values.
- alternative: keep stack of values for each unique variable name.
- note: dynamic links are also used when we return from one function call to its caller: the dynamic link helps us reset the frame pointer to the beginning of the previous frame on the stack.
- requires both name and value on stack. Naive approach: dig through
frames. Would cause slow lookups, and also we'd need to be storing
the actual parameter names on the stack - weird…
Scenario 3: Stack-Dynamic Locals and Allowed Nested Subprograms
Description/Implications:
- still choose stack-dynamic locals (keep recursion)
- subprogram definitions may be nested now
- still use stack of active frames (general recursion available)
- still need dynamic links
- also need static links (when scoping is static): pointer of most
recent active frame of the direct parent-scope.
- guaranteed to exist (or we couldn't have called the subprogram at all; that's the only time & place the function was in scope).
- Frame Structure:
- return address
- dynamic link
- static link (when using static scoping)
- parameters
- locals
Scoping (stack-dynamic locals and nested subprograms)
- choosing Static Scoping:
Also need static links (when scoping is static): pointer of most recent active frame of the direct parent-scope.
- parent-scope's frame guaranteed to exist (or we couldn't have called the subprogram at all; that's the only time & place the function was in scope).
- implementation detail: static chains. We can statically (at
compile time) calculate the
(/chain/, /offset/)
values of any variable (we know the contents of a frame: return address, old frame pointer (dynamic link), static link, parameters, locals.). We had a thorough example of this in class (thechaining.txt
samples). We could look at the syntactic structure to figure out how many scoping levels away we had to look to find a variable (locally was zero steps away), and then also looked at how much of an offset was needed within that frame to access the variable. (We assumed the return address, dynamic link, and static link, always started off the frame; then, parameters were placed in order, and then locals in order, all within the current frame). This pair of (static-chain offset, local frame offset) is all we need to find a variable's values at runtime, and that pair can be pre-calculated statically (at compile time).
- choosing Dynamic Scoping:
Again, can use dynamic chain to find each variable. (And don't need the static links). Behaves the same as scenario 2 above.