CS463 - Midterm Review - Spring 2024

The test will be administered in person, during class time.

Our midterm will be Thursday, February 29th, 2024 in class.

Allowed: one 8.5x11" cribsheet of self-generated notes (both sides). I strongly suggest writing out this crib sheet, as it's a valuable way to drive your studying. If you're typing things out, you need to keep a reasonable font size (no smaller than 10pt), and you may not include any copied images (including images of text). If you bring that kind of sheet I reserve the right to respond accordingly, perhaps by deducting ten points from the assessment for failure to follow instructions or other actions depending on the content of the cribsheet.

There will be some theory questions as well as some applications of ideas (such as reading/writing snippets of code). I will assume you are comfortable enough with Java to discuss features and terms of that language, and also that you are moderately comfortable with Haskell (using it in ways similar to our homework and sample code thus far).

You will have the entire class time to take the test (75 minutes).

Table of Contents

1 Preparation

This summary section suggests what you should be prepared to do for each of the major portions of our class so far. Generally, look at the homeworks and quizzes we've had so far to get an idea of the kinds of work you're expected to now be comfortable performing.

  • lambda calculus:
    • simplify lambda expressions, listing the rules you've used and without skipping steps.
    • identify which terms are also values (versus terms that are not values)
    • extend the language with new terms, select new values, and create new evaluation rules.
    • write little "programs" (encodings) that implement some logic, using only what is actually in the language. This might be done on the core lambda calculus, with encodings for true/false like we're practiced, and it might also be requested on a lambda calculus that has multiple extensions already added, similar to the last portion of our homework.
    • you don't need to be prepared to use the Y-combinator at all, but using recursion is possible. (via the fix t and E-Fix extension)
  • defining languages:
    • answer some open-ended prompts about language basics
    • read/write basic grammar definitions, based on production rules.
    • perform a sentence derivation (perhaps specifically left- or right-derivation). This is where you begin with the Start symbol, keep replacing a non-terminal, and all your sentential forms eventually end up with a sentence of all terminals, showing that that sentence is in your language. There is no tree drawing involved here, be careful not to confuse derivations with parse trees!
    • draw the parse tree form of a sentence
      • each produced terminal or non-terminal gets its own separate branch downwards, in the order of appearance in the sentence. (example: an if expression could have seven branches if the syntax was verbose enough: if, expr, then, expr, else, expr, endif. We tend to just use "if t t t" styles of representation, meaning three branches below).
      • be careful not to confuse parse trees with derivations! Derivations are all text, parse trees are drawings/diagrams.
    • discuss attribute grammars very briefly
    • discuss dynamic semantics, and differentiate the various kinds we'd discussed. (especially denotational, far more than operational/axiomatic).
  • Haskell:
    • writing/reading function definitions:
      • use helper functions! (as necessary)
      • pattern matching, which often involves multiple top-level equations in a row (that must be visited in the listed order to find the first compatible pattern)
      • pattern guards
    • structural recursion: pattern matching over e.g. a list with a recursive call, where those recursive calls match the data structure's recursive structure. Similarly, creating a list based on those calls (examples: map, filter generate lists while making recursive calls along the shape of a list-argument)
    • expression flavors:
      • if-expressions
      • let-bindings
      • case expressions (pattern matching anywhere an expression can go)
      • lambdas
    • higher-order functions:
      • map, filter, zip/zipWith, foldr/foldl'
    • usage of existing data types:
      • data Bool = True | False
      • type String = [Char]
      • lists (the type uses []'s as syntax such as [Int], just like the values use []'s as syntax such as [1,2,3])
        • the constructors are "cons" (:) and "nil" ([]), or hardcoded lists, e.g. [1,2,3]
        • the pretty syntax of comma-separated values (e.g. [1,2,3,4]) is just a convenient way to build lists, but (:) and [] are still ultimately how each cell of this singly-linked list is made, such as (1:(2:(3:(4:[]))))
      • tuples
        • parenthesized, items separated by commas.
        • the type of a tuple records the type at each position (and implicitly the length of the tuple), e.g. (Int, Char, Bool).
        • useful functions: fst/snd. Or, preferably just use pattern matching.
      • data Maybe a = Just a | Nothing
        • be prepared to read/write functions using Maybe.
      • data Either a b = Left a | Right b
        • be prepared to read/write functions using Either.
  • in general, if you know more about the haskell language (other than importing stuff) you're welcome to use it in code you write, but this basic set of things should easily box in the kinds of things I'd like you to be comfortable writing on our test. No importing things though. And be careful to follow instructions - if it says to explicitly perform the structural recursion, then calling map instead will not earn you much credit.

2 Lambda Calculus

2.1 The Untyped Lambda Calculus

  • The untyped lambda calculus is a simple definition of terms (and a way to evaluate them, discussed later). Consider our shared file:

2.1.1 Terms

Definition of the core lambda calculus' terms;

t ::= x | λx.t | (t t)
  • x represents a variable.
  • λx.t represents a "lambda" (a function, an abstraction). The function is anonymous, because it has no name. The parameter has a name, though (here, we've named it x).
  • (t t) represents an application of one term onto another - assumedly the left term evaluates to a λ value, or else we won't be able to reduce this application term (nor reach a value overall with evaluation).
    • when application is implied we often omit the parentheses and rely on adjacency. For example, f a b c is the same as (((f a) b) c).

2.1.2 Values

We also define what terms in our language are values (terms that cannot be further evaluated and represent a "normal form"). For the core untyped lambda calculus, just lambas are values. (We often extend our language, adding more terms and more values).

v ::= λ x . t

2.1.3 Evaluation

Evaluating a term in the untyped lambda calculus means applying evaluation rules upon a term until the result is a value.

  • Lambdas are by definition already values. (thus there are no evaluation rules that look at just a lambda all by itself, turning it into some even-simpler forms).
    • This is why you'll never look inside a lambda for evaluation steps to take, even if you see a nice "1+2" hiding in there; no evaluation rules will give you the chance to peek inside.
  • We never actually evaluate a variable, because they should be substituted away before we ever reach them. Finding a variable that is "free" (not enclosed by an accompanying lambda that introduces the variable name) would essentially be a runtime error during evaluation: the variable is undefined there.
  • Applications are not values and thus may be evaluatable. They need a lambda as the left term; if it isn't yet a lambda, evaluate it until it becomes one.
    • Progress towards a function argument:
                    t1  ->  t1'
      E-App1  -----------------------
               (t1 t2)  ->  (t1' t2)
      
    • assumedly when t1 has no more evaluation possible, it's both a value and is specifically a lambda. If not, our application is not able to simplify to a value overall; for example, (5 6) makes no sense as a request to use 5 as a function. So let's think just about cases where t1 did eventually end up as a lambda.
  • we also will choose (this time) to require that in our language, only values are supplied as arguments to lambdas; so we need to make progress on the second subterm in an application once we've already got a value for the first subterm:
    • Progress towards a value as the argument to our lambda:
                    t2  ->  t2'
      E-App2  -----------------------
               (v t2)  ->  (v t2')
      
    • again, we're hoping/assuming that v is actually a lambda, but this rule doesn't need to be explicit about it. For now, any value will do there, and we enforce the "actually a lambda" requirement later.
  • evaluation of an application of a lambda and any other value means to substitute that value (the function's argument) for the variable that the lambda introduces, throughout the body-term of the lambda.
                
    E-App-Abs  ----------------------------
                ((λ x . t) v)  ->  t[x ↦ v]
    

2.1.4 lazy semantics

We can embody lazy semantics (where we don't have to evaluate arguments to value before substituting them) by throwing out E-App2 and modifying our E-App-Abs rule to allow arbitrary terms as the argument. Note below that we have t2 instead of v, meaning that we can substitute some arbitrarily large term (perhaps into more than one usage of the parameter if the variable had shown up in the lambda more than once), instead of only being able to substitute values.

            
E-App-Abs-Lazy  ---------------------------------
                 ((λ x . t1) t2)  ->  t1[x ↦ t2]
  • this rule is not a usual part of the lambda calculi we've looked at! This is a variant (just E-App-1 and E-App-Abs-Lazy would be present).

2.1.5 Encoding Other Values (without extending the language definition)

We looked at how to represent some things by effectively writing code in our language (instead of extending the language with extra terms and values and evaluation rules), primarily what are called "Church Booleans" and operations on them. This shows us that we can use this very terse language as-is to start describing more familiar calculations, but we learned that extending the language with boolean primitives was far more inviting. We could still encode functions and other useful things using extensions, giving us the best of both worlds. Here were the expressions we used to represent common true/false values and operations.

Encoding booleans and operations in the core untyped lambda calculus

true  = λ x . λ y . x
false = λ x . λ y . y
not   = λ x . ((a false) true)
or    = λ a . λ b . ((a a) b)
and   = λ a . λ b . ((a b) a)
if    = λ b . λ x . λ y . ((b x) y)
  • a closing comment: if you extend the language to natively include booleans, you should not use the above encodings, because in that case true is not the same thing as λx.λy.x; one's an atomic token (true) term, one's a lambda term. Above, we are simulating having booleans by using lambdas, but we never had actual booleans in the language itself.

2.2 Extending the Language

We can make our language much more interesting by extending our terms and values (and then evaluation/typing rules).

2.2.1 Booleans

  • Language extensions:
    t ::= ... | true | false | if t t t | equals t t
    v ::= ... | true | false
    
  • evaluation extensions:
                          t1  ->  t1'
    E-If        -------------------------------
                 if t1 t2 t3  ->  if t1' t2 t3
    
    E-If-true   -----------------------
                 if true t2 t3  ->  t2
    
    E-If-false  ------------------------
                 if false t2 t3  ->  t3
    
    • rules for equals might need to be exhaustive as our language grows. We could also just leverage some other total predicate function over terms, such as alpha-equivalence α (see our implementation in class; honestly, I will be avoiding equality on our test).
                                 (λx1.t1) =α (λx2.t2)
    E-Equal-lambda-true  ------------------------------------------
                          (equal (λ x1 . t1) (λ x2. t2)  -->  true
    
                                 (λx1.t1) ≠α (λx2.t2)
    E-Equal-lambda-false -------------------------------------------
                          (equal (λ x1 . t1) (λ x2. t2)  -->  false
    

2.2.2 Numbers

There are two approaches here to extending the language with numbers. We can try to implement our own numbers with zero and successor and predecessor, or we can borrow from our host language.

2.3 defining zero/successor (this is kindof painful - it's base one counting!):

t ::= ... | zero | succ t | pred t
v ::= ... | zero | succ v

2.4 borrowing the host's definition (this is far more useful!):

We can also just borrow our host language's implementation (and also here add some operations):

t ::= ... | <Ints> | t + t | t - t | t * t
v ::= ... | <Ints>

2.4.1 Other Extensions

  • Many other extensions are simple to add, such as pairs, lists, records, and so on. We performed some of these in class (see class notes), and you might want to be prepared to either describe the process or carry it out. For instance, we might show the standard definitions of evaluation and substitution for the lambda calculus, and have you extend the language somehow. Or, we might need to write evaluation rules for some extra feature of the language. (later in the semester we would also want to write new type checking rules).
  • many much more significant language features can also be added in this fashion: assignments (and state); classes/objects; case statements; abstract data types; type inference. We will not tackle some giant new extension on the test!

2.5 Recursion

2.5.1 Recursion without extensions

  • the Ω-combinator always diverges - it evaluates to itself.
    Ω = ( (λ x . x x)  (λ x . x x) )
    
  • the Y-combinator helps us define recursive definitions in a language that doesn't directly support recursion:
    Y = λf. ( (λx. (f (λy.x x y)))
              (λx. (f (λy.x x y)))
            )
    
    • define a function whose first argument is assumedly going to be an entire copy of itself. Feed that "stub" of a function to the Y combinator and it will "tie the recursive knot" for you.
    factStub = λself.  λn. if (n=1) 1 (n*(self (n-1)))
    fact = (Y factSub)
    
    • you can test definitions like this in our class-provided code:
    ULC>  eval (App fact (Num 5))
    (Num 120)
    

Note that the y-combinator uses untyped lambdas. When we explore the simply typed lambda calculus, without extending our language significantly, we won't be able to represent recursion such as this. There, only the extension version will suffice when we have types; there's a hidden point at which the types need to be ignored for the Y combinator to do its trick.

2.5.2 Recursion as an extension

We also explored the fix extension. Happily, it expects something of the exact same shape that the Y-combinator does (see factStub above).

t ::= ... | fix t
(no new values!)

We use it by creating a term like fix factStub. Often this may just be used as part an entire encoding definition, without bothering to name the stub.

fact = fix (λself. λn. if (n=1) 1 (n*(self (n-1))))

And then we call it like any other function: (fact 5).

If you are working on values that are recursively built up, like navigating lists, you will likely need to use recursion in order to write functionality using those terms.

3 Language Basics: Syntax and Semantics

3.1 Chapter 1: Basics

3.1.1 Language Evaluation Criteria

  • readability (ease of understanding progs)
    • simplicity, orthogonality, available data types, syntax
  • writability (ease in creating/modifying progs)
    • simplicity and orthogonality, abstractions available, expressivity (ways to write same task)
  • reliability (conformance to specs)
    • type checking, exception handling, aliasing, minimal implicit coercions, good readability/writability
  • cost (time, training, money…)
    • training time, compilation time, runtime, cost of tools, cost of maintenance

3.2 Chapter 2: History

Maybe some day you'll have time to read this if you haven't already. It's a nice read! We won't be testing on this chapter though.

3.3 Chapter 3: Syntax and Semantics

3.3.1 Some Definitions

  • sentence: string of characters from an alphabet
  • language: set of sentences
  • lexeme: the characters from a sentence that correspond to a token (e.g. "foo", "123")
  • token: representation of a lexical unit, e.g. (Identifier "foo"), (Number 123)
  • recognizer: device that finds out if string-inputs are sentences of a specific language
  • generator: device that generates sentences of a language. (Hard to build/use well)

3.3.2 Chomsky Hierarchy of Languages

  • Type 3: regular languages.
  • Type 2: context-free languages.
  • Type 1: context-sensitive languages.
  • Type 0: recursively enumerable languages.
  1. Focus

    We mostly focused on type 2 languages, as those match most programming languages. We don't discuss the others here, but regular expressions fit into Type 3 and are also useful for programming tasks.

3.3.3 Context-Free Grammars

  • abstractions (non-terminals) for groups of syntactic structures. They can build on each other/themselves to completely define specific languages.
  • terminals are lexemes (specific characters)
  • production rules: define possible patterns that can be used to replace non-terminals.
    • our syntax:
      • use UPPERCASE or CamelCase style for non-terminals
        • and just notice that if something is on the left side of a production rule, it needs to be a non-terminal thanks to the rules of how to generate production rules in a Type 2 grammar.
      • lefthand side: exactly one non-terminal
      • righthand side: any mixture of terminals and non-terminals (can also just be epsilon, ε, to indicate the empty string)
      • for convenience, you can put multiple production rules of the same non-terminal in one line, e.g. S -> S+T | S-T | T
  • derivation: repeated substitutions of non-terminals (one at a time please) to arrive at a sentence in a language. May be a left-, right-derivation (or neither) based on which non-terminal we choose to replace in each step of the derivation.
    • sentential form: the string of symbols in one step of a derivation. Only all-terminal forms are sentences.

3.3.4 Parse Trees

  • a hierarchical representation of a derivation. Inner nodes (including the root) are non-terminals, leaves are the terminals. Since there's no record of which part of the tree you drew when, there's no notion of a "leftmost parse tree generation" that would correspond to a leftmost sentence derivation.
  • ambiguity: a grammar is ambiguous IFF there exists a sentence that has 2+ unique parse trees for it. You can also show this sufficiently with 2+ unique leftmost derivations (or 2+ unique rightmost derivations) of the sentence, because it will force you to show at which step there were two different choices for the same non-terminal (the place where the two parse trees differ from each other).
  • associativity: operators can associate left or right. Left-only recursion of production rules results in left-associativity, as well as parse trees that are "heavier" on the left.
    • Example: the plus operator is left-associative in this example because the production rule for S puts S to the left of the plus: S -> S+T

3.3.5 Attribute Grammars - Static Semantics

AGs allow us to decorate a parse tree with values, encoding some (but not all) semantic information. We place these attributes at any node in the parse tree (both leaves and interior nodes), based upon parent or child nodes' attributes or structure, and generated by use of semantic functions. Lastly, we also may enforce predicates about these attribute values at any node.

  • specific attribute values are associated with specific symbols (non-terminals/terminals)
  • all production rules get "semantic functions" that define attributes of the non-terminals in that rule
    • some are synthetic: building up attributes from leaves to root.
    • some are inherited: deciding attributes from root to leaves.
    • some are intrinsic: only dependent upon the node itself.
  • production rules also get predicates that must be true for the sentence to be in the grammar. These compare attribute values.

We had an example that looked at two attribute values - representing the actual type of a term, and separately the expected type of a term. We saw some of the semantic functions for specific production rules to see how the attributes were added, and we also looked at some attribute predicates e.g. that the actual type and expected type values at a node needed to be compatible.

You will not need to perform these steps; however, you should be able to discuss some of the high-level reasons to use AGs.

3.3.6 Dynamic Semantics

  • There's no 'best' approach, because we want to encode the meaning of programs, which can be better described in different ways for different situations. The intended audience can greatly affect what style is most useful: e.g. language implementers, programmers, or someone proving properties about a specific program.
  • three approaches discussed: operational, denotational, axiomatic
  1. Operational Semantics
    • essentially, define meaning by showing how it runs on a machine.
    • too much detail on a real machine, so we usually show how to 'run' it on an idealized (simplified) virtual machine.
    • great for informal descriptions: "it works like this simpler thing"; learning by example.
    1. Natural Operational Semantics
      • "big-step"; focuses on final result
    2. Structural Operational Semantics
      • "small-step"; focuses on the sequence of state transitions
  2. Denotational Semantics
    • maps each non-terminal to a mathematical object (value). Meaning is the result of calling these functions.
    • Based on recursive function theory.
    1. Approach:
      • choose mathematical object (choose a type) for each language entity. E.g., choose integers or strings or booleans as the target output of evaluating them. Could also choose some combination of them (as a union of those sets of values).
      • define functions mapping from those entities to the chosen types of values.
      • we must track the current state: it could be represented by a table of names-to-values. A list of pairs was a direct way for us to represent this (where the first occurrence of a variable is the chosen one in case of ties). [("a",5),("b",12),("c",[1,2,3]),("other",True)...]
      • include the error value, so that failing computations have a value to represent them. This value is often called "bottom" (⊥), and is considered an element of every single type. (It looks like an upside-down T, which means "Top").
        • though not the error value, null in Java also is a value that "fits" into every reference type in that language; it's not too different that ⊥ is being used "anywhere" here.
      • loops are converted to recursion (we use recursion to find the value that results from evaluationg a loop; this value is probably the state of all variables).
      • rigorous, can express correctness of programs; can be used in compiler generation systems. Not as useful for language users though.
  3. Axiomatic Semantics
    • define axioms/inference rules (called assertions) for each production rule. Can reason about programs with these rules.
    • Intended for formal program verification.
    • define assertions as preconditions and postconditions around a statement.
    • "weakest precondition": a minimal precondition that allows us to prove the postcondition.
    • prove a program's correctness: define what correct means as a postcondition of entire program; work backwards finding weak preconditions until you can reach the start of a program with a precondition that can be shown to be true.
    • good for correctness proofs, but really hard to use in a full language definition or by casual language users.
    1. Some Rules
      • Rule of Consequence: lets us modify pre-post conditions as long as they are 'stronger':
        {P} S {Q},  P'=>P,  Q=>Q'
        -------------------------
               {P'} S {Q'}  
        
      • Sequences: chaining statements together.
         {P1}S1{P2},  {P2}S2{P3}
        -------------------------
            {P1} S2; S3 {P3}
        
      • selection: if-expression. (not an if-statement!)
        {B and P} S1 {Q},   {(not B) and P} S2 {Q}
        ------------------------------------------
              {P} if B then S1 else S2 {Q}
        

4 Language: Haskell

You don't need to be a prolific experienced programmer yet in this language, but you should be able to use structural recursion over lists, use helper functions as necessary/useful, and generally inspect pieces of Haskell code and answer questions about it or extend it.

4.1 The Language

  • functional: focuses on defining functions and calling them, instead of constructing elaborate control flows with iteration. We can also partially apply a function, meaning we don't give all of its expected arguments; this yields another (fewer-arguments-needed) function that can also be passed around as a first-class value. More focus on data and transforming data than on the steps of state modification common to imperative languages. (Haskell is data flow oriented rather than control flow oriented).
  • non-strict ("lazy"): every single portion of a program is only evaluated when actually needed, and only as much as needed. Think of it like navigating the bare minimum of the parse tree in order to get to the answer. (Often for us, printing the answer is what "forces" an evaluation to occur). Often, we force the computation because we ask for something to be printed.
  • pure: no side effects allowed in virtually the entire language (everywhere except the IO monad, which we haven't seen yet). Side effects include things like re-assignment to a variable or printing. Seeing expressions' answers printed in the interpreter is the only "side effect" we're arguably seeing so far.
    • referential transparency: the equals sign is truly Liebniz equality, not (re)assignment. This means the compiler is free to replace these definitions at will when trying to simplify or come up with better generated code. The programmer can hopefully also reason about their code more safely. Our manual simplifications in the lambda calculus are also a useful way to trace your Haskell code down to a value. Our code samples have a few such evaluations embedded in comments.
  • strongly statically typed: every single (sub)expression has a type that is discovered at compile time.
    • Haskell also infers the types - pretty handy, but we often want a simpler (and "narrower") type than what it finds, so adding type ascriptions is still quite common and good practice. We have often narrowed down a type to only Int when Haskell would have been happy with any numeric type or something even more general.
  • can be compiled and interpreted.
  • indentation is significant (though braces and semicolons can be used to get around this). Pretty forgiving on layouts.

4.2 Operators, Functions

  • operators are truly just functions that are naturally used infix.
  • functions can be first-class values: they can be parameters to other functions and the result of a function can be another function.
  • internally, all functions truly have only one parameter and might return another function; "multi-parameter" functions are just syntactic sugar for multiple function definitions that each build upon each other.
    • so, zipWith::(a->b->c) -> [a] -> [b] -> [c] is actually zipWith::(a->b->c) -> ([a] -> ([b]->[c])). Those added parentheses aren't needed, because -> is right-associative in Haskell.
  • binary operators can be used like functions - just put in parentheses (changing from 1+2 to become (+) 1 2).
  • functions can be used infix like operators - just surround with back-ticks (changing from mod x y to become x `mod` y).

4.3 Lists

  • two constructors: an empty list [] and a non-empty list (infix operator :, like x:xs).
  • syntactic sugar: [1,2,3] is the same as 1:(2:(3:[])). Either way, we get a singly-linked list built out of cons-cells (:) and an empty list ([]). You can change syntactic representation throughout your code at will to suit the situation.
  • some useful operations: the concatenation operator ++ (as in xs ++ ys), head, tail, last, init, reverse, drop, elem, repeat, cycle.
  • lists are monomorphic. For some specific type, every value in the list must fit into that type. (note, that type might currently be specified by a type variable, e.g.the a in forall a . [a] -> Int. We don't explicitly write the forall but this is parametric polymorphism, much like Java's generics).
  • infinite lists are possible.
    cats = cat : cats
    fibs = 1 : 1 : (zipWith (+) fibs (drop 1 fibs))
    
  • list comprehensions: much like set notation. We generate a list out of other lists' elements, boolean guards, and an expression to use for each item that came from the other generated list(s) and passed all guards. Can be very succinct or highly unreadable.

4.4 Some Other Constructs

  • if-then-else expression (not a statement - must have all three parts)
  • let-expression (a local scope; convenient to give something a name)
  • String values (truly just [Char]) use double-quotes to capture the intended sequence of characters. As String is an alias, whenever you need to show the type, you can either write [Char] or String. Haskell knows they're the same thanks to the alias definition (type String = [Char]). Thus you also can pattern-match using standard list pattern matching approaches, character by character. (Individual characters are surrounded by single-quotes, not double-quotes).
  • case-expression (pattern matching anywhere, not just at a function definition site). The right-arrows replace the equals-signs when we use top-level pattern matching in multiple equations that define our function. Otherwise, it's the same idea and we can still have pattern guards.
  • where-clause (scopes another definition to only be available in the host definition; helps not make our helper function definitions fill up the overall namespace unnecessarily)
    • we didn't do much of this; the same will be true on the midterm.
  • functions: pattern match over parameter(s) and define the simplified result.
    • may have multiple definition lines ("equations"), which are considered in the order they're written, until a valid pattern is found that is compatible with the arguments. Each recursive call starts with the first definition line each time, looking for the first matching pattern.
  • type synonyms: doesn't create new values, just syntactic convenience to make our type signatures more readable.
    type String = [Char]
    type Coord = (Double,Double,Double) -- represent a 3D coordinate as a 3-tuple
    annoyingDistance :: (Double,Double,Double)->(Double,Double,Double)->Double
    distance :: Coord -> Coord -> Double
    
  • type ascriptions (optional). add :: and a type after an expression, or as a header-line before a function's definitions. In more advanced situations, they are useful or needed. They really help get clearer error messages.
    overTen :: Int -> Bool   -- function's type signature
    overTen x = if (x::Int) > 10 then True else (False :: Bool)   -- some unnecessary ascriptions that are allowed.
    

4.5 Some Provided Data Types

  • Bool: data Bool = True | False
  • lists (see above)
  • tuples: specific unchanging length, each spot has its own type.
    • syntax: parentheses around the values, separated by commas.
  • Unit: the type with just one value (a "don't care" placeholder type). The type and value are both represented by ().
    • very similar to Python's None value.
  • Maybe: data Maybe a = Just a | Nothing
  • Eithers: data Either a b = Left a | Right b

4.6 Abstract Data Types

  • the user can create their own types (like IntTree or Tm). Give it a name, as many uniquely named constructors as desired, as many parameters to each constructor, and then they are all collected into the new given type.
    • can be parameterized over type variables (we have seen this but haven't used it much). This type of polymorphism is called parametric polymorphism: absolutely nothing is known about the type, so any type may be used there (and preserved for all related locations in one particular usage of this thing). It is like generics in Java (Java borrowed it from functional languages when they carefully added it to the language in version 5).
    • for now, just add deriving (Show,Eq) to make printing and equality testing available. This uses type classes, which we haven't discussed in depth yet.
  • functions can be written over these types, pattern-matching on the constructors.

4.7 Type Variables

  • represent "forall". zipWith works "for all types a, for all types b, and for all type c" such that zipWith::(a->b->c)->[a]->[b]->[c]. Actual uses of a function with type variables in its type will mean that specific types are chosen to instantiate the type variables. (We use it with "concrete types").

4.8 Coding Hints

  • writing helper functions that have more parameters allows us to simulate state in a direct fashion.
    • example: if an iterative version had three variables in use from iteration to iteration, your helper might have these three extra parameters.
    • we had a class example showing a simplistic way to convert a while loop to recursion. Not always the prettiest output but it's a valid way to get semantically correct code.
  • phrasing your algorithm by defining what you end up with rather than how you might iteratively calculate it tends to give you the exact structure that your code will end up with. Thinking in terms of "milestone" values that split the task down into smaller tasks pays off well.
  • recursion replaces loops, sometimes almost seemingly ridiculously so. The compiler is more active in finding tail recursion than you may be used to occurring, so it's not necessarily as expensive as you've been taught in other classes that only considered imperative style languages. Often you'd get the same kind of assembly output as a loop despite writing your code as recursion.
  • try to think in terms of pattern-matching, as that's the first line of offense. What type are we inspecting? What are all the possible constructors/shapes that we need to address? Often, the only (or the main) argument is a list, and we should consider the empty and non-empty cases. Handle base cases (more specific pattern matches) first. Don't-cares can just match with an _. You can have multiple don't-cares in a single pattern match.
    • look over the code samples and our homework for ideas of what might be asked.
    • often, we will pattern match over empty and non-empty lists as the two options for that parameter.