Table of Contents
- 1. CS131 Notes
- 1.1. Functional Programming
- 1.2. OCaml
- 1.3. Grammars
- 1.4. Parsing
- 1.5. Types
- 1.5.1. Primitive vs. constructed types
- 1.5.2.
float
- 1.5.3. Type Usage and Properties
- 1.5.4. Type Equivalence
- 1.5.5. Abstract vs. Exposed types
- 1.5.6. Subtypes
- 1.5.7. Polymorphism
- 1.5.8. Subtype Invariance
- 1.5.9. Duck Typing
- 1.5.10. Static typing (Generics/Java) vs. Dynamic typing (Duck typing/Python)
- 1.6. Java
- 1.7. Logic Programming and
prolog
- 1.7.1. Quicksort in
prolog
- 1.7.2.
prolog
syntax - 1.7.3. Some
prolog
syntactic sugar - 1.7.4. Review: Clauses
- 1.7.5. Three Kinds of Clauses
- 1.7.6.
append
- 1.7.7.
reverse
- 1.7.8. Simple predicates
- 1.7.9. What can go wrong in
prolog
? - 1.7.10. Unification, two-way matching, and infinite (cyclic) terms
- 1.7.11. Piano Arithmetic
- 1.7.12. Control mechanisms in
prolog
: proof trees and backtracking - 1.7.13. Control mechanisms in
prolog
: the cut operator (!
) - 1.7.14. Philosophy of
prolog
- 1.7.15. Logic
- 1.7.1. Quicksort in
- 1.8. Scheme
- 1.8.1. Basic Scheme syntax
- 1.8.2. Scheme vs. other languages
- 1.8.3. Scheme syntax
- 1.8.4. Quoting in Scheme
- 1.8.5. Special form:
lambda
- 1.8.6. Scoping
- 1.8.7. Bonus Scheme
- 1.8.8. Primitives vs. Library
- 1.8.9. Categorization of errors in Scheme
- 1.8.10. Continuations
- 1.8.11. Storage Management (from Albert's Notes)
- 1.9. Storage Management
- 1.10. Information hiding (for modularity)
- 1.11. Errors, faults, and failures
- 1.12. Parameter Passing, Object-Oriented Programming, Cost Models
- 1.13. Cost Models
- 1.14. Rust
- 1.15. Programming Language Semantics
- 1.16. Notes on Python
asyncio
andFuture
- 2. TODO:
1. CS131 Notes
1.1. Functional Programming
1.1.1. The quiz
tr -cs A-Za-z '\n' | tr A-Z a-z | sort | uniq -c | sort -nr
Why should we care about functional programming?
- Clarity: easier to maintain and write. In functional programming, you guarantee referential transparency: reference by identifiers (or names) to values obvious in functional programming! If you write the same expression twice with the same input you get the same value. This is not true for C++, C, non-functional programming languages. Feels a lot like mathematics.
- Performance: want our programs to run faster (to be more easily optimizable - compiler should do the heavy lifting)
For example for performance under functional programming assumptions, we could call functions in the following way
return f(x) + f(x) movl x, %ax call f addl %rax, %rax // With side effects, you cannot make this simplification! ret
since \(f(x)\) is functional, we can rely on it not affecting program state, allowing us to make this simplification in our "pseudo" assembly.
1.1.2. Approaches to programming languages
Imperative:
- The basic unit of an imperative programming language is the command/statement: i.e.,
S1, S2
, etc. - These langauges offer siginficant control over the computer, the "glue" being statements divided by i.e., semicolons like
S1; S2; S3
- It relies on mutable variables: variables with state, modified via assignment.
- Note that the sequential, command-based nature of imperative languages makes it inherently difficult to scale program written in them to modern, multi-threaded and distributing computing models.
- The reason for this is that modifying the program state makes multi-threading environemnts difficult. For example, race condtions modify shared state in an unpredictable order. Execution order is crucial because statements mutate state.
- There is also a performance overhead that comes with mutexes, locks, etc to prevent this.
- On the other hand, functional programming helps since immutability removes race conditions (no shared mutable state), referential transparency allows parallel execution without worrying about side effects.
Functional:
- The basic unit of a functional programming language is the function: i.e.,
F1 F2 F3
, etc and are based on the idea of mathematical functions.- For example
A(B(x, y), C(z))
. Since we are guaranteed they don't change program state, we can easily parallelizeB(x, y)
andC(z)
"for free".
- For example
- Functional languages gives us referential transparency: we always know what an identifier refers to.
- This comes at the cost of giving up side effects, e.g. assignments. However, without the side effect of I/O, we can't ever get work done.
- This means that functional languages need an "escape hatch". OCaml's REPL allow its CLI to communicate with the user; in the real world, functional languages aren't purely functional but generally encourage the user to write in a functional style.
- The "glue" in this case is function application, i.e.,
F1 (F2 (x), F3 (y, w), w)
. - Functional languages also lend themselves to functional forms or higher-order functions: functions that take other functions as arguments, or return other functions as results.
- In OCaml, functions are first-class values, meaning they can be assigned to variables, passed as arguments, returned as results, or stored in data structures.
- We can return a function directly, i.e.,
let make_adder x = fun y -> x + y;;
, and use partial applicationlet add5 = make_adder_5
which returns a function.
Logic:
- Logic languages are a subset of functional languages based on predicates, which are simply logical statements. Using logical operations, we can combine predicates to reason about things. The logic model may not be used as much as the functional/imperative models, but the logical way of thinking can be applied to things like database queries and static analysis of other languages.
1.2. OCaml
Some important features of ML: compile-time type checking, type inference, garbage collection, higher-order functions. OCaml is an object-oriented dialect of ML.
1.2.1. Values
Everything in OCaml has a value, and every value has a type. OCaml also has type inference, and can typcially infer the type of a value automatically. Below we have bound the value to the identifier ("name") x
. Importantly, bindings in OCaml are immutable: the value assigned to some name never changes.
let x = 37*37;; # val x : int = 1369 (* We have val (variable), name, : , type, value *)
1.2.2. Conditionals
We can write conditionals like so: if x < 2000 then "a" else "b";;
. This is basically a ternary operator, and is an expression, not a statement: it evaluates to a value. In an conditional, you cannot have two different return types.
Since they are expressions, rather than statements this makes them fundamentally different from imperative if
statements. Since they always produce a value, it does not cause side effects. In imperative languages, an if
statement controls the executio flow. It also requires both then
and else
to esnure all branches return a value.
1.2.3. Equality
OCaml provides two different equality operators:
- =: structural equality, which simply compares values, i.e.,
"hello" = "hello"
. Even if they are different objects in memory.- For lists and tuples, they are compared element by element. They are equal if 1) they have the same length, and 2) corresponding elements are structurally equal.
- ==: physical equality, which checks if two expressions point to the same object in memory. Even if they have same content but are different objects, it will return false. This checks the address, which is fast and efficient, especially compared to something like a list.
1.2.4. Data Structures
There are two main data structures to consider in OCaml: tuples (heterogeneous) and lists (homogeneous).
Tuples:
Tuples are simply fixed length collections of elements of any type. Note that tuples are denoted using the *
symbol and that elements are separated using commas. We have for example
let tj = (1, "a");; val tj : int * string = (1, "a") let tj = (1, "a", ("c", 5.3));; val tj : int * string * (string * float) = (1, "a", ("c", 5.3))
The empty tuple is of the type unit = ()
and are typically associated with side effects and indicate the absence of useful data.
Lists:
Lists in OCaml require that every element of a list be the same type. Lists are implemented as immutable singly linked lists, where each list element consists of a value (the head), and a pointer to the next element (the tail). The last element points to an empty list ([]
) which acts as the base case for recursion.
Take for example the following:
let empty = [];; val empty : 'a list = [] let notempty = [1; 3];; val notempty : int list = [1; 3] empty = notempty;; - : bool = false
note that 'a
denotes a type variable standing for any type. When we compare []
with an int list
, it is inferred that []
is an int list
in order to do the comparison.
There are two important operations on list: prepending and appending. Note that lists cannot be modified afer creation, so operations like @
and ::
return new lists instead of modifying existing ones.
::
is \(O(1)\) and returns a list with a new head at the front.@
is \(O(n)\) and returns a new list with a new tail.
the function head::tail
has the following type signature.
let ocons head tail = head::tail;; val ocons : 'a -> 'a list -> 'a list = <fun>
1.2.5. Functions
Functions are simply values in OCaml, meaning we define them with let
. Importantly, every function in OCaml takes a single argument and returns a single result. To "pass" multiple arguments or return "multiple" results, we are essentially using tuples. From the language point of view, there's always one argument. For the type of a function, only the last "argument" is the actual type.
OCaml applies functions using left-associative application, which allows it to write f x y
easily. This also means that f a b c
is syntactic sugar for (((f a) b) c)
, and each function call returns a function unless it reaches full application. Function types are right associative. This means that the function type int -> int -> int
is interpreted as int -> (int -> int)
. This is related to currying, since this implies that functions in OCaml take one argument at a time and return another function until all arguments are supplied. For example,
let add x y = x + y;; val add : int -> int -> int (* equivalent to *) val add : int -> (int -> int)
this means that add
is a function that takes an int
(first argument) and returns a new function that takes another int
returns an int
.
This is directly related to currying/partial application, which also allows us to guarantee that each function only has one argument. The process of turning a function that takes multiple arguments into a sequence of single-argument functions is known as currying. Namely, let f x1 x2 ... xn = e
is syntactic sugar for
let f = fun x1 -> (fun x2 -> (... (fun xn -> e)...))
i.e., they are semantically equivalent.
fun
is used to create anonymous functions and is inherently curried. fun x y -> x + y;;
is equivalent to fun x -> fun y -> x + y;;
.
function
is primarily used for defining functions with pattern matching:
function | 0 -> -3 | x -> x + 1;;
which is syntactic sugar for
fun x -> match x with | 0 -> -3 | x -> x + 1;;
we can also return functions (in general). Take the following
let nnn = function | 0 -> (fun x -> x) (* Identity function if input is 0 *) | _ -> (fun _ -> 1) (* Function always returning 1 otherwise *)
1.2.6. Pattern matching
Pattern matching is a super powerful way to decompose data structure, destructure values and handle conditional logic. Take the following map
function:
let rec map f u = match u with | [] -> [] | x :: y -> f x :: map f y;; val map : ('a -> 'b) -> 'a list -> 'b list = <fun> map (fun x -> x * x) [1; 2; 3; 4];; - : int list = [1; 4; 9; 16]
the match
statement allows us to take care of the two cases. To match lists, we use h::t
, where the head is just an element ('a
) and the tail is a list ('a list
). They can also be thought of as conditional statements, as seen below. Note that matches are done top-down, and _
is a wildcard that is catch-all: matches with anything and discards the value.
let g x = if x = "foo" then 1 else if x = "bar" then 2 else 0;; val g : string -> int = <fun> let g' x = match x with | "foo" -> 1 | "bar" -> 2 | _ -> 0;; val g' : string -> int = <fun>
when a pattern match is not exhaustive, OCaml will detect this and warn us. In a general sense, the idea is
match E with | P1 -> E1 | P2 -> E2 | ... | Pn -> En
Look at your variable and see if it matches pattern \(P_i\), if so evaluate \(E_i\). A list of basic patterns is below.
- Pattern | Value that it matches
0
(any constant) | itself"a"
| itselfa
(any identifier) | any value (binds identifier to that value)_
| any value, discarding value (_
never bound to anything, "don't care" pattern(P)
| whatever P matchesP1,P2,...,Pn
| any n-tuple, such that its first component matches P1, …[P1;P2;...;Pn]
| any list of length n, with corresponding submatches (rare)P1::Pn
| any list of length > 0, such that P1 matches the first item in the list, P2 matches the rest of the list
match
can be thought of as syntactic sugar in the following way:
let cons (a, b) = a::b let cons x = match x with | (a, b) -> a::b let cons = (fun x -> (match x with | (a, b) -> (a::b)))
The first is syntactic sugar for the second which is syntactic sugar for the third. No fallthrough cases because (a, b)
matches any 2-tuple that can be passed to cons
. Below we access members of a tuple by pattern matching
let second p = match p with | (_, y) -> y;; val second : 'a * 'b -> 'b = <fun> second (42, "apple");; - : string = "apple"
We can do a similar pattern matching with lists, but we need to be careful to take care of every possible case as lists can have a variable number of elements.
1.2.7. Recursive Functions
In OCaml, we declare recursive functions using the rec
keyword. When defining recursive functions, use let rec
, since the general rule is that you can't have let ID = EXPR
where ID
is in EXPR
. I.e., let n = n + 1
is not allowed. For example, the following function removes odd-indexed elements:
let rec eo = fun x -> match x with | [] -> [] | h::_::t -> h::(eo t) | [h] -> [h] ;;
OR, if you wanted to be smarter:
let rec eo = fun x -> match x with | h::_::t -> h::(eo t) | x -> x ;;
where the last pattern matches both the empty list and single list pattern. Can also use the function
keyword for syntactic sugar for fun x = match x with
, and we implicitly use x
here.
Take for example the following reverse list functions:
let rec reverse = function | [] -> [] | h::t -> (reverse t) @ h;; (* This is of type 'a list list -> 'a list ! Reverses a list of list of something, and flattens the list.*) let rec reverse = function | [] -> [] | h::t -> (reverse t) @ [h];; (* This will reverse the list correctly, but its slow! Lists are linked lists, A@B is O(|A|), A::B is cheap since we just need to make a new link. *) let rec rev_helper = fun a -> fun l -> match l with | [] -> a | h::t -> rev (h::a) t (* Now we run rev_helper with a = [], making reverse a curried function.*) let reverse = (let rec rev = fun a -> function | [] -> a | h::t -> rev (h::a) t in rev [])
suppose we want to write a function where we supply a function (comparator)? Take the \(\min\) operation for example
let rec gmin lt idval = function | [] -> idval | h::t -> let tmin = gmin lt idval t in if lt h tmin then h else tmin (* Use it like so: *) gmin (<) 99999999999 [5; -9; 27; -100; 12];; (* returns : int = -100 *) (* Has type: *) val gmin : ('a -> 'a -> bool) -> 'a -> 'a list -> 'a = <fun>
1.2.8. Types
Declare type alias for a function type with i.e., type myfntype = int -> int
.
Discriminant Union Type allows a type to hold multiple distinct forms. Consider the following general example
type shape = | Circle of float | Rectangle of float * float | Square of float;;
discriminant unions can be thought of a class that has both a type tag and a value. One common example is the option
type, which we could implement as
type 'a option = | None | Some of 'a
the option
type is useful because it replaces the idea of having null pointers, removing the idea of "dereferencing a null pointer." It essentially indicates if a value does or does not exist, and is useful for pattern matching.
We can also compare and contrast the discriminant union type in OCaml vs. the union type in C/C++
:
type mydutype = | Foo | Bar of int | Baz of int * string (* every mydutype has a tag field, which can't be overwritten. It can be relied on by application *) (* You can create a value of a discriminant union type by using a constructor named in the type *) Bar 3 (* You can use pattern matching to examine a value of a discriminant union. *) match myval with | Foo -> 1 | Bar n -> n + 7 | Baz (n, _) -> n - 3 ;; (* Pattern matching guarantees that the way you access is safe *)
vs. union of C/C++
union mydutype { char dummy; /* nothing here */ int bar; /* single int */ struct { int i; char *s; } baz; /* int, string pair */ }; // Unions are "stacked" all addresses are the same. // On the programmar to make sure that you to not access the wrong member. Error prone! union mydutype u; &u == &u.bar &u == &u.baz.i ...
importantly, unions in C/C++
are dangerous since they are "stacked", and it is on the programmar to only access one field.
Generic types We can define generic type in OCaml, where it is generic/general based on the type. For example,
type 'a option = | None | Some of 'a (* type field is Some *) let x = Some "abc" let y = None match x with | Some n -> ...
1.3. Grammars
Why grammars? We need a specification for the programs we write, especially syntax. We expect things to be consistent and unambigious.
1.3.1. Syntax vs. Semantics
Syntax is about form. Semantics is about meaning. Syntax is the "lesser of the two", an easier problem.
Specification is about putting constraints on how a program is supposed to behave, and syntax is just a formal way to describe this. In english, we can write something syntactically correct, but semantically nonsense ("Colorless green ideas sleep furiously"). You can also get an ambigious sentence (time flies: timing a fly or going fast?). The same idea in programming languages depends on the compiler, and how the compiler parses syntax. And in order to resolve ambiguity, we need an airtight context-free grammar.
Successful syntax is easier to use, read, safer for the programmar, has some useful redudancy, and most importantly unambigious. A grammar is simply a syntax specification. We want some redundancy in order for your software to be reliable. Extra constraints in the syntax
help catch mistakes early. For example, explicit type declarations iiin statically typed languages catch type errors at compile time (let x : int = "hello";;
). Another example would be
parentheses, to avoid ambiguity, i.e., let y = (3 + 4) * 2;;
1.3.2. Tokens
Tokens are the lowest level components of most programming languages, but not necessarily things "we just make up" (e.g. in English, we can make up tokens). A program is a byte sequence that represents a character sequence, and this character sequence represents a token sequence. Just as not every byte-sequence is not a valid character sequence, not every character sequence represents a valid token sequence.
Lexical analysis is the process of taking the bytes of some input program and figuring out where the boundaries between the words are. This allows the compiler to look at “tokens”, instead of individual characters. Combining the tokens and any additional information (think “number” and “42”) gives the set of lexemes.
Generally, we use greedy tokenizers, i.e. we continuously find the longest set of characters that could be a token and choose it.
In addition, consider issues of internationalization (support for non-ASCII characters and the security concerns that could follow) as well as the issue of reserved words: having reserved words makes it harder to extend a language as new features break old programs.
1.3.3. BNF (Backus-Naur Form)
A BNF grammar rule looks like LHS :: = RHS
. The left hand side is a non-terminal symbol and the right hand side is any sequence of terminals or non-terminals. For example
NT = {S} # non-terminals T = {a, b, c, d} # terminals S ::= a S S ::= S b S ::= c d e
where we denote non-terminals in capital letters and terminals in lowercase. abdeb
is a valid string that would staisfy this grammar rule. If some text is valid under a grammar, it will be possible to build a parse tree such that each of the skeleton nodes (non-terminals) follows a defined grammar rule and the leaves (terminals) represents the text. BNF is a metasyntax notation for context-free grammars.
1.3.4. Extended BNF
Extended BNF extends BNF by adding metanotations which shorthand for BNF, in something similar to a Regex style way. Whatever grammar you can write in BNF you can write in EBNF (and vice versa). For example, to exclude the characters <
and &
from CharData
, we could write
CharData ::= [^<&]*
which is equivalent to
X ::= a X ::= b X ::= c ... CharData ::= CharData ::= X
Another example below, for the definition of floating-point literals in Python:
floatnumber ::= pointfloat | exponentfloat pointfloat ::= [intpart] fraction | intpart "." exponentfloat ::= (intpart | pointfloat) exponent intpart ::= digit+ fraction ::= "." digit+ exponent ::= ("e" | "E") ["+" | "-"] digit+
we extend BNF with optional elements [ ... ]
, repetitions { ... }
and grouping ( ... )
. For example,
<expr> ::= <number> | <expr> ("+" | "*") <expr> <number> ::= [ "0"-"9" ]+
where [ "0"-"9" ]+
means one or more digits, and ("+" | "*")
means either +
or *
. Translating EBNF back down to BNF can be difficult in some cases. For example,
<no-fold-literal> ::= "[" *dtext "]" # This is EBNF, the * in front means repeat that 0 or more times. # To do this in BNF, we have to introduce a new non-terminal <sdtext> ::= # has to be empty! since we have *0* or more times <sdtext> ::= dtext sdtext <no-fold-literal> ::= "[" sdtext "]" # Now they are equivalent.
continuing, we have the following EBNF to BNF translation
# EBNF atext ::= /ALPHA/DIGIT/ ... dot-atom-text ::= 1*atext * ("." 1*atext) # The * on the right means repeat 1 or more times (like +). # The ( ) grouping means in this context, the * after atext controls the whole group. # So we would have a dot, followed by 1 or more atext, and we would have on or more of these. For example sub.example.com. # To extend this to BNF we could have 1atext = /ALPHA/DIGIT/... 1atext = /ALPHA/DIGIT/... 1atext 0atext = 0atext = "." 1atex dot-atom-text = 1atext 0atext
we can parse these grammars easily with i.e. grep
because it is a context-free grammar.
Every BNF (and EBNF) grammar is context-free. But you can't turn every EBNF grammar into a regular expression. This is because regular expressions "can't count". More specifically, they cannot enforce arbitrary nesting or ensure balanced constructs. For example, we could have
<balanced> ::= "(" <balanced> ")" | ε # valid strings are "", "()", "(())" etc
since regular languages are defined by finite automate, they do not have memory / stack to track dependencies. They cannot remember how many (
have been opened or how deep they are inside parantheses.
For grammars, we notice that for it to be parsed by a regular expression, there should not be any recursion. Every expansion makes the expression simpler.
XML Grammar
The XML grammar looks like the following:
CharData ::= [^<&]* CharRef ::= '&#'([0-9]+ | 'x'[0-9a-fA-F]+)';' EntityRef ::= '&'Name';' Reference ::= EntityRef | CharRef Equality ::= S?'='S? S ::= (#x20 | #x9 | #xD | #xA)+ AltValue ::= '"'([^<&"] | Reference)*'"' | "'"([^<&'] | Reference)*"'" NameChar ::= NameStartChar | "-" | "." | [0-9] NameStartChar ::= ":" | [A-Za-z] | "_" Name ::= NameStartChar (NameChar)* Attribute ::= Name Eq AttValue Stag ::= '<'Name (S Attribute)* S?'>' Etag ::= '</'Name S?'>' EmptyElemTag ::= '<'Name (S Attribute)* S?'/>' content ::= CharData? ((element | Reference) CharData?)* element ::= EmptyElemTag | STag content ETag
in the XML grammar, lower-case means a non-terminal that is recursively defined. Uppercase is not recursive.
EBNF has an ISO standard, which includes the following
"terminal" # Can write it with single or double quotes 'terminal' [optional] # Anything in brackets is optional, 0 or 1 times {repetition} # Anything in braces can appear 0 or more times (grouping) # Group things together N*X : repeat N times # Can also write it this way instead of repetition X-Y : set difference # Instance of X that is not an instance of Y; exception X,Y : concatenation # Instance of X followed by an instance of Y X|Y : disjunction (or) X = Y; # How to write a rule in ISO-EBNF
ISO EBNF syntax is defined in ISO EBNF syntax! See the following
syntax = syntax rule, {syntax rule}; # Can't have an empty syntax rule. syntax rule = meta id, '=', defns list, ';'; defns list = defn, {'|', defn}; defn = term, {',', term }; term = factor, ['-', exception];
Note that syntax rule
describes itself. Similar to how int x = x
is not feasible in most programming languages. We could do int* x = &x
which is OK because by the time we get to the RHS we have already allocated storage for x
, so it's OK to take the address.
Back to ISO EBNF in terms of ISO EBNF, why is this OK?
The way the authors of ISO EBNF resolve this is to specify ISO EBNF in English, and then put the ISO EBNF in terms of ISO EBNF in an appendix. The appendix is not the definiton, it's documentation.
1.3.5. Debugging Grammars
The next question we ask ourselves is what can go wrong?
What can go wrong with tokenization?
- Confused about characters: i.e., o vs. cyrillic o (i.e., auditing code, international developers)
- Keywords in programming languages (reserved words, depends on the programming languages, case matters in some programming languages, in some they don't, etc)
- Tokenizers are greedy!
- Tokenizers look for the longest possible token until it can no longer find it, and then cuts it off and moves to the next at the moment it becomes no longer valid.
- e.g.
a = b+++++c;
- Long tokens, e.g. a character string. Can run into issues like buffer overflow or memory limitations. It could overwrite adjacent memory, or there may be some identifier length limit.
- Give people ways to write shorter tokens that mean the same thing as a longer token.
- In
C
, can attack this problem with\n
- This can also run into issues! Consider
\
elsewhere in your code, you may not be sure what is going on and the compiler may interpret it in a way you are not familiar with.
- This can also run into issues! Consider
- Comments and whitespace.
- This part of the program is ignored, but you need to recognize it in order to ignore it.
What can go wrong with (context-free) grammars?
First, we should know the components of a context-free grammar: (set of tokens, set of nonterminals, set of terminals, starting terminal (non-terminal), set of production rules (at least one must have a LHS that is the start symbol))
What is a rule? A rule is a pair: a LHS which is a non-terminal, and a RHS which is a finite sequence of symbols (terminals or nonterminals).
Now we move onto examples of grammars that have issues. Consider the following grammar, where S
is the start symbol, lowercase letters are terminals and uppercase are nonterminals.
S -> a S -> b T -> c S -> S
Our third and fourth rules are useless. We can never get "c" in a valid sentence and our last rule is simply redundant. Now consider
S -> aTb S -> Sa S -> b
in this case, our first rule doesn't make sense: there are no terminal symbols defined in this grammar for T
. We also have
S -> Sa S -> b
this grammar is left-recursive, and these are difficult to parse for top-down parsers. The reason for this is that a recursive descent parser works by calling functions recursively for each non-terminal. The parser calls itself indefinitely. For example, trying to parse 1 + 2 + 3
using
<expr> ::= <expr> "+" <term> | <term>
will result in the parser calling itself indefinitely on <expr>
, never consuming the input. Well-behaved parsers consume tokens as it expands rules.
parse_expr() → calls parse_expr() → calls parse_expr() → infinite loop!
to prevent this, we need to convert our left-recursive grammar to right recursive. From our example earlier, we turn it into right-recursive form in the following way:
<expr> ::= <term> <expr_tail> <expr_tail> ::= "+" <term> <expr_tail> | ε
this avoids infinite recursion by ensuring each step consumes a token before recursing. From our simpler example from much earlier, we would change it in the following way:
S -> bT T -> aT T -> ε
We also have to consider that we need to capture the entirety of our language. For instance: in English, we need verbs and nouns to agree in plurality. A grammar for English would thus need to capture this complexity. However, at the same time, we need to consider the level of detail we actually want: for instance, if we tried to capture the constraints of plurality, gender, and tense, we would have an incredibly large grammar.
Another issue: consider things like typedef
in C++: this means that when we parse, we can’t do a
simple tokenization, as the typedef affects whether a token is a variable or a type.
In Python, the parser would also need to take note of indentations, adding complication as it can’t simply do something like looking for braces.
What can go wrong, cont.
- Useless rules:
- Unreachable non-terminals: non-terminals that cannot be dervied from the start symbol.
- e.g.
S -> A; T -> B;
(ifS
ius start symbol,T
is unreachable)
- e.g.
- Blind alleys: rules that lead to derivations that cannot produce terminal strings.
- e.g.
S -> a; S -> bT; T -> cT;
(T
can only derive moreT
's andc
's; never terminating) - A blind alley refers to a production rule that can never be reached in any valid derivation from the start symbol of a grammar. Beacuse in our example
T
recursively calls itself, it never reaches a terminal string. And any valid sentence must eventually be made up of terminals, and since it only expands into moreT
it never produces a valid string and this is unreachable in any valid sentence and serves no purpose in the grammar.
- e.g.
- Redundant rules: rules that are unnecessary as they don't add to the language defined by the grammar
- e.g.
S -> a; S -> bS; S -> ba;
(S -> ba
is redundant)
- e.g.
- Impact of useless/redundant rules: grammars become more complex and harder to understand, but the language defined remains the same.
- Unreachable non-terminals: non-terminals that cannot be dervied from the start symbol.
- Extra constraints the grammar cannot capture
- Grammars (BNF/EBNF) are context-free grammars, and CFGs have limitations.
- Fundamentally, they cannot capture all syntactic rules of programming languages
- e.g. Declaration before Use: CFGs cannot enforce that variables must be declared before being used (semantic rule, not purely syntactic)
- e.g. Type Checking: CFGs cannot enforce type correctness (semantic rule)
- Workaround: programming language standards use "English prose" to specify constraints that CFGs cannot express (semantic rules, context-sensitive rules)
Example in English: take for example the following grammar:
S -> NP VP . # Sentence can go to a noun phrase followed by a verb phrase followed by a period. NP -> n # Terminal noun NP -> adj NP # Terminal adjective followed by a noun phrase VP -> v # Terminal verb VP -> VP adv # Verb phrase followed by terminal adverb
in this grammar, the sentences "dogs bark." and "Maxwell meows loudly." are allowed. However, the sentence "Maxwell meow loudly." is valid in the grammar but not in English, our grammar is too generous. We can try to fix this by complicating the grammar:
S -> SNP SVP # Sentence can be a singular noun phrase followed by a singular verb phrase S -> PNP PVP # or a plural noun phrase followe by a plural verb phrase SNP -> sn SNP -> adj SNP PNP -> pn PNP -> adj PNP ... would have to do the same for verb phrases
so we can fix this problem, but the fix came at the price of the grammar becoming more complicating. For programming languages, this can be very difficult, so we can introduce "English" (i.e., constraints outside of the grammar)
- Ambiguity
- Ambiguous grammar: a grammar that allows more than one parse tree (derivation) for the same input sentence (token sequence)
- Undesirable in programming languages: ambiguity leads to uncertainty about program meaning and behavior; compilers might interpret code differently than intended.
- Example of ambiguous grammar (arithmetic expression):
S ::= id | num | S "+" S | S "*" S | "(" S ")" Example ambiguous sentence: id "+" num "*" id Two possible parse trees (different precedence): (id "+" num) "*" id vs. id "+" (num "*" id)
Suffers from precedence and associativity problems. We can fix this in one way by defining the order of operations, or define grouping for operators of the same precedence (e.g. left-associativity for subtraction: a - b - c
is ((a - b) - c)
). Or, we can complicate the grammar. For the example above, we can do the following:
S ::= T + T S ::= T T ::= F * F T ::= F F ::= id F ::= num F ::= (S)
which gives multiplication higher precedence over multiplication. However, we still have a problem, for example a + b + c
will not be able to be parsed because S
can only have two subterms.
To fix, we make our grammar recursive (see image above):
S ::= S + T S ::= T T ::= T * F T ::= F F ::= id F ::= num F ::= ( S )
and now we can parse a + b + c
. With this fix, we have fixed both precedence and associativity (matters because of overflow, rounding errors, subtraction). If we wanted to e.g. introduce exponentiation, we would do
S ::= S + T S ::= T T ::= T * F T ::= F F ::= E E ::= F ^ E (* Right-associative exponentiation *) E ::= id E ::= num E ::= ( S )
Concrete grammar vs. abstract grammar
We could call the grammar where we introduce much more complications as the concrete grammar, and the abstract grammar which does not worry about associativity, precedence, ambiguity, etc. The concrete grammar is complicated, and the abstract grammar is easier to describe, but would not be used for implementation.
Dangling-Else Ambiguity
When we have nexted if-else
statements, there's ambiguity! For example:
if (condition1): if (condition2): statement1; else: statement2; # To which 'if' does this 'else' belong to?
there are two possible interpretations that the parse tree could take:
else
is associated with the innerif: if (condition1) { if (condition2) statement1; else statement2; }
; correct interpretation inC
-like languageselse
associated with the outerif: if (condition1) { if (condition2) statement1; } else statement2;
; incorrect interpretation- We can resolve this by just having "English" for a lexical scoping rule, that is not directly in the grammar: "Dangling
else
associates with the nearest precedingif
that lacks anelse
" - Or we can have a grammar fix by complicating the grammar: introduce new non-terminals to enforce the correct association in the grammar itself, i.e., distinguish between "matched" and "unmatched" if statements in grammar rules.
Consider the following grammar for statements in C
:
stmt: ; expr; break; continue; return expr_opt; goto ID; while (expr) stmt do stmt while(expr); for (expr_opt; expr_opt; expr_opt) stmt switch (expr) stmt if (expr) *stm* else stmt # we use stm to differentiate. # to fix, we re-name the top-level stmt: to stm:, and redefine # stmt: stm if (expr) stmt
1.4. Parsing
Parsing is the process of taking a sequence of tokens (program code) and determining its syntactic structure according to a grammar. The goal of parsing is to construct a parse tree (abstract syntax tree - AST) that represents the syntactic structure of the input code, or to detect syntax errors if the input is not valid according to the grammar.
1.4.1. Three issues in parsinng
- Recursion
- Grammars are typically recursive
- Parsers need to handle recursion to parse arbitrarily nested structures
- Recursion handling in parses (relatively easy if your tech is recursive): recursive descent parsing is a common technique that directly implmenets grammar rules as recursive functions
- Disjunction (OR)
- Grammars use alternation (
|
) to define choices (multiple possible rules for a non-terminal) - Parsers need to handle disjunction by exploring different parsing paths when encountering alternatives in the grammar
- Disjunction handling in parsers requires backtracking or alternatives. Parsers may need to try different alternatives and backtrack if a choice leads to a dead and end.
- Matchers and acceptors are designed to handle disjunction
- Grammars use alternation (
- Concatenation (sequence)
- Grammar rules define sequences of symbols (concatenation -, or implicit juxtaposition: concatenation of symbols without an explicit operators, they just appear next to each other
S -> AB=
) - Parsers need to process input tokens sequentially according to the order defined in grammar rules.
- Concatenation handling in parsers (sequential processing): parsers read input tokens in order, matching them against the expect sequence of symbols in grammar rules.
- Matchers and acceptors handle concatenation by sequential matching
- Grammar rules define sequences of symbols (concatenation -, or implicit juxtaposition: concatenation of symbols without an explicit operators, they just appear next to each other
Sequential matching example:
S ::= A B A ::= "hello" B ::= "world"
the parsers starts at S
and expects A
followed by B
. It reads "hello"
and matches A
, it reads "world"
and matches B
. Since all symbols were matched in the correct order, the input is accepted.
1.4.2. Matchers and Acceptors
- Matchers are functions that try to match a prefix of the input token sequence against a part of a grammar rule. It returns the matched part and the remaining (unmatched) suffix.
- Acceptors are functions that check if the remaining suffix (from a matcher) is "acceptable" according to the rest of the grammar rule.
- Matcher matches a part, acceptor validates the rest, combining to parse larger structures.
1.5. Types
What is a type?
- Definition 1: predefined or user-defined data structure
- Initial thought: a type is a predefined or user-defined data structure.
- Critique:
- Types are related to data structures but not always the same.
- Example: Enum types in C/C++ (e.g.,
enum colors {A, B, C}
) are types but not complex data structures. - Data structures can be complex without necessarily being a single type.
- Definition 2: a set of things the compiler knows about a value
- Second attempt: A type is the set of things the compiler knows about a value.
- Critique:
- Too broad and potentially inaccurate.
- Example:
int n + 1
. While the type is int, the compiler might deduce a more precise range than the full int range, especially considering overflow behavior. - Undefined behavior in C/C++ on overflow means the compiler’s knowledge can be limited or lead to unpredictable outcomes.
- Python example: In dynamically typed languages like Python, every variable has a type (dynamically checked), but the “compiler” (interpreter) may know very little statically about the value’s specifics.
- Definition 3 (more accurate perspective): a set of values and associated operations
- Refined definitiopn: a type is a set of value (and optionally, associated operations)
- Explanation:
- A type defines the possible values a variable or expression can hold.
- In object-oriented languages (C++, Python, Java), types often include associated operations (methods)
- Example: int in a quasi-C language can be seen as the set of values within a certain range.
- Set theory connection: Types can be conceptually viewed as sets of values.
1.5.1. Primitive vs. constructed types
Primitive types:
- Definitions: basic types built into the language.
- Examples: int, char, boolean, float, double.
- Seemingly “solved,” but implementations vary across languages.
- Variations in
int
type- C/C++
int
:- Minimum requirement: Must store values from at least -32,767 to 32,767.
- Implementation-defined size (e.g., 16-bit, 32-bit, 64-bit).
- Macros like INTMIN and INTMAX provide platform-specific limits.
- Mathematical integers (e.g., in some languages like Haskell, Python implicitly for large ints):
- No fixed size limit (limited only by available memory).
- No overflow issues.
- JavaScript
int
- int is a subset of float.
- Integers are floating-point numbers without a fractional part.
- Maximum integer size is limited by the float representation, but much larger than C/C++ int.
- Rounding behavior can occur when integers become very large.
- C/C++
1.5.2. float
Most use data type in terms of CPU/GPU cycles!
IEEE754 single-precision (32-bit) representation:
- 32 bits divided into: 1 sign bit, 8 exponent bits, 23 fraction bits
- We represent numbers in the form \(+/- 2^{E - 127} \cdot (1.F)_2\)
- Sign bit: 0 for non-negative, 1 for non-positive (not strictly negative)
- Biased exponent: \(E - 127\)
- Mantiass/fraction: \(1.F\) where \(F\) is the 23-bit fraction field. The '1' is a hidden bit for normalized numbers.
- Normalization: like scientific notation, ensures a unique representation for most numbers
Special cases:
- Case 1: \(E\) is not zeros or all ones (normal case): use \(+/- 2^{E - 127} \cdot (1.F)_2\)
- Example: represent
5.75
:- Convert 5 to binary:
101
- Fraction part:
0.75
in binary is0.11
- Full binary:
101.11
- Normalize: \(1.0111 \cdot 2^2\)
01110....0
for mantissa, exponent is2 + 127 = 129
, 0 for sign bit
- Convert 5 to binary:
- Example: represent
-0.15625
:- Convert
0.15625
in binary:0.00101
- Normalized: \(1.01 \cdot 2^{-3}\)
- 1 for sign bit, exponent is
-3 + 1276 = 124
, mantissa is0100....0
- Convert
- Case 2: \(E\) is all zeros
- Case 2.1: tiny numbers (denormalized/subnormal)
- When \(E = 0\) and \(F \neq 0\)
- New formula: $+/- 2-127 ⋅ (0.F)
- Not normalized, leading zeros in the fraction
- Closer to zero than the smallest normalized number
- Can lead to underflow if normalization fails due to exponent limits
- Provides gradual underflow rather than abrupt zeroing
- Case 2.2: zero
- \(E = F = 0\)
- There are two representations:
+0
and-0
+0
and-0
compare as numerically equal (x = y
even if different sign bits) but representations differ
- Case 3: Infinity and NaN
- Case 3.1: Infinity
- Represented when \(E = 255\) (all ones) and \(F = 0\)
- Two infinities: negative and positive
- Result of overflow or division by zero (
1.0 / 0.0 = +infty
) - Up to designer to decide if this should be trapped (program crash) or to return on overflow (GCC
-ftrapv
flag)
- Case 3.2: Not-a-Number
- Represented when \(E = 255\) and \(F \neq 0\)
- Represents results of undefined operations (e.g. \(\infty - \infty, \sqrt{-1}\))
- Many NaN values (\(2^{34}\) possibilities due to different \(F\) values), but generally treated as the same
- NaN is not equal to anything, including itself. Can use this to check for NaN:
bool isNan(float x) { return x ! x}
Implications of float
design:
- Complexity: "simple"
float
type has many nuances and special cases; designing types requires careful consideration of edge cases and error handling - Operations on
float
: standard operations must be defined for special values i.e., \(-0\), NaN, etc - Trade-off: choose to trap or use exceptional values - when errors occur (overflow, invalid ops), should the program crash (trap/exception) or return a special value (infinity, NaN)? IEEE754 chose to provide exceptional values as default for continuous operation.
1.5.3. Type Usage and Properties
Annotations
- Types as comments with teeth: provide information for human readers and compilers
- Compiler optimization: type information enables register allocation, efficient code generation
- Runtime effects: type annotations can trigger runtime operations (e.g. casting between
float
andint
)
Inference
- Inferring types based on context
- Example:
int i; float f; i * f;
. Compiler infers the result type ofi * f
asfloat
due to mixed-type operation rules (coercion), has to change the assembly/machine code.
Coercion Coercion occurs occurs when we convert values from one type to another. Ideally, coercion doesn't lose information but this is frequently not the case in real code.
We can have issues with this paradigm. For example:
float f = ____; double d = cos(f);
should we call the float
variant of cos
and convert the result to a double, or should we convert f
to a double and then return the result of using the double
variant of cos
?
Another example of an issue is when we have multiple definitions of an overloaded function: one that takes a double
and a float
and one that takes a float
and a double
. if we provide two floats, which function should be called?
Checking
We do type checking to catch errors early and prevent dumb mistakes and serious consequences. There are two main types of checking: static and dynamic checking.
- Static checking: performed by the compiler before program execution
- Advantages:
- Guarantees no type errors at runtime (for statically checked parts)
- Faster execution as runtime type checks are minimized
- Improved reliability
- Disadvantages:
- Less flexible
- Can be more strict and reject valid programs (false positives!)
- Compiler erros can be complex to understand
- Advantages:
- Dynamic type checking: performed during program execution (runtime)
- Advantages:
- More flexible and forgiving
- Allows for programs that might be rejected by static type checkers
- Easier to get started with (less upfront type declaration)
- Disdavantages:
- Type erros occur at runtime (can be unexpected)
- Slower execution due to runtime type checks
- Less relaible (errors found later in development/production)
- Advantages:
- Hybrid approaches
- Combine static and dynamic checking
- e.g. Java is mostly stack, but you can do dynamic casting:
- take
(String) o;
: the static type system assumeso
could be aString
, but dynamic check at runtime to ensure it actually is; throw an exception if not
- take
- Strong typing
- Definition: Language where you cannot "cheat" about types. System prevents treating a value of one type as another without explicit conversion (and even then, conversions are controlled)
- Strongly typed languages can be static (OCaml) or dynamic (Python)
- Weakly typed languages (C, C++) allow type "cheating" (e.g. type pruning via pointers and casts)
1.5.4. Type Equivalence
Given two type definitions, are they considered the same type? There are two main approaches: name equivalence and structural equivalence.
Under name equivalence, two types are the same only if they have the same name. Used in C, C++ for struct
and class
typoes. For example,
struct S { int val; struct S *next; }; struct T { int val; struct T *next; }; struct S v; struct T w; v = w; // Type error: S and T are different types despite identical structure.
Under structural equivalence, two types are the same if they have the same structure or representation, regardless of name. Used in C
for typedef
, e.g.
typedef int S; typedef in T; S x; T w; x = w; // Allowed: S and T sare structural equivalent (both are int)
more complex structural equivalence (e.g. for recursive types) can be difficult to define and implement.
1.5.5. Abstract vs. Exposed types
Abstract (name equivalence is the way to go)
- Implementation details are hidden from users
- Users know the type name and operations but not internal structure
- Name equivalence is natural for abstract type (structure is hidden anyway)
Exposed (structural equivalence is the way to go)
- Implementation details are visible (structure is known)
- Structural equivalence might seem more appropriate as implementation is part of the type definition
- Exposed types can offer efficiency but reduce flexibility (tight coupling to implementation)
float
is partly exposed: IEE754 standard defines bit layout, but bit order is implementation-defined (partly abstract). The float
on most modern computers is stored in such a way such taht we can avoid catastropic cancellation i.e., a - b =
0 && a > b= can never be true.
1.5.6. Subtypes
We have types T
and U
and want to know if \(T \subseteq_T U\), i.e., if every value of type T
is also a value of type U
and every operation on U
is also an operation on T
. For example, we could consider the C++ int
type to be a subtype of long long int
. The caveat here is that there is a change in representation, meaning we need some machine code to convert from an int
to a long long int
.
The set-based interpretation is that a subtype is a subset of values. The object-oriented interpretation (C++ classes) is a that a subclass is a subtype, and typically has more oeprations than the supertype. Take for example:
type day = {Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday} type weekday = {Monday, Tuesday, Wednesday, Thursday, Friday} // weekday is a subtype of day.
Function argument type compatibility: if a function f
expects an argument of type day
, it is typically safe to pass a value of type
weekday.
Another classic example of subtyping is inheritance. For instance, if we have some class T
that extends a class U
, doing something like T a; U b; b = a;
requires only a pointer assignment and no extra machine code. Another example of subtyping occurs when the programmar puts some constraint on a parent type. For instance, in Pascal: type alpha = 'a'..'z';
or in Lisp (declare (type (and integer (satisfies evenp)) x))
.
1.5.7. Polymorphism
Polymorphism means "many forms", and occurs when a function can accept many forms (types).
Ad-hoc polymorphism "Messy" and "idiosyncratic" polymorphism. Rules for polymorphism are often complex, special-case driven and not always intuitive.
There are two types of ad-hoc polymorphism: overloading and coercion.
Overloading refers to having multiple functions with the same name, but differentiated by argument types. The compiler chooses the correct function based on argument types in the call context.
For compiled languages, this could be an issue: take some polymorphic sin
function. In the compiled assembly code, there needs to be separate code to deal with each of these cases. We can solve this ambiguity by mangling the names of each of
these functions (think sin$f
and sin$d
) but this mangling must be consistent.
Coercion is automatic, implicit type conversions performed by the compiler. The compiler inserts conversions to make operation type-compatible.
Earlier we saw float x; int i; x * i;
:
- There is no direct
float * int
operation - Compiler coerces
i
tofloat
(implicitly convertingi
to a floating point representation) - Then performs
float * float
operations
There are some potential issues!
- Unexpected conversion s can lead to bug
- For example: comparing
uid_t x
(unsigned) with-1
(int), the coercion tounsigned
results inUINT_MAX
Another issue: ambiguity in combined overloading and coercion. Take for example:
overloaded function f: int f(float x) int f(int y, float z) call: f(3, 7) // both 3 and 7 are integers.
there are multiple possible coercions:
- Coerce 3 to
float
: callf(3.0f)
, matchesint f(float x)
- Coerce 7 to
float
: callf(3, 7.0f)
, matchesint f(int y, float z)
there is ambiguity: compiler might not know which coercion to apply if both are valid and lead to different function calls. In C++, if ambiguity exists are considering coercions, the compiler typically reports an error. Adding a new overloaded function can break existing code due to newly introduced ambiguity.
Parametric polymorphism
A more "organized" and "academic" polymorphism. Designed for constructed (user-defined) types. Types are parametrized (type parameters). Polymorphism is achieved by instantiating type parameters with specific types. Examples: generics in Java, templates in C++, type constructors in OCaml (list<'a>
). THe idea that you can
"substitute" a type for the type variable without requiring a runtime action/additional code.
Traditional Java (pre-generics):
Object
is the root type, and everything can be treated as anObject
- Casting (
(String) o
) used to downcast fromObject
to specifric types, with runtime checks. - Example: collection of
Object
. When retrieving from the collection, we need to cast to aString
even if you know it should be strings.
Take for example
Collection c = ...; Iterator i = c.iterator(); while (i.hasNext()) { if (((String) i.next()).length() == 1) { i.remove(); // Cast to string required } }
the problem is that (String) i.next()
is needed, even though the programmer knows its a collection of string. This subverts the type checking, and type safety is no longer guaranteed at compile time as a runtime cast exception is possible.
Modern Java (has generics)
- Generic types allow parameteerizing types (like type constructors in OCaml)
- For example,
Collection<String>
is a collection specifically onStrings
- Type safety is enforced at compile time, and we can avoid casts in most cases.
Take for example
Collection<string> c = ...; // Collection, specifically of Strings Iterator<String> i = c.iterator(); while (i.hasNext()) { if (i.next().length() == 1) { i.remove(); // No cast needed, i.next() is a string } }
the advantages of generics (parametric polymorphism in Java) are
- Improved type safety (compile-time checks)
- Reduced need for casts
- Potentially faster execution (fewer runtime type checks)
Generics vs. Templates (Java vs. C++)
- Key differences in implementation and behavior
- Generic types (Java, OCaml)
- Have type erasure (Java at runime). Compile once, run anywhere. This works under the assumption that all types "smell the same": that you can treat all variables the same. Generics are erased to
Object
meaning you must use wrapper classes, and you cannot check the actual type at runtime.- Every generic value is 64-bit
- Compiler checks types and generates a single version of machine code for generic class/function, regardless of type parameters.
- Type information like (like
<String>
) is mostly used for compile time checking and is often ereased or abstracted away at runtime (type erasure) - Cleaner, more type-safe; less flexible for specialization
- Have type erasure (Java at runime). Compile once, run anywhere. This works under the assumption that all types "smell the same": that you can treat all variables the same. Generics are erased to
- Templates (C++)
- We have code instantiation/specialization
- For each different type parameter instantiation (e.g.
list<int>
andlist<float>
, the compiler generates separate machine code and performs type checking specifically for that instantiation. - More flexible: templates can be specialized for particular types, allowing for optimized for specific cases.
- Can lead to code bloat
- Lower-level, more focused on efficiency and control. Potentially more complex type errors during tempalte instantiation.
1.5.8. Subtype Invariance
Consider the following list subtyping issue in Java:
List<String> ls = ...; List<Object> lo = ls; // Not allowed in Java - compile-time error! lo.add(new Thread()); // Adding a Thread to what was originally a list of strings: type violation! String s = ls.get(); // Attempting to get a String from a list that might now contain a Thread - Runtime ClassCastException // The core problem is Java's mutability: lists are mutable, there is no .add
- The problem: contravariant subtype relationship breakdown. Although
String
is a subtype ofObject
,List<String>
is not a subtype ofList<Object>
. - The reason: mutability and type safety violation. Allowing
List<String>
to be treated asList<Object>
would break type safety in Java due to list mutability. You could add a non-String object (likeThread
) to a list that was supposed to hold onlyStrings
, leading to runtime errors (ClassCastException
) when you later try to retrieve and use the elements asStrings
. - Subtype invariance of generic types in Java: Java generics are invariant by default.
List<String>
andList<Object>
are considered unrelated types in terms of subtyping, even ifString
is a subtype ofObject
.
The standard rule of subtypes: if \(X \subseteq_T Y\), then every value of type \(X\) should be useful in a context that wants type \(Y\). Or conversely, every operation on \(Y\) values should work on \(X\) values. There's a similar problem in C/C++. Consider the following
// Quick definitions char*; // pointer to a character char const*; // pointer to a constant character const char*; // same as above char* const; // constant pointer to mutable character // Example char c = 'A'; char const *p = &c; printf(*p); f(&c); // what we pass here is of type char*, so f is allowed can modify c print(*p); // in this case, the two things will print different things // Then what we really have is: char const *p; // "Pointer to character that can't be modified via this pointer." // As a result: char* \subseteq char const * // Assigning from a subtype to a supertype, and that's OK // Every value of type char* can be used in the context of char const *, not necessarily the other way around!
How to tackle this problem: wildcards
void printall(Collection <?> c) { for (Object i : c) System.out.println(i); }
We can use an unbounded wildcard: List<?>
. The ?
represnets an unknown type, this list can be of any object type. This is useful for methods that only read from a list and don't care about the specific type. For example, printList(List<?> l)
can print any list of objects (or subtypes), but cannot safely added elements to it (due to type uncertainty).
The type safety tradeoff is that List<?>
is more flexible (accepts lists of various types), but less type-safe for operations that modify the list.
Now take for example the following scenario:
public void printShapes(Collection<?> shapes) { for (Shape s : shapes) { printShape(s); // Assume objects of type shapes have this method } } // The problem is that by using <?>, any object could fit here, but only shapes have printShape
to fix this, we introduce a bounded wildcard: use printShapes(Collection<? extends Shape> shapes)
, where the unknown type ?
must be a subtype of Shape
(for reading out).
If we instead wanted supertypes to be accepted (but not subtypes), use <? super Shape>
(for storing stuff), where we have Shape
up through object inclusive, but nothing else is allowed. You can (probably) use both at the same time.
Problems with bounded wildcards
Consider the following:
void cvt( ? []ar, Collection<?> cs) { for (Object o : ar) co.add(o); }
this code is too generous: you can copy from an array of strings to a collection of threads! What we need is some way to indicate that both unknown types are the same. We can accomplish this type variables.
<T> void cvt( T []ar, Collection<T> cs) { for (T o : ar) // hidden iterator is Iterator<T> co.add(o); }
this will work, but only if all the types exactly match. However, there is a limitation: we cannot e.g. copy from Square[]
to Collection<Shape>
because Square
and Collection<Shape>
are different types,
even though Square
is a subtype of Shape
.
To solve this, we could try to use the upper bound wildcard: <T> void cvt (T [] ar, Collection<? extends T> co) { ... }
. Note briefly that wherever we use an unknown type once (like ?
) we can give it an actual name. ?
is just syntactic sugar. This is incorrect for this use case (read from array, write to collection).
It doesn't allow copying from Square[]
to Collection<Shape>
because type parameter T
must be exactly the same for both array and collection.
Instead, we use the lower bounded wildcard (super
): <T> void cvt (T [] ar, collection<? super T> co) { ... }
. Now,
Collection<? super T> co:
collectionco
can be of typeT
or any supertype ofT
(e.g. ifT
isSquare
, we can haveCollection<Square>, Collection<Shape>, Collection<Object>
)- Now
cvt
works for copying fromSquare[]
toCollection<Shape>
becauseShape
is a supertype ofSquare
.
Implementation note:
- We can think about Java (and other OOP) languages as having the following implementation:
- Every object is represented by a pointer to the value of that object. In a strongly typed language, there is a type field there.
- In user-defined type languages, that type field would point to another piece of storage, which we could call a type descriptor (what am I subtype of? what's my name? etc).
- This object won't change as you run, there mostly for type checking
- What does the type descriptor look like when you use
?
, or e.g.List<Integer>
orList<String>
etc. - In Java, the design decision is to say there is only one type descriptor for
List
, for efficiency - Instead, it does type erasure: all we know at runtime is effectively
List<?>
. To do checking, you have to do these checks at runtime.
Note on tree-like class hierarchy
In Java, the type hierarchy is typically a tree, where:
- Every class has a single direct superclass, and all classes ultimately inherit other interfaces.
- There is an exceptional case where it can form a DAG: interfaces and multiple inheritance through interfaces.
1.5.9. Duck Typing
Motivation: can we do something simpler? Yes: duck typing: "if it quacks like a duck, it's a duck."
The type of an object is a bad notion - we shouldn't care about types. Focus on behavior (methods an object supports) rather than type (class hierarchy, type declarations). Type is determined by runtime behavior, not static declarations.
Take the following example in python:
def process_object(obj): obj.quack() obj.waddle() obj.hasRoundedBeak()
Notice that there are no type declarations and no type annotations for the obj
parameter. At runtime, Python will calles these methods on that object to resolve. There's no compile-time type errors if obj
doesn't have these methods, but there are runtime
errors (exceptions): if obj
does not have these methods when process_object
is called, a runtime exceptions occurs: "Duck-typing error" - object doesn't behave like a duck at runtime.
Advantages
- Simplicity: very simple type system (or lack thereof). No need for complex generic type declarations or subtype rules.
- Flexibility: Highly flexible. Functions can work with objects of various types as long as they support the required methods
- Code reusability: Encourages code reuse. Functions can be more generic and work with a wider range of objects.
- Rapid development: easier to write code quickly without being constrained by strict type systems.
Disadvantages
- Runtime errors: Type errors are detected at runtime, not compile time. Errors can be delayed and harder to catch early in development.
- Reduced reliability: Less type safety compared to statically typed languages. Potential for unexpected runtime errors due to type mismatches
- Debugging challenges: Debugging type-related errors can be more difficult as errors surface only at runtime, potentially far from the source of the type issue.
- Performance overhead (Dynamic Dispatch): Method calls might be slightly slower due to dynamic method resolution at runtime (interpreter needs to look up methods at runtime
1.5.10. Static typing (Generics/Java) vs. Dynamic typing (Duck typing/Python)
- Static typing (Java):
- Pros: Compile-time type safety, early error detection, potentially better performance (less runtime overhead), improved reliability for large, complex systems.
- Cons: Less flexible, more verbose code, can be stricter and reject valid programs, steeper learning curve for complex type features (like generics and wildcards).
- Dynamic Typing (Python):
- Pros: More flexible, concise code, faster development, easier to get started, more forgiving type system
- Cons: Runtime type errors, reduced reliability, potential performance overhead, harder to debug type issues in large projects, less suitable for safety-critical applications
1.6. Java
At a glance,
A first few notes about Java: it's object-oriented programming. Java was meant to be slightly higher level, which we do for portability, reliability, and performance (small loss compared to C++). A big plus over C++ and C that's easier in Java is concurrency.
Java is closely related to the idea of bytecodes: standard bytecode representaton for Java programs that is intended to be portable. Like machine code, except for an abstract machine that doesn't really exist, it's a stack oriented machined that is run by an interpreter + JIT compiler. A major difference between Java and C++ is that it has single inheritance, for simplicity and performance (vs. C++ having multiple inheritance), think like a tree. A single class cannot have two superclass, can only have one parent.
One special difference in Java vs. C++ are arrays. Arrays in Java are objects, so they live on the heap and are garbage collected, which takes a performance hit but has a reliability advantage, additionally, you can return the array since it lives on the heap.
This, however, gives the opportunity for the Java compiler to optimize it, and it could put it on the stack. Also, the size of an array can be deduced at run-time (a[?]
, vs. must be available compile-time for C++). However, the size is fixed once allocated.
1.6.1. abstract
A subclass method can "shadow" a parent's (superclass). It's considered to be good practice to have the shadow method to be an "implementation" of the super method.
There are some exceptions. For example, the abstract
class are classes that come without implementation. For abstract
method, this forces the children to implement the marked method. Any class with an abstract method can't be instantiated: we need to initialize a subclass instead.
The opposite of abstract
is a final
class or method, which means it cannot be subclassed or overriden. This helps with inlining, since we can substitute the body of the method directly. Another reason that we have this is trust: to make sure your code is clear, you can use final
to ensure bad things can't happen with bad subclass implementations. The other reason to use final
is for debugging: you can make guarantees that your method won't be subclassed and won't cause errors that way, and can be helpful for debugging.
1.6.2. interface
This brings us to a different idea. We have interfaces
, which we only define APIs and no implementations. In contrast to classes,
Java allows implementation of multiple interfaces. This allows us to place constraints on objects from an API
perspective instead of an inheritance perspective. An abstract
class can be thought of as bundling a concrete class and an interface together.
interface I { Java int m(); int n(float); } class E extends C implements I, J, K { ... } void p(I x) { Java x.run(); } E o = new E(); p(o);
1.6.3. Java object
class Object { // Why did we do this? One reason could be *sentinel values*: special value used in // algorithms to mark a boundary or to signal a special condition. It's chosen because it's // guaranteed not to occur as a normal element in your data. When you create a new Object(), // you get an instance that is unique and does not carry any additional data. Because of its generic // nature, you you can be sure it won’t naturally appear in your collection of meaningful data. // By inserting this unique, “empty” object into a data structure (like an array), // you can use it as a sentinel—a flag that indicates something special. public Object(); // For any object o, you have an equals method // Distinct from == operator! Let designer dictate equality (semantic vs. address) // By default goes to ==. Up to the designer! public boolean equals(Object obj); // Keyword Class is an object that represents compile-time class // A capital 'C' Class is an object that stands for lower 'c' class that you wrote // in your program, and they live in the object hierarchy! // Not an object at runtime! No lying about your type! // Good for debugging. Final because efficiency and *trust* (debug, types) public final Class<? extends X> getClass(); // Where X is the type of O (object), i.e., O.getClass() // By default hashes the objects internal address, but you can redefine. // But! equals and hashCode should be consistent. If equals returns true, // then the two objects should have the same object. public int hashCode(); // Get a string representation of the object. public string toString(); // For any object, you can clone it. Except, for some objects, // we can throw an exception against cloning. // The idea is that we get a copy of the top level of the original object. // Figure out how big the object is, copy all the bytes into a new object. // Protected means "not intended for general use." The GC can call it... others should probably not. protected Object clone() throws CloneNotSupportedException; // Throw whatever it likes, the finalize methods is a hook into the GC. // Right before the GC claims the object, it calls finalize(). // Apparently protected and finalize() don't jive well protected void finalize() throws Throwable; // Soon: wait(), notify(), notifyall() }
Note on sentinel
:
Object sentinel = new Object(); // our unique sentinel value data[data.length - 1] = sentinel; // place it at the end int i = 0; while (data[i] != target && data[i] != sentinel) { i++; } if (data[i] == target) { // found the target } else { // target not found (because we hit the sentinel) }
1.6.4. Java threads
Java implements threads using a thread
object, and it's under the object hierarchy. Each thread object represents
a distinct thread of execution: has it's own PC/CPU. Even subclasses has the same idea (distinct threads).
We could write thread methods like so:
class T extends Thread { void run() { doSomething(); } } T t = new T(); t.start();
note that we override Thread
's run method. This way is too restrictive, because every class that needs to do multithreading
must "match up" in the object hierarchy, i.e., has to be a sublass or superclass. Instead, we use the Runnable
interface
class RT implements Runnable { Java void run() { doSomething(); } } Thread t = new Thread(new RT()); t.start();
all you need is a run()
method! This is the more common and flexibly way.
A thread's life cycle begins when created with new
. At this point, the thread exists but hasn't started yet. At this point (NEW),
we can fiddle with the thread, change priority, etc. Once we call .start()
, the OS allocates resources (virtual CPU, instruction pointer: PC, stack, etc) and the thread can be run (RUNNABLE, once it's here it can start doing stuff, and it stays RUNNABLE).
At this point, the thread can sleep (TIMEDWAITING), wait (WAITING), do I/O (BLOCKED), yield (still, RUNNABLE, this just "yields" to some other thread), or execute code (also RUNNABLE).
Finally, we can exit (return
from run()
method) and the thread becomes TERMINATED. Note that the thread object still exists: "someone else" may come along and examine the "grave".
1.6.5. Race conditions
A race condition is when the order of execution changes behavior.
The classic example is that thread A writes and thread B reads at the same time (inconsistent state, partial writes).
- How can you reproduce a race condition? In general it's very difficult. You can go into a debugging environment and note the timestamps where everything has occurred and then investigate the log. However, when you turn on logging, you are changing the timing! You slowed the program down, and so may not reproduce the same race condition.
Synchronized methods are one solution. We can mark a method as a synchronized
, which causes the Java compiler to insert code to grab/release locks at the beginning/end of function execution.
- These locks are spin locks (
while
loop type lock): unlike traditional locks that might put a thread to sleep if it cannot acquire the lock immediately, a spin lock causes the thread to repeatedly check (or "spin") in a tight loop until the lock becomes available.- Because spinning avoids the overhead of system calls for putting threads to sleep and waking them up, spin locks are efficient when the wait time is expected to be very short.
- The tradeoff is that spinning consumes CPU cycles. In scenarios where the wait is long, spin locks can lead to inefficient CPU usage, as the spinning thread does no useful work while waiting.
- A typical strategy would be that any reads/writes to some object done in a threaded environment are done from synchronzied methods. Many basic classes in Java have synchronized methods (synchronized hash tables). This is fine if accesses are rare or if clients are slow.
- Note that even if we only do reads, we still need to lock the object in order to prevent writes from occurring during a read.
- A thread that is trying to acquire a spin lock, is under RUNNABLE, and stays RUNNABLE. We want to add supports for some of these other states.
In Java, every object is associated with a monitor. This monitor acts as a built-in lock mechanism that controls access to the object’s synchronized code blocks and methods.
When you declare a method or a block of code as synchronized
, the thread executing that code must acquire the object's monitor. Only one thread can hold that monitor at a time, which prevents other threads from entering any other synchronized block that locks on the same object.
This mechanism is built into the Java language and the java.lang.Object
class, meaning every Java object automatically has this synchronization capability without requiring explicit lock objects.
An object's monitor is essentially the internal lock that Java uses to coordinate synchronized access and communication between threads.
When a thread calls wait()
from within a synchronized context, it temporarily releases the lock it holds. This allows other threads to enter the synchronized block or method and work with the object.
The thread that calls wait()
goes into a waiting state until some other thread signals that the condition it was waiting for might now be true. The thread is not actively consuming CPU resources while waiting.
Once notified, the waiting thread must re-acquire the lock before it can resume execution. This means that even after being woken up, it might have to wait if another thread currently holds the lock.
o.notify()
: This method wakes up a single thread that is waiting on the object’s monitor. The exact choice of which waiting thread to wake up is often left to the operating system’s thread scheduler and may seem random (or sometimes based on how long the threads have been waiting). The key point is that the thread calling notify()
is not the one waiting; it’s another thread that signals that it’s done using the object.
o.notifyAll()
: In contrast, notifyAll()
wakes up every thread that is waiting on the object’s monitor. This gives the awakened threads a chance to compete for the lock, which can allow the application to implement its own policy about which thread should proceed. This is more flexible but can be harder to manage because all waiting threads become active, potentially leading to a burst of contention.
With notify()
, you let the operating system (or rather the underlying thread scheduler) choose which thread to wake up. With notifyAll()
, you hand over more control to the application because all waiting threads are awakened and then must re-compete for the lock based on the application’s logic and their own scheduling.
1.6.6. Semaphores
Semaphores are higher-level synchronization primitives built on top of the lower-level wait/notify mechanisms in Java. A semaphore maintains a count (its capacity) that represents how many permits (or resources) are available. A binary semaphore is simply a semaphore with a capacity of 1, while a counting semaphore can have a capacity greater than 1. In Java, we have
s.acquire()
: which blocks the calling thread until a permit is available. When a permit is acquired, the semaphore’s internal count is decremented.s.tryAcquire()
: which attempts to acquire a permit without blocking. It returns a Boolean value indicating whether the acquisition was successful.s.release()
: releases a permit, incrementing the semaphore’s count and potentially unblocking a waiting thread.
Semaphores are useful when you want to limit access to a shared resource (for example, a pool of database connections or a fixed number of identical resources). If the resource pool has a capacity of 10, for instance, up to 10 threads can acquire a permit concurrently. When an 11th thread attempts to acquire a permit, it must wait until one of the existing threads calls release()
.
1.6.7. Exchanges
An exchanger is a high-level synchronization object that provides a simple rendezvous point where two threads can exchange data with each other.
Specifically, it is designed for pairs of threads. One thread calls the exchange method with its data (for example, calling x.exchange(v)
), and it then waits at the rendezvous point until a partner thread also calls the same method with its data (for example, x.exchange(A)
).
When both threads have reached the exchange point, the exchanger swaps the objects they provided. The operation is atomic: both threads are blocked until the exchange can occur. This ensures that neither thread proceeds until it has successfully received the object from its partner, making it useful for coordinated data transfers.
This pattern is especially useful when two threads need to collaborate by exchanging results or resources without having to resort to more complex synchronization techniques. It guarantees that both threads know that the exchange has happened before they continue.
Take the following example
T1: a = e.exchange(b); T2: c = e.exchange(d); a == d && c == b
- Thread 1 calls
exchange(b)
with its own valueb
. - Thread 2 CALLS
exchange(d)
with its own valued
.
The exchanger works as a rendezvous point. When one thread reaches the exchange()
call (say, T1), it waits
for a partner to arrive. Once thread 2 reaches its exchange()
call, the exchanger pairs them together and swaps the values:
- T1 receives the value
d
from T2, soa
becomes equal tod
. - T2 receives the value
b
from T1, soc
becomes equal tob
.
Then, after the exchange the relationships holds that a =
d && c = b
. One thread must wait; the other doesn't have to.
1.6.8. Countdown Latch
A countdown latch (barrier) makes all the threads wait, run a little code, then say "go" and let them continue running. There is a related notion called a cylic barrier, which is a countdown latch that repeats. Think of it as a race where, after each lap, all horses stop at the barrier. They wait until every horse has completed the lap before starting the next lap together.
How can we make Java goes fast? Spin locks are slow. The machine instructions may cause OoO execution. The rule in Java is if a single thread doesn't know different, then the compiler can do it in any order ("as-if rule").
If we do this in a synchronized method, there could be issues! So there are exceptions: lock and unlock prevent re-ordering (generally).
We could use the volatile
keyword, which indicates to Java not to optimize access to a variable.
1.6.9. Java Memory Model
Preamble: image a universe with only one spatial dimension (an X-axis) and one time dimension (a t-axis). In this simplified universe, your “location” and the evolution of events over time are graphed much like a Minkowski diagram in special relativity. In physics, the speed of light (set to 1 in the analogy) is the ultimate limit for the transmission of information. This creates three regions:
- Future: The region that can be influenced by an event at (0,0).
- Past: The region from which events could have affected the present.
- Elsewhere: Regions that lie outside the light cone, where no causal influence is possible.
The analogy maps onto multithreaded programming in that one thread “sends a message” by modifying a shared object. Just as an event in spacetime is limited by the speed of light, changes in shared memory are subject to constraints on visibility and ordering—if two threads try to communicate via a shared variable, they must respect these “causal” rules. Much like in relativity, there is a “region” (or a set of rules) within which messages (i.e., memory updates) can reliably be seen by other threads.
In a multithreaded Java program, one thread may modify a shared object while another thread is reading or also writing to it. Without a strict ordering or synchronization, these operations can “step on each other’s toes,” resulting in unpredictable behavior. Just as you cannot send a message to a region outside your light cone, threads can only reliably “send messages” (i.e., share memory updates) if the operations are properly ordered. Race conditions occur when these messages get garbled because the timing and ordering of operations are not controlled.
This in some ways motivates the Java memory model, which was developed to define precisely when and how threads can reliably communicate through shared memory. It specifies rules for the ordering of reads and writes and tells compiler writers what optimizations (like reordering instructions) are allowed.
First, compilers and CPUs are allowed to reorder instructions (e.g., caching values in registers) as long as the changes are not observable in a single-threaded context. This is also known as the as-if rule.
This will not work in a multithreaded context. In multithreaded programs, one thread’s view might be different from another’s if reordering leads to race conditions.
1.6.10. Digression on volatile
They keyword volatile
in Java tells the compiler and runtime that a variable’s value can change unexpectedly (like a volatile chemical that evaporates). Every access to a volatile variable must reflect the actual value in memory-no caching or reordering of those accesses is allowed.
Volatile variables are used for simple message passing between threads. For example, one thread can spin in a loop waiting for a volatile variable to change, ensuring that each check reads the latest value. Take the following
// Shared variable (if not volatile) boolean done = false; // Thread A: while (!done) { // Busy-wait until done becomes true } // Continue when done is true // Thread B: done = true;
if done
is a normal (non-volatile) variable, the compiler or processor might cache its value in a register. This means that Thread A might read the cached value and never see the change made by Thread B. The loop might never terminate.
By declaring the variable as volatile (volatile boolean done = false;
), every read of done
must check the actual memory value. Thread A will see the update made by Thread B, and the loop will eventually terminate.
There are some key limitations. Volatile prevents certains kinds of reordering, and is not sufficient to implemenent
full locking mechanisms (like a synchronized
block) because race conditions can still occur if multiple operations need to be atomic. For example,
a naive spin lock using only a volatile flag can still fail due to timing issues:
class SpinLock { // Using volatile so that changes are visible across threads private volatile boolean locked = false; public void lock() { // Spin until we see that locked is false while (locked) { // Busy-wait (spin) } // Once we exit, we assume locked is false, so we set it to true locked = true; } public void unlock() { locked = false; } }
the volatile declaration ensures that each read of locked
gets the up-to-date value. However, we can still construct a race condition:
- Thread A checks
while (locked)
and seesfalse
. - Before Thread A sets
locked = true;
, Thread B also checkswhile (locked)
and seesfalse
. - Both threads then set
locked = true
, leading to a situation where both believe they hold the lock.
The problem is not visibility (volatile
provides that), but atomicity. The check-and-set sequence isn’t atomic. For proper locking, you need hardware-supported atomic operations (or use higher-level constructs like synchronized
) that ensure the check and update happen as one indivisible step.
1.6.11. Returning to the JMM
First, we have to start with definitions. Recall that every Java object has an associated monitor (intrinsic lock, part of its definition like its type).
When you enter a synchronized
block or method, the thread must acquire this monitor. These actions are conceptualzed as
- Enter monitor (Lock): The thread grabs the lock before executing the critical section.
- Exit monitor (Unlock): The thread releases the lock upon leaving the critical section.
These synchronized blocks ensure that only one thread can execute the protected code at a time, thereby preventing race conditions. Whereas volatile enforces ordering on individual variable accesses, synchronized blocks guarantee mutual exclusion and a stronger form of visibility. They also involve two primitives (enter and exit monitor) that form a part of the memory model’s rules.
At a high level, the Java memory model is trying to describe "causal" relationships between memory operations across threads. Operations that are “in the future” (or outside the allowed reordering boundaries) cannot affect the present computation reliably.
To formalize, what reordering of computation is allowed, the JMM conceptually divides operations into three categories:
- A: Ordinary (non-volatile) loads and stores. These are the most common and are heavily optimized.
- Operations within category A can be freely reordered relative to one another if doing so does not change the observable behavior in a single-threaded context.
- Simple, common example would be two non-dependent assignments, where the compiler might swap them or even eliminate redundant loads/stores.
- B: Volatile loads and “enter monitor” operations. These operations have stronger ordering constraints because they must fetch the latest value or acquire the lock.
- A volatile load means reading a volatile variable, which requires getting the up-to-date value from memory.
- Entering a monitor (or acquiring a lock via a synchronized block) is a synchronization operation that establishes a “happens-before” relationship.
- In terms of reordering constraints, these operations create a barrier. When a normal operation (category A) is immediately followed by a category B operation, the compiler is allowed to swap them as long as it doesn’t change what an observer (another thread) can see.
- For example, a normal load that precedes an acquire (enter monitor) might be moved after the monitor entry—this effectively “expands” the critical section. However, this movement is allowed only if it doesn’t let an external observer see a result that violates the expected ordering.
- C: Volatile stores and “exit monitor” operations. These also enforce ordering, ensuring that changes are visible once the lock is released or the volatile variable is written.
- A volatile store means writing to a volatile variable and ensuring that the value is immediately visible to all threads.
- Exiting a monitor (or releasing a lock) is also a synchronization action that “publishes” the changes made in a critical section.
- In terms of reordering constraints, if you have a category C operation followed by a normal operation (category A), the normal operation can sometimes be moved to occur before the category C operation. This also has the effect of “expanding” the critical section.
- The key is that such reordering must not allow another thread to observe a state that contradicts the intended synchronization order.
The Java memory model dictates that (as "rules"):
- Within category A
- Allowed: Normal loads and stores can be reordered if they are not dependent (e.g., a load of X followed by a store to Y can be swapped if no dependency exists).
- Not Allowed: If there is a data dependency (for instance, reading a variable and then immediately writing a computed value based on it), reordering is disallowed because the outcome could change.
- Between category A and B
- Allowed: A normal (category A) operation that precedes a volatile load or an enter monitor (category B) can be swapped so that the category B operation comes first.
- Effect: This moves the normal operation into the critical section. For example, moving a load after acquiring the lock might be beneficial because it ensures the value read is synchronized with the critical section’s state.
- Between category A and C
- Allowed: Similarly, a normal operation (category A) that follows a volatile store or an exit monitor (category C) can be moved to occur before the volatile store or exit monitor.
- Effect: This expands the critical section in the other direction. In effect, the update to a normal variable is “protected” by the lock even though it appears after the explicit synchronization in the source code.
- Between categories B and C (or within B or within C)
- Careful! Not generally allowed: Operations that are already marked as volatile loads/stores or that involve entering/exiting monitors have stricter ordering because they are the points where the happens-before relationships are established.
- Result: They effectively serve as memory barriers. For instance, a volatile store cannot be delayed or swapped with operations that follow it if that would allow another thread to see stale or inconsistent data.
Even though the JMM allows aggressive reordering within category A for efficiency, it imposes strict limits at the boundaries (Categories B and C) to ensure that once a thread enters a critical section (or reads a volatile value), it sees a consistent state.
We also have a notion of "expanding critical sections". A compiler might move normal loads into a synchronized block (or move stores out of it) if it can do so without affecting the program’s behavior as observed by other threads. This “expansion” of the critical section is allowed by the reordering rules.
Importantly, all these reordering permissions are subject to the as-if rule: the observable behavior of the program in a single-threaded context must remain the same. In a multithreaded context, however, the ordering provided by volatile and synchronized constructs (the barriers) is preserved.
In short:
There are some notable exceptions to this rule. One of which being volatile
: every access to a voltile variable has to be done at a machine level
in the same order that it is in your program. Compiler writers like making everything volatile, but programmers hate it because it makes implementation harder/unreliable.
How can we do spin-locking reliably at the hardware level? A re-order table. Which has the following categories of operations:
- A: normal load + store (most common)
- Reading into and storing into non-volatile variables
- B: volatile load + enter monitor (entering the critical section)
- Reading from a volatile variable, monitor means grab a lock
- C: volatile store + exit monitor (exiting the critical section)
- Writing into a volatile variable, letting going of a lock
This gives the folloiwng valid optimizations:
- Reordering multiple elements of cateogry A
- Reordering type A (first op) and B (second op) in any order (it is okay to grow the critical section)
- Type C (first op) and type A (second op) in any order.
1.6.12. Java Project
This is just the Java project code
// State.java interface State { int size(); long[] current(); void swap(int i, int j); } // NullState.java // This is a dummy implementation, useful for // deducing the overhead of the testing framework. class NullState implements State { private long[] value; NullState(int length) { value = new long[length]; } public int size() { return value.length; } public long[] current() { return value; } public void swap(int i, int j) { } } // SwapTest.java import java.util.concurrent.ThreadLocalRandom; class SwapTest implements Runnable { private long nTransitions; private State state; SwapTest(long n, State s) { nTransitions = n; state = s; } public void run() { var n = state.size(); if (n <= 1) return; var rng = ThreadLocalRandom.current(); var id = Thread.currentThread().threadId(); for (var i = nTransitions; 0 < i; i--) state.swap(rng.nextInt(0, n), rng.nextInt(0, n)); } } // Synchronized state class SynchronizedState implements State { private long[] value; SynchronizedState(int length) { value = new long[length]; } public int size() { return value.length; } public long[] current() { return value; } public synchronized void swap(int i, int j) { value[i]--; value[j]++; } } // Unsynchronized state class UnsynchronizedState implements State { private long[] value; UnsynchronizedState(int length) { value = new long[length]; } public int size() { return value.length; } public long[] current() { return value; } public void swap(int i, int j) { value[i]--; value[j]++; } } // UnsafeMemory.java class UnsafeMemory { public static void main(String args[]) { if (args.length != 5) usage(null); try { boolean virtual; if (args[0].equals("Platform")) virtual = false; else if (args[0].equals("Virtual")) virtual = true; else throw new Exception(args[0]); var nValues = (int) argInt(args[1], 0, Integer.MAX_VALUE); State s; if (args[2].equals("Null")) s = new NullState(nValues); else if (args[2].equals("Synchronized")) s = new SynchronizedState(nValues); else if (args[2].equals("Unsynchronized")) s = new UnsynchronizedState(nValues); else if (args[2].equals("AcmeSafe")) s = new AcmeSafeState(nValues); else throw new Exception(args[2]); var nThreads = (int) argInt(args[3], 1, Integer.MAX_VALUE); var nTransitions = argInt(args[4], 0, Long.MAX_VALUE); dowork(virtual, nThreads, nTransitions, s); test(s.current()); System.exit(0); } catch (Exception e) { usage(e); } } private static void usage(Exception e) { if (e != null) System.err.println(e); System.err.println("Arguments: [Platform|Virtual] nvalues model" + " nthreads ntransitions\n"); System.exit(1); } private static long argInt(String s, long min, long max) { var n = Long.parseLong(s); if (min <= n && n <= max) return n; throw new NumberFormatException(s); } private static void dowork(boolean virtual, int nThreads, long nTransitions, State s) throws InterruptedException { var builder = virtual ? Thread.ofVirtual() : Thread.ofPlatform(); var test = new SwapTest[nThreads]; var t = new Thread[nThreads]; for (var i = 0; i < nThreads; i++) { var threadTransitions = (nTransitions / nThreads + (i < nTransitions % nThreads ? 1 : 0)); test[i] = new SwapTest(threadTransitions, s); t[i] = builder.unstarted(test[i]); } var realtimeStart = System.nanoTime(); for (var i = 0; i < nThreads; i++) t[i].start(); for (var i = 0; i < nThreads; i++) t[i].join(); var realtimeEnd = System.nanoTime(); long realtime = realtimeEnd - realtimeStart; double dTransitions = nTransitions; System.out.format("Total real time %g s\n", realtime / 1e9); System.out.format("Average real swap time %g ns\n", realtime / dTransitions * nThreads); } private static void test(long[] output) { long osum = 0; for (var i = 0; i < output.length; i++) osum += output[i]; if (osum != 0) error("output sum mismatch", osum, 0); } private static void error(String s, long i, long j) { System.err.format("%s (%d != %d)\n", s, i, j); System.exit(1); } }
1.7. Logic Programming and prolog
At a glance, (TODO)
In a logic language, the basic entity is a predicate (like a function which returns true or false. But don't think of it as a function because that is in functional terms).
We glue together predicates using &
(AND), ;
(OR), \+
(negation as failure, discussed more later). We give up functiuons and side effects. Think declaritvely, you specify what answers you want.
We want to split an algorithm into two parts: logic (what you want; spec; correctness) and control (efficiency; implementation).
Rules have conditions. Facts don't have conditions and are always true.
1.7.1. Quicksort in prolog
In prolog
, you don't "say" what type of sort (quicksort, heapsort, etc) you're doing. You just indicate to prolog
that "I have a sorting problem. You figure it out." This is considered thinking declaratively.
Since we have two predicates, we give two arguments to sort:
L
is the list to be sortedS
is the sorted version ofL
we need to specify what sorting means. S
has to have the same elements in L
, if L
has duplicates, S
has to have the same duplicates (mathematically: S
is a permutation of L
). We can write
sort(L, S) :- perm(L, S), sorted(S)
we can think of :-
as "if" and ,
as "and". In English, this reads as "if S
is a permutation of L
and S
is sorted, then S
is a sorted version of L
.
We can also write some "for all L
and for all S
" but these are implicit to all predicates.
Capital letters are logical variables. The scope of a logical variable is just the clause in which it is defined. No nested scope, etc. No need to declare, we are implicitly declaring a variable by writing a capital letter.
At this point, we still have two more predicates that we have yet to define (perm
, sorted
).
First we define sorted
. We can look at sorted
as the following:
- Fact: the singleton list sorted:
sorted([_])
- Fact: empty list is sorted:
sorted([])
- Rule: in a sorted list, X <= Y for X before Y:
sorted([X, Y]) :- X =< Y
- Rule: if X is less than or equal to Y and Y is less than the head Z then our list is sorted (similar to
Y::Z
in OCaml):sorted([X, Y|Z]) :- X =< Y, sorted([Y|Z])
Next we define perm
. We have the following:
perm([], [])
perm([X], [X])
perm([X, Y], [X, Y])
perm([X, Y], [Y, X])
but we can write a simpler recursive rule to capture all the above base cases (instead of manually enumerating…)
perm([X|L], R) :- perm(L, PL), append(P1, P2, PL), append(P1, [X|P2], R)
Rule: if PL
is a permutation of L
and appending P1
to P2
gives you PL
and appending P1
to [X|L]
is equal to R
append([], L, L)
Fact: appending the empty list to any list L
will give you L
append([X|L], M, [X|LM]) :- append(L, M, LM)
one list is a permutation of another if we can rearrange the suffix of the first list L
into two parts PL1
and PL2
, and PL1, X, PL2
appended to each other gets us the resulting list.
By the way, this implementation is \(O(n!)\). The logic is optimized for DFS, and also depends how you write the logic. Who the $*@& is using this?
If we wanted to write a faster sort
, we could do
% Base case: an empty list is already sorted. fast_sort([], []). % Recursive case: sort the tail, then insert the head into the sorted tail. fast_sort([H|T], Sorted) :- fast_sort(T, SortedTail), insert(H, SortedTail, Sorted). % insert(+Elem, +SortedList, -NewSortedList) % Inserts Elem into the correct position in SortedList. insert(X, [], [X]). insert(X, [Y|T], [X,Y|T]) :- X =< Y, !. insert(X, [Y|T], [Y|Rest]) :- insert(X, T, Rest).
1.7.2. prolog
syntax
Prolog is built out of terms, which are one of the following:
atom := [a-z][a-zA-Z0-9_]* variable := [A-Z_][a-zA-Z0-9_]* structure := atom(T1, ... Tn) where n > 0 (arity)
Atoms are individual values that only have the properties that you assign to then, They are simply names, atoms can never be equal to numbers. You can write atoms in quotes and not have naming restrictions.
A logical variable will stand for any value that you want it to stand for, except every occurrence of the same logical variable in a clause has to be the same thing. Logical variables are bound to terms on success, and unbound on failure (will happen to all occurrences of the logical variable).
A structure f(T1,...,Tn)
, for \(n > 0\) and \(f\) an atom, T1-Tn
are terms: a simple data structure in memory (just a vector of atom then arguments).
1.7.3. Some prolog
syntactic sugar
Syntactic sugar means that while Prolog allows you to write things in a very natural, human‐friendly way, these forms are formally equivalent to more “raw” or explicit term representations. You could write them out in full if you wanted to; the meaning wouldn’t change.
Infix operators: Prolog supports built-in operators (like +
, -
,===, :-
, etc.) with precedences similar to those in languages like C++ or Java. For example, when you write X = 2 + 2.
, Prolog interpretes this as a term that is equivalent to =(X, +(2, 2)).
.
- In this form the operator
=
is just a functor that tries to make its two arguments equal (via unification), and+
is a functor that constructs a term representing "2 plus 2". - Internally, when you call the equals predicate (
=
), Prolog attempts to unify its two arguments. In our example, it doesn’t actually perform arithmetic addition; it simply constructs the term+(2,2)
and bindsX
to that structure. Later, when Prolog printsX
, it pretty-prints it as2+2
, which is a neat representation of the underlying data structure. - Prolog does not automatically evaluate arithmetic expressions. If you want to perform arithmetic, you must explicitly ask it to evaluate (for example, using the
is/2
predicate). Thus,X
is2 + 2
would compute the sum and bindX
to 4, whileX = 2 + 2
just constructs the term.
Data structures and terms
- Every Prolog expression is a term. Operators and infix notations are just ways to build these terms. For instance, the expression
2+2
is internally the compound term+(2,2)
, which is exactly what you’d write if you were manually constructing the term.
List notation
- The square bracket notation
[a, b, c]
is shorthand for explicit construction using the.
operator. Internally, this list is represented as.(a, .(b, .(c, []))).
Rules as terms
- When you write a rule in Prolog, such as
head :- body.
, this is not some separate language construct but is itself a term. The binary operator:-
(which has low precedence) connects the head of the rule with its body. In other words, a rule is just a compound term that represents both a fact and a rule simultaneously.
1.7.4. Review: Clauses
Clauses: logic statements about the world that are universally quantified. For example: \(\forall x \forall y \forall z (BC(z, F(y, x)) \rightarrow(x, F(y, x)))\). In English, this reads "For every x, every y and every z, if BC holds for z and F applied to y and x, then AB holds for x and F applied to y and x."
In prolog
we would express this as ab(X, Y) :- bc(Z, F(Y, X)).
, which means "For all X and Y, AB(X, Y) is true if there exists some Z such that BC(Z, F(Y, X)) is true."
Sometimes, we might write down two clauses that are very similar, but one is "more specific" (has extra conditions) than the other. The more general caluse applies in a wider range of situations because it has fewer constraints.
Substitution test: to decide if one clause is more general than another, you check whether you can take the more general clause and, by consistently subtituting its variables with specific terms, get the more specific clause.
- If yes, then the general clause covers all cases of the specific clause (and maybe more)
- If no, then the specific clause cannot be derived from the general one via substitution
For example:
sibling(X, Y) :- parent(P, X), parent(P, Y), X \ = Y.
says "X and Y are siblings if they share a parent P and are not the same person." (general clause)sibling(john, mary) :- parent(P, john), parent(P, mary), john \ = mary.
, if we substituteX = john
andY = mary
in the general clause, you obtain the specific clause.
substitution is a set of assignments to logical variables. prolog
is based on the idea of coming up with substitutions. If you ask prolog
a question,
we get some series of substitutions.
- In a substitution, the LHS must be a single variable. e.g.
{F(y) = B}
is not valid.
1.7.5. Three Kinds of Clauses
- Facts: statements that are unconditionally true. Like raw data or assertions. (
prereq(cs31, cs131)
)- Facts can also include variables, making them more general. We also have a "don't care" variable:
prereq(intro101, _)
- Ground term: a fact that does not involve any logical (unbound variables).e.g.
prereq(cs31, cs131)
- Facts can also include variables, making them more general. We also have a "don't care" variable:
- Rules: rules are clauses that have conditions - written with a
:-
(reads as "if") and define relationships beased on other clauses- Transitive closure: we can combine facts and rules to define more complex predicates. e.g.
prt(A, B) :- prereq(A, B)
as our base caseprt(A, Z) :- prereq(A, B), prt(B, Z)
as our recursive case
- Transitive closure: we can combine facts and rules to define more complex predicates. e.g.
- Queries: questions you ask
prolog
to find out what substitutions (answers) make the statement true.- When you pose a query,
prolog
uses a depth first search to look for proofs based on the facts and rules. - You can view a query in a logical sense as saying, "it is not the case that there is no solution." Formally, a query
is equivalent to asserting that if there were no valid subsitutions making the query true, then then false would follow. Prolog works by trying to disprove this "negative" until it finds a counter example (a valid solution). e.g. it using a form of proof by contradiction.
- You can type
;
to keep getting more "counterexamples".
- You can type
- Prolog scans its list of facts and rules from the top, and picks the first clauses that matches. Use the match to set up further subgoals.
- When you pose a query,
Subgoals and proof tree
- When a clause has multiple conditions (subgoals) connected by an "and" (
,
), Prolog tries to satisfy them one after another from left to right- If the first subgoal fails, Prolog will not even attempt the later ones in that branch
- For a proof tree, we have that
- For an and node, all the children (subgoals) must be proven for the node to succeed.
- For an or node, only one branch (one way of proving the goal) needs to succeed. Prolog backtracks if needed.
- Prolog explores the proof space in a DFS manner:
- Dives down one branch (resolving subgoals in l-r order) before trying alternative branches
- If a branch fails at any point, it backtracks to the last decision point (an OR node) to try a different alternative
Backtracking Take for example the following:
% A fact: X is a member if it is the head of the list. member(X, [X|_]). % A rule: X is a member if it is in the tail of the list. member(X, [_|Tail]) :- member(X, Tail).
- You can query a specific element:
?- member(a, [a, b, c]).
which matches the factX = a
and returns true immediately. - You can query for all possible elements:
?- member(X, [a, b, c]).
which first matchesX = a
from the fact. On a semicolon, it backtracks and would next findX = b
. - You can do a proof by contradiction in queries i.e. a query that logically asserts "there is no solution" by negating the possibility of any valid substitution. For example,
?- \+ member(q, [a,b,c]).
, this query asks "is it false thatq
is a member of[a,b,c]
?". Prolog attempts to prove this. Since no substitution mkakes this true, the negation succeeds.
Prolog does not compute all answers before returning the first one. Instead, it returns the first answer it finds immediately, and then backtracks to the next valid answer on ;
. Technically, this means that if the logic allows for an infinite number of answers, Prolog will keep generating answers as long as you keep request more semicolons.
Simple vs. complex substitutions
- (Simple substitutions) When prolog finds a straightforward match (binding a variable to a ground term), it often doesn't bother displaying the substitution explicitly because it's trivial, i.e.,
?- member(a, [a,b,c]).
- (Complex substitutions) When the match involves more elaborate structures (such as lists built with "cons"), prolog must print out these substitutions to show how the variables are instantiated. i.e. when the result for query is a compound data structure, prolog displays the entire structure.
- When you ask a query that logically asserts "member is always false",
prolog
will generate substitions showing a list structure that contradicts your assumption.- e.g.
?- member(Q, R).
, hereprolog
might first bindQ
to a variable, and then construct a list whereQ
is the head. This gives you a concrete counterexample.
- e.g.
Finite vs. infinite answers
- In some cases, there are infinitely many ways to construct a list that satisfies the query.
- e.g.
?- member(Q, R).
,prolog
can generate an infinite number of answers because R can be any list containing Q in various positions. Practically, you will run out of memory since we will backtrack indefinitely.
- e.g.
1.7.6. append
The predicate append/3
is commonly defined with two clauses.
- Base case:
append([], L, L).
- Recursive case:
append([X|L1], L2, [X|L3]) :- append(L1, L2, L3).
For example, on querying ?- append([a], [b,c], R).
prolog will
- Match the head of the first list (
a
) - Create a fresh copy of the variable for the tail
- Solve the subgoal
append(L1, [b, c], L3)
, eventually matching the base case - Return
R = [a, b, c]
Scope
- Every time a clause is used,
porlog
treats the variable in that clause as locally scoped.- In the recursive call, the variables get "renamed", ensures that matching in one clause does not interfere with another.
Reversing append
You can also use append to split a list into two sublists, e.g. ?- append(R, S, [a,b,c]).
. However, there are possibly infinite answers.
1.7.7. reverse
We want to define a predicate such that reverse(List, ReversedList)
is true if the second argument is the first list in reverse order.
As a base case, the simplest fact is that the empty list is the reverse of itself: reverse([], []).
One idea is to handel non-empty lists by saying that if a list starts with a head X
and a tail L
, then its reverse is the reverse of L
with X
appended to the end:
reverse([X|L], R) :- reverse(L, RevL), append(RevL, [X], R).
. The predicate recursively computes reverse(L, RevL)
, and it then appends [X]
to the end of RevL
by using the append/3
predicate. However, this is inefficient, as each recursive call uses append/3
, which itself takes time proportional to the length of the first argument, which is \(O(N^2)\).
Improving performance
In order to improve performance, we can avoid repeated appending by defining a helper predicate (rev/3
) that carries an extra argument (accumulator) to build up the reversed list as we traverse the original list.
The helper predicate rev(L, A, R)
is intended to be true when reversing list L
and then appending accumulator A
produces R
. Then,
rev([], Acc, Acc). rev([X|L], Acc, R) :- rev(L, [X|Acc], R). reverse(L, R) :- rev(L, [], R).
this gives us an \(O(N)\) algorithm.
reverse
is often used as a benchmark in prolog systems to measure how many logical inferences (sometimes called "lips") per second can be made.
1.7.8. Simple predicates
The simplest predicates are
fail/0
, which has no clause defined forfail.
. Querying?- fail.
immediately returnsno
.true/0
, which is defined as a single fact:true
. Querying immediately returnsyes
.
Infinite loops:
loop/0
which is simplyloop :- loop
, which basically has prolog infinitely setting up a subgoalloop
to prove, which sets up a subgoalloop
… etc.repeat/0
is a slightly more comlicated. We haverepeat.
repeat :- repeat.
- When we query this, it will prove
repeat
immediately via the fact. But typing a semicolon makes prolog backtrack and repreat infinitely. - We can use
repeat
to print and debug:?- repeat, write('Ouch!'), fail.
will create an infinite loop of printing.
We can also write predicates poorly. Take for example repeats :- repeats, repeat.
- The rule is self-referential without any grounding fact, which leads to an infinite loop.
- Always ensure that recursive predicates have a well-defined base case or are written in a way that allows the search to eventually terminate.
1.7.9. What can go wrong in prolog
?
- In
prolog
, you may write "wrong code" in the sense that the logic you express does not match your intent.- For example, a predicate intended to decide a relationship may have its rules ordered improperly or be missing a necessary base case.
Debugging in prolog
Prolog provides a few built-in facilities for debugging:
- Use
trace/0
: invokingtrace
puts "hooks" into the interpreter - Uses the 4-port debugging model: every goal (or subgoal) in a proof has four "ports":
- Call: when the goal is first invoked
- Exit: when the goal succeeds
- Fail: when the goal fails
- Redo: when the goal is re-invoked (backtracking) to seek another solution
Take for example the following model:
[Call Q] │ (Goal processing) │ ┌─────────┐ │ Q Succeeds ├─► [Exit Q] └─────────┘ │ (If further solutions are requested) │ ┌─────────┐ │ Q Redo ├─► [Redo Q] (then eventually Exit again, or) └─────────┘ │ (If no alternatives remain) │ [Fail Q]
- When you execute a query, Prolog will show (if tracing is turned on) each time it enters or leaves a goal.
- You can also "instrument" your code with print statements, e.g. you can write a predicate that, for every success of a goal, prints the bound values and then calls
fail
to force backtracking. This is often more efficient thatn using the built-in debugger.
This complicated model exists because in prolog
, things can succeed more than once (vs. just returning in other languages). For built-in predicates and their pitfalls, reference the "Simple predicates" section
1.7.10. Unification, two-way matching, and infinite (cyclic) terms
Unification is the process of making two terms identical by finding a substitution (i.e. binding variables to values), that, when applied, makes the terms the same. In prolog
, unification is used
to match a goal against the head of a clause. Every time you execute a gaol, prolog
unifies the goal with a clause head to see if the clause applies.
In OCaml or other functional languages, we have one-way pattern matching: we have some fixed data structure and we match it against patterns that may include variables. The data is "given" and cannot contain variables. In Prolog, we have two-way pattern matching:
- Both the goal and the clause head may contain logical variables
- Variables can be instantiated during unification in either direction
- Advantage: more powerful; allows for bidirectional data flow
- Disadvantage: can lead to unintended consequences
Take the following examples:
- Basic two-way matching:
?- X = f(Y).
.- Here the left side is the variable
X
and the right side is the compount termf(Y)
, whereY
is unbound. - Unification succeeds by binding
X
tof(Y)
. - Even though
Y
remains unbound, the binding is two-way: Prolog now "knows" thatX
andf(Y)
are the same term. - Unification succeeds, result:
X = f(Y).
- Here the left side is the variable
- Matching two compound terms:
?- f(X, a) = f(b, Y).
.- Both terms have the same functor
f/2
. - Prolog compares the corresponding arguments:
- First arguments:
X
andb
-> bindsX = b
. - Second arguments:
a
andY
-> bindsY = a
.
- First arguments:
- Unification succeeds, result:
X = b, Y = a.
.
- Both terms have the same functor
- More complex unification:
?- f(g(X), Y) = f(Z, g(a)).
- Both terms start with the functor
f/2
, so Prolog attempts to unify the arguments:- For the first argument:
g(X)
must equalZ
. So,Z
is bound tog(X)
. - For the second argument:
Y
must equalg(a)
. So,Y
is bound tog(a)
.
- For the first argument:
- Unification succeeds:
Z = g(X), Y = g(a).
.
- Both terms start with the functor
Now we consider cyclic terms and discuss the "Occurs" check. Consider the fact p(X, X).
.
- Querying
?- p(a, Y).
successfuly unifiesX
witha
andY
witha
. - However, trying a more complicated unification like
?- p(Z, f(Z)).
will haveprolog
attempting to setZ = f(Z)
.
In the second (bad) case, internally, prolog
binds Z
to therm f(Z)
, creating a cycle (term that refers to itself). This is cheap to do (just setting a pointer), but when you try to print or process
this structure, you may run into an infinite loop. These are cyclic terms that can lead to unintended infinite loops.
How can we fix this? For efficiency, standard unification in prolog
does not perform the "occurs check" (i.e. it does not verify that a variable does not occur within the term it is being bound to).
- We can use
unify_with_occurs_check/2
: a built-in predicate that performs unification but fails if the unification would create an infinite cyclic term.- We can use it as
?- unify_with_occurs_check(Z, f(Z)). false.
. However, the tradeoff is that it is slower since it must traverse the entire datastructure in order to check for cycles.
- We can use it as
1.7.11. Piano Arithmetic
The representation of numbers in prolog
is interesting.
- Atoms and successor function: some
prolog
systems may not have built-in arithmetic. So, we represent numbers usingZ
to represent 0 and the functors/1
to represent the successor (i.e. plus one). So the number 2 is represented as(s(s(Z)).
- Defining addition:
- Base case: adding 0 to any number yields that number
plus(Z, X, X).
- Recursive case: adding the successor of
X
toY
results in the successor of (X
plusY
). Similar to list append logicplus(s(X), Y, s(Z)) :- plus(X, Y, Z).
- To compute
2 + 2
, we would have?- plus(s(s(Z)), s(s(Z)), R).
.
- Base case: adding 0 to any number yields that number
- Defining subtraction and comparison:
- We can define substraction similarly by "inverting" addition.
- First we define a less-than predicate, with the rule that 0 is less than any non-zero number:
lt(Z, s(_)).
. - The recursive rule: a number is less than the successor of another number if it is less than that number.
lt(s(X), s(Y)) :- lt(X, Y).
- Pitfall: unification might yield cyclic terms (i.e. trying to deduce a number less than itself may force a term like
Z = s(Z)
.
To avoid cyclic terms in arithmetic, we could use the unify_with_occurs_check/2
, at the tradeoff of efficiency.
lt(X, Y) :- unify_with_occurs_check(X, Y), % Ensures X is not "inside" Y % ... further conditions ... true.
1.7.12. Control mechanisms in prolog
: proof trees and backtracking
The default control flow in prolog
involves the proof tree and backtracking:
- Prolog's execution can be thought of as exploring a proof tree.
- The nodes in the proof tree are:
- Or-nodes: represents alternative ways (clauses) to prove a goal. These represent a choice between different clauses that might prove a goal. For example, if a predicate is defined by two clauses, the proof tree will branch (an “or” branch) showing both possibilities.
- And-nodes: represent conjunctions where all subgoals must succeed. These represent the situation where multiple subgoals (conjuncts) must all succeed for the parent goal to succeed. In a clause like:
goal :- subgoal1, subgoal2.
, bothsubgoal1
andsubgoal2
must be proven, this is an "and" relationship.
The default strategy from prolog
uses depth-frist, left-to-right search. It explores one branch fully, backtracking only when necessary.
Recall that in HW2, we had to manually define an "acceptor" function to handle multiple solutions. Here, prolog
automatically backtracks.
- When a goal fails, Prolog returns to the previous choice point (an or-node) and tries a different alternative.
- This backtracking continues until a solution is found or no more alternatives remain.
We can visualize a proof tree. Consider a simple prolog
program:
% Facts for colors of fruits. fruit(apple). fruit(banana). % Rule: A fruit is tasty if it is an apple. tasty(Fruit) :- fruit(Fruit), Fruit = apple. % Alternative rule: A fruit is tasty if it is a banana. tasty(Fruit) :- fruit(Fruit), Fruit = banana.
Querying ?- tasty(X).
would essentially have prolog
DFS L-R searching the following simplified diagram
[tasty(X)] │ ┌────────────┴────────────┐ <--- OR-node (two alternative clauses for tasty/1) │ │ Clause 1: tasty(Fruit) :- Clause 2: tasty(Fruit) :- fruit(Fruit), fruit(Fruit), Fruit = apple. Fruit = banana. │ │ (And-node: Both (And-node: Both subgoals must subgoals must succeed) succeed) │ │ ┌──────┴──────┐ ┌─────┴─────┐ │ │ │ │ [fruit(Fruit)] [Fruit = apple] [fruit(Fruit)] [Fruit = banana] │ │ │ │ Succeeds: Binds: Succeeds: Binds: Fruit = apple Fruit = apple Fruit = banana Fruit = banana │ │ │ │ └─────────────┴────────────┘ └─────────────┴─────────────┘ Each clause yields a solution: X = apple OR X = banana
there are two solutions: X = apple
and X = banana
. Consider a predicate defined with multiple subgoals
% Facts: parent(john, mary). parent(mary, susan). % Rule for grandparent: grandparent(X, Z) :- parent(X, Y), parent(Y, Z).
On the query ?- grandparent(john, Who).
, we have the following proof tree:
[grandparent(john, Who)] │ (Rule: grandparent(X,Z) :- parent(X,Y), parent(Y,Z)) │ ┌──────────┴──────────┐ <--- AND-node (both subgoals must succeed) │ │ [parent(john, Y)] [parent(Y, Who)] │ │ Fact: Y = mary Fact: Who = susan │ │ └──────────┬──────────┘ │ Conclusion: grandparent(john, susan) holds
1.7.13. Control mechanisms in prolog
: the cut operator (!
)
The cut operator !
is a control primitive that tells the prolog
interpreter to "commit" to the choices made up to that point and not consider any altnerative solutions for the current predicate. How?
- When
prolog
encounters a cut, it succeeds immediately. - If later backtracking reaches the cut, the interpreter does not try any alternatives that were available before the cut - it "cuts them off."
Vaguely, in a proof tree we might have
(Goal P) / \ ... Alternatives | │ [!] <--- Cut: No alternative exploration here | Continue with rest of P
Suppose we have a predicate that generates candidates using member/2
and then performs an expensive test R/3
only if the candidate is valid. Without cut, if R
fails, prolog
backtracks and may re-try the same candidate multiple times:
% Without cut: Multiple alternatives might be explored. process(X, Z) :- member(X, List), expensive_test(X, Z).
to prevent multiple re-evaluations of expensive_tgest/2
for the same candidates, you can write a variant (commonly known as a member check):
% Member check using cut: member_check(X, [X|_]) :- !. member_check(X, [_|Tail]) :- member_check(X, Tail).
here, the cut !
ensures that once a match is found, prolog
does not try alternative clauses for member_check/2
for that candidate.
Scope:
- The cut only affects alternatives in the current predicate. It does not remove choice points outside its predicate.
- Placing a cut too early or too late can affect whether some solutions are even considered.
Now we can move onto the once/1
predicate: the purpose of once/1
is that it serves as a meta predicate (once(P)
) that succeeds if P
can be prove at least once, but prevents any further backtracking into P
. At a high level,
once(P)
will try to proveP
.- If
P
succeeds,once(P)
succeeds. - If you attempt to backtrack (by i.e. typing a semicolon), no alternative solutions for
P
will be considered.
% Suppose you have a predicate that may generate many solutions: candidate(X) :- member(X, [a,b,a,c]). % To ensure only one solution is considered: process_once :- once(candidate(X)), write(X).
when we call process_once
, it prints only the first solution for candidate(X)
even if further alternatives exist.
Next we can discuss the backslash-plus operator (+
) in prolog
. It is a unary operator that implements negation as failure. Its definition is essentially two-clause (meta-predicate) behavior:
- First clause: cut-fail branch
- If the goal
P
succeeds, then the definition immediately performs a cut and fails. - This branch "rejects" any situation where
P
can be prove.
- If the goal
- Second clause:
- If the goal
P
fails, then+ P
succeeds; ifprolog
cannot proveP
, it concludes thatP
is not provable.
- If the goal
Informally,
% Negation as failure definition (informal) \+ P :- P, !, fail. \+ _.
- The first clause says: "Try to prove P. If P succeeds, execute the cut (!) to prevent backtracking, then force failure with fail."
- The second clause is reached only if the first clause fails (i.e.
P
was not provable). In that case,+P
succeeds.
Consider the following examples. First, consider negating a simple equality. Take the query ?- \+ (X = 9).
. Step-by-step, we have
- An attempt to prove the inner goal:
prolog
first tries to prove the goal(X = 9)
.- Since
X
is unbound,X = 9
succeeds by bindingX
to9
.
- Since
- First clause activation
- Because the inner goal succeeded, the first clause of
+
is used. - The cut
!
is exexucted, which prevents any backtracking to try alternative proofs for(X = 9)
. - Immediately after the cut,
fail
is excuted. - Result: query fails (returns
no.
).
- Because the inner goal succeeded, the first clause of
Importantly, the operator +
does not "return" a new binding for X
. Instead, it tests whether the given predicate can be proven. In this case, it can so the negation fails.
Now consider negating a non-provable goal. Take the query ?- \+ (member(9, [1,2,3])).
. Step-by-step we have
- An attempt to prove the goal
member(9, [1,2,3])
- Since
9
is not in the list, the goal fails.
- Since
- Second clause activation
- Since the inner goal failed,
prolog
does not invoke the fist clause. - Instead, it falls through to the second clause, which succeeds unconditionally
- Then, this query
?- \+ (member(9, [1,2,3])).
succeeds.
- Since the inner goal failed,
Below is a diagram that visuales the evaluation of \+ P
:
Start: Evaluate "\+ P" │ ---------------- | | Try to prove P (P fails?) │ │ [P succeeds] [P fails] │ │ Execute P, then Directly cut (!) and fail succeed │ │ (No alternative) \+ P succeeds │ \+ P fails
1.7.14. Philosophy of prolog
We are concerned with provability vs. truth now.
- Probability:
- In Prolog,
+ P
succeeds ifP
is not provable from the given facts and rules. This is the essence of negation as failure. - In Prolog, “+ P” means “P is not provable from the given facts and rules.”
- In Prolog,
- Truth under the Closed World Assumption (CWA):
- CWA: assumes that the database contains all true facts.
- Under CWA, if
P
cannot be proven, it is assumed to be false. - However, in the "real world" (or in incomplete databases), failing to prove
P
does not necessarily meanP
is false.
In practice, consider the following example:
- In a course prerequisite database, if the fact “Cs131 is a prerequisite for Dance201” is not present,
+ P
will succeed. - But strictly speaking, the absence of a fact does not guarantee that the relationship is false—it just means it wasn’t recorded.
- Many programming environments (e.g., in imperative languages) operate under a closed world assumption.
There are some limitations of negation as failure. Namely,
- Non-commutativity: Because of the cut and the order of evaluation, the order of conjuncts matters. For example:
- If you ask
\+ (X = 6), X = 5
, Prolog bindsX = 5
first, then tries to proveX = 6
(which fails), so the negation succeeds. - If you reverse the order, you might end up with a different (and unintended) behavior.
- If you ask
- Counterintuitive behavior: The technique works well when the domain is well understood (closed world), but it can be unsound or counterintuitive in cases where alternative bindings exist. For instance, if you try to use it to prove “X is not 9” by writing
\+ (X = 9)
:- Prolog will fail because it can bind X to 9, even though there might be other possible values for X.
- The logic here does not “search for” a value that makes the goal false—it simply checks if the goal is provable.
Incompleteness: Professor Piano’s dream (circa 1900) was to have a complete and consistent theory for integer arithmetic. However, Gödel’s incompleteness theorems later showed that for sufficiently complex theories, there are true statements that cannot be proven. For prolog, the use of + (negation as failure) mirrors this in that it bases “falsehood” on the inability to prove a statement, rather than on an absolute notion of falsehood.
Take for example our query \+ (X = 9).
. Although there exist values for X (like 10) that would make the statement “X is not 9” true, Prolog simply tests if it can prove X = 9
. Because Prolog unifies X with 9 immediately, the negation fails—even though we might think “X could be something else.” There's some more trouble we can run into. Suppose we have two goals:
- Goal A: bind
X
to5
(X = 5
) - Goal B: prove that
X
equals6
(X = 6
).
We then apply the negation operator +
(negation as failure) to the second goal. There are two orders to consider:
?- (X = 5), \+ (X = 6).
- First goal:
X = 5
; Prolog binds X to 5. - Second goal:
X = 6
. Prolog tests ifX = 6
can be proven. WithX
already bound to 5, the unification5 = 6
fails. SinceX = 6
fails, the negation+
succeeds. - Result: overall query succeeds,
X
remains bound to 5.
- First goal:
?- \+ (X = 6), (X = 5).
- First goal:
\+ (X = 6)
. Prolog attempts to proveX = 6
. BecauseX
is not yet boundX = 6
succeeds by bindingX
to 6. The+
operator then immediately executes its cut andfail
(first clause of the+
definition):\+ P :- P, !, fail.
- Because
X = 6
succeeded, the cut is executed and thenfail
forces the whole negation to fail.
- Because
- Since the first goal fails,
prolog
does not even proceed to the second goal(X = 5)
. - Result: the overall query fails and no solution is returned.
- First goal:
In prolog
, the order in which you write goals is critical when using negation as failure.
- In order 1, because the binding occurs first (
X = 5
), the subsequent test (X = 6
) fails as expected. - In order 2, because the negation is applied before any binding, the unification
(X = 6)
succeeds (binding X to 6) and triggers the cut-fail sequence.
This behavior demonstrates that when cuts and negation are involved, the conjunction operator ,
is not commutative. This departure from commutativity is one of the reasons why logicians become nervous when cuts are introduced. The order-sensitive behavior can lead to results that are “illogical” or counterintuitive when compared to classical logic, where conjunction is commutative.
1.7.15. Logic
Why should we care about logic? We’ve seen that using cuts and negation as failure in Prolog can cause non-commutative behavior. e.g. the order of goals. Understanding why these anomalies occur requires some background in logic. Prolog is built upon well-established logical systems that originate in philosophy. Today, we review the simplest system—propositional logic—and then see why it isn’t enough to capture everyday reasoning, which leads us to first‑order logic and finally to Horn clauses.
Proposition: simple statements about the world. We abbreviate them with simple letters (P, Q, R, …).
Basic Connectives:
- Negation (¬): The negation of P (written as ¬P) is true if P is false and vice versa.
- Conjunction (∧): P ∧ Q is true only if both P and Q are true.
- Disjunction (∨): P ∨ Q is true if at least one of P or Q is true.
- Implication (→): P → Q (read “P implies Q”) is false only when P is true and Q is false.
- Truth table:
- If P is false, P → Q is always true.
- If P is true and Q is true, then P → Q is true.
- If P is true and Q is false, then P → Q is false.
- Truth table:
- Biconditional (↔): P ↔ Q means “P if and only if Q” (they have the same truth value).
There are some limitations of propositional logic.
- Expressiveness - Propositional logic treats propositions as atomic entities. It cannot express internal structure (e.g., “Socrates is a man” and “All men are mortal”).
- Syllogisms: e.g. "All men are mortal", "Socrates is a man", "Therefore Socrates is mortal." In propositional logic, you’d have to treat these as whole propositions (say, P, Q, R), but then the logical form isn’t a tautology. It is merely an implication that might not hold for arbitrary truth assignments.
First-Order Logic (Predicate Calculates) adds
- Predicates with arguments: Instead of atomic propositions, you have predicates such as man(x) and mortal(x).
- Quantifiers: universal quantifier (for all) and existential quantifier (there exists)
Tautalogy is a formula that is true regardless of the interpretation of its component terms, with only the logical constants have a fixed meaning.
Horn Clauses is a type of clause with at most one positive (unnegated) literal.
- Fact: A Horn clause with one positive literal and no antecdents (
man(socrates).
) - Rule: A Horn clause with one positive literal (the head) and a body (a conjunction of literals). (
mortal(X) :- man(X).
) - Query: Also expressed as Horn clauses.
Prolog works on Horn clauses. There are some reasons:
- Efficiency
- Prolog's inference mechanism (DFS, backward chaining) works efficiently on Horn clauses
- They restrict the form of logical expressions so that the problem of logical inference becomes tractable in practice
- Limitation: less-expressive than full first-order logic
We can convert general first-order logic to Horn clauses. Take for example:
- “If you are in Santa Monica and you have a license, then you are either a dog, a cat, or a pig.”
This clause has a disjunction in the consequent. In practice, we would split this into multiple Horn clauses:
licensed_in_sm(X) -> dog(X). licensed_in_sm(X) -> cat(X). licensed_in_sm(X) -> pig(X).
Prolog then works efficiently because each cluase has at most one positive literal.
1.8. Scheme
At a glance, (TODO)
First, why scheme? The idea of Scheme is to try and strip away as much as possible from the core of the language. But, more specifically:
- Simplicity of syntax: Scheme has a very simple and uniform syntax compared to languages like ML or C++. Every expression is written as a list, which makes parsing and manipulation straightforward.
- Programs as data: In Scheme, code is data. Every Scheme program is represented using the same data structures (lists, atoms, pairs) that you use for ordinary data. This allows you to write programs that generate, transform, or evaluate other programs on the fly.
- Continuations: (later) Advanced control structure in Scheme. They provide a powerful way to manipulate control flow and are often considered one of the nicest low-level control primitives available. Like
goto
inC++
or!
inprolog
. - Scheme doesn’t have exception handling or even operations like addition in its core.
- Objects in Scheme are dynamically allocated and never explicitly freed (garbage collection).
- Types are latent, not manifest, meaning that you look at an object at runtime to figure out what type it is and that it’s not obvious what the type of an object is from the code.
- Scheme has static scoping, meaning it has to maintain both a dynamic and static chain. Note that this is unlike Lisp, which uses dynamic scoping. Scheme is call by value and objects in Scheme include the usual (numbers/data) as well as procedures, including continuations.
- Scheme syntax is very simple and includes a straightforward representation of a program as data.
- Finally, tail recursion optimization is required, not optional.
1.8.1. Basic Scheme syntax
- Uniform syntax: a function call in Schem is written with an opening parenthesis follow by the function and its arguments, then a closing parenthesis. e.g.
(+ 2 2)
. - Nested calls: scheme allows you nest expressions.
(* (+ 1 2) 3)
evaluates to9
. - Function as data: Scheme also allows function calls where the result of one expression is treated as a function (like partial). For example,
((f 3) 4)
.
Data Structures
The cons
cell: the fundamental data structure, constructed using cons
function, which creates a pair from two values. e.g. (cons 10 20)
creates a pair.
In Scheme, a list is either the empty list ()
or a cons cell whose second element is a list. For example, to build the list (10, 20, 30)
, we would do (cons 10 (cons 20 (cons 30 '())))
where '()
is the empty list.
As a diagram:
cons cell ┌───────┐ │ 10 │──► (cons cell) └───────┘ ┌───────┐ │ 20 │──► (cons cell) └───────┘ ┌───────┐ │ 30 │──► () └───────┘
An improper list occurs when the final "cdr" (second part) of a cons cell is not the empty list. e.g. (cons 3 (cons 4 5))
, and as a diagram,
cons cell ┌───────┐ │ 3 │──► (cons cell) └───────┘ ┌───────┐ │ 4 │──► 5 (not a list) └───────┘
Quoting in Scheme prevents evaluation. When you quote an expression, you tell Scheme to treat it as literal data rather than trying to evaluate it as a function call. Simply do 'F
, which produces the symbol F
without trying to evaluate it. Or '(1 2 3)
returns the list (1 2 3)
as data, rather than trying to call the function 1
on 2
and 3
. It is syntactic sugar for quote
.
Because Scheme code is represented as lists, you can quote an entire program. For example, '(define (square x) (* x x))
is treated as a list rather than being executed. Scheme will print it as (define (square x) (* x x))
.
1.8.2. Scheme vs. other languages
- Simple syntax: Scheme’s syntax is minimalistic; every expression is either an atom or a list. This simplicity allows programs to be easily manipulated and represented as data. (very unlike C++, whose simplest program representation is probably a string)
- Objects are allocated dynamically, and never freed, we use a garbage collector.
- Dynamic typing: Unlike ML or OCaml, Scheme does not enforce static type checking. Instead, type checking occurs at runtime. This flexibility means you can mix different types in a list if you’re careful.
- Interactivity: Scheme’s simple syntax and representation make it very suitable for interactive programming and rapid prototyping. You can easily write, modify, and evaluate code on the fly.
- Static scoping: all the identifiers in the language that are declared are scoped at compile time, in the sense that you can tell where they are being used used by checking them statically at compile time. The compiler knows how to scope the variable.
- Like basically every language we have covered: ML, C++, Java, Python
- I.e., a variable's scope is determined by its position in the source code. The meaning of a variable is fixed by the block in which it is defined.
What about dynamic scoping?
We have dynamic scoping in languages like Lisp and sh
- In Lisp, you have to look at the caller's context to figure out what a nonlocal variable means
(define (f x) (f x y)) ; 'y' is not locally defined in f (let ((y 12)) (f 13)) ; here, when f is called, y is bound to 12 (let ((y 19)) (f 3)) ; here, when f is called, y is bound to 19
- Shell enviroment variables are passed dynamically and use dynamic scoping
The downside of dynamic scoping is that it's hard to optimize at compile time, as machine code and has more run-time cost. More importantly, it's harder to debug and has less software reliability.
This begs the question, why use dynamic scoping at all? For example, if the shell is not dynamically scoped and environment variables are not passed dynamically, what changes? In terms of general advantages, we get
- Implicit context sharing: Variables are resolved based on the call stack at runtime, which means that functions can automatically access variables defined in their caller's context. This eliminates the need to pass every variable explicitly as a parameter, simplifying code in certain cases.
- Flexibility: Dynamic scoping allows programs to adapt behavior based on the current runtime context. This is particularly useful in situations where the behavior of a function might need to change according to the environment in which it is called.
Specifically for shell scripts, there are two main advantages
- Implicit inheritance: Functions and subprocesses automatically see any modifications to environment variables made in the caller’s context. This can simplify scripting since you don’t need to explicitly pass variables as parameters between functions or commands.
- Flexibility in configuration: Dynamic scoping allows for a more flexible and adaptive behavior. For instance, a script can set an environment variable and have that setting influence all subsequent operations, without having to re-declare or pass it around. This can be especially handy in interactive shells or scripts where the context may change over time.
If the shell were not dynamically scoped, there are some advantages and disadvantages. Specific to environments, we have
- Explicit passing of variables: Without dynamic scoping, each function or command would operate within a fixed, lexically determined environment. This means that any changes to an environment variable in a caller would not automatically be visible to the callee. Instead, you’d have to pass these variables explicitly. While this can make dependencies clearer and improve predictability, it also introduces more boilerplate code and is against what
sh
is about. - Potential for reduced side effects: Static scoping confines variables to well-defined blocks of code. This isolation could improve reliability by preventing unintended side effects. For example, a change to an environment variable in one part of the script wouldn’t unexpectedly alter the behavior of code in another part. However, the trade-off is a loss of the convenience that comes with the implicit sharing of state.
- Optimizations, debugging: With static scoping, compilers or interpreters might have an easier time optimizing code since the variable bindings are fixed at compile time. Conversely, dynamic scoping requires additional runtime checks and may introduce subtle bugs if variables are unintentionally shadowed or overwritten, complicating debugging efforts.
Returning to the main topic:
- Scheme uses call-by-value: when you call a function, there is a simple calling mechanism that copies the value (like ML, C++, Java, Python)
- This can also be thought of as having all all arguments evaluated before a function is called.
- Objects include the usual (full-fledged, unbounded integers, strings, etc), and procedures are considered to objects
- Procedures (functions) are treated like lists by the compiler just like everything else
- This also includes continuations
- High-level arithmetic - scientific programming should be correct, we get unbounded integers, floats, easy addition, etc.
- TRO: tail recursion optimization. We discuss this further now.
TRO
First, a tail call is a function call made as the last operation in a function. Tail calls allow tail recursion optimization (TRO), meaning that recursive calls do not grow the call stack, it will shrink it / stay constant.
A naive, typical way to do a recursive factorial could look like
(define (factorial n) (if (zero? n) 1 (* n (factorial (- n 1)))))
Functions calls here are done recursively via a stack where you place the more recent function calls on top.
- In our case, the last call would be the
*
function, and when*
returns thenfactorial
returns.
However, scheme requires that the last call must be a tail function. In our case, if we wanted to use *
, the last function call on the stack
(on top of main
) must *
.
- Everything that happens in the factorial call must be done by the
*
function.
We want to require TRO because we don't want to overflow the stack with our factorial
calls. In order to improve the
code above, we want the function to call itself at the end so that it pops itself off and then also pushes itself back on,
so the stack will not grow (stay constant).
We want to multiply first and then call factorial
, rather than call factorial
first then multiply. Instead, we want
(define (factorial n) (fact n 1)) (define (fact n a) (if (zero? n) a (fact (- n 1) (* a n))))
Now, following the program, the main function will call factorial
, factorial
will call fact
, fact
will call zero?
, zero
will return
then fact
will call fact
with a smaller argument.
fact
will not blow up the stack, in fact Scheme will turn this code into a for loop automatically!
This type of programming, akin to functional programming where we use recursion to essentially do the work of what would typically be a for
loop is so common,
Scheme has a simpler way of achieving this, via a named 'let'.
A named 'let' is different from an ordinary let
, which looks something like
(let ((x (+ 9 3 -7))) (* x (+ x 3))) ; something like ; { ; int x = 9 + 3 + -7; ; return x * (x + 3); ; }
instead, we have
; named let: (define (factorial n) ; Define an internal/inline function ONLY in the scope of factorial ; n' starts as n, a starts as 1 (let fact ((n' n) (a 1)) (if (zero? n') a (fact (- n' 1) (* n' a))))) ; Kinda like a for loop but we use let ; But generally people will just write (n' n) as (n n) ; This is loops but in a functional sense with no side effects
let
binds local variables and defines a local recursive function fact
without needing separate define
.
1.8.3. Scheme syntax
Identifiers in scheme are almost anything: a-zA-Z0-9+-.?*<>:%_...
etc.
- This means that something like
*
is just an identifier in Scheme. For example,
(let ((* 5)) (- * 3)) ; This is valid code
Identifiers cannot start with 0-9+-.
. Except, +
, -
, ...
, ->
with some stuff after, i.e. ->xyz
is an identifier.
We comment with ;
. Lists look like (l i s t s)
. Booleans have #t
and #f
. Interestingly, #f
is the only false value in Scheme.
So, something like (if 0 5 10)
in Scheme will return 10 because 0 is true. Vectors look like #(a v e c t o r)
. Internally, they are represented as
arrays under the hood, which are unlike lists (linked lists).
Strings look like "Strings"
, characters are #\c, this is the lower case character.
Interestingly, numbers are represented a couple of ways in Scheme. We can write things like
12, -19
, or we could do 12e-9
(scientific notation), or also write 2/3
, which is an
exact representation of 2/3
. If you multiply this with 3/2
it will return 1. It's represented by an array
with 2 and 3, which essentially represents the fraction.
1.8.4. Quoting in Scheme
If you put an apostrophy in front of any expression, Scheme handles it differently. If we
have (quote E)
, we take E
as a piece of data and not evaluate the expression. For example
for a list (a b c)
it returns the linked list of a b c
and will not evaluate the expression
a
with arguments b c
.
- We can also write
(a b .c)
is(quote (a b .c))
, which creates a list of[a | -] --> [b | c]
, which is the same as calling(cons 'a (cons 'b 'c))
.- Essentially,
.
creates a pair in Scheme (similar to a tuple). So calling(cons 3 4)
returns'(3 . 4)
(space is necessary)
- Essentially,
Digression on special forms
The quote is considered a special form. A special form is something that looks like a function call, but it's not. For example, some special forms are define
and if
. We're
not calling functions, just calling a built-in feature. not
is a function, and the idea is that special forms cannot be defined as other things in the language. For example, not
could be defined as
(define (not x) (if x #f #t))
one may think that we could re-write if
like
(define (if-test cond then-part else-part) (if cond then-part else-part))
and then called it as (if (zero? n) 1 (/ 1 n))
, this not will work since Scheme does call by value. Scheme would value both 1
and (/1 n)
before if-test
is applied.
If n
is zero, we would actually get a division by zero. If if
were not a special form, you would need to delay the branches. One approach would be
to pass the branches as zero-argument procedures (thinks) so thaty they are not evaluated until explicitly called. For example,
(define (if-func cond then-thunk else-thunk) (if cond (then-thunk) (else-thunk))) ; call it like (if-func (zero? n) (lambda () 1) (lambda () (/ 1 n)))
1.8.5. Special form: lambda
The most special special form in Scheme is the lambda
expression. We could write something like (lambda (x) (+ x 1))
which just takes the argument and adds one.
We can also nest lambdas in lambdas. For example,
(define f (lambda (x) (lambda (y) (+ x y)))) ; scope of x here extends to the bottom call ; you can curry functions here just like you can in ml ; fun x -> fun y -> x + y (define g (f 3)) (g 7) ; same as let f = fun x -> fun y -> x + y, let g = f 3, g 7
Scheme has a notion of variadic functions, you can write something like (define printf (lambda (format . args) (........)))
, this is a special form for
a function that takes one or more arguments. We could call the function like (printf "%d hello %r" 19 "abc")
, when this function is called,
args
is bound to a list of the caller's trailing arguments. It would look something like [lambda | -]->[[format | args] | -]-> ........
, and we take it apart
with car
and cdr
.
Lambda is perhaps the lowest level of function definitions in Scheme, it's hardwired into the compiler. You cannot define lambda in terms of anything else. However, there's common syntactic sugar for defining functions.
(define (f x y ) (+ x (* y 3))) ; same as (define f (lambda x y) (+ x (* y 3))) (define (list x x)) ; same as (define list (lambda x x)) ; this is also built into the compiler
Calling something like (list a b (+ c 5))
will evaluate the three expressions, i.e. returns a 3 item list
with 3 values that return from evaluating a
, b
and (+ c 5)
. If you quote this, it will just return the data structure.
Back to quoting
We have quasiquoting / dequoting, with `
(backtick). It allows you to "unquote" parts of the expression (using a comma ,
) so those parts
are evaluated instead of taken literally. Suppose somewhere in your code you have (define b 37)
. Now b
is bound to the value 37
.
When we dequote the list `(a ,b (+ c 5))
, it will look something like (a 37 (+ c 5))
. In this way,
quasiquoting is like a template: we fill it what we have later. Unlike quoting, we can nest quasiquotes. For example, take
`(a ,(append `(3 x) l) (+ c 5))
. Since a
is not unquoted, it remains a
. The second element is ,(append (3 x) l)
, which
means the expression should be evaluated. So `(3 x)
expands to (list 3 x)
. Then, (append '(3 x) l)
is evaluated. Suppose l = '(y z)
, then our final
expanded list is (a (3 x y z) (+ c 5))
.
1.8.6. Scoping
If in a let
, you declare several variable, such as the following (assume a
and x
are defined outside)
(let ((x (+ a 3)) (y (* x2))) + (* x x) (- y x)) ; body of let ; x's scope extends all the way to the bottom, but does not extend to the scope of y
In Scheme's standard let
, all the binding expressions are evaluated simultaneously in the environment outside
the let
. This means that each binding's initializer is computed without knowing about any of the other variables defined in the same let
. More specifically, step-by-step:
- When you enter the
let
, Scheme creates a new environment for the variables that will be bound (in this case,x
andy
). - However, the expressions used to compute the values for
x
andy
are evaluated in the environment outside thelet
. This means that any bindings introduced in this samelet
are not visible during these evaluations.
In our example:
- The expression for
x
is evaluated:(+ a 3)
. At this point,a
is looked up in the outer environment, but the new binding forx
is not available (it hasn’t been established yet). - The expression for
y
is evaluated:(* x 2)
. Here, you might expect that the new binding forx
would be used. However, because all bindings are set up simultaneously, thex
used in(* x 2)
is thex
from the outer environment (if any), not the one being defined in thelet
. - After both expressions are evaluated, the resulting values are then bound to
x
andy
respectively in the new environment. - Now, within the body of the let, when you refer to
x
ory
, Scheme will use these newly established bindings.
If you wanted sequential dependency, just let*
. This is part of Scheme's philsophy that "you cannot define something in terms of itself."
Our function could also be written like
((lambda (x y) (+ (* x x) ( - y x))) (+ a 3) ; call (* x 2)) ; call ; inside the scheme compiler it will compile the first code into the second ; notice that since we convert let to lambda we can see how x extends its ; scope to the body of the funciton + (* x x) ......, but does not extend to ; the + a3 and * x2 because its not in scope for lambda
Some more special functions. We have (and E1 ... En)
, which evaluates each expression E1 -> En
left to right,
and if any of them return false
immediately returns false
. However, it differs from &&
in C++
because if all of them
return true, it will return En
. Similarily, we have or
, which for the first value it succeeds on, it just returns. For example, we
could have (or (getenv "PATH") "/bin:/usr/bin")
, gives you simple backtracking.
More on tail recursion.
If we have a funtion like the following,
(define (f n) (if (zero? n) (f x) ; tail call 37)) (lambda (...) ... (f ...)) ; the last f is in the tail call context ; because you are committed to returning what f returns (if E1 E2 ; tail call context E3) ; tail call context (and E1 ... En) ; En is in the tail call context
we have tail call contexts, where as long as the last function call is the last thing called, it's a tail call. Thinking in terms of what we committing to calling.
Macros.
You can define your own special forms if you want. You can use the special form define-syntax
, which lets you define your own special forms. Similar to macros in
C/C++
. In fact, it is not uncommon that and
and or
are not defined and instead written using define syntax
. We could write
(define-syntax and (syntax-rules () ((and) #t) ; When there are no arguments, return true. ((and x) x) ; When there's a single argument, simply return x. ((and x y ...) (if x (and y ...) #f))) ; For more than one argument, test x; if true, recursively process the rest. ; Writing (and (< x 1) (< y 3) (p x)) ; Expands to (if (< x 1) (if (< y 3) (p x) #f) #f)
We also can define or
, but care must be taken to avoid evaluating an expression more than once.
(define or (syntax-rules () ((or) #f) ; No arguments yields false. ((or x) x) ; A single argument returns x. ((or x y ...) (if x x (or y ...))))) ; Otherwise, if x is true, return x; else, check the rest. ; Writing (or (getenv "PATH") "/bin:/usr/bin") ; Expands to (if (getenv "PATH") (getenv "PATH") "/bin:/usr/bin")
you can think of it as, what should the compiler substitute in as text? Also, in order to avoid multiple evaluations, we can write
(define or (syntax-rules () ((or) #f) ((or x) x) ((or x y ...) (let ((xc x)) (if xc xc (or y ...)))))) ; Recall that if works like (if test consequent alternative)
to avoid multiple evaluations, the macro use a let
to bind the value of the first expression to a temporary variable xc
.
What happens when we calling the or
function with a predefined variable called xc
? In our case
(let ((xc 93)) (or (getenv "0") xc)) ; This will transform into: (let ((xc (getenv "0"))) (if xc xc xc)) ; The compiler will actually differentiate between the ; two xc variables and will label them differently, ; resulting in something like this: (let ((xc1 (getenv "0"))) (if xc1 xc1 xc2))
1.8.7. Bonus Scheme
Tail-recursive Fib
(define (fib n) (define (fib-helper a b n) (if (= n 0) a (fib-helper b (+ a b) (- n 1)))) (fib-helper 0 1 n))
fib-helper
carries accumulators (a
andb
) representing successive Fibonacci numbers.- When
n
reaches 0,a
is the result. - Runs in linear time and uses tail recursion.
We now turn to some list processing code.
Sum of a list:
(define (sum lst) (if (null? lst) 0 (+ (car lst) (sum (cdr lst)))))
- If the list empty (
null? lst
) return 0. - Otherwise, add the first element (
car lst
) to the sum of the rest (cdr lst
)
Another useful function is map
:
(define (square x) (* x x)) (define (map-square lst) (map square lst)) (map-square '(1 2 3 4 5)) ; Returns (1 4 9 16 25)
square
simply squares the argument.map
appliessquare
to each element of the list.
Also useful is filter
:
(define (even? x) (= (modulo x 2) 0)) (define (filter-even lst) (filter even? lst)) (filter-even '(1 2 3 4 5 6)) ; Returns (2 4 6)
even?
checks whether a number is evenfilter
returns a list of elements that satisfy the predicateeven?
Also neat are lambdas
. A few examples - first, defining a lambda
(lambda (x) (+ x 1))
Using lambda in higher-order functions
(map (lambda (x) (* x x)) '(1 2 3 4 5)) ; Returns (1 4 9 16 25)
Assigning a lambda to a variable
(define add-one (lambda (x) (+ x 1))) (add-one 5) ; Returns 6
Using lambda in filtering
(filter (lambda (x) (> x 3)) '(1 2 3 4 5)) ; Returns (4 5)
Using lambda in folding (reduction)
(foldl (lambda (x y) (+ x y)) 0 '(1 2 3 4 5)) ; Returns 15
Hygienic macros ensure that variables defined within a macro do not accidentally capture or clash with variables in the surrounding code. They preserve lexical scope and prevent unintended side effects.
- Example: swapping two values
(define-syntax swap! (syntax-rules () ((swap! x y) (let ((temp x)) (set! x y) (set! y temp)))))
swap!
macro takes two variables and swaps their values- The macro is hygienic: teh temporary variable
temp
is local to the macro and cannot interfere with any variable outside
Another example is a when
macro:
(define-syntax when (syntax-rules () ((when condition body ...) (if condition (begin body ...)))))
- The
when
macro executes a sequence of expressions only if the condition is true. - The use of
begin
groups the multiple expressions together.
Another example would be a looping macro (while
):
(define-syntax while (syntax-rules () ((while test body ...) (let loop () (when test body ... (loop))))))
- The
while
macro repeatedly evaluates thebody
as long as thetest
condition holds true. - The internal
loop
function is recursively called; the macro is hygienic, so the identifierloop
does not conflict with any other variable.
Hygienic macros prevent accidental variable capture, making code more maintainable and predictable.
1.8.8. Primitives vs. Library
Primitives are core language constructs built into Scheme that cannot be defined in terms of simpler language constructs. For example, if
and lambda
. The "bare bones" of the language.
On the other hand, library functions/macros can be defined in terms of primitives.
- The Scheme standard clearly categorizes what is considered a primitive and what belongs to the library.
- vs. languages like C++ or Java, in which the distinction between core features and libraries is often blurred.
Scheme maintains a very small set of primitives.
- Required features: Every Scheme implementation must support a set of core features (e.g., integer arithmetic, basic I/O).
- Optional features: Features like multiple-precision floating-point arithmetic (i.e., numbers wider than 64 bits) are optional.
- Extensions: Implementation-specific features that are not standardized.
- Implication for portability:
- Programs that stick to required features are highly portable.
- Relying on optional features or extensions can reduce portability to only a few implementations (e.g. only running on Racket)
1.8.9. Categorization of errors in Scheme
Scheme standards categorize errors and "bugs" to help programmers write portable, robust code.
- Implementation restrictions
- Definition: These occur when a valid Scheme program fails to run on a particular implementation due to resource limits (e.g., available memory).
- Example: Creating extremely long lists might exhaust virtual memory on some implementations.
- Unspecified behavior
- Definition: The Scheme standard does not dictate how certain operations behave, leaving room for variation between implementations.
- Example: equality predicates - Scheme offers several predicates for equalty, each with its own purpose
eq?
: Typically checks whether two expressions reference the same object (pointer or identity comparison). \(O(1)\) cost but might no compare contents.eqv?
: Likeeq?
, but also compares numbers and characters by value.equal?
: Does a deep, recursive comparison of data structures (lists, trees). Pitfall: with cyclic structures, can lead to infinite loops.- Numeric comparison: checks whether numbers are numerically equal. Some implementations may choose to re-use objects for small integers, while others create new objects.
- Implications:
- Programs should be robust regardless of the specific behavior of these operations.
- Relying on a particular behavior (e.g., expecting eq? to always return true for numbers) can harm portability.
- Undefined behavior
- Definition: when a program violates the Schem standard, the behavior is undefined; the implementation can do anything
- Example: taking
car
of something that is not a pair. - Comparison: similar to undefined behavior in C/C++ (e.g. array out-of-bounds, nullptr dereferencing)
- Error signaling
- Some operations (opening a file) must signal an error if they fail
- Can display error message, throw exception, exit
- Implementation differences: some Scheme systems may signal erros for even "cheap" operations (e.g. taking
car
of non-pair)
1.8.10. Continuations
Continuations capture the “rest of the computation” and are a distinguishing feature of Scheme. At a high level, a continuation is a representation of “what to do next” in the program. It can be thought of as the program’s “bucket list” of pending operations.
At the hardware, we can think of it as the following:
- We have two essential pointers:
- Instruction pointer (IP):
- Points to the next machine instruction or bytecode to execute.
- On hardware (e.g., x86-64), this is represented by the RIP register.
- Environment pointer (EP):
- Points to the current activation record (stack frame) containing local variables, temporaries, and the return address.
- Often implemented using the base pointer register (e.g., RBP) or sometimes the stack pointer (RSP).
- Instruction pointer (IP):
- Stack frame diagram is the following – image a call stack in a C program:
┌─────────────────────────────┐ │ Main Function Frame │ <-- Base (no caller) │ (global variables, etc.) │ └─────────────────────────────┘ ↑ Return Address stored here ┌─────────────────────────────┐ │ Caller’s Frame │ <-- Contains saved RBP, local variables │ (Return address, EP, etc.) │ └─────────────────────────────┘ ↑ Return Address stored here ┌─────────────────────────────┐ │ Current Function’s Frame │ <-- Contains local variables, etc. │ (Local variables, IP, EP) │ └─────────────────────────────┘
- IP points into machine code within the current frame.
- EP points to caller's frame, providing context for when the current function returns.
Continuation as data:
- Continuation representation:
- Abstractly, a continuation can be seen as a pair
(IP, EP)
- This pair represents the plan for what to execute next.
- Abstractly, a continuation can be seen as a pair
- Scheme's uniqueness:
- Unlike many languages, Scheme allows you to capture this continuation (using
call/cc
) and manipulate it as a first-class object.
- Unlike many languages, Scheme allows you to capture this continuation (using
As a diagram:
Current Continuation: ┌─────────────────────────┐ │ (IP, EP) of Current │ │ Function’s Frame │ ├─────────────────────────┤ │ (IP, EP) of Caller’s │ │ Frame │ ├─────────────────────────┤ │ (IP, EP) of Main Frame │ └─────────────────────────┘
- When a function returns, its frame is popped off, and the continuation resumes from the saved
(IP, EP)
of the caller's frame.
Importance of continuations
- Control flow manipulations: Continuations allow the implementation of non-standard control structures such as early exits, coroutines, and backtracking.
- Few languages expose continuations as directly as Scheme. This is often cited as the “essence of Scheme” (and sometimes as its “biggest mistake” because of the complexity it introduces).
Example: using call/cc
(call/cc (lambda (cont) (display "Before continuation") (cont "Jumped out!") ; Immediately returns this value and abandons the rest. (display "This will never be printed")))
call/cc
(call with current continuation) passes the current continuation (the rest of the computation) as an argument (cont
) to the lambda- Calling
cont
immediately exits the lambda and returns the given value, skipping subsequent code.
Intuition and examples A continuation represents "the rest of the computation"—that is, everything your program still has to do at a given point in time. Imagine you’re reading a to-do list; your current continuation is the list of steps you plan to complete.
In programming, when a function is executing, the continuation is the “future work” (like what to do after the function returns). In many languages, this is hidden (it’s managed by the call stack), but Scheme makes it possible to capture that continuation as a first-class object.
When a function is called, the computer saves the place to return to (its return address) along with the local environment (variables, etc.). Conceptually, a continuation is like capturing this entire “stack” or the “rest of the program” so that you can later resume execution from that point.
call/cc
: call/cc stands for “call with current continuation.” It is a Scheme primitive that captures the current continuation and passes it as an argument to a function (usually a lambda). This allows you to control the flow of your program in ways that are impossible in most languages.
- Syntax:
(call/cc (lambda (k) ... ))
. Here,k
is the current continuation – a function representing "waht to do next." - Behavior:
call/cc
calls the lambda, passing the current continuation ask
.- Within the lambda, you can
- Call
k
with a value. If you do the entirecall/cc
expression immediately returns that value, skipping any code after the call. - Not call
k
at all, in which case the lambda’s normal result is returned.
- Call
Examples:
What if you want to exit-early from a computation? With call/cc
you can "bail out" from a deeply nested function call.
(define (find-first lst predicate) (call/cc (lambda (exit) (for-each (lambda (x) (when (predicate x) (exit x))) ; call exit with x when predicate is true lst) #f))) ; if no element satisfies the predicate, return false ;; Example usage: (find-first '(1 3 5 8 9) even?) ; Returns 8 because it's the first even number.
- When an element satisfies
predicate
, we bail out (call(exit x)
, which immediately ends the function and returnsx
.)
What about non-local exits? (escape), i.e. nested computation?
(define (compute-sum lst) (call/cc (lambda (exit) (let loop ((lst lst) (acc 0)) (if (null? lst) acc (if (> acc 100) (exit acc) ; if accumulator > 100, exit early (loop (cdr lst) (+ acc (car lst))))))))) ;; Example usage: (compute-sum '(10 20 30 40 50)) ; May exit early if the sum exceeds 100.
We can also simulate backtracking: try-all
goes through a list trying to find a "successful" element.
(define (try-all lst) (call/cc (lambda (fail) (let loop ((lst lst)) (if (null? lst) (fail 'no-success) ; if list is exhausted, invoke fail continuation (if (successful? (car lst)) (car lst) ; return the first successful element (loop (cdr lst)))))))) ;; Assume 'successful?' is a predicate that checks if an element is “successful.”
Mechanics of call/cc Recall that call with current continuation, definitionally:
- A primitive built into scheme that lets you capture the current continuation – i.e., "what to do next"
- It takes one argument: a procdure (function)
P
. - It creates a continuation (we call
K
) representing the rest of the computation from the point of invocation. - Then,
call/cc
calls the procdureP
, passingK
as its argument. - Finally, it returns whatever
P
returns (unless the continuationK
is later invoked)
Mechanically, how do we create a continuation?
- Internall, the continuiation is represented by a small data structure containing two pointers:
- Instruction pointer (IP): points to the next instruction
- Environment pointer (EP): points to current activation record (or stack frame), containing local variables and the returna ddress
The continuation K
is essentially a "snapshot" of these two pointers. It is very cheap to create (comparable to a cons cell in cost), as it only stores two words.
Even more mechanically, when we use a continuation, we have
K
is a first class object (a function) that you can call with a single argumentV
- When you call
K
with valueV
, the interpreter "jumps" back to the point wherecall/cc
was invoked, and the entirecall/cc
expression then returnsV
. - In low-level terms (x86-64 analagy)
- The return value register
rax
is set toV
- The environment pointer
rbp
is restored fromK
's stored EP - The instruction pointer
rip
is set toK
's stored IP, effectively performing a jump.
- The return value register
More examples Imagine you have a long list of integers that you wish to multiply together. However, if any integer is 0, you want to immediately return 0 without performing all the multiplications. Continuations let you do this non-local exit efficiently.
(define (prod lst) ;; Use call/cc to capture the continuation for an early exit. (call/cc (lambda (break) ;; Define a helper using named let for recursion. (let loop ((lst lst) (acc 1)) (cond ((null? lst) acc) ; If list is empty, return accumulator. ((zero? (car lst)) (break 0)) ; If head is 0, immediately exit with 0. (else (loop (cdr lst) (* acc (car lst))))))))) ;; Example usage: (prod '(2 3 4)) ; Returns 24. (prod '(2 3 0 4 5)) ; Returns 0 immediately, without multiplying 4 and 5.
Another function – Imagine a function that searches through a tree. If a certain condition is met, you want to immediately exit and return a result.
(define (find-in-tree tree predicate) (call/cc (lambda (exit) (define (search node) (if (predicate node) (exit node) ; immediately exit if predicate is satisfied (if (pair? node) (or (search (car node)) (search (cdr node))) #f))) (search tree)))) ;; Example usage: (find-in-tree '((1 2) (3 (4 5))) even?) ; Returns first even number encountered.
We can also simulate a simple try-catch
(define (safe-divide a b) (call/cc (lambda (handle-error) (if (zero? b) (handle-error 'division-by-zero) (/ a b))))) ;; Example usage: (safe-divide 10 2) ; Returns 5. (safe-divide 10 0) ; Returns 'division-by-zero.
Example: Green threads We have mainly discussing using continuations to perform non-local exits (try-catch). Now we extend this idea: continuation in Scheme can do more than just "jump out" of a function – they can also let you jump back into a function.
We will use continuations to implement green threads – lightweight, user-level thjreads that cooperate on a single CPU.
- They provide concurrency without the complications of true multi-threading (no parallel execution, simpler synchronization)
- Since only one thread runs at a time, many race conditions are avoided
- No need for OS-level threads!
We can build green threads in Scheme in the following way
;; Global variable: a list to hold all green threads (thunks) waiting to run. (define green-thread-list '()) ;; Thread Constructor: gt-cons ;; Purpose: ;; - Create a new green thread from a thunk (a no-argument function) ;; - Append the new thread to the global green-thread-list. (define (gt-cons thunk) ;; Append the new thread (as a singleton list) to the end of green-thread-list. (set! green-thread-list (append green-thread-list (list thunk)))) ;; Scheduler Function: start ;; Purpose: ;; - Remove the first thread from green-thread-list and execute it. ;; - This function acts as the scheduler to transfer control to the next thread. (define (start) (if (null? green-thread-list) (error "No green threads to run!") (let ((next-thread (car green-thread-list))) ;; Remove the thread from the head of the list. (set! green-thread-list (cdr green-thread-list)) ;; Execute the next thread (thunk). (next-thread)))) ;; Yield Function: yield ;; Purpose: ;; - Allow a running thread to suspend its execution, ;; capture its current continuation, and add itself to the end of ;; the green-thread-list. ;; - Then, immediately transfer control to the next thread. ;; ;; How it works: ;; - call/cc is used to capture the current continuation as a function k. ;; - k represents the remainder of the current thread's computation. ;; - We append k to green-thread-list and then call start to schedule the next thread. (define (yield) (call/cc (lambda (k) ;; Append the captured continuation k (as a thunk) to the end of green-thread-list. (set! green-thread-list (append green-thread-list (list k))) ;; Transfer control to the next thread. (start)))) (define (print-h) (let loop () (display "H") ;; Yield control after printing H. (yield) ;; Resume here after yield and loop forever. (loop))) (define (print-i) (let loop () (display "I") (newline) ;; Yield control after printing I and a newline. (yield) ;; Resume here after yield and loop forever. (loop))) ;; Main: Create Threads and Start the Scheduler ;; Clear the global thread list (set! green-thread-list '()) ;; Add our green threads to the thread list. (gt-cons print-h) ;; Add the thread that prints "H". (gt-cons print-i) ;; Add the thread that prints "I" followed by a newline. ;; Start the scheduler to begin green thread execution. (start)
The issue with this code is that we will have the threads run forever - the yield
function acts like a true yield
function for a true multithreaded application.
Can we do things without using call/cc
? Yes:
Continuation Passing Style (CPS) Programming style in which every function takes an extra argument (a continuation) that represents "what to do next" after that function finishes. Instead of returning a value directly, the function passes its result to the continuatiion function.
- Makes control flow explicit
- Allows you to simulate non-local exits, backtracking, etc without relying on built-in primitives like
call/cc
- Can be implemented in any language with first-class functions (e.g. Scheme, Python, OCaml)
The key idea is that every function is rewritten so that it never "returns" in the usual sense. Instead, when a result is computed, it is pased to the continuation (a function) that was provided as an extra argument.
Imagine you’re following a recipe. Instead of finishing a step and then “returning” to continue, someone (a friend) stands by with a note saying, “When you finish this step, give me the result so I can tell you what to do next.” In CPS, that note is the continuation. For example, the normal function would be
(define (add x y) (+ x y))
When we call (add 2 3)
it returns 5
. Now the CPS version looks like
(define (add-cps x y cont) (cont (+ x y)))
Instead of returning a value, it passes 5 to the continuation cont. For example, if cont is a function that prints its argument, (add-cps 2 3 display) would print 5.
We can no re-write the prod
function using CPS. The traditional function looks like
(define (prod lst) (if (null? lst) 1 (if (zero? (car lst)) 0 (* (car lst) (prod (cdr lst))))))
In the CPS version We add an extra parameter k (the continuation). Instead of returning a value, the function passes its result to k. We have the following changes:
- Base case: If the list is empty, then the product is 1. Instead of returning 1, we call the continuation k with 1.
- Early exit case: If the head of the list is 0, we immediately pass 0 to the continuation.
- Recursive case: Otherwise, we recursively compute the product of the tail of the list. However, we must modify the continuation to multiply the result by the head.
(define (prod-cps lst k) (if (null? lst) ;; Base case: if list is empty, product is 1. Pass 1 to continuation. (k 1) (if (zero? (car lst)) ;; Early exit: if head is 0, product is 0. Immediately pass 0 to k. (k 0) ;; Recursive case: ;; Compute the product of the tail, but update the continuation so that ;; once the tail's product is computed, multiply it by (car lst). (prod-cps (cdr lst) (lambda (tail-prod) ;; Multiply the head by the product of the tail, and pass the result to k. (k (* (car lst) tail-prod))))))) ;; A helper to call our CPS function with the identity continuation. (define (prod lst) (prod-cps lst (lambda (x) x))) ;; Example usage: ;; (prod '(2 3 4)) returns 24. ;; (prod '(2 3 0 4 5)) returns 0 immediately when 0 is encountered. (display (prod '(2 3 4))) ; Should display 24. (newline) (display (prod '(2 3 0 4 5))) ; Should display 0. (newline)
1.8.11. Storage Management (from Albert's Notes)
The storage hierarchy starts with small/fast and ends slow/big: registers -> L1/2/3 cache -> RAM -> flash -> disk. On a higher level, the contents of variables get put into the registers or into RAM: the compiler (or us, potentially) decide which one; the system decides whether to cache something or not. On the other hand, we typically interface with persistent storage through I/O commands, i.e. explicit actions.
We also need to store machine code somewhere (for our program, for library code, and for the code running an interpreter if we’re using an interpreted language). We need a stack, to hold the instruction pointer and return addresses, as well as separate stacks for separate threads. Finally, we need somewhere to store an I/O buffer as well as memory to contain temporary values.
On a higher level, we need first static storage. The simplest such approach is to allocate everything statically in RAM (a la 1950s FORTRAN). This is simple and lets you optimize well but is less secure and doesn’t let you use object oriented/recursive patterns.
We also need a stack to better support calling/returning from functions. The stack is simple, is fast to allocate/free, but has obvious limitations in terms of what order we can free storage.
An extension to the stack is the idea of an “activation record”, i.e. the storage allocated when you enter a function/freed when you return from it. In C, for instance, the size of this activation record is known per function and thus can be statically allocated. In C99, on the other hand, we can dynamically have different activation record sizes for the same function, making stack allocations more complex. This is more flexible but more complex, leads to security issues, and can lead to stack overflows (which typically are not checked for performance reasons in lower-level languages).
A parallel idea is the idea of nested functions where internal functions can refer to the variables of the function they were declared in. This is difficult to implement. For instance, let’s say a function’s activation record contains one slot per local variable, one slot per temporary, and one slot for the return address. It then becomes difficult to figure out how to access the memory of the nonlocal variable using static analysis techniques: instead of just maintaining a dynamic chain of pointers referring to callers, we also need to maintain a chain of pointers referring to the definers of each function, called the static chain. We then need to tell a function, when it’s called, where its definer’s stack frame is.
This idea of having some machine code be linked to the context in which it is called is known as a closure. This is very important in high level programming languages.
Finally, we need a heap. This removes the restriction of LIFO allocation but is generally more expensive/complicated to manage than the stack. For instance, we need to manage a free list to keep track of whether we still have extra heap space available or not. However, we need to deal with issues like making sure we have big blocks of memory available (fragmentation), coalescing free chunks, and even potentially mechanisms like quick lists for commonly-requested sizes.
Note that if we forget to free/free memory too early, this can lead to significant performance/ reliability issues. The standard fix for these issues is to use a garbage collector, which frees memory that is no longer accessible by the program. A standard garbage collection algorithm is the mark and sweep algorithm: all reachable objects are marked recursively; all unmarked objects are freed. The compiler thus needs to tell the garbage collector where the program variables are, which it can do because it allocated storage for them.
1.9. Storage Management
A. Why Memory Management Matters
- Purpose:
Memory management deals with storing all the data a program uses:
- Variables: Local, global, and temporary.
- Return addresses and environment pointers: Essential for tracking function calls (the call stack).
- Program instructions and constants: Stored in memory (the text segment).
- Challenge: Modern programs have many more variables than available CPU registers. Large objects (like arrays) cannot reside solely in registers and must be stored in RAM.
B. Memory Layout Overview
A typical program’s memory is partitioned into several regions:
- Text Segment (Code): Contains machine instructions and read‑only constants. Advantages: Can be marked read‑only to prevent accidental writes.
- Initialized Data Segment (Data): Contains global variables that are pre‑initialized. Loaded from the executable; its initial values are fixed.
- BSS Segment (Block Started by Symbol): Contains global and static variables that are zero‑initalized. Not stored in the executable; simply zero‑filled at runtime.
- Heap: Used for dynamic memory allocation (via functions like malloc in C). Can grow and shrink during program execution. Flexible: You can allocate objects whose size isn’t known at compile time. Downsides: Slower to allocate/free and may require a memory manager.
- Stack: Used for function call management (activation records). Stores local variables, return addresses, and saved environment pointers. Fast allocation/deallocation: (by adjusting the stack pointer). LIFO order: Limits freeing out‑of‑order objects.
+----------------------+ | Text Segment | ← Program instructions, read‑only data +----------------------+ | Initialized Data | ← Global variables (pre‑set) +----------------------+ | BSS | ← Zero‑initialized globals/statics +----------------------+ | Heap | ← Dynamic memory (grows/shrinks) +----------------------+ | Stack | ← Activation records (function calls) +----------------------+
C. Static vs. Dynamic Memory Allocation
- Static Memory Allocation (e.g., Fortran 1958)
- Description: All memory is allocated at compile time.
- Advantages:
- Simplicity: No runtime allocation or deallocation.
- No memory exhaustion: The program has a fixed amount of memory; you won’t run into “out‑of‑memory” errors.
- Predictability: The size and location of every variable are known at compile time.
- Disadvantages:
- Inflexibility: If you underestimate the needed size (e.g., array size), you must recompile.
- Limited recursion: Fixed‑size activation records make recursion difficult or impossible.
- Wasteful: May reserve more memory than necessary.
- Dynamic Memory Allocation (e.g., C from 1975 Onward)
- Description: Memory is allocated at runtime.
- Components:
- Stack Allocation:
- Used for function calls (activation records).
- Fast allocation/deallocation via a stack pointer.
- Limitation: LIFO structure; you cannot deallocate out‑of‑order.
- Heap Allocation:
- Used for objects created during runtime (via malloc/free in C).
- More flexible than the stack.
- Slower: Memory management involves more complex operations.
- Stack Allocation:
- Advantages:
- Flexibility: You can allocate variable‑sized objects at runtime (e.g., dynamic arrays).
- Recursion is enabled: Stack frames are allocated as functions are called.
- Disadvantages:
- Overhead: Managing the heap is more complex and slower than stack allocation.
- Fragmentation: Dynamic allocation can lead to memory fragmentation.
- Potential runtime errors: Memory exhaustion, if too much is allocated.
- Historical Example: Algol 60 vs. Traditional C
- Algol 60:
- Allowed local arrays to have sizes determined at runtime.
- The activation record (stack frame) size could vary.
- Traditional C:
- Local arrays must have sizes determined by compile‑time constant expressions.
- Activation records are fixed‑size, allowing the compiler to generate efficient code.
- (Note: Modern C (C99) introduced variable‑length arrays, but C++ still requires fixed‑size local arrays.)
- Algol 60:
1.9.1. Memory management
Beyond activation reocrds (or frames) for the currently executing procedure, a language must also manage other objects such as I/O buffers and dynamically allocated objects (the heap).
- Activation records hold all the necessary function for a function's execution, including local variables, return addresses and saved environment pointers.
- Typically, these records are stored on the stack (LIFO), which is efficient for allocation and deallocation.
- However, when a function returns but its inner function (e.g. a curried function) is still needed, the activation record must survive even after the original function has returned. In such case, the activation record is promoted to the heap.
In Scheme, recall that functions are not just pointers to code, they are "fat" function objects that consist of two components:
- An instruction pointer (IP), which indicates where the machine code for the function resides.
- An environment pointer (EP), which points to the activaetion record of the function that defined the current function.
- When you curry a function (i.e. partially apply arguments), the resulting function object carries with it the necessary environment so that it can later refer to variables from the defining context.
- This means that a curried function is effectively a pair of pointers (IP and EP). The EP must remain accessible even after the outer function has completed, so it is allocated on the heap rather than on the stack.
1.9.2. Dynamic vs. Static Chain
- The dynamic chain records the sequence of calls that led to the current point of execution (i.e. the call stack). It is useful for generating backtraces and understanding runtime context.
- "Who called us"; the only thing we can do with the dynamic chain is return and let the caller figure out what to do.
- The static chain reflects the lexical structure of the program. When a function needs to access a non-local variable, it follows the static chain to locate the variable's binding in the environment of the function that defined it.
- "Who defined us"; can figure out the static chain at compile-time. Maximum nesting level in the original source code for Scheme is the max length of the static chain.
- When a function follows its static chain, it goes from a function to its definer. It knows statically, at compile time, what the relevant activation record looks like, knows where the local variables are and can go look them up.
Sometimes the two will be the same, i.e. if the function calls us and the function defines us. With currying and higher-order functions, the static chain is often shorter because it represents only the nesting structure of definitons, not the entire runtime call history.
- If some function
F
returns, but its activation record is still around (i.e. due to the EP / dynamic chain) need not be stacked. So in Scheme, in general, your activation records are not on the stack (not LIFO). We have to "keepF
around" so long as anybody has access to the functionF
returns. If we're not returning the function, the compiler can optimize it into a stack. - This makes it possible to design languages without
new
ormalloc()
. Instead, the way you allocate an object is similar to above: you callF
,F
returns a function, that function ties down the activation record, and that ties down your object.
1.9.3. Heap Management
- The heap is used for dynamic memory allocation. Unlike the stack, objects on the heap can be freed in any order.
- In a managed language like Scheme, the heap manager (or garbage collector) must track which objects are still reachable (the “roots”) and reclaim memory for those that are not.
- The heap manager's gaol is to keep track of all the stuff that's in the heap, and make sure that if you can access an object in the heap, we keep it around, and if you cannot access that object, then we can reuse that piece of storage.
Undefined vs. unspecified behavior:
- Unspecified Behavior: The language standard does not specify the exact outcome (e.g., the exact result of pointer comparisons for small integers).
- Undefined Behavior: When the program violates the language rules (such as using a dangling pointer), the behavior is completely unpredictable; the program may crash, produce incorrect results, or seem to work on one run and not another.
Trade-offs:
- Static allocation (as in early Fortran) is simple and efficient but inflexible.
- Dynamic allocation (as in C) offers flexibility and supports recursion and variable‑sized objects but introduces overhead, potential fragmentation, and runtime errors.
- Modern languages like Scheme use garbage collection to automatically manage heap memory, reducing the risk of errors like dangling pointers but at the cost of some performance overhead.
Roots
- Roots are entry points from which the program can access objects in the heap (for example, variables in registers, on the stack, or global variables).
- The garbage collector uses these roots to trace which objects are accessible; any object not reachable from any root is eligible for reclamation.
How can we manage roots?
- We can explicitly inform thje heap manager
- The progrma might call specific routines to push or pop roots when objects are created or destroyed.
- However, this is error prone and unpopular.
- Compiler-recorded roots
- When compling, the compiler can record the locations of global variables and local variables in activation reocrds as roots, sotring this information in read-only tables that the heap manager consults.
- Manual memory management
- In languages like
C
, you have explicit allocation (mallow
,new
) and explicit deallocation (viafree
anddelete
) - The heap manager does not automatically know which objects are still needed and instead relies on the programmar.
- This can lead to issues i.e. dangling pointers if an object is freed while still in use.
- In languages like
- All these options assume we have a managed heap: i.e. you need a garbage collector, which deletes the objects. It deduces, from the roots and object layout, which objects are unreachable from the program (and thus,
free
them)
How can we keep track of roots?
How can we keep track of free space? (while being fast)
- Keep a linked list, where each link points to another, and the payload is a pointer to the free area and the size of the free area
+-----------+ +-----------+ +-----------+ | Free Node | ---> | Free Node | ---> | Free Node | ---> NULL +-----------+ +-----------+ +-----------+ | | | V V V [ Pointer, Size ] [ Pointer, Size ] [ Pointer, Size ]
- Each node represents a free memor block (e.g. 100 bytes, 50 bytes, 300 bytes)
Problem: keeping the free list in a separate memory area (a "meta" heap) – but this leads to a recursive problem, as you'd then need a free list for the meta heap and so on.
Instead, stored the free list information wuthin the free blocks themselves. Reserve the first few words (e.g. 3 words) of each free block to hold the linked-list pointers and size information. You avoid extra memory management overhead since the metadata uses the free space itself.
Heap Block (Free): +-------------------+ | Metadata: [ptr, sz] | <-- 3 words reserved for free list info +-------------------+ | Unused (free) | +-------------------+
however, we still have an efficiency problem. We want \(O(1)\) malloc()
. In the ideal case, if we request an allocation (30 bytes) and the first free block is large enough (100 bytes), you just "peel off" the requested bytes:
- Subtract the allocated size from the free block
- Adjust the pointer to mark the new start of the free block.
However, if you keep allocating and deallocating, free blocks can become very small. Eventually, even if the total free memory is large, you might not find a single contiguous block big enough for a new allocation request. This is known as external fragmentation. Instead, we could try:
- Best fit (search for free block that most closely matches requested size): \(O(n)\)
- Roving pointer (use a circular free list that remembers the last block used): may keep hitting a large block repeatedly, leaving smaller blocks unused.
What about free
? Freeing should also be fast (ideally \(O(1)\)), but when an object is freed, you must:
- Determine if the freed block is adjacent to an existing free block.
- Coalesce (merge) adjacent free blocks to prevent fragmentation.
With a naive approach, checking adjacency might require scanning the free list, which has \(O(n)\) cost.
To speed up the freeing process, use extra words: wen allocating, reserve extra space before and/or after the object for metadata (e.g., size and pointers to neighbors). This extra information speeds up the coalescing process because you can directly check if neighboring blocks are free.
Allocated Block Layout: +-------------------+-------------------+------------------+ | Metadata (header) | User Data | Metadata (footer)| +-------------------+-------------------+------------------+
with this structure, when freeing, you can:
- check the foot of the previous block
- check the header of the next block
if either is free, merge the blocks quickly.
Fragmentation issues:
- External fragmentation: Occurs when free space is scattered in many small blocks, so a large allocation request might fail even if the total free memory is sufficient. Adjust algorithms to avoid this.
- Internal fragmentation: Happens when the allocated block is larger than requested (due to alignment or metadata overhead), leaving some unusable space. i.e. you request 49 bytes but allocator gives you 50 bytes (or even extra for metadata), the extra byte(s) are wasted. Less of a problem than external fragmentation.
1.9.4. Garbage collection
The basic GC algorithm, used in language like Java which don't have free
is mark and sweep.It consists of
- Mark phase:
- Start with roots: the garbage collector knows where the “roots” are (global variables, stack variables, etc.) because the compiler or runtime provides this information.
- Recursively traverse: starting from the roots, recursively mark every object that is reachable.
- Sweep phase:
- Scan heap: go through every object in the heap
- Free unmarked objects: if an object is not marked, add it back to the free list
- Clear the mark bits for the next garbage collection cycle (for next cycle of mark-and-sweep).
Problems:
- The mark phase is proportional to the number of reachable objects. In large applications (with lots of objects), this can be very time-consuming
- Long pauses during GC can freeze applications (e.g. a robot running Java might "freeze" during mark-and-sweep – other computation can't be happening)
We can solve the second issue with real-time garbage collection: to mitigate long pauses, real-time garbage collectors perform a small, bounded amount of work (e.g., a fixed number of instructions) on each allocation. This spreads out the cost of garbage collection. This approach makes allocation a little slower overall compared to traditional mark and sweep, but it avoids the disruptive pauses.
Outside of mark-and-sweep, what about languages that have explicit garbage collection with free
or delete
? Manual memory management can lead to
- Dangling pointers (using freed memory), or
- Memory leaks (forgetting to free memory); GCs will get taken care of the next time it marks-and-sweeps
Conservative garbage collection
- No explicit free: redefine
free
or simply ignore it, so that memory is never explicitly freed by the programmer - Automatic GC: the compiler doesn't know where roots are, we and we don't know the layout of the objects in memory. Instead, a garbage collector periodically scans the stack, registers, and static/global data to find anything that looks like a pointer into the heap.
- If a bit pattern on the stack or in a register resembles a heap address, the conservative GC assumes it is a valid pointer and marks that object as in use.
Pros:
- Prevents dangling pointers, since nothing is explicitly freed
- Simplifies memory management in a language not design for GC
Cons:
- May falsely mark non-pointer values as pointers (leading to memory leaks) - but this is rare
- Relies on the assumption that valid pointers have a recognizable format (e.g. they lie within the heap's address range).
Garbage collection in Python
Historically, Python wasn't as performance critical in terms of garbage collections because the interpreter itself was relatively slow. However, Python uses a very different
strategy than a full mark-and-sweep collector - we have a reference count field: every object in the heap carries an extra word (or field) that holds a reference count - the number of pointers currently referencing that object.
For referencing counting, on assignment (when you assign one variable to another) two things happen:
- The reference count for the old object (if overwritten) is decremented.
- The reference count for the new object is incremeented.
This means that every assignment carries additional overhead because of these counter updates. However, when an object's reference count drops to zero, it can be freed immediately. No need for a separate mark-and-sweep phase for most objects.
Pros & cons:
Advantages:
- Immediate reclamation: objects are freed as soon as they are no longer needed
- Simplicity: conceptually straightforward - each object "knows" how many activate references it has
Disadvantages:
- Assignment overhead: every assignment now involves updating two reference counts
- Cyclic references: cycles (e.g. A references B and B references A) will never drop to zero, leading to memory leaks
- To fix this, Python supplements coutning with a periodic mark-and-sweep to clean up cyclic garbage
Copy behavior:
- A shallow copy creates a new object (with its own reference count starting at one) without altering the original object's reference count
- A deep copy recursively copies objects, again starting fresh with a new reference count.
Garbage collection Java
In Java, we want to allocate new objects as quickly as creating a new stack frame in C
(i.e., a simple pointer arithmetic operation).
Traditional free-list-based allocation can be slow because of fragment and seraching issues. Instead, we adopt the nursery model.
In the nursery molde, the heap is partitioned into several generations. The youngest objects are allocated in the nursery and allocations are performed by simple pointer bumping:
- Heap pointer (HP): points to next free location
- Limit pointer: marks the end of the nursey
The key idea is that in a generation-based collector, when you want to allocate new storage, you always allocate it from the nursery.
When the nursery fills, we can either promote it to "adolescents" and allocate a new nursery, or we can do garbage collection once it fills. This is well-suited for the case where objects always point to older objects, and this is true in general. This is especially true for functional programming. Then, we can garbage collection just the nursery. Because most objects are short-lived, the garbage collector only needs to scan and reclaim the nursery. It uses the fact that most pointers from young objects point to older generations, so scanning can be restricted.
- Instead of marking and sweeping in place, live objects in the nursery are copied into a new space.
- The free space is consolidated, which improves cache locality and reduces fragmentation.
Java-based garbage collectors also use copying collectors. Suppose we are cleaning the nursery. A copying collector will not garbage collect the objects in place. instead, as we find an object that's in use, we make a copy of the object down to a new nursery, and update the root to point to the copied object. Once we're done, we have a slab of storage that we can just "free" and give back to the operating system. A copying collector updates all pointers (roots and internal pointers) as objects are relocated. This complicates the use of finalizers (cleanup methods) because the traditional mark-and-sweep isn’t scanning free objects, so special care (or even separate GC strategies) is needed for objects with finalizers.
Advantage 1: Cache - Since a copying collector copies objects into a contiguous regions (with free spaces left at the end), organizing the date in this way so that the related data lives within the same cache line improves the memory access.
Advantage 2: Cost - The cost of copying is proportional only to the number of objects in your system; we never look at the free areas. In mark-and-sweep, the cost is proportional to the cost of marking + the cost of sweeping all of the free areas into the free list. Cost is proportional to number of objects in use, not to the number of objects. This is good since in Java a lot of objects are created amd freed quickly.
Disadvantage: Java has a finalized
method, which, when marked, executes just before the object is reclaimed / cleaned up. However, with this collector, we can't easily support this since we never look at all the object. In order to support this, the Java developers supported two garbage collectors: one regular mark-and-sweep for objects with finalized
method, and the generational GC for all others. Eventually, finalize
will go away since it is deprecated.
To deal with multithreading issues, each thread can have its own nursery to allow rapid, lock-free allocation. This avoids contention on the heap pointer shared among threads.
Private free lists in traditional mark-and-sweep
- In some systems (e.g. Lisp
cons
cells), a dedicated free list is maintained for common, small objects.- Allocation: check the private free list, if non-empty, pop a cell off it.
- Freeing: instead of returning to the general heap free list, the object is pushed back onto the private free list.
This is a good in C/C++ since the private free list avoids the overhead of scanning a heterogeneous free list. In Java, with a copying collector, the garbage collector might copy even the unused objects from the private free list–adding overhead and negating the speed advantage.
1.9.5. Names, Identifiers, and Substitution
In both natural language and programming, there’s a notion that a name should be interchangeable with what it stands for. For example, if you say: “Sir Walter Scott is the author of Waverley.” then substituting “the author of Waverley” for “Sir Walter Scott” should not change the truth of the statement. However, due to multiple meanings (or “senses”) attached to a name, the substitution sometimes fails. This is the substitution principle. We want this to hold in programming. Consider the following
int I = 27; return I + 3;
here, the identifier I
is bound to the value 27. By the substitution principle, one might expect that replacing I
with 27 everywhere would yield the same meaning. In this simple case, it does. If I
is declared with additional properties (like type), then a simple substitution might change meaning. For example:
long int I = 27; return sizeof(I);
On an x86-64 machine, sizeof(long int)
might be 8, not 4 (as would be for a 32-bit int).
The identifier I
here isn’t just the number 27—it also carries its type information, which influences other operations (like sizeof).
An identifier in languages like C
or C++
can be associated with several pieces of information:
- Value: actual runtime value (e.g.,
27
) - Type: determines operations (e.g.,
int
,long int
) - Address: where it is stored in memory
- Alignment: constraints on its memory address address. (
alignof
)
The implication is when reasoning about a program, you cannot simply replace the identifier with its value. You must consider its type (affecting, for instance, arithmetic and sizeof operations), its address (if the program takes its address), and possibly its alignment.
What we really need is a binding: the association between a name (identifier) and some attribute(s) or "value"—though "value" here is abstract. It might be the runtime value, type, address, alignment, or other properties. For example, the statement int I = 19;
establishes a binding for I
with
- a value (
19
), - a type (
int
), - alignment,
- and later (during execuction), an address in memory.
A set of bindings is like a python dictionary or namespace, it just amps a set of names to values.
These are either determined explicitly (i.e. int i = 3
) or determined implicitly, (i.e. fun i -> i + 1
, inferred to be int
).
Binding time - when does the binding occur?
- Runtime bindings: the value of a local variable may change during execution. Its binding time (the moment when the variable is assigned or modified) occurs at runtime.
- Compile typing bindings:
- Type binding: the type of a variable is determined at compile time. E.g.,
I
is known to be anint
or (long int
) when the program is compiled - Constants: some identifiers (like
INT_MAX
) are bound to a value as part of the language or platform definition long before any program execution begins. Not quite compile time, more like platform-definition-time. It was decided by the machine (e.g. x86-64 has 32/64 etc). Always the same, once you decide on the platform.
- Type binding: the type of a variable is determined at compile time. E.g.,
- Link-Time vs. Load-Time bindings:
- Static variables: a variable declared as
static
has a single copy persists for the program's lifetime. its address is typically decided at link time. However, with techniques like Address Space Layout Randomization (ASLR), the actual address might not be fixed until the executable is loaded into memory. - Separate compilation is involved when we have too large a program to compile at once, so we compile different modules separately and link them. When that occurs, the compiler does not know all the details of the other modules, so it has to write reasonably generic code, regardless what these bindings happen to be.
- Static variables: a variable declared as
1.9.6. Names and terminology
- Declaration (Lightweight)
- A declaration tells the compiler about the name, type, and (sometimes) certain properties of an entity (such as a function or variable) without giving its full implementation or storage.
- Declarations are used in header files or at the start of a module to inform other modules about the interface.
- Definition (Heavier weight)
- A definition not only declares the name and type but also provides the actual implementation or storage.
- Definitions are found in source files where the full code is provided.
When it comes to namespaces, binding times, etc, the other modules will see the declarations, but won't see the definitions. The compiler uses declarations to check that any definitions later in the module (or in separate modules) are consistent with what was declared. For example, if a function is declared to return a double, its definition must return a double.
Filtering
In many programming languages, a definition might include extra details that you do not wish to expose outside a module. The declaration is a “filtered” version containing only the interface—enough information for callers to use the function correctly.
- Abstraction: In software engineering, abstraction means exposing only what is necessary for using a component (e.g., a function or variable) without revealing the internal details.
- Why filter?
- Encapsulation: Hides internal workings so that the user of the function doesn’t depend on its internal implementation.
- Optimization: Certain implementation details (like whether a function is unsequenced) may help the compiler optimize the code, but they are not essential for a caller to know.
- Filtering in practice: The declaration might leave out attributes that appear only in the definition. For example, the keyword unsequenced (indicating that a function has no side effects and can be optimized aggressively) might be part of the definition but omitted in the declaration for clarity.
- e.g. declaration without extra attributes would look like
double diff_time(time_t start, time_t end);
- e.g. definition with additional attributes:
double diff_time(time_t start, time_t end) __attribute__((unsequenced)) { return difftime(end, start); }
- When compiling, the compiler will merge the two—using the extra information from the definition to optimize calls to
diff_time
(for example, allowing common subexpression elimination when it sees multiple calls with the same arguments).
- e.g. declaration without extra attributes would look like
- Practical implications:
- Interface stability; Other modules only see the declaration and are thus isolated from changes in the definition, which is useful for modular programming.
- Compiler optimization; If the compiler knows (through a declaration or merged information) that a function is unsequenced (i.e., it has no side effects), it can simplify repeated calls:
if (diff_time(a, b) =
difftime(a, b))= (the compiler may optimize this to 'true' because it knows the function is pure). - Design consideration; Deciding what details to expose in a declaration is an important design decision. Too much information can leak implementation details, while too little might limit the compiler’s ability to optimize.
1.9.7. Namespaces
A namespace is a collection (or mapping) of names to their bindings. Think of it like a dictionary where each key is an identifier and each value is a bundle of information (value, type, address, etc.). For example for a given block of code
int x = 10; double y = 3.14;
the namespace (or binding set) might look like:
{ x -> {type: int, value: 10, address: ...}, y -> {type: double, value: 3.14, address: ...} }
Block-structured namespaces and shadowing
Programming languages often use block structure where inner blocks have their own namespaces that "shadow" (override) outer namespaces. For example,
int x = 5; // Global or outer scope void foo() { int y = 10; // Local to foo() { int x = 20; // Shadows the outer x printf("%d\n", x); // Refers to the inner x (20) } printf("%d\n", x); // Refers to the outer x (5) }
the nested scope would look like
Outer Namespace: x -> 5 foo() -> function Inside foo(): y -> 10 Inner Block: x -> 20 (shadows outer x)
As an aside, let's define what we mean by scope. The scope of a variable or of a name is the set of "locations" in the program where the name is visible. A name is visible if you write the name in the program, that's what that name will mean. If you know the scope of every name, you know the visibility of every name and vice versa.
Primitive namespaces are the built‐in, irreducible collections of names that a programming language provides by default. They are “primitive” in the sense that they are not constructed by the programmer; instead, they are inherent to the language design. Different kinds of identifiers—such as ordinary variables, function names, struct/union/enum tags, preprocessor macros, and labels—reside in different namespaces. This separation prevents collisions even if the same identifier (e.g., “F”) is used in multiple roles.
In C/C++, certain kinds of names belong to different namespaces:
- Ordinary identifiers include variables, functions, etc
- Struct/union/enum tags are names that follow keywords like
struct
orenum
, which exist in a separate namespace - Preprocessor macros are managed by the preprocessor and not subject to block scope
- And labels, which are used in
goto
statements
For example,
int f() // Here we have an f that maps to this function { f; // Inside this scope we have another variable so we have a local variable inside the function // What about something like below? struct f { float f; // This is also allowed because its inside the scope of the struct } f; // Structs have a different namespace because they always come with // an identifier struct. This creates another namespace for f the struct. // float f is allowed because the only time you can use f is when you do // ____.f or ____ -> f which stands for the value of struct or the pointer of // f respectively, so the compiler will // understand the context of the f and will know when to use which name enum f{ // can't use g or f here because it could get confused // if you do something like return &f;, since both f inside and the // struct with name f are ordinary identifiers, it wouldn't know what to do, // same with why you cant do int i, i; } g; // this cant be named f either // but notice that enum f is ok because its in a different namespace than struct. #include <f> // this is also ok because its just a library include #define f g // this is a preprocessor macro name, so it also works f: goto f; // this really means g: goto g;, which is a infinite loop, // and since this g is preceeding a : or after goto, this would not collide // with the enum g
the compiler uses context (e.g., after the struct
keyword, or after a dot operator) to determine which namespace an identifier belongs to.
Python has a simpler model. All variables names within a function are in one namespace (for lcoals) and global names in another. There isn't the added complexity of separate namespaces for types versus variables because Python's model is unified.
1.10. Information hiding (for modularity)
Information hiding is a design principle that helps you build modular software. Each module (or class, or file) should expose only the functionality other parts of the program need and hide the rest of its implementation details. This approach:
- Makes code easier to maintain, because internal details can change without breaking other modules.
- Reduces naming conflicts by limiting visibility of internal identifiers.
- Promotes cleaner, more understandable interfaces (APIs).
For each identifier in your code, decide whether it should be externally visible or remain private.
1.10.1. C
-style information hiding
In C
, the primary way to control visibiluty across different compilation units (i.e. .c
files) is
static
(at file scope): Declaring a function or global variable as static makes it private to that .c file. Other .c files cannot see or call it.- No
static
(implicitlyextern
): If you omit static and define a function or global variable at file scope, it becomes externally visible to the rest of the program (assuming you provide a matching declaration in a header file). extern
(in a header file): Typically, you place extern declarations in header files so other modules can reference a variable or function:
By default, C
does not enforces strong encapsulation; however, using static for private symbols and extern for public symbols is an established convention to emulate a modular design.
Each .c file compiles separately, and the linker resolves references to non-static (public) symbols across object files.
1.10.2. Java
-style acess modifiers
Java provides more granular access control for classes, methods, and fields:
public
: visible to all classes everywhere.protected
: visible to the same class, subclasses, and other classes in the same package.- default (no modifier): visible to classes in the same package only.
private
: visibly only within the same class.
Take the following visibility table
#+begin_src org | Modifier | Same Class | Subclass (diff pkg) | Same Pkg | Other Pkgs | |------------+-----------+---------------------+---------+-----------| | public | ✓ | ✓ | ✓ | ✓ | | protected | ✓ | ✓ | ✓ | ✗ | | default | ✓ | ✗ | ✓ | ✗ | | private | ✓ | ✗ | ✗ | ✗ |
#+ENDSRC
- In Java, each class can be considered a “module” that decides how much of its internal state and methods to expose.
- Classes in the same package can share “package-private” (default) access, a middle ground between fully public and fully private.
1.10.3. OCaml
signatures
In OCaml, a signature is like an interface or “type” of a module, describing what the module exposes (types, values, submodules, etc.) without revealing full implementation details.
module
defines an implementation (like a "namespace" or "package")module type
defines a signature (the interface of a module).sig ... end
is the syntax for writing the contents of a signaturestruct ... end
the syntax for writing the contents of a module (implementation)
For example, we have
module Q = struct (* Define a queue type with parameterized elements *) type 'a queue = | Empty | Node of int * 'a * 'a queue (* Function to enqueue an element (actual implementation should go here) *) let enqueue = (* actual implementation *) end (* Use signatures to control what gets exported *) (* This is statically checked at compile time *) (* More flexible than Java *) module type QI = sig type 'a queue val enqueue : 'a * 'a queue -> 'a queue end (* This is the only thing that will be exposed to outside code *)
The crucial link to information hiding is that a signature can declare both
- Type abstraction; you expose
type t
but hide its actual structure in thestruct
, - Value declarations; you expose function names and types, but not the internal detials of how they work.
This ensures that code using the module only relies on the published signature. Internals can change without breaking clients, which is a core advantage of the OCaml module system.
1.10.4. Other langauges (Python
, etc)
Python:
- Python does not have built-in private/public keywords for classes. Instead, it relies on naming conventions (underscore for “internal” objects, _doubleunderscore for name mangling) and a culture of “we’re all consenting adults here.”
- Internally, Python stores attributes of a class in a dictionary. For example, MyClass._dict__ holds the class members. This is very dynamic—names can be added or removed at runtime.
- Python’s “information hiding” is often done by only exposing certain names in
__all__
or by prefixing “private” objects with an underscore.
Linking back this to information hiding, why do we do this?
- Modularity: Each part of the code is developed and tested independently.
- Maintainability: Internals can change without affecting other parts of the program that only see the “public” interface.
- Clarity: A smaller, well-defined interface is easier to understand than exposing every function/variable.
Some real world examples:
- C Libraries: .h headers define the API; .c files contain private helpers marked static.
- Java Libraries: Classes often expose only a small set of public methods. Internals remain private.
- Python Packages: By convention, “public” APIs are documented, while internal modules or functions (with _ prefix) are for internal use.
1.11. Errors, faults, and failures
1.11.1. Error checking
Compile-Time Checking (Static Checking)
- Definition: Many errors are detected before running the program—during compilation.
- Type Checking: Ensures that functions are called with the correct argument types, variables are used consistently, etc.
- Name Resolution: Identifiers must be declared before use, preventing undefined symbol errors.
- Examples of Static Checking:
- C/C++: The compiler checks function prototypes, variable types, etc.
- Rust: Enforces strict rules about borrowing, lifetimes, and type constraints at compile time.
- Java: Verifies types, class members, method signatures, etc., during compilation (though it also has dynamic checks for certain things).
- Advantages of Static Checking:
- Catches many mistakes early, reducing run-time failures.
- Can enable optimizations because the compiler knows more about the code structure and types.
- Limitations:
- Cannot catch every possible error (e.g., array out-of-bounds might still happen at run time unless the language enforces bounds checks).
- Requires more up-front specification (e.g., type annotations)
Run-Time Checking (Dynamic Checking)
- Definition: Errors are detected during execution. The language or runtime system checks conditions on-the-fly (e.g., type correctness, array bounds).
- Examples of Dynamic Checking:
- Python: Checks types at run time. If a function is called with the wrong argument type, an exception is raised only at the point of use.
- Java/C#: Even though they do static checks, some checks are deferred to run time (e.g., NullPointerException, array bounds checks).
- Advantages:
- Flexibility: you can write code without specifying strict types in advance (in dynamically typed languages).
- More powerful runtime reflection or meta-programming.
- Limitations:
- Errors can go undetected until a particular code path is executed, possibly in production.
- Performance overhead due to continuous checks at run time.
Binding times and language differences
There are two approaches to binding:
- Earliest binding, where the language/runtime decides an identifier’s type or meaning at compile time (e.g., C variable types).
- Latest binding, where the language defers deciding an identifier’s type/meaning until run time (e.g., Python variables can reference any type object at any moment).
1.11.2. Error-handling
In addition to compile-time vs. run-time checks, languages provide various error-handling mechanisms to deal with faults or failures when they do occur.
One approach are to use preconditions:
- Preconditions are conditions that must be true before a function (or method) runs.
- Eiffel: Known for “Design by Contract,” which allows you to specify preconditions (requires) and postconditions (ensures). If a precondition fails, the system raises an exception or error at run time.
- Rust: Encourages writing safe functions that check invalid inputs (e.g., using
Result<T, E>
), or usingassert!()
macros for invariants. Rust’s type system also prevents certain classes of errors at compile time (like data races).
Another approach is "total definitions": make sure the function will do something reasonable no matter what the arguments are, but this
is not always feasible or reasonable in call cases. i.e. sqrt(-1.0) = NaN
. This is done in Rust.
Fatal errors are errors that cause the program to stop immediately (e.g., abort()
, an uncaught exception, or a panic!
in Rust). For user error handling, the programmer explicitly checks for error conditions and decides what to do (return a special value, throw an exception, etc.).
1.11.3. Exceptions
Exceptions allow you to separate normal flow from error handling by “throwing” an exception when something unexpected happens, and “catching” it at an appropriate handler. Typically, we would have some throw mechanism throw SomeException("message");
, where the runtime "unwinds" the stack until it finds a matching catch
.
There are checked and unchecked exceptions:
- For checked exceptions, they must be declared in a method's signature (Java). The compiler enforces that callers handle or declare these exceptions, i.e. every method specifies what it can throw. Java's exception class hierarchy has
java.lang.Object
as the root of all classes,java.lang.Throwable
is the superclass for all "throwable" objects,java.lang.Exception
andjava.lang.Error
are direct subclasses ofThrowable
, andIOException
is one of many specific exception types that extendException
. - On the other hand, for unchecked exceptions (Java
RuntimeException
, C++ exceptions, Python exceptions) the compiler does not force explicit handle (best practice to, typically)
In Java code, we could do something like
try { doSomeStuff(); } catch (IOException e) { fixup(e); } catch (NumberFormatException e) { fixupMore(e); } finally { // This block always executes, even if an exception occurs System.out.println("done"); }
The runtime looks up the call stack for a matching handler. If none is found, the program may terminate (e.g., uncaught exception). Take the following diagram, which illustrates how an exception is thrown in function h()
, then propagates up through callers (f()
, then main()
), searching for a matching catch
.
+-------------------+ | main() | | (caller) | +---------+---------+ | | calls f() v +-------------------+ | f() | | (caller) | +---------+---------+ | | calls h() v +-------------------+ | h() | +---------+---------+ | | throw new SomeException("error"); | +--------------------------------------+ | Exception thrown in h(). | | No catch block in h(), so it | | propagates up to the caller (f()). | +--------------------------------------+ | v +-------------------+ | f() | +---------+---------+ | | Does f() have a catch block for SomeException? | - If YES, handle it here and continue normal flow. | - If NO, rethrow to the next caller (main()). | v +-------------------+ | main() | +---------+---------+ | | Does main() have a matching catch block? | - If YES, handle it and continue. | - If NO, program terminates with an unhandled exception. | v (End of call stack)
We can read this diagram as follows:
main()
callsf()
, which in turn callsh()
h()
throws an exception (SomeException
)- If
h()
does not handle it, the runtime looks inf()
for a matchingcatch
. - If
f()
does not handle it, it propagates up tomain()
- If
main()
does not handle it, the program typically terminates with an unhandled exception error.
We can also checked exceptions for statically typed languages (i.e. in Java here).
In Java, the compiler enforces that methods either handle or declare exceptions that might be thrown.
- If a method can throw a checked exception, it must declare this in its signature using the
throws
keyword - e.g.
public void readFile(String filePath) throws IOException
then, the caller can also declare what exceptions it will throw, and must be declared or handled.
There are some exceptions, i.e. NullPointerException
.
1.12. Parameter Passing, Object-Oriented Programming, Cost Models
1.12.1. Parameter Passing
There are two main issues to be concerned about when discussing parameter passing:
- Semantic issues, or how arguments in a function call correspond to the function's parameters. This includes matching based on position, keywords, or handling a variable number of arguments.
- Efficiency issues, or how the parameter passing mechanism affects performace, particularly in programs with lots of function calls / expensive function calls, etc.
There are a few different calling conventions discussed.
- Positional correspondence is the most traditional way, where each argument is paired with its corresponding parameter in order (left-to-right)
- For example, a function
def F(x, y): return x + y
when called withresult = F(5, 10)
will matchx = 5, y = 10
- For example, a function
- Variadic or "Packaged" arguments allows you to pass multiple arguments that get bundle into a tuple (or similar data structure), not all languages support this.
- For example, a function
def F(x, *y): return x, y
when called withresult =F(5, 10, 15, 20)
will matchx = 5
andy = (10, 15, 20)
- For example, a function
- Keyword correspondence is when arguments are specified with names, allowing them to be passed in any order
- For example, a function
def F(x, y): return x * y
when called withresult = F(y = 10, x = 5)
will correctly match the parameters by their name, even though the positions are swapped.
- For example, a function
The first calling convention we discuss is call by value. In call by value, the caller evaluates the argument (even if it's an expensive expression) and passes a copy of the resulting value to the function (callee).
- It's simple, since each parameter in the function is a distinct copy. Changes in the calee do not affect the original value in the caller.
- It's efficient for small arguments such as integers, floats, object pointers, since they are fast and easy to copy.
We also have call by reference, where instead of copying the value, the caller passes a reference (or address) to the variable where the data is stored. The callee then uses this reference to access or modify the original value.
- It's a lot more efficient for large data, like arrays or tensors, since passing a reference is much faster than copying the entire object.
- It enables in-place modification, since the callee can modify the object directly, which is useful when you want the changes to be visible to the caller.
// Function that accepts two parameters by reference void F(int *x, int *y) { // Dereference x and y to access their values *x = *x + 5; // Modifies the value of the original variable pointed to by x *y = *y; // Here, simply using y (could perform operations similarly) } int main() { int a = 1000; int b = 7; // Passing the address of 'a' and the address of a temporary (for example, b+7) F(&a, &b); // The original variable 'a' is modified by F return 0; }
If the argument is a complex expression (like B + 7
), the compiler may create a temporary storage location, pass its address to the callee, and then work with that location. This ensures the original variable isn't accidently modified.
There are some downsides to call-by-reference. Namely,
- It introduces overhead for small data, since passing an address might be slower when the data is small (e.g., a 4-byte integer might require passing an 8-byte address)
- We may run into aliasing problems: when the same memory location is passed through different parameter names (aliasing), it can lead to confusing code and potential undefined behavior.
For example, in C++, we could have potential aliasing issues with the following code:
// Function that demonstrates potential aliasing issues void F(int &x, int &y) { printf("Initial x: %d\n", x); // Should print the original value of x y = x++; // Increment x via one alias and assign to y printf("Modified x: %d\n", x); // The outcome may be unexpected due to aliasing } int main() { int i = 12; // Passing the same variable for both parameters, causing aliasing F(i, i); // The behavior is undefined: i might not simply become 13. return 0; }
When the same variable is passed to both parameters, both x
and y
refer to the same memoery. This can lead to undefined behavior because the operations interfere with each other. Additionally, aliasing forces the compiler
to be conservative in optimizations. For example, it cannot cache a value in a register if it might be changed indirectly through an alias, potentially slowing down execution.
The efficiency vs. semantics trade-offs is a balancing act. One one hand,
- Call by value offers simple semantics and potential for aggressive optimization (e.g., caching in registers) but may be inefficient for large objects.
- whereas call by reference is efficient for large data by avoiding copies but can lead to aliasing issues that complicate both human understanding and compiler optimizations.
Designers of programming languages often try to provide mechanisms that balance clear semantics with execution efficiency. In many modern languages, object handling is abstracted so that you might see call-by-value in source code but under the hood, the language may actually pass references (especially for objects) to maintain efficiency.
1.12.2. Advanced Calling Conventions
Now, we can discuss call-by-result. In call-by-result, the callee starts with an uninitialized parameter, uses it during execution, and then – just before returning – copies its final value back to a location provided by the caller. Step-by-step, we have
- The caller designates a location where the result should be stored
- The callee writes the computed result into its local (but uninitialized) parameter
- Upon return, that result is copied back into the caller's designated storage
In pseudocode, that could look something like
procedure F(result: out int) { // 'result' is uninitialized at the start result := computeSomething(); // The function computes and assigns a value } // Caller int res; F(res); // Now, res contains the result computed by F.
This approach avoids the cost of copying large data on function entry, since the data is only copied back out at the end of the function. It is especially beneficial when returning large objects without incurring extra copying overhead. Another example could be read
. In the Linux API,
the function might be declared like
ssize_t read(int file_descriptor, void *buffer, size_t buff_size);
From the caller's perspective,
- The caller provides a file descriptor, a buffer (an array), and the size of the buffer
- The caller does not need to initialize the buffer's contents because its initial state doesn't matter
From the callee's perspective (inside read
),
- The function treats the provided buffer as uninitialized
- It reads data from the file descriptor into the buffer
- It then returns the number of bytes successfully read
It can be thought of in Ada-like pseudocode as follows:
procedure ReadFile(Result_Buffer : out Buffer_Type; Num_Bytes_Read : out Integer) is begin -- The buffer is assumed to be uninitialized. -- Read data from the file and fill in Result_Buffer. -- Set Num_Bytes_Read to the number of bytes read. end ReadFile;
This is more efficient, because instead of copying a potentially large buffer from the caller to the callee (as in call by value), the callee writes directly into the buffer provided by the caller.
Additionally, this mechanism, allows the function to "return" both the updated buffer and the count of the bytes read (multiple return values), specifically, for our read
example, what happens is that
- The return value would be the number of bytes read. Our function returns this as normal.
- The call-by-result parameter, i.e. our buffer parameter was passed in uninitialized. Inside the function, the read operation fills this buffer with data. Then, just before the function returns, the final contents of that buffer are copied back to the caller's buffer.
So, although the function explicitly returns the number of bytes read, the buffer itself is updated by the callee using call-by-result semantics. This lets the function effectively "return" multiple values (both the buffer and the count) without having to package them into a single tuple or structure.
Another similar convention is call-by-value-result. It is similar to call by result but involves an extra copying step:
- The caller passes a copy of the argument (like call by value)
- The callee works on that copy
- Upon return, the callee copies the (possibly modified) value back to the caller
This scheme avoids aliasing issues because the callee is operating on a local copy during execution. However, it t requires two copies (one when entering the function and one when returning), which might be inefficient for large data.
Another "convention" are macro calls. They are quite different from parameter passing methods discussed earlier, and exist in languages like C, C++, Scheme
. They work as follows
- They are compile-time evaluated: the compiler expands the macro call by copying and substituting code text. This occurs during compilation, not at runtime.
- Essentially they do test substitution. Rather than evaluating arguments and passing values or addresses, the macro mechanism directly inserts the code provided in the macro definition.
Suppose we have a simple macro in C
, #define SQUARE(x) ((x) * (x))
. When you write SQUARE(5)
in your code, the preprocessor replaces it with ((5) * (5))
during compilation. This is purely a compile-time text substitution. There's no runtime cost of function calling or parameter passing.
Finally, we discuss call-by-name. In call by name, the caller does not evaluate the argument expression immediately. Instead, it passes a “recipe” for computing the value—often implemented as a parameterless function (sometimes called a thunk). As a mechanism,
- The callee receives a function (or a reference to an expression) instead of a direct value
- Each time the callee needs the argument's value, it calls this function, thus reevaluating the expression in the caller's context.
Imagine F
which takes two call-by-name parameters
// Define F which expects thunks (parameterless functions) for its arguments function F(getX, getY) { // When F needs x or y, it calls the functions to evaluate the expressions. valueX = getX() // Evaluates the expression for x valueY = getY() // Evaluates the expression for y // Use valueX and valueY in computations return valueX + valueY; }
In order to call F
with call by name, semantics, it would look something like
// Caller defines the expressions but does not evaluate them immediately. N = number_of_items; sum = compute_sum_of_array(); // Pass the expressions as parameterless functions (thunks) result = F( function() { return sum; }, function() { return sum / N; } );
In terms of advantages, we have
- Reliability (in certain cases). For example if an expression might cause an error (such as division by zero) when evaluated, call by name will only compute it if it's actually needed. More specifically, if the function checks for
N =
0= before callinggetY()
then the dangerous expression (sum / N
) is never evaluated when it would cause a crash. Similarily, we would also avoid infinite loops.- Specifically, consider a function where one of the arguments might cause an error (e.g., division by zero or entering an infinite loop). With call by name, if the function’s body never uses that argument, the dangerous expression is never evaluated.
- We also can avoid unnecessary computation. If the callee never uses a particular argument, its potentially expensive computation is entirely skipped.
However, it also has a lot of drawbacks. Namely,
- It comes with a huge performance overhead. Every time the callee needs the value, the thunk is called. This repeated function call can be less efficient than computing the value once.
- There is also a bit of implementation complexity. The underlying machine code must support this deferred evaluation, making the process more complicated than simple value copying.
We go through an example that Eggert discussed in lecture. Imagine a function that prints the average of items in an array. With call by value, the expression for the average might be computed before the function is called—leading to a potential crash when dividing by zero. With call by name, the function can check the condition before evaluating the dangerous expression. In call-by-value, we would do something like
// Assume sum and N are computed before calling print_average. print_average(sum / N, N); // Division by zero occurs if N is 0, even if not used in the function.
whereas in call-by-name, we can check:
function print_average(getAverage, getCount) { if (getCount() == 0) { print("No items to average."); } else { print("Average is " + getAverage()); } } // Caller passes thunks to delay evaluation. print_average( function() { return sum / N; }, function() { return N; } );
In Scheme, call-by-name might look something like
(define (F getX getY) ;; When F needs the values, it calls the functions: (let ((x (getX)) (y (getY))) (+ x y))) ; For example, here we add the two results. ;; Suppose we have: (define a 10) (define b 20) ;; Calling F with thunks: (F (lambda () a) (lambda () (+ b 7))) ;; Or, the example from earlier ;; Define a function that prints an average only if the count is non-zero. (define (print-average getAverage getCount) (if (= (getCount) 0) (display "No items to average.") (display (string-append "Average is " (number->string (getAverage)))))) ;; Caller wraps the expressions in lambdas (thunks) so they are not evaluated immediately. (define sum 100) (define N 0) (print-average (lambda () (/ sum N)) ; Dangerous expression (division) that won’t be evaluated if N==0. (lambda () N))
Oftentimes, you, as the programmer, don’t actually write these thunks explicitly. Instead, the compiler automatically transforms the argument expressions into thunks (or similar constructs), so that the function call effectively delays the evaluation of these expressions.
The term lazy evaluation, a programmy strategy where evaluation of an expression is deferred until its value is actually needed, is sometime synonymous with call-by-name. On the other hand, eager evaluation is the opposite of lazy evaluation; arguments are evaluated before the function call, and is generally thought of as all the other methods (call-by-value, by-reference, by-result, by-value-result).
This brings us to call-by-need, which is an improvement over call-by-name (the "eager" evaluation version). Instead of evaluating the argument every time it’s used, the result is cached (or “memoized”) after the first evaluation. Subsequent accesses use the cached value, thereby avoiding repeated computation.
One example of the problem with pure call by name is that if you are inside a loop and an argument (thunk) is used repeatedly, call by name forces re-evaluation each time. This can be inefficient if the computation is expensive.
On the other hand, in this scenario, under call-by-need, when the thunk is called the first time, its result is stored. Any subsequent use of that argument returns the cached value rather than recalculating it. Take this example in Haskell
that demonstrates lazy evaluation -
-- A lazy list of all prime numbers (conceptual; actual implementation may vary) primes :: [Integer] primes = sieve [2..] -- Imagine sieve is a function that lazily computes primes -- Accessing the 7th prime in the list. -- Only as many primes as needed will be computed. seventhPrime :: Integer seventhPrime = primes !! 6 -- List indexing starts at 0
There are a few interesting things about Haskell to note here:
- Infinite data structure - Haskell can represent infinite lists because it computes only as much of the list as is needed. In the above example, when we ask for the 7th prime, Haskell computes just enough of the prime sequence.
- Goal-oriented computation - The program is “read backwards”: you first see what is needed (the 7th element), then determine which computations are required to supply that element.
- Call-by-need efficiency - Once a value is computed (like an element of the prime list), it’s cached. This avoids recalculating the same value repeatedly, which is especially useful in loops or repeated function calls.
Finally, we will do what prolog
uses, which is call-by-unification, which was discussed earlier but will be discussed again here. Call-by-unification is a method where arguments are not passed as already computed values but rather as expressions or terms that include variables. These variables can be uninstantiated (or partially instantiated) when passed to a predicate. The callee (or predicate) then “unifies” these terms—matching and binding variables to values during the computation.
Unification is the process of making two logical terms equal by finding a substitution for variables that makes them identical, and is central to Prolog's execution model. Importantly, it enables
- Partial information: unlike call-by-value or call-by-result, unification allows variables to remain free (uninstantiated) until the predicate provides more information.
- For example, a predicate may accept an argument that is not fully known until further constraints are applied.
- Bidirectional flow of information: the caller and callee can both contribute to determining the value of a variable.
- The callee might bind a variable to a value.
- Later, the caller might further refine that value based on additional context.
Call-by-unification is “maybe the closest is call by value-result,” in that both allow the callee to modify data that is then “returned” to the caller. However, call-by-unification is more flexible because variables can remain uninstantiated and be filled in as the predicate executes.
% Define a predicate that adds two numbers and unifies the result with a third. add(X, Y, Z) :- Z is X + Y. % Query examples: % 1. Fully instantiated arguments: ?- add(3, 4, Result). % This will unify Result with 7. % 2. Partially instantiated query: ?- add(3, Y, 7). % Unification will determine Y = 4, since 3 + Y must equal 7. % 3. Uninstantiated output variable: ?- add(X, Y, 10). % This predicate might leave X and Y underconstrained unless additional conditions are provided.
In the above examples, the predicate add/3 is called with various degrees of known information:
- When all arguments are given except the result, the predicate computes the sum and unifies the result.
- When one of the input values is missing, Prolog uses unification to solve for the missing value that satisfies the equation.
This mechanism lets the program “fill in” unknowns during execution. The same predicate can be used to compute sums or to infer missing numbers based on the known result.
Beyond various argument-passing strategies, many calling conventions rely on extra, hidden parameters (or "transformed parameters"), not explicitly written in the source code but are cruicial for
- OOP:
this
orself
is an implicit pointer passed to methods that refers to the current object- Also, extra slots for objects (some overhead, i.e. a
type
slot, orprototype
, or for the garbage collector (size, etc))
- Also, extra slots for objects (some overhead, i.e. a
- Static and dynamic chains:
- Static chain; Used for nested functions, it’s a pointer to the defining environment (the definer’s frame) so that non-local variables can be resolved.
- Dynamic chain; A pointer to the caller’s frame (for example,
rbp
on x86-64) that helps maintain context during function calls.
- The return address; i.e., the address to which control returns after a function call, often stored as the top word on the stack or in a dedicated register.
- Thread-local-storage (TLS); A pointer to data that is local to the current thread, allowing efficient access to thread-specific variables without locking.
1.13. Cost Models
First, there is the idea of high-level vs. low-level software. Typically, we trade off ease-of-use vs. speed. In ML, there are
- High level frameworks (PyTorch), which are easy to use and lead to rapdi development, but may run mich slower because they incur overheads that hide low-level optimizations
- and low-level systems (CUDA) which are optimized for perofrmance on specific hardware, but are difficult to program in and the code is made to be highly specialized.
We digress and discuss GPU cost models and the memory hierarchy. The latest NVIDIA chips are the Blackwell series, which (like other GPUs), have the following memory hierarchy:
- Registers
- Located in each “SM” (streaming multiprocessor).
- Very fast, but extremely limited in size (e.g., registers might total only a few hundred bytes per SM).
- L1 cache
- Faster than main memory; closer to the computational unit.
- Capacity might be around tens of megabytes (e.g., 24 MB on the B100 module).
- L2 cache
- Larger but slower than L1; shared among multiple SMs (e.g., 128 MB).
- HBM3e (high-bandwidth memory)
- Acts like very fast RAM; capacities can be very high (e.g., 196 GB).
- Much faster than DDR memory in laptops.
The implications are clear:
- Computations that fit into registers run extremely fast.
- If your data spills into slower memory (like L1 or HBM), performance suffers.
- Cost models for GPU programming are very different from traditional CPU models, where memory was assumed to be random access with uniform speed.
There are these evolving cost models
- Code optimized for the previous generation (like the A100) may run slower on a Blackwell chip because the sizes and speeds of different memory levels have shifted.
- Reoptimizing low‑level C++/CUDA code may take years as programmers learn the new cost model of the hardware.
This brings us to: what is a cost model?
It's a mental model of computiong resources that you need to understand in order to operate your software at the greatest efficiency needed or efficient execution.
We also need to be wary of both big-O type cost models, but also absolute cost-models. If you know your software is going to run on some bounded sizes of the problem, it may not always be true that some asymptotically better runtime will be better than some other algorithm. Here, we will worry about the constant factor / absolute cost. In terms of costs, we can consider time, memory. There are trade-offs (i.e. trade time for space).
Quick digression: CPU time is the actual time where the CPU does computation, whereas real time also includes the time when your program is sitting around and waiting for input (i.e. includes I/O overhead). On multi-threaded applications, real-time can be faster than CPU time.
Returning, we also consider power and energy. i.e. you may need to run low-power. There's also I/O access time and space, reliability cost (we assume machines are perfect, though), and network access time. Minimizing accesses on and off the netowrk is important here. We will focus on speed cost models.
1.13.1. Cost models in programming languages
Let's look at the cost model for lists. The basic model is that we have some list of N
itees, represented in RAM in e.g. the following way
[ 0 ] -> [ 1 ] - [ 2 ] -> [ 3 ] -> ... Addr 0 Addr 300 Addr 1391 ...
although the boxes are in nice numeric order, the addresses are somewhat random, but we don't really care since we have RAM.
Under this cost model, what's the cost of append
?
First, let's consider Lisp
. In Lisp
, a list is a chain of "cons" cells (each cell contains a value and a pointer to the next cell). Under an append
operation,
we create a new list that copies the cells from the first list and then links to the second list. Under this, the cost is proportional to the length of the first list.
List A: [A1] -> [A2] -> [A3] -> NIL List B: [B1] -> [B2] -> NIL append(A, B) produces: [A1] -> [A2] -> [A3] -> [B1] -> [B2] -> NIL
Now, we turn to lists in Python
. In Python
, they are implemented as dynamic arrays. II.e., they contain a header with a pointer to the array of items, and the current length and allocated capacity. When we do
append
, in the common case, adding one element is \(O(1)\) because the array has extra capacity. Although occasionally a new, larger array must be allocated and copied, the average (amortized) cost per append is still \(O(1)\).
Briefly, Prolog's unification cost model is as follows. When comparing two logical terms, the cost of unification is roughly proportional to the size of the smaller term. This must be built into the cost model for logic programs, similar to how list operations are analyzed in imperative languages.
What about low-level array acess and optimization in our cost model?
First, array access calls. Assume that we have an array of 64-bit (8-byte) doubles. Then the address of element A[i]
can be computed by base_address + (i * 8)
. How can we calculate this as quickly as possible?
First, we place the address of the array and the index i
in register. Then, notice that multiplication by 8 can be replaced with a left shift by 3 bits:
// Given: double *A is the base address, and i is the index. // Instead of: address = A + (i * 8); // Do: address = A + (i << 3); // Left shift i by 3 (i.e., multiply by 8)
this part of the reason primitive types are powers of two. However, there are some complications with just doing this all the time. Some languages (e.g., Java, JavaScript) require subscript checking to ensure indices are in range. The naive implementation would be to do two comparisons (i < 0 and i >= length) with two branches. We can slightly optimize by using an unsigned comparison to combine these checks into a single conditional branch
We also care about cache effects on array access. Cache lines (e.g., 64 bytes) determine how data is loaded from memory. Using power-of-2 sizes can lead to cache conflicts. Use row that are not exact powers of 2 (or even prime numbers) so that different array elements map to different cache lines.
For example, when summing a column in a 2D array, if all rows are aligned with a power‑of‑2 size, many elements may hash to the same cache line, causing frequent cache evictions. By choosing a row size that is not a power of 2, the accesses will spread out over multiple cache lines, leading to better performance
1.14. Rust
At a glance
The initial focus of Rust is memory safety. Importantly, compared to other languages like Python, Scheme and OCaml which have memory safety done at runtime (dynamically), Rust does the memory safety statically - at compile time. However, Rust's memory safety does not extend to all memory accesses. In this way, the dynamic checks also not 100% sure all the time. Also, OCaml does do some memory safety at compile-time.
1.14.1. Digression: Static checking of encryption-based program
Suppose you have a program which wants to compute something, but you want to do it in a way that you can run it on some other machine, like say at Amazon, but we don’t want Amazon to know what you are doing. Mathematically, we want the program \[ y = f(x) \] and we tell Amazon \[ E(y) = F_E(E(x)) \] What we tell Amazon is something else. We encrypt \(x\), we ship it off to Amazon, and Amazon gives you the encrypted version of \(y\), which it computes by some more complicated program \(F_E\). \(F_E\) is an encrypted program that can read its input, compute, compute the encrypted output, but such that if you inspect \(F_E\) while it's running, you can't debug it (effectively). I.e. it is slower, and "impossible" to debug. We are interested in the question: how can we debug this?. So, dynamic analysis (debugger, etc) will not work. This leaves only static checking of the original program and the compiler (from \(F\) to \(F_E\)).
1.14.2. Rust: Performance, Ownership and Borrowing
The mantra is 0 cost abstraction: we want to operating at a higher-level than machine code, but we want the same performance of machine code.
Ownership
Every value in Rust has a unique owner. When the owner goes out of scope, the value is automatically cleaned up (this is often referred to as RAII in C++ terms). The compiler statically enforces that there is exactly one owner at any time, preventing issues like double frees or memory leaks.
Borrowing
Instead of transferring ownership, you can borrow references to a value. Rust distinguishes between mutable and immutable references.
- Immutable references allow multiple parts of the program to read a value concurrently.
- Only one mutable reference is allowed at a time, ensuring that no two parts of the program can modify the same data simultaneously (thus avoiding data races).
The Rust compiler uses these rules to check your code at compile time. If you violate ownership or borrowing rules (for example, by trying to have a mutable reference while immutable ones exist), the compiler will produce an error.
Because data races are prevented by the ownership/borrowing system, writing safe multi-threaded code in Rust is much easier. You can use multi-core systems effectively without the traditional pitfalls of locking in C or C++.
We also tackle garbage collection by preventing a class of errors (i.e. dangling pointer, memory leaks) from occurring.
When a reference variable drops out of scope, if it owns the objects, its drop
method is invoked, and that must clean up. The drop method
will also automatically go
through all of the slots in the object and drop them as well, i.e., pointers, etc. The transitive closure is automatic. The memory management parts are automatic.
Note that immutable in Rust is a bit tricky. First, we compare to C
's const. We have can have a pointer declared as
pointing to some constant value (=const int *p
). Although the pointer p
is declared to point to a constant integer (you cannot
modify the value through p
), it does not guarantee that the object itself will never change. For instance,
int *q = ...; // q points to the same integer as p, and q can modify it. int x = *p; // ... some operations ... int y = *p;
even though p
is a pointer to const, the underlying integer might change via another pointer like q
, so we cannot guarantee x
and y
are equal.
We have the same notion in rust. "immutable" does not guarantee that an object will never change. Instead,
immutability in Rust means that, in the current state or context, the value is not allowed to be modified ("shared access"). This allows us
to make some guarantees on data races. In C
, a data race occurs at the intersection of shared data and mutable state.
Functional programming makes guarantees on this by making everything immutable. Rust's approach is different. Rust kills the combination of shared data and mutable state, but not the two individually.
If an object is accessed via an immutable reference, it signals that the object should not be changed while that reference is active. However, the object itself might later become mutable once those immutable references are no longer in scope.
In order to have true memory safety, Rust also has
- Runtime bounds checking: We keep track of an array's bounds, and if it's out of bounds, we
panic
(error handling). The practical implication is that an array rust is no longer just an array with a pointer to a base address, it's a "fat pointer" that also includes an upper bound, and potentially a lower bound. At runtime, whenever we do a subscript check, which costs us at least one comparison and potential branch. You can do a trick with for loops here, so you don't have to check everything every single time (i.e. just check loop bounds). However, it is still slower. - No pointer arithmetic: i.e. can only point to start of the array
- No free operation:
drop
is called for you automatically - No null pointer: if you want your "nullpointer" in Rust, you wrap things in
Option
like how OCaml does it.
All of these points attack the following problems
- Use after free,
- Double free,
- Buffer overrun,
- Data races,
- and null pointer dereferencing
common errors in C
, but not in Rust
. However, this cuts down performance. As an option, Rust
has the keyword unsafe
which brings
us back to the C/C++
world and get the performance. The goal is to not use unsafe
as much as possible, and it's typical
that this happens for the code that runs most of time (for the performance benefits). However, this is OK, since it will have to be
tested and used a lot. Can be thought of as the Linux Kernel, which is used all the time so gets rigorously tested in this way.
1.14.3. Rust failure handling
Form 1: Rust has a "panic" primitive (panic("ouch");
), which kills the program with a backtrace to help you with debugging, and unwinds the stack, and finally exits.
You can configure the panic primitive if you want, to not exit as quickly as possible.
Form 2: Result
, declared as
enum Result<T, E> { Ok(T), Err(E) }
which can be thought of as a discriminant union, similar to the option
in OCaml. Can be thought of
as simply being a pointer to an object, and the object either says "I'm OK" and have a value of type T
,
or "I'm not OK" and have a value of type E
. It's similar to an exception, but there is no try-catch,
which means you can use it in a function and pass the information (the type) back up to your callers. In some ways,
you can write exception-handling "by hand". This is better for low-level code, where you need to be worried about
what errors can happen all the time, and want low-level control over it.
1.14.4. Cargo build system
Does things like dependency management, builds (make
), and testing, benchmarks, and documentation. Lint checkers, formatter,
all the things you normally expect in an ecosystem.
Eggert sees the problem of subscript checking, and would rather have it at compile-time than run-time. Subscript checking still isn't there, and we should expect progress here.
1.15. Programming Language Semantics
Semantics are "what does a program mean?" Software developers should care about this (and not just philosophers) because we want tools that run programs and analyze programs, and importantly, we want our tools to be correct.
We have the classic distinction between syntax and semantics. The former is easy, the latter is easy. Semantics is split into two subcategories:
- Dynamic semantics asks the question, "what does the program mean, as you run it?"
- Static semantics asks the question, "how much of the program's meaning can you determine, before you start running the program?"
In general, static semantics are easier for traditional software programs, whereas dynamic semantics are in general harder.
1.15.1. Static semantics
One way to determine static semantics is through attribute grammars. The ideas is that you take your parse tree, but
you "annotate" it with attributes on each node (nonterminals), i.e. types and scope. For example, if we have a plus node, and we have a leaf that is
an identifier (float
) and some other indentifier (int
). Then for the plus, we can compute the attribute and find that e.g. it becomes an int
.
The way we formalize is this is through grammars. For example, we have some grapper rule Expr → Expr + Expr
. Now, we also have an extra rule
with "subscripts", and write a rule like Expr.type = unify(Expr₁.type, Expr₂.type)
. We can do this for every grammar rule that we have.
This approach suffice for attributes called synthesized attributes, which is an attribute that has the information "flow up" from the leaves up to the root. Unfortunatelly, not all attributes are that simple. Consider, for example, attributes that deal with scope. For scope, we need a symbol table that maps identifier names to types. It is an inherited attribute that is passed down from a parent node to its children. Each scope has its own symbol table, and when you enter a new block/scope, the block's scope is represented by a new symbol table that inherits all entries from the parent, and potentially add new entries. You could also flow upwards, but it's easier to think about it flowing down.
Static semantics is very disciplined.
Dynamic semantics
This is the hard part. We discuss 3 forms: operational, axiomatic, and denotational semantics.
1.15.2. Operational semantics
The basic idea is that you want a programming language L's semantics. If you don't know how the language works, what you can do is "ask someone else to explain L to you", by writing a program that will interpret correctly, i.e., read the source code for an interpreter/compiler for L. Typically, this interpreter is written in some simpler language. Once you learn one language, you can learn what another language means, by "seeing what its programs do", and then you observe the program B, written in L, under some interpreter I, written in some simpler language K, You can either see the behavior, or even run it through a debugger.
Usually, what you'll do is read the source code, "mentally understand it", and then you don't have to run the program – you will understand the language.
Consider the following example, where we write the semantics for ML, in prolog. We want to effectively describe OCaml in Prolog, where
m
is a prediction in Prolog which determines the meaning of an Ocaml expressionE
is an ML expressionC
is a context (tell me all the values of the variables I can see, map variable names to values)R
is the result of evaluatingE
inC
constants represent themselves, we want function calls (call/2
), and atoms represent variables. We write the following
call/2 = call(F, E) % represents: F E fun(I, E) % represents: fun I -> E let(I, E, F) % represents: let I=E in F m(E, C, R). m(E, _, E) :-integer(E). % constants evaluate to themselves m(E, C, R) % Variables (atoms) are looked up in the context :-atom(E), member(E=R, C). % Lookup variable from context, a list. !, % We add this because if the same variable appears more than once in % the context (shadowing), it may return outermost binding instead of innermost. % to prevent prolog from backtracking, we add a cut ! m(let(I, E, F), C, R) :- m(E, C, V), % Evaluate E in the current context to get value V m(F, [I=V|C], R). % Evaluate F where I bound to V, growing the context % When you define fun(I, E), you produce a function value represented % as lambda(I, C, E) m(fun(I, E), C, lambda(I, C, E)). % We do lambda(*) because it has to be a function, so we indicate this to prolog m(call(lambda(I, C, E1), E), C, R) % Call a function on an argument :- m(let(I,E,E1), C, R)
If we wanted to implement Prolog in Ocaml, we would probably want to define variables, atoms and compount terms
type term = | Var of string (* A variable, e.g. "X" *) | Atom of string (* An atom, e.g. "foo" *) | Compound of string * term list (* A compound term, e.g. f(a, X) *)
also clauses, which are either a fact or a rule
type clause = | Fact of term (* e.g., parent(john, mary). *) | Rule of term * term list (* e.g., ancestor(X, Y) :- parent(X, Z), ancestor(Z, Y). *)
and then the most important is unify, which would need to write a unification function in Ocaml:
(* A substitution maps variable names to terms *) type substitution = (string * term) list (* Attempt to unify two terms, returning a substitution if successful *) val unify : term -> term -> substitution option
we could write a unification in OCaml like so
(* A substitution maps variable names to terms *) type substitution = (string * term) list (* Apply a substitution to a term *) let rec apply_subst (subst : substitution) (t : term) : term = match t with | Var x -> (match List.assoc_opt x subst with | Some t' -> apply_subst subst t' (* recursively apply, in case t' is also a variable *) | None -> Var x) | Atom _ -> t | Compound (f, args) -> Compound (f, List.map (apply_subst subst) args) (* Extend the substitution by binding x to t *) let extend (subst : substitution) (x : string) (t : term) : substitution = (x, t) :: subst (* A simple unification function that attempts to unify two terms with an initial substitution *) let rec unify (t1 : term) (t2 : term) (subst : substitution) : substitution option = let t1 = apply_subst subst t1 in let t2 = apply_subst subst t2 in match (t1, t2) with | (Atom a1, Atom a2) -> if a1 = a2 then Some subst else None | (Var x, t) | (t, Var x) -> (* If they are the same variable, no change is needed *) if t = Var x then Some subst else (* For simplicity, we do not do an occurs-check here *) Some (extend subst x t) | (Compound (f1, args1), Compound (f2, args2)) -> if f1 = f2 && List.length args1 = List.length args2 then unify_lists args1 args2 subst else None | _ -> None (* Helper function to unify two lists of terms *) and unify_lists (l1 : term list) (l2 : term list) (subst : substitution) : substitution option = match (l1, l2) with | ([], []) -> Some subst | (t1 :: rest1, t2 :: rest2) -> (match unify t1 t2 subst with | None -> None | Some subst' -> unify_lists rest1 rest2 subst') | _ -> None (* A wrapper function that starts with the empty substitution *) let unify_terms t1 t2 = unify t1 t2 []
we have circular dependencies in operational semantics, e.g. in Lisp. Although it refers to itself in its operational rules, the semantics are defined inductively (or recursively) so that evaluation eventually bottoms out on base cases (like evaluating constants or primitives). This self-reference is handled by the language’s ability to resolve recursive definitions, ensuring that well-formed expressions eventually evaluate in a consistent manner.
1.15.3. Axiomatic semantics
We want to write the logic for how to reason about a program
\[{P}S{Q}\]
the above "S" is a statement, where "P" is a precondition (a boolean statement which is true or false), and ”Q” is a
postcondition (a boolean statement that should be true after ”S” is executed. The meaning of this notation is "If P is true and S runs then finishes, then Q is true afterwards." For example,
we could write {i < 0}i : = i + 1; {i ≤ 0}
, literally \({P}S{Q}\). We want to write a set of axioms for our program that will allow us
to prove whatever we need. In Pascal, we have somet thing like
Prove: {P}If B then T else E{Q} Assuming no side effects in B We need to prove 2 things: {P^B}T{Q} and {P^notB}E{Q} Prove: {P}while W do S{Q} {P^W}S{P} {P^notW}S{Q} Catch: this is valid but doesn’t check that the loop terminates. Will also need to prove that a loop terminates.
a huge advantage of axiomatic semantics over operational semantics is that we can prove things about the program before running it.
1.15.4. Denotational semantics
When you want to compute the denotational semantics of a program, it will always l be a function which takes byte sequences to byte sequences and an exit status. \[ \text{semantics}(P) = \text{function from inp to out} \] the semantics of i.e. a C program is a function of byte sequence to (byte sequence, exit status). To figure out the semantics of a program, you write down this function.
We can map these semantics to types of semantics to our three different types of programming languages: imperative, functional, and logic. Imperative corresponds to operational, logic to axiomatic, functional to denotational.
1.16. Notes on Python asyncio
and Future
A coroutine is a special function defined with async def
taht can pause its execution using the await
keyword. When you call an async function, it returns a coroutine object rather than running immediately.
This coroutine must be scheduled on an event loop (managed by asyncio
) to be executed. Some basic syntax
async def my_coroutine():
defines a coroutineawait some_async_operation()
pauses the coroutine until the awaited operation (which must be an awaitable) completes
A Future is an object is an object that represents a result that isn't available yet. In asyncio
, many operations return a Future
that will eventually be resolved with a result.
- You can think of a Future as a promise to provide a result later.
- Coroutines often await Futures to pause execution until the result is ready.
import asyncio async def say_hello(): print("Hello") # Simulate an asynchronous operation by sleeping await asyncio.sleep(1) print("World") return "Done" # Running the coroutine using asyncio's event loop. async def main(): # The coroutine returns a Future-like object. result = await say_hello() print("Result:", result) # This will run the main coroutine. asyncio.run(main())
say_hello()
is a coroutine that prints a message, waits asynchronously for 1 second, then prints another message- the
await asyncio.sleep(1)
statement pausessay_hello
without blocking the whole program asyncio.run(main())
starts the event loop, schedulesmain()
and waits for its completion.
Threads run concurrently but each thread has its own system-level overhead (e.g., context switching, memory consumption). Threads can run in parallel on multiple CPUs, but Python’s Global Interpreter Lock (GIL) often prevents true parallel execution for CPU-bound tasks.
I/O-bound tasks (such as reading from disk or waiting for a network response) frequently block. When using threads, one thread might block waiting for I/O while others can run, but managing many threads can be inefficient and error-prone (e.g., dealing with race conditions, locks, and thread-safety).
In terms of advantages, coroutines are much lighter than threads. They allow thousands of concurrent tasks to run in the same thread without the overhead of OS-level threads. It allows it to pause while waiting for I/O without blocking the entire process. Additionally, thanks to it being in a single-threaded event loop, you typically avoid many of the synchronization issues (deadlocks, race conditions) common in threaded programming.
2. TODO:
- Language "at a glance" for each one
- Practice / get intuitive feeling for writing code in scheme, prolog, know basics in Java, re-view ocaml
- Review first half
- Reread notes and make sure everything makes sense.
- Organize notes + index (Wed)
- Review first half (Wed/Thurs)
Wed:
- Print out notes and understand things well, make sure everything is nice and organized
- Prep notes for printing
- Do all the worksheets
- Review first half material, especially grammars, ambiguity, HW2, OCaml, etc
- Scheme
call/cc
video - Do the LA practice final
- Get intuitive feeling for writing code in all the languages, what's important, etc
- Print out homeworks that are annotated
- Make sure to get all "language at a glance stuff" correct and spend time making your notes good
- Summarize all the concepts you've learned and the "Eggert vocab" in a single page
Thurs:
- Review weak points, especially writing code in the languages
- Review hard concepts, make sure everything is understandable
- Do practice exams, review notes