Res ipsa loquitur — Latin, meaning "the thing speaks for itself"
Largely following the steps noted down in pg's The Roots of Lisp. Blissfully ignoring the admonitions in this axis-of-eval blog post.
>>> (car '(x))
x
>>> (eq 'foo (car '(foo)))
t
>>> ((lambda (x) (cons x '(b))) 'a)
(a b)
>>> (eval '((lambda (x) (cons x '(b))) 'a) '())
(a b)
- Make a REPL work in a web page
- Change the 'eval' evaluator to compare values, not symbols
- Change the 'eval' evaluator to pass function values, not lambda exprs
- Change the 'eval' evaluator to accept varargs
We take things in two big steps. The first, compiling to an intermediate representation. This makes some statically known or inferrable things clear in the code, and breaks it down into smaller steps. It's also straightforward to flatten the intermediate representation to bytecode later.
A variable lookup in code has two parts:
-
The static part, where the variable lookup is resolved to the innermost binder which defines it. The resulting lookup information, if successful, is of the form "M steps out, slot number N".
-
The dynamic part, where a value is looked up using the "M steps out, slot number N" information, together with an environment, a linked list of at least N frames, the Nth of which has at least M slots.
In case the static lookup is successful, we generate (lookup M N)
.
In case it wasn't, we generate (error (concat "Variable " name " not found"))
.
Quotation proceeds through a nested list structure, in preorder.
- Leaf nodes are either symbols or the empty list.
- A symbol generates
(symbol N)
, whereN
is a number (like au64
) indexing into a global symbol registry. If the symbol wasn't already in this registry, we add it and give it a fresh index. - The empty list generates
(empty-list)
.
- A symbol generates
- Internal nodes cons the results of their children together, right-to-left.
- Start by generating
(empty-list)
; call thisL
. - For each child node
e
, right-to-left:- Calculate
(cons e L)
; call this new listL
.
- Calculate
- The end result is
L
.
- Start by generating
A conditional has this form:
(cond c1 e1
c2 e2
...
cN eN)
Equationally, this is equivalent to a simpler if
:
(if c1 e1
(if c2 e2
...
(if cN eN
(error "Fell off cond"))))
Let's consider a single if
:
(if c e-then e-else)
For the intermediate format, we use "label binders":
(fwd-label AFTER-IF
(fwd-label AFTER-THEN
(jump-unless-nil c AFTER-THEN)
e-then
(jump AFTER-IF))
e-else)
This is still a nested format, but it's much easier to generate linear bytecode from it, thanks to the named labels.
A function application looks like this:
(f a1 a2 ... aN)
There are two challenges here:
-
In order to maintain a "flat" intermediate representation, we need to compute all the arguments
a1 a2 ... aN
, and store them in temporary registers. -
In order for the "call function" opcode to have bounded size, the call must only make use of indexed slots, both for the function and all the arguments.
The resulting intermediate code looks something like this:
(set-slot sf f)
(set-slot s1 a1)
(set-slot s2 a2)
...
(set-slot sN aN)
(call sf s1 s2 ... sN)
A lambda generates a function value, but let's call it a closure for greater impact. A closure has two parts:
- The environment, supplied by the runtime. This is so that the function body can access variables declared outside the function.
- The code, consisting of two parts:
- A parameter list, but this is really a nonnegative integer.
- Instructions, the result of recursively compiling the function body.
Importantly, the code part is compiled/prepared once, and can then be re-used with a different environment in each created closure.
Generates as (closure code)
, where code
is an index into a global code
registry. Again, the environment is supplied by the runtime.
An expression (label name expr)
generates the following:
(rec expr)
Where expr
has been compiled in a new scope that binds name
to its single
slot.
| Form | Type |
|============================|===================|
| (lookup M N)
| IrLookup
|
| (error msg)
| IrError
|
| (symbol index)
| IrSymbol
|
| (empty-list)
| IrEmptyList
|
| (cons e L)
| IrCons
|
| (fwd-label lbl IR)
| IrFwdLabel
|
| (jump-unless-nil e lbl)
| IrJumpUnlessNil
|
| (jump lbl)
| IrJump
|
| (set-slot r e)
| IrSetSlot
|
| (call sf s1 s2 ... sN)
| IrCall
|
| (closure index)
| IrClosure
|
| (rec e)
| IrRec
|