Rosetta Lisp is a set of Lisp-1 implementations that share the VM instruction set, built-in functions, and the bootstrapping code.
Currently there are 6 implementations available. Each of them is implemented in different languages.
- ocalisp in OCaml (Reference implementation)
- scalisp in Scala 3
- golisp in Go
- fslisp in F#
- kokalisp in Koka 3
- idrlisp in Idris 1 (Out of date)
- wonderlisp in Rosetta Lisp itself
The design of the Rosetta Lisp implementation is heavily inspired by SECD machine. Every S-expression is macro-expanded and then compiled into a code, a sequence of instructions. Rosetta Lisp implementations share an extremely simple, small instruction set that targeting an SECD-like virtual machine.
Rosetta Lisp VM consists of four states:
[S]tack
- a list of values[E]nvironment
- an abstract representation of a collection of key-value pairs[C]ode
- a list of instructions[D]ump
- a list of pairs of Environment and Code
Push a value
on top of S
.
(define (ldc value)
(lambda (S E C D k)
(k (cons value S) E C D)))
Find a binding for name
from E
and push the value on top of S
.
(define (ldv name)
(lambda (S E C D k)
(k (cons (find-env name E) S) E C D)))
Make a function capturing E
and push it on top of S
.
(define (ldf pattern code)
(lambda (S E C D k)
(k (cons (make-function-closure pattern code E) S) E C D)))
Make a macro capturing E
and push it on top of S
.
(define (ldm pattern code)
(lambda (S E C D k)
(k (cons (make-macro-closure pattern code E) S) E C D)))
Find a built-in function named name
and push it on top of S
.
(define (ldb name)
(lambda (S E C D k)
(k (cons (find-builtin-function name) S) E C D)))
Pop a value from S
, set C
to then-code
if the value is #f
, set C
to
else-code
otherwise. Set E
to a new environment derived from E
. Push the
previous E
and C
on top of D
.
(define (sel then-code else-code)
(lambda ((s . S) E C D k)
(k S
(new-env E)
(if s then-code else-code)
(cons (cons E C) D))))
Pop a pair of Environment and Code from D
and set E
and C
to it.
(define (leave)
(lambda (S E C ((e . c) . D) k)
(k S e c D)))
Pop argc
values as function arguments from S
, pop a function from S
, call
the function with the arguments.
(define (app argc)
(lambda ((argN argN-1 ... arg1 f . S) E C D k)
(apply-function f (list arg1 .. argN) S E C D k)))
apply-function
is defined for built-in functions and function closures
obtained by ldf
. apply-function
on function closures is defined as follows:
(define (apply-function (FUNCTION-CLOSURE pattern code env)
args
S E C D k)
(k S
(bind-args-with-pattern args pattern (new-env env))
code
(cons (cons E C) D)))
Pop a value from S
.
(define (pop)
(lambda ((s . S) E C D k)
(k S E C D)))
Pop a value, create a binding from name
to the value on E
.
(define (def name)
(lambda ((s . S) E C D k)
(k S (env-define name s E) C D)))
Pop a value, update a binding from name
to the value on E
.
(define (set name)
(lambda ((s . S) E C D k)
(k S (env-set name s E) C D)))
All literals and quoted expressions are compiled into ldc
:
; compiling 123
[0 entry]
ldc 123
All unquoted symbols are compiled into ldv
:
; compiling foo
[0 entry]
ldv foo
Lists are compiled into a sequence of element evaluations and an app
.
; compiling (compare a b)
[0 entry]
ldv compare
ldv a
ldv b
app 2
; compiling (+ foo (* bar baz) 111)
[0 entry]
ldv +
ldv foo
ldv *
ldv bar
ldv baz
app 2
ldc 111
app 3
All other instructions are produced by Syntax. Applications of Syntax are
not compiled into an usual app
.
(builtin cons)
produces a ldb
.
[0 entry]
ldb hello
(if a b c)
produces a sel
and two branch codes terminated by a leave
.
[0 entry]
ldv a
sel [1 then] [2 else]
[1 then]
ldv b
leave
[2 else]
ldv c
leave
(begin a b c)
produces a sequence of evaluations and pop
s.
[0 entry]
ldv a
pop
ldv b
pop
ldv c
(fun (x y) y)
produces a ldf
, (macro (x y) x)
produces a ldm
. Body of
functions are terminated by a leave
but macros are not.
[0 entry]
ldf [1 fun (x y)]
[1 fun (x y)]
ldv y
leave
[0 entry]
ldm [1 macro (x y)]
[1 macro (x y)]
ldv x
(def x 123)
produces a def
, (set! x 123)
produces a set
. Both of them
also produce a ldc ()
to adjust stack size.
[0 entry]
ldc 123
def x
ldc ()
[0 entry]
ldc 123
set x
ldc ()
(quote x)
is also one of Syntax that produces a ldc
.
All other functionalities are injected as built-in functions. Every implementation provides the same built-in function set. By doing this, Every implementation can use the same bootstrapping code to get frequently used functions and macros.
To reduce the size of required built-in functions, there are a lot of functions and macros that are defined in the bootstrapping code instead of in host languages. It's not optimal in terms of performance, but it makes easy to add another Rosetta Lisp implementation.
In the following documentation, result<a>
means a result cons cell.
- If
car
is#t
, the resulta
is set tocdr
. - If
car
is#f
, the failure error informationstr
is set tocdr
.
name | overloads |
---|---|
cons |
(_, _) -> cons |
car |
(cons) -> _ |
cdr |
(cons) -> _ |
name | overloads |
---|---|
exit |
(?num) -> ! |
error |
(?str) -> ! |
NOTE: quote
is a syntax, quasiquote
and unquote
are implemented on
boot.lisp
.
name | overloads | note |
---|---|---|
gensym |
() -> sym |
Must not overlap with symbol literals |
name | overloads |
---|---|
apply |
(proc, list) -> _ |
call/cc |
(proc) -> _ |
never |
(proc, ..._) -> ! |
name | overloads | note |
---|---|---|
num? |
(_) -> bool |
|
sym? |
(_) -> bool |
|
str? |
(_) -> bool |
|
cons? |
(_) -> bool |
|
nil? |
(_) -> bool |
|
bool? |
(_) -> bool |
|
proc? |
(_) -> bool |
Built-in functions and user functions |
meta? |
(_) -> bool |
Language syntax and user macros |
vec? |
(_) -> bool |
name | overloads |
---|---|
+ |
(...num) -> num |
- |
(num, ...num) -> num |
* |
(...num) -> num |
/ |
(num, ...num) -> num |
% |
(num, ...num) -> num |
name | overloads |
---|---|
= |
(..._) -> bool |
< |
(...num) -> bool , (...str) -> bool |
> |
(...num) -> bool , (...str) -> bool |
<= |
(...num) -> bool , (...str) -> bool |
>= |
(...num) -> bool , (...str) -> bool |
String encoding is not specified, but is expected to be ASCII compatible through these operators. This requirement is usually satisfied by using a Unicode scalar sequence or byte sequence.
name | overloads | note |
---|---|---|
str |
(...num) -> str | ! |
Each num corresponds to a character. Raises an error on invalid characters |
str-char-at |
(str, num) -> num | nil |
Takes an index. Returns a character |
str-length |
(str) -> num |
Returns character count |
str-concat |
(...str) -> str |
|
substr |
(str, num, num) -> str | ! |
Takes a str, index, and length. Raises an error on out of range |
sym->str |
(sym) -> str |
|
num->str |
(num) -> str |
|
str->num |
(str) -> num | nil |
vec
is a type for minimal support of mutable/fixed-length sequential data.
name | overloads | note |
---|---|---|
vec |
(..._) -> vec |
|
vec-make |
(num, _) -> vec |
Takes a length and initial value |
vec-length |
(vec) -> num |
|
vec-get |
(vec, num) -> _ | nil |
Get by index |
vec-set! |
(vec, num, _) -> nil | ! |
Set by index. Raises an error on out of range |
vec-copy! |
(vec, num, vec, num, num) -> nil | ! |
Takes a dest, dest-start-index, src, src-start-index, and length |
name | overloads | note |
---|---|---|
read-file-text |
(str) -> result<str> |
Takes a file path. Returns contents of the file |
write-file-text |
(str, str) -> result<nil> |
Takes a file path and contents |
read-console-line |
() -> result<str | nil> |
No line feed at end. Returns nil if terminated |
write-console |
(str) -> result<nil> |
name | overloads | note |
---|---|---|
args |
() -> list |
Returns command line arguments as list of str |
eval |
(_) -> result<_> |
Evaluates a S-expression |
macroexpand |
(_) -> result<_> |
Perform macro expansion |
macroexpand-1 |
(_) -> result<_> |
Perform one step macro expansion |
There are test suites for the functionalities of the parser, the compiler, and the VM. Rosetta Lisp's bootstrapping code includes unit tests for each built-in function, as comments.
(def cons (builtin cons))
;! > (cons 1)
;! fail
;! > (cons 1 2)
;! (1 . 2)
;! > (cons 1 2 3)
;! fail
Since it's troublesome to parse these comments, there is a unified, easy-to-parse test file available. This test file is generated by ./gentest.