Skip to content

A set of Lisp-1 implementations that share the VM instruction set, built-in functions, and the bootstrapping code

License

Notifications You must be signed in to change notification settings

yubrot/rosetta-lisp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rosetta Lisp

Rosetta Lisp is a set of Lisp-1 implementations that share the VM instruction set, built-in functions, and the bootstrapping code.

Implementations

Currently there are 6 implementations available. Each of them is implemented in different languages.

VM Design

The design of the Rosetta Lisp implementation is heavily inspired by SECD machine. Every S-expression is macro-expanded and then compiled into a code, a sequence of instructions. Rosetta Lisp implementations share an extremely simple, small instruction set that targeting an SECD-like virtual machine.

VM Instruction Set

Rosetta Lisp VM consists of four states:

  • [S]tack - a list of values
  • [E]nvironment - an abstract representation of a collection of key-value pairs
  • [C]ode - a list of instructions
  • [D]ump - a list of pairs of Environment and Code

ldc value

Push a value on top of S.

(define (ldc value)
  (lambda (S E C D k)
    (k (cons value S) E C D)))

ldv name

Find a binding for name from E and push the value on top of S.

(define (ldv name)
  (lambda (S E C D k)
    (k (cons (find-env name E) S) E C D)))

ldf pattern code

Make a function capturing E and push it on top of S.

(define (ldf pattern code)
  (lambda (S E C D k)
    (k (cons (make-function-closure pattern code E) S) E C D)))

ldm pattern code

Make a macro capturing E and push it on top of S.

(define (ldm pattern code)
  (lambda (S E C D k)
    (k (cons (make-macro-closure pattern code E) S) E C D)))

ldb name

Find a built-in function named name and push it on top of S.

(define (ldb name)
  (lambda (S E C D k)
    (k (cons (find-builtin-function name) S) E C D)))

sel then-code else-code

Pop a value from S, set C to then-code if the value is #f, set C to else-code otherwise. Set E to a new environment derived from E. Push the previous E and C on top of D.

(define (sel then-code else-code)
  (lambda ((s . S) E C D k)
    (k S
       (new-env E)
       (if s then-code else-code)
       (cons (cons E C) D))))

leave

Pop a pair of Environment and Code from D and set E and C to it.

(define (leave)
  (lambda (S E C ((e . c) . D) k)
    (k S e c D)))

app argc

Pop argc values as function arguments from S, pop a function from S, call the function with the arguments.

(define (app argc)
  (lambda ((argN argN-1 ... arg1 f . S) E C D k)
    (apply-function f (list arg1 .. argN) S E C D k)))

apply-function is defined for built-in functions and function closures obtained by ldf. apply-function on function closures is defined as follows:

(define (apply-function (FUNCTION-CLOSURE pattern code env)
                        args
                        S E C D k)
  (k S
     (bind-args-with-pattern args pattern (new-env env))
     code
     (cons (cons E C) D)))

pop

Pop a value from S.

(define (pop)
  (lambda ((s . S) E C D k)
    (k S E C D)))

def name

Pop a value, create a binding from name to the value on E.

(define (def name)
  (lambda ((s . S) E C D k)
    (k S (env-define name s E) C D)))

set name

Pop a value, update a binding from name to the value on E.

(define (set name)
  (lambda ((s . S) E C D k)
    (k S (env-set name s E) C D)))

Compilation to VM instructions

All literals and quoted expressions are compiled into ldc:

; compiling 123
[0 entry]
  ldc 123

All unquoted symbols are compiled into ldv:

; compiling foo
[0 entry]
  ldv foo

Lists are compiled into a sequence of element evaluations and an app.

; compiling (compare a b)
[0 entry]
  ldv compare
  ldv a
  ldv b
  app 2

; compiling (+ foo (* bar baz) 111)
[0 entry]
  ldv +
  ldv foo
  ldv *
  ldv bar
  ldv baz
  app 2
  ldc 111
  app 3

All other instructions are produced by Syntax. Applications of Syntax are not compiled into an usual app.

(builtin cons) produces a ldb.

[0 entry]
  ldb hello

(if a b c) produces a sel and two branch codes terminated by a leave.

[0 entry]
  ldv a
  sel [1 then] [2 else]
[1 then]
  ldv b
  leave
[2 else]
  ldv c
  leave

(begin a b c) produces a sequence of evaluations and pops.

[0 entry]
  ldv a
  pop
  ldv b
  pop
  ldv c

(fun (x y) y) produces a ldf, (macro (x y) x) produces a ldm. Body of functions are terminated by a leave but macros are not.

[0 entry]
  ldf [1 fun (x y)]
[1 fun (x y)]
  ldv y
  leave
[0 entry]
  ldm [1 macro (x y)]
[1 macro (x y)]
  ldv x

(def x 123) produces a def, (set! x 123) produces a set. Both of them also produce a ldc () to adjust stack size.

[0 entry]
  ldc 123
  def x
  ldc ()
[0 entry]
  ldc 123
  set x
  ldc ()

(quote x) is also one of Syntax that produces a ldc.

Built-in functions

All other functionalities are injected as built-in functions. Every implementation provides the same built-in function set. By doing this, Every implementation can use the same bootstrapping code to get frequently used functions and macros.

Built-in functions required by the bootstrapping code

To reduce the size of required built-in functions, there are a lot of functions and macros that are defined in the bootstrapping code instead of in host languages. It's not optimal in terms of performance, but it makes easy to add another Rosetta Lisp implementation.

In the following documentation, result<a> means a result cons cell.

  • If car is #t, the result a is set to cdr.
  • If car is #f, the failure error information str is set to cdr.

Cons cell operators

name overloads
cons (_, _) -> cons
car (cons) -> _
cdr (cons) -> _

Terminators

name overloads
exit (?num) -> !
error (?str) -> !

Macro supports

NOTE: quote is a syntax, quasiquote and unquote are implemented on boot.lisp.

name overloads note
gensym () -> sym Must not overlap with symbol literals

Control operators

name overloads
apply (proc, list) -> _
call/cc (proc) -> _
never (proc, ..._) -> !

Test operators

name overloads note
num? (_) -> bool
sym? (_) -> bool
str? (_) -> bool
cons? (_) -> bool
nil? (_) -> bool
bool? (_) -> bool
proc? (_) -> bool Built-in functions and user functions
meta? (_) -> bool Language syntax and user macros
vec? (_) -> bool

Arithmetic operators

name overloads
+ (...num) -> num
- (num, ...num) -> num
* (...num) -> num
/ (num, ...num) -> num
% (num, ...num) -> num

Relational operators

name overloads
= (..._) -> bool
< (...num) -> bool, (...str) -> bool
> (...num) -> bool, (...str) -> bool
<= (...num) -> bool, (...str) -> bool
>= (...num) -> bool, (...str) -> bool

str operators

String encoding is not specified, but is expected to be ASCII compatible through these operators. This requirement is usually satisfied by using a Unicode scalar sequence or byte sequence.

name overloads note
str (...num) -> str | ! Each num corresponds to a character. Raises an error on invalid characters
str-char-at (str, num) -> num | nil Takes an index. Returns a character
str-length (str) -> num Returns character count
str-concat (...str) -> str
substr (str, num, num) -> str | ! Takes a str, index, and length. Raises an error on out of range
sym->str (sym) -> str
num->str (num) -> str
str->num (str) -> num | nil

vec operators

vec is a type for minimal support of mutable/fixed-length sequential data.

name overloads note
vec (..._) -> vec
vec-make (num, _) -> vec Takes a length and initial value
vec-length (vec) -> num
vec-get (vec, num) -> _ | nil Get by index
vec-set! (vec, num, _) -> nil | ! Set by index. Raises an error on out of range
vec-copy! (vec, num, vec, num, num) -> nil | ! Takes a dest, dest-start-index, src, src-start-index, and length

I/O operators

name overloads note
read-file-text (str) -> result<str> Takes a file path. Returns contents of the file
write-file-text (str, str) -> result<nil> Takes a file path and contents
read-console-line () -> result<str | nil> No line feed at end. Returns nil if terminated
write-console (str) -> result<nil>

Misc

name overloads note
args () -> list Returns command line arguments as list of str
eval (_) -> result<_> Evaluates a S-expression
macroexpand (_) -> result<_> Perform macro expansion
macroexpand-1 (_) -> result<_> Perform one step macro expansion

Tests

There are test suites for the functionalities of the parser, the compiler, and the VM. Rosetta Lisp's bootstrapping code includes unit tests for each built-in function, as comments.

(def cons (builtin cons))
;! > (cons 1)
;! fail
;! > (cons 1 2)
;! (1 . 2)
;! > (cons 1 2 3)
;! fail

Since it's troublesome to parse these comments, there is a unified, easy-to-parse test file available. This test file is generated by ./gentest.

About

A set of Lisp-1 implementations that share the VM instruction set, built-in functions, and the bootstrapping code

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages