Shirp is a tree-walk interpreter of a tiny subset of Scheme programming language. I use the R7RS-small documentation as a reference, and the behavior of the built-in functions mimics that of Gauche, an existing R7RS implementation.
Running make
in the root directory will build the interpreter executable named shirp
. You can start REPL by typing ./shirp
.
Shirp REPL prints >>
to prompt user input.
If the REPL receives incomplete user input(has unclosed parentheses), it prints ..
to prompt further input.
If you want to run a script, pass the file path to the executable shirp
, for example, ./shirp test.scm
.
This command will run the script and start the REPL with the environment generated by the script interpretation.
You can also run a script by calling load
function in the REPL, for example, by calling (load "test.scm")
.
Note that you need to run the interpreter in the directory where prelude.scm
is located.
prelude.scm
is a library file that defines some of the built-in procedures.
You can find many examples of Shirp executing Scheme programs in the test
directory, namely in test/expr.c
, test/kadai1.c
, test/kadai2.c
, test/quote.c
.
You can run these tests by typing make test
in the root directory.
Here are some selected examples of Shirp REPL executions.
>> (define a 42)
>> (define b (+ 10 1))
>> a
42
>> b
11
>> (if (< a b) 10 20)
20
>> (define g (lambda (a b) (+ a b)))
>> (g 3 5)
8
>> (define l (cons 1 (cons 2 (cons 3 (list)))))
>> l
(1 2 3)
>> (cadr l)
2
>> (define l (cons 4 (cons 5 '())))
>> l
(4 5)
>> (define l '(+ (+ 1 2) (- 7 8)))
>> l
(+ (+ 1 2) (- 7 8))
>> (define (fact n) (if (< n 2) 1 (* n (fact (- n 1)))))
>> (fact 5)
120
>> (define (fact-rec n acc) (if (< n 2) acc (fact-rec (- n 1) (* n acc)))
.. )
>> (fact-rec 5 1)
120
>> (cond (else 42))
42
>> (cond (#f 2) (#f))
#<undef>
>> (cond (4) (7) (else 88))
4
>> (cond (else 1 23 4 56 7))
7
>> (= 42 42 42 42 42)
#t
>> (= 42 42 42 -42 42)
#f
>> (< 1 2 3 4 5)
#t
>> (< 2 1 3 5 4)
#f
>> (> 1 2 3 4 5)
#f
>> (> 5 4 3 2 1)
#t
>> (>= 5 4 4 3 3 2 1)
#t
>> (even? 9)
#f
>> (odd? -11)
#t
>> (cadadr (cadadr (cadadr '(1 (2 (4 (5 (6 (7 3)))))))))
3
>> (reverse '(1 2 (3 4) 5 (6 7 (8 9) 10) 11))
(11 (6 7 (8 9) 10) 5 (3 4) 2 1)
>> (define lst1 (list 1 2 3))
>> (define lst2 (list 1 2 3))
>> (eq? lst1 lst2)
#f
>> (equal? lst1 lst2)
#t
>> (define a 100)
>> (define h (lambda (x) (lambda (y) (+ x (* a y)))))
>> (let ((a 0)) ((h 1) (+ a 5)))
501
>> (define f (+ h 7))
error: argument expected to be a number, but got `closure`
(define f (+ h 7))
^
>> unknowon_func
error: undefined variable: unknowon_func
unknowon_func
^~~~~~~~~~~~~
As shown in the examples above, Shirp supports the following fundamental features.
<identifier>
can contain letters, digits, and special characters !$%&*+-./:<=>?@^_~
.
<number>
is <integer>
or <floating>
.
<integer>
is a sequence of digits with an optional prefix character +
or -
.
<floating>
is a sequence of digits with a single .
and an optional prefix character +
or -
.
<string>
is a sequence of characters enclosed in double quotes "
. No escaping sequences are implemented.
<boolean>
is #t
, #true
, #f
or #false
.
All integers are represented as int64_t
, and all floating numbers as double
.
(eq? -.0 .0)
evaluates to #t
.
Shirp is intended to parse the following Extended Backus-Naur Form (EBNF) grammar.
<program> := <command or definition>+
<command or definition> := <command> | <definition>
<command> := <expression>
<definition> := (define <identifier> <expression>) | (define (<identifier> <def formals>) <body>)
<def formals> := <identifier>* | <identifier>* . <identifier>
<expression> := <identifier> | <literal> | <procedure call> | <lambda expression>
| <conditional> | <assignment> | <derived expression>
<literal> := <quotation> | <self-evaluating>
<self-evaluating> := <boolean> | <number> | <string>
<quotation> := '<datum> | (quote <datum>)
<procedure call> := (<operator> <operand>*)
<operator> := <expression>
<operand> := <expression>
<lambda expression> := (lambda <formals> <body>)
<formals> := (<identifier>*) | (<identifier>+ . <identifier>)
<body> := <definition>* <sequence>
<sequence> := <command>* <expression>
<conditional> := (if <test> <consequent> <alternate>)
<test> := <expression>
<consequent> := <expression>
<alternate> := <expression> | <empty>
<assignment> := (set! <identifier> <expression>)
<derived expression> := (cond <cond clause>+) | (cond <cond clause>* (else <sequence>))
| (and <test>*) | (or <test>*) | (let (<binding spec>*) <body>)
<cond clause> := (<test> <sequence>) | (<test>)
<binding spec> := (<identifier> <expression>)
<datum> := <simple datum> | <compound datum>
<simple datum> := <boolean> | <number> | <string> | <symbol>
<symbol> := <identifier>
<compound datum> := <list>
<list> := (<datum>*) | (<datum>+ . <datum>)
However, Shirp cannot evaluate <def formals> -> <identifier>* . <identifier>
and
<formals> -> (<identifier>+ . <identifier>)
correctly as of now.
Shirp represents the Scheme Environment as a stack of Frames. Each Frame is a hash table that maps identifiers to values.
Typically, when a procedure is called, a new Frame is pushed to the stack to save the local variables. Then, when the procedure returns, the Frame is popped from the stack to restore the previous environment.
Shirp supports proper tail recursion, which means that the interpreter does not push a new frame on the environment for procedure calls in tail contexts.
According to the R7RS-small documentation,
The last expression within the body of a lambda expression, shown as below, occurs in a tail context.
(lambda <formals> <definition>* <expression>* <tail expression>)
As is defined in the R7RS-small documentation, Shirp supports the following tail contexts.
(if <expression> <tail expression> <tail expression>)
(if <expression> <tail expression>)
(cond <cond clause>*)
(cond <cond clause>* (else <tail sequence>))
(and <expression>* <tail expression>)
(or <expression>* <tail expression>)
(let <binding spec>* <tail body>)
<cond clause> := (<test> <tail sequence>)
<tail body> := <definition>* <tail sequence>
<tail sequence> := <expression>* <tail expression>
Tail context search is implemented as a depth-first search in the AST(Abstract Syntax Tree).
+
-
*
/
=
<
>
<=
>=
div
remainder
car
cdr
null?
pair?
list?
symbol?
number?
list
cons
and
or
eq?
equal?
sqrt
load
even?
odd?
abs
cadr
caar
cddr
cadadr
caaddr
gcd
reverse
-
/
anddiv
works differently from the Scheme standard for integers. -
eq?
internal does not conform to the Scheme standard.
A very simple mark-and-sweep garbage collector is implemented in Shirp. Currently, it collects objects every time the interpreter finishes evaluating user REPL input. Thus, when the user gives many inputs to the REPL, Shirp takes a long time to respond to every user input. There is much room for improvement.