The Lisp Machine is an extensible Lisp Machine embedded into the Primus Machine. The Lisp machine is used to provide function stubs (summaries), as well as to control the Primus Machine using a dialect of Lisp.
Primus Lisp is a dialect of Lisp, that can be used to interact with a native program. Primus Lisp is close to Common Lisp and to the Emacs Lisp dialect.
Primus Lips is a low-level language that doesn't provide many abstractions, as it tries to be as close to the machine language as possible. In that sense Primus Lisp can be seen as an assembler, except that it can't really assemble binaries, as it operates over already existing and assembled program. Primus Lisp, however is still quite powerfull, as the absence of suitable abstractions is compensated with powerful and versatile meta-programming system.
Primus Lisp is primarily used for the following tasks:
A Primus Lisp program is a file, that can contain the following entities:
The entities may be specified in any order, however the above order constitutes a good programming practice.
Each file provides (implements) a feature, that has the same
name as the name of the file without an extension and
directories. Thus the namespace of features is flat. A feature
is usually a function, macro definition, or any other
definition, or a collection of definition, gathered under the
same theme. For example, the
getopt feature implements C
getopt function, and accompanying definitions. The features
maybe very specific, i.e., providing an implementation for
only one small function, or they can be a collection of other
features. For example, the
posix feature provides an
implementation of all functions specified in the POSIX
standard (not all at the time of writing).
A collection of files is called a library. To use features
provided by another file, the file should be requested with
(require <ident>) form, where
<ident> is the name of
the feature. A file with the requested name is searched in the
library, and loaded, making all its definitions available in
the lexical scope, that follows the
require form. If a
feature is already provided, then nothing
happens. Dependencies should not contain cycles.
Top-level declarations specify attributes that are shared by all definitions in a file. For example, a declaration
(declare (context (arch armv7))
makes all definitions visible only in the context of the ARMv7 architecture.
Constants and substitutions are primitive abstractions that
give names to code fragments. Macros are program
transformations. Functions add parameters to a code, and are
basic building blocks. Scope of all definitions can be limited
with the context declarations. Finally, a function can be
advised with another function using the
A type defines all possible values of an expression. In Primus
Lisp, expression values can be only scalar, i.e., machine
words of different widths. The width is always specified in
the number of bits. A maximum width of a word is equal to the
width of the architecture machine word, thus a family of types
is dependent on the context of evaluation. (Note, current
implementation limits maximum width of the machine word to 64
bits). We denote a type of expression with a decimal number,
(exp : 16) means that an expression ranges over all 16
An expression can have a polymorphic type
t that denotes a
powerset of all types for the given architecture. For example,
for ARMv7, t = 32 \/ 31 \/ .. \/ 1 . Thus a value of any
type, is a also a value of type
Side note -- the type system doesn't include the unit type,
0 type. An expression
() evaluates to the
Functions are named abstractions of code, where a code is a sequence of expressions. Since a value of an expression is a machine word, functions are not first-class values in Primus Lisp. However, functions and types can be manipulated on the meta-programming level.
A function is defined with the
defun form, that has the
(defun <name> (<arg> ...) <exp> ...)
A list of arguments (that can be empty) defines function arity. Functions in Primus Lisp has fixed arity, unlike macros.
A function definition may optionally contain a documentation strings and a declaration section. For example,
(defun strlen (p) "returns a length of the null-terminated string pointed by P" (declare (external "strlen")) (msg "strlen was called with $p") (let ((len 0)) (while (not (points-to-null p)) (incr len p)) len))
A function can be called (applied) using the function application form:
(<name> <exp> ...)
The first element of the function application form is not an expression and must be an identified. The rest arguments are expressions, that are evaluated from left to right. All arguments are passed by value.
The body of a function is a sequence of expressions, that is evaluated in the lexical order (i.e., from left to right). A value of the last expression is the result of the function evaluation. An expression is either a function application or or a special form. Primus Lisp defines only 5 special forms, the rest of the syntax is defined using the macro system.
The (if <test-expr> <then-expr> <else-expr> ...) form is a basic control flow structure. If <test-expr> evaluates to a non-zero word then the result of the <if> form is the result of evaluation of the <then-expr>, otherwise a sequence of <else-expr> ... is evaluated and the result of the form evaluation would be a result of the last expression in a form
(if (< 4 3) (msg "shouldn't happen") (msg "that's right") (- 4 3))
Note that the the <else-expr> sequence maybe empty.
Several derived forms are defined as macros, e.g.,
(when <cond> <expr> ...) (or <expr> ...) (and <expr> ...)
Iterations can be implemented either using recursion or with
while special form. Since the interpreter doesn't
provide the tail-call optimization it is better to use the
latter (although the interpreter itself is using a constant
stack size, as it uses the host language heap memory to
represent the Primus Lisp call stack).
The (while <cond> <expr> ...) form, will evaluate the
<cond> expression first, and if it is a non-zero value, then
the sequence of expressions <expr> ... is evaluated, and
the value of the last expression becomes the value of the
while form. If the value of the <cond> expression is a
false value, then this value becomes the value of the
let form binds values to names in the lexical scope.
(let (<binding> ...) <body-expr> ...) binding ::= (<var> <expr>)
Evaluates each <binding> in order binding the <var> identifier to a result of <expr>. The newly created binding is available in consequent bindings and in the <body-expr>, but is not visible outside of the scope of the let-form.
(let ((x 4) (y (+ x 2))) (+ x 3))
The value of the
let form is the value of the last
expression <sN> .
The (prog <expr> ...) form combines a sequence of
expressions into one expression, and is useful in the contexts
where an expression is required. The expressions are evaluated
from left to right, and the value of the
prog form is the
value of the last expression.
The (msg <fmt> <expr> ...) form constructs logging/debugging messages using an embedded formatting language. The formed message will be sent to the logging facility, that was set up during the Primus Lisp library initialization.
The format language interprets all symbols literally, unless they start with the dollar sign ($).
A pair of characters of the form $<n> , where <n> is a decimal digit, will be substituted with the value of the n'th expression (counting from zero).
(msg "hello, $0 $0 world, (+ 7 8) = $1" "cruel" (+ 7 8))
will be rendered to a message:
"hello, cruel cruel world, (+ 7 8) = 15"
Ordinary Primus Lisp expressions are evaluated at the runtime in the Primus emulator, and are quite limited as they need to be evaluated directly on the CPU model. To mitigate this limitation, Primus Lisp provides a powerful metaprogramming system. The metaprogram is evaluated when the Primus Lisp program is read. A metaprogram generates a program, that will be evaluated by the CPU. The metaprogram itself is Turing complete, thus any transformation can be applied to a program. The Primus Lisp metaprogramming system use term-rewriting as a computational model, with Lisp code fragments as terms. Primus Lisp provides three facilities for metaprogramming:
The syntactic constants is the simplest syntactic
substitution, it just substitutes atoms for atoms. Constants
are introduced with the
defconstant form, that has the
(defconstant <name> <atom>) (defconstant <name> <docstring> <atom>) (defconstant <name> <declarations> <atom>) (defconstant <name> <docstring> <declarations> <atom>)
(defconstant main-address 0xDEAD)
During the program parsing, each occurrence of the <name> term will be rewritten with the <value> term, that should be an atom.
The syntactic substitution is a generalization of syntactic constant, and has quite a similar syntax:
| (defsubst <name> <value> ...) | (defsubst <name> <declarations> <value> ...) | (defsubst <name> :<syntax> <value> ...) | (defsubst <name> <declarations> :<syntax> <value> ...)
During parsing, every occurrence of the term <name> (that should be an atom), will be rewritten with a sequence of values
(defsubst ten-digits 0 1 2 3 4 5 6 7 8 9)
A process of applying of the substitutions is called "expansion". Since the expansion transforms an atom to a list of atoms, it can be applied only inside the macro or function application. For example,
will be expanded to
(+ 0 1 2 3 4 5 6 7 8 9 )
Expansions also provide a support for extensible value
specification syntax, that enables domain-specific data
specification languages. Currently, we support only two
:ascii syntax the values should be atoms, possibly
delimited with double quotes. Each character of each atom will
be expanded to its corresponding ASCII code. Strings can
contain special characters prefixed with a backslash. The
special character can be one of the well-known ASCII special
\r, etc, or it can be a decimal or a
hexadecimal code of a character.
Example, given the following substitution:
(defsubst hello-cruel-world :ascii "hello, cruel world\n\000")
the following application:
(write-block SP hello-cruel-world
will be expanded with
(write-block SP 0x68 0x65 0x6c 0x6c 0x6f 0x2c 0x20 0x63 0x72 0x75 0x65 0x6c 0x20 0x77 0x6f 0x72 0x6c 0x64 0x0a 0x00)
:hex syntax the sequence of atoms is split into
two-characters subsequences each treated as a hex value. This
syntax is useful for encoding memory dumps in a format that is
close to the hexdump (without offsets). E.g., given the
following substitution rule
(defsubt example :hex 68656c 6c 6f2c2063)
(write-block SP example)
will be expanded into
(write-block SP 0x68 0x65 0x6c 0x6c 0x6f 0x2c 0x20 0x63)
The macros provide the most versatile and powerful way to specify arbitrary code transformations. The macro definitions introduce abstractions on the meta-programming level. I.e., it allows a programmer to write a function that operates on code terms, making the code a first class value.
The macro definition has the following syntax:
(defmacro <name> (<param> ...) <value>) (defmacro <name> (<param> ...) <docstring> <value>) (defmacro <name> (<param> ...) <declarations> <value>) (defmacro <name> (<param> ...) <docstring> <declarations> <value>)
A macro definition adds a term rewriting rule, that rewrites
each occurrence of (<name> <arg> ...) , where the number
N is greater or equal then the number of
M, with the <value> in which occurrences of the
ith parameter is substituted with the term
ith argument. If
M, then the last parameter is bound with the
sequence of arguments <argM>...<argN> .
The macro subsystem doesn't provide any specific looping or control-flow facilities, however, the macro-overloading mechanism along with the recursion make it possible to encode arbitrary meta-transformations.
Other than a standard context-based ad-hoc overloading mechanism, the macro application uses the arity-based resolution. As it was described above, if a number of arguments is greater than the number of parameters, then the last parameter is bound to the rest of the arguments. When several macro definitions matches, then a definition that has fewer unmatched arguments is chosen. For example, suppose we have the following definitions:
(defmacro list-length (x) 1) (defmacro list-length (x xs) (+ 1 (list-length xs)))
The the following term
(list-length 1 2 3)
will be normalized (after a series of transformations) with the following:
(+ 1 (+ 1 1))
1: (list-length 1 2 3) => (+ 1 (list-length 2 3)) 2: (+ 1 (list-length 2 3)) => (+ 1 (+ 1 (list-length 3))) 3: (+ 1 (+ 1 (list-length 3))) => (+ 1 (+ 1 1))
In the first step, both definition match. In the first
x is bound to
1 2 3, while in the second
x is bound to
xs is bound to
2 3. Since the last
parameter is bound to fewer arguments, the second definition
is chosen as the most certain. In the second step the second
definition is still more concrete. Finally at the last step,
the second definition doesn't match at all, as it has more
parameters than arguments.
A slightly more complex example, is a fold iterator, that applies a function to a sequence of arguments of arbitrary length, e.g.,
(defmacro fold (f a x) (f a x)) (defmacro fold (f a x xs) (fold f (f a x) xs))
Using this definition we can define a sum function (although
it is not needed as the
+ function defined in the Primus Lisp
standard library, already accepts arbitrary number of
(defmacro sum (xs) (fold + 0 xs))
The (sum 1 2 3) will be rewritten as follows:
1: (sum 1 2 3) => (fold + 0 1 2 3) 2: (fold + 0 1 2 3) => (fold + (+ 0 1) 2 3) 3: (fold + (+ 0 1) 2 3) => (fold + (+ (+ 0 1) 2) 3) 4: (fold + (+ (+ 0 1) 2) 3) => (+ (+ (+ 0 1) 2) 3)
A more real example is the
write-block macro, that takes a
sequence of bytes, and writes them starting from the given
(defmacro write-block (addr bytes) (fold memory-write addr bytes))
The definition uses the
memory-write primitive, that writes
a byte at the given address and returns an address of the next
Primus Lisp provides both kinds of polymorphism: parametric and ad hoc.
Expressions in Primus Lisp have types, that are denoted with natural numbers starting with one. Each type defines a set of values that can be represented with the given number of bits. Values with different widths are different, even if they represent the same number. Expressions can be polymorphic, e.g., function
(defun square (x) ( * x x))
has type `forall n. n -> n -> n`. Thus it can be applied to
values of different types, e.g.,
(square 4:4), that will be
evaluated to the
0:4 value, or
(square 4:8), that will be
16:8, etc. The parametric polymorphism doesn't
require any special annotations or type specifications so we
will not stop on it anymore.
The ad hoc polymorphism provides a facilities for overloading definitions. That is, the same entity may have multiple definitions, and depending on a context, only one definition should be chosen. Not only functions can have multiple definitions, but also macros, constants, and substitutions. Since the latter three entities operate on the syntactic level, the syntax of Primus Lisp itself is context-dependent.
A context (from the perspective of the type system) is a set of type class instances. The context is fixed when a program is parsed. A Primus Lisp program may not change the context; neither in runtime, nor it the parse time, thus a program is parsed and evaluated at the specific context. However, a definition may declare that it makes sense only in some context. If more than one definition make sense under the given context, then the most specific one is chosen. If no definition is more specific than another, then an error occurs.
A type class defines a type as a set of features. The subset
relation induces a subtyping relation over types - a type
is a subtype of a type
t' <= t (i.e., if
t' is a
t). Each feature is a textual tag, called a
The context declaration limits an associated definition to the specified type class(es), and has the following syntax:
(declare (context (<type-class> <feature> ...) ...))
Let's use the following two definitions for a concrete example,
(defmacro get-arg-0 () (declare (context (arch arm gnueabi))) R0) (defmacro get-arg-0 () (declare (context (arch x86 cdecl))) (read-word word-width (+ SP (sizeof word-width))))
We have two definitions of the same macro
are applicable to different contexts. The first definition,
is only applicable in the context of the ARM architecture and
the gnueabi ABI. The second is applicable in the context of
the x86 architecture and the cdecl ABI. More formally, a
definition is considered only if its context is a subtype of
the current type context.
Primus Lisp also provides a mechanism for non-intrusive extending existing function definitions. An existing definition maybe advised with another definition. A piece of advice maybe added to a function that will be called either before or after the evaluation of an advised function, e.g.,
(defun memory-written (a x) (declare (advice :before memory-write)) (msg "write $x to $a"))
This definition not only defines a new function called
memory-written, but also proclaims it as advice function to
memory-write function that should before it is called.
If an advisor is attached before the advised function, then it the advisor will be called with the same arguments as the advised function. The return value of the advisor is ignored. The advisor function will be called as a normal Lisp function, with all expected overloading and name resolving. So it is possible to provide context specific advice. If there are several advice to the same function, then they will be called in the unspecified order.
An advisor that is attached after the advised function will be called with one extra argument - the result of evaluation of the advised function. The value returned by the advisor will override the result of the advised function. If there are several advisors attached after the same function, then they will be called in the unspecified order.
The Primus Observation system is reflected onto the Primus
Lisp Machine Signals. Every time a reflected observation
occurs the Lisp Machine receives a signal that is dispatched
to handlers. A handler can be declared defined with the
defmethod form, e.g.,
(defmethod call (name arg) (when (= name 'malloc) (msg "malloc($0) was called" arg)))
defmethod form follows the general definiton template,
i.e., it can contain a docstring and declaration section, and
selection and resolution rules are applicable to
methods. Methods of the same signal are invoked in an
Each entity is an s-expression with the grammar, specified
below.We use BNF-like syntax with the following conventions.
Metavariables are denoted like
stands of any number of
<this> (possibly zero). Ordinary
parentheses do not bear any notation, and should be read
literally. Note, since the grammar is not context free, and is
extensible, the following is an approximation of the language
grammar. Grammar extension points are defined with the
'?extensible?'comment in a production definition.
module ::= <entity> ... entity ::= | <feature-request> | <declarations> | <constant-definition> | <substitution-definition> | <parameter-definition> | <macro-definition> | <function-definition> | <method-definition> feature-request ::= (require <ident>) declarations ::= (declare <attribute> ...) constant-definition ::= | (defconstant <ident> <atom>) | (defconstant <ident> <atom> <docstring>) | (defconstant <ident> <atom> <declarations>) | (defconstant <ident> <atom> <declarations> <docstring>) parameter-definition ::= | (defparameter <ident> <atom>) | (defparameter <ident> <atom> <docstring>) | (defparameter <ident> <atom> <declarations> <docstring>) substitution-definition ::= | (defsubst <ident> <atom> ...) | (defsubst <ident> <declarations> <atom> ...) | (defsubst <ident> :<syntax> <atom> ...) | (defsubst <ident> <declarations> :<syntax> <atom> ...) macro-definition ::= | (defmacro <ident> (<ident> ...) <exp>) | (defmacro <ident> (<ident> ...) <docstring> <exp>) | (defmacro <ident> (<ident> ...) <declarations> <exp>) | (defmacro <ident> (<ident> ...) <docstring> <declarations> <exp>) function-definition ::= | (defun <ident> (<var> ...) <exp> ...) | (defun <ident> (<var> ...) <docstring> <exp> ...) | (defun <ident> (<var> ...) <declarations> <exp> ...) | (defun <ident> (<var> ...) <docstring> <declarations> <exp> ...) method-definition ::= | (defmethod <ident> (<var> ...) <exp> ...) | (defmethod <ident> (<var> ...) <docstring> <exp> ...) | (defmethod <ident> (<var> ...) <declarations> <exp> ...) | (defmethod <ident> (<var> ...) <docstring> <declarations> <exp> ...) exp ::= | () | <var> | <word> | <sym> | (if <exp> <exp> <exp> ...) | (let (<binding> ...) <exp> ...) | (set <var> <exp>) | (while <exp> <exp> <exp> ...) | (prog <exp> ...) | (msg <format> <exp> ...) | (<ident> <exp> ...) binding ::= (<var> <exp>) var ::= <ident> | <ident>:<size> attribute ::= | (external <ident> ...) | (context (<ident> <ident> ...) ...) | (<ident> ?ident-specific-format?) docstring ::= <text> syntax ::= :hex | :ascii | ... atom ::= <word> | <text> word ::= ?ascii-char? | <int> | <int>:<size> sym ::= '<atom> int ::= ?decimal-octal-hex-or-bin format? size ::= ?decimal? ident ::= ?any atom that is not recognized as a <word>?
an abstract type representing a lisp program
an abstract type that represents messages send with the
Primus Lisp program loader
Lisp Type System.
Lisp Machine Message interface.
module type Closure =
Machine independent closure.
a closure packed as an OCaml value
module type Primitives =
a list of primitives.
val message :
message observation occurs every time a message is sent
from the Primus Machine.
Make(Machine) creates a Lisp machine embedded into the