There has been a lot of talking about the problems of one-man open source projects and the bus, but little is said about the story from the other side. Like when one has too many ideas and too many goals and knows how to achieve them but fails on finding the 25th hour of the day or the two additional hands and another half a brain to code what has to be coded.
That said, right now the highest priority in ECL is to get the library component stable and going, and to solve problems like stability against interrupts or against stack overflows, or to optimize key functions and fix a bunch of bugs that are left around. That means this all thing with the compiler gets slowed down significantly.
On the other hand when looking at optimizations one sees that most of the library is quite fast, but the current lisp compiler is failing to identify simple types, simple inline forms, unless a lot of declarations are spread around the code. ECL does for instance pretty bad at the cl-bench benchmarks. We know what the problem is, we know the solution: move forward with the new compiler.
So some more work has been done. I have fixed the “Simple Language” or “Simple Lisp” if you want to call it that way, which is a set of s-expressions that will make our intermediate language for the compiler. It is simple enough and consists only on assignments to variables and three kinds of jumps that will be probably reduced to two. You can find the specification at the beginning of ssa-translate.lisp which is a rather new addition to the code.
Some of the names are pretty obvious, including JMP, JMP-IF-NOT, SET and CALL. The first two are jumps, with or without a condition, and the second two are assignments to a variable, either direct or from the output of a function. The creation of variables is of three types:
- We just assign it with SET, which means that the variable is really local to the function.
- We bind it to the lexical environment in two steps, first growing the environment with GROW-ENV and the assigning a value. Here UNBIND-LEXICAL is used, but it will be little more than a mark for when a variable stops being used.
- We bind it as a special variable with BIND-GLOBAL and then after some statements undo the binding with UNBIND-GLOBAL.
Note that the lexical environment is just treated as a set of variables, which keep an object that is grown or shrinked for adding new variables that can be closed over by children functions. These environment variables are already in SSA forms which means once we know how many variables are really used in closures we will be able to do a simplification pass and eliminate most of them.
The final ingredient is SETJMP/LONGJMP, which implement exactly what they name: unconditional jumps across function boundaries. They have these names because they are implemented using C’s setjmp() and longjmp(), but do a little more: they allow to pass a number of variables and the destination of the jumps is discriminated using some unique ids. We use these special operators to implement RETURN, nonlocal GOs, CATCH, UNWIND-PROTECT, etc. There is no innovation here, for this is indeed the mechanism used by ECL right now.
With all these ingredients we are currently able to translate almost any Common Lisp form to this intermediate language. Whatever is not provided by these forms can be implemented using macros and function calls. This is for instance how we encode the parsing of arguments. Take for instance
;;;
;;; Compiling:
;;; (LET ((X 1))
;;; #'(LAMBDA (Y) (+ X Y)))
;;; [...]
;;; Function name: ANONYMOUS-2
;;; Required arguments: 1
CALL <TMP-2> [GLOBAL/GET-REQUIRED] 0
GROW-ENV <ENV-3> <Y> NIL
SET <Y> <TMP-2>
SET <TMP-3> <X>
SET <TMP-4> <Y>
CALL <MV-635> [GLOBAL/+] <TMP-3> <TMP-4>
UNBIND-LEXICAL Y
At some later point we can inline the call to GLOBAL/GET-REQUIRED, but for all purposes we need not add special forms to this simple language and complicate the compiler.
Right now the translator is pretty functional and allows for almost everything that we contemplate in the current compiler. To see a few examples, just load “ssa.lisp” into your favourite lisp implementation and let it run.
The following steps to take are, in some random order, identifying function closures, simplification of GROW-ENV statements, splitting of code into connected blocks forming possibly cyclic graphs, eliminating dead code, eliminating unused variables, introducing unique names for the variables (i.e. SSA form), introducing phi forms, finding out variable lifetimes, first code emitter…