This document was motivated by a request from Paolo Amoroso for notes or other documentation on my work on SBCL. It's intended for developers who are familiar with the guts of CMU CL, as an overview of the changes made to CMU CL in order to produce SBCL. It was written for the initial release (sbcl-0.5.0) and has not been updated since then. There are two sections in this report: I. non-fundamental changes II. fundamental changes In this context, fundamental changes are changes which were directly driven by the goal of making the system bootstrap itself. Section I: non-fundamental changes Before I describe the fundamental changes I had to make in order to get the system to bootstrap itself, let me emphasize that there are many non-fundamental changes as well. I won't try to summarize them all, but I'll mention some to give some idea. (Some more information about why I made some of these changes is in the PRINCIPLES file in the distribution.) Many, many extensions have been removed. Packages have all been renamed; in the final system, the system packages have names which begin with "SB-". Mostly these correspond closely to CMU CL packages, e.g. the "C" package of CMU CL has become the "SB-C" package, and the "EXTENSIONS" package of CMU CL has become the "SB-EXT" package. Some other definitions and declarations have been centralized, too. E.g. the build order is defined in one place, and all the COMMON-LISP special variables are declared in one place. I've made various reformatting changes in the comments, and added a number of comments. INFO is now implemented as a function instead of a macro, using keywords as its first and second arguments, and is no longer in the extensions package, but is considered a private implementation detail. The expected Lisp function arguments and command line arguments for SAVE-LISP (now called SAVE-LISP-AND-DIE) and loading the core back into a new Lisp have changed completely. The SB-UNIX package no longer attempts to be a complete user interface to Unix. Instead, it's considered a private part of the implementation of SBCL, and tries to implement only what's needed by the current implementation of SBCL. Lots of stale conditional code was deleted, e.g. code to support portability to archaic systems in the LOOP and PCL packages. (The SB-PCL and SB-LOOP packages no longer aspire to portability.) Various internal symbols, and even some externally-visible extensions, have been given less-ambiguous or more-modern names, with more to follow. (E.g. SAVE-LISP becoming SAVE-LISP-AND-DIE, both to avoid surprising the user and to reserve the name SAVE-LISP in case we ever manage to implement a SAVE-LISP which doesn't cause the system to die afterwards. And GIVE-UP and ABORT-TRANSFORM have been renamed to GIVE-UP-IR1-TRANSFORM and ABORT-IR1-TRANSFORM. And so on.) Various internal names "NEW-FOO" have been changed to FOO, generally after deleting the obsolete old version of FOO. This has happened both with names at the Lisp level (e.g. "NEW-ASSEM") and at the Unix filesystem level (e.g. "new-hash.lisp" and "new-assem.lisp"). A cultural change, rather than a technical one: The system no longer tries to be binary compatible between releases. Per-file credits for programs should move into a single centralized CREDITS file Real Soon Now. A lot of spelling errors have been corrected.:-) Section II. fundamental changes There were a number of things which I changed in order to get the system to boot itself. The source files have been extensively reordered to fix broken forward references. In many cases, this required breaking one CMU CL source file into more than one SBCL source file, and scattering the multiple SBCL source files into multiple places in the build order. (Some of the breakups were motivated by reasons which no longer exist, and could be undone now, e.g. "class.lisp" could probably go back into "classes.lisp". But I think most of the reasons still apply.) The assembler and genesis were rewritten for portability, using vectors for scratch space instead of using SAPs. We define new readmacro syntax #!+ and #!- which acts like the standard #+ and #- syntax, except that it switches on the target feature list instead of the host feature list. We also introduce temporary new features like :XC-HOST ("in the cross-compilation host") and :XC ("in the cross-compiler") which will be used to control some of the behavior below. A new package SB-XC ("cross-compiler") was introduced to hold affecting-the-target versions of various things like DEFMACRO, DEFTYPE, FIND-CLASS, CONSTANTP, CLASS, etc. So e.g. when you're building the cross-compiler in the cross-compilation host Lisp, SB-XC:DEFMACRO defines a macro in the target Lisp; SB-XC:CONSTANTP tells you whether something is known to be constant in the target Lisp; and SB-XC:CLASS is the class of an object which represents a class in the target Lisp. In order to make everything work out later when running the cross-compiler to produce code for the target Lisp, SB-XC turns into a sort of nickname for the COMMON-LISP package. Except it's a little more complicated than that.. It doesn't quite work to make SB-XC into a nickname for COMMON-LISP while building code for the target, because then much of the code in EVAL-WHEN (:COMPILE-TOPLEVEL :EXECUTE) forms would break. Instead, we read in code using the ordinary SB-XC package, and then when we process code in any situation other than :COMPILE-TOPLEVEL, we run it through the function UNCROSS to translate any SB-XC symbols into the corresponding CL symbols. (This doesn't seem like a very elegant solution, but it does seem to work.:-) Even after we've implemented the UNCROSS hack, a lot of the code inside EVAL-WHEN forms is still broken, because it does things like CL:DEFMACRO to define macros which are intended to show up in the target, and under the new system we really need it to do SB-XC:DEFMACRO instead in order to achieve the desired effect. So we have to go through all the EVAL-WHEN forms and convert various CL:FOO operations to the corresponding SB-XC:FOO operations. Or sometimes instead we convert code a la (EVAL-WHEN (COMPILE EVAL) (DEFMACRO FOO ..)) (code-using-foo) into code a la (MACROLET ((FOO ..)) (code-using-foo)) Or sometimes we even give up and write (DEFMACRO FOO ..) (code-using-foo) instead, figuring it's not *that* important to try to save a few bytes in the target Lisp by keeping FOO from being defined. And in a few shameful instances we even did things like #+XC (DEFMACRO FOO ..) #-XC (DEFMACRO FOO .. or #+XC (code-using-foo) #-XC (other-code-using-foo) even though we know that we will burn in hell for it. (The really horribly unmaintainable stuff along those lines is three compiler-building macros which I hope to fix before anyone else notices them.:-) In order to avoid trashing the host Common Lisp when cross-compiling under another instance of ourself (and in order to avoid coming to depend on its internals in various weird ways, like some systems we could mention but won't:-) we make the system use different package names at cold init time than afterwards. The internal packages are named "SB!FOO" while we're building the system, and "SB-FOO" afterwards. In order to make the system work even when we're renaming its packages out from underneath it, we need to seek out and destroy any nasty hacks which refer to particular package names, like the one in %PRIMITIVE which wants to reintern the symbols in its arguments into the "C"/"SB-C"/"SB!C" package. Incidentally, because of the #! readmacros and the "SB!FOO" package names, the system sources are unreadable to the running system. (The undefined readmacros and package names cause READ-ERRORs.) I'd like to make a little hack to fix this for use when experimenting with and maintaining the system, but I haven't gotten around to it, despite several false starts. Real Soon Now.. In order to keep track of layouts and other type and structure information set up under the cross-compiler, we use a system built around the DEF!STRUCT macro. (The #\! character is used to name a lot of cold-boot-related stuff.) When building the cross-compiler, the DEF!STRUCT macro is a wrapper around portable DEFSTRUCT which builds its own portable information about the structures being created, and arranges for host Lisp instances of the structures to be dumpable as target Lisp instances as necessary. (This system uses MAKE-LOAD-FORM heavily and is the reason that I say that bootstrapping under CLISP is not likely to happen until CLISP supports MAKE-LOAD-FORM.) When running the cross-compiler, DEF!STRUCT basically reduces to the DEFSTRUCT macro. In order to be able to make this system handle target Lisp code, we need to be able to test whether a host Lisp value matches a target Lisp type specifier. With the information available from DEF!STRUCT, and various hackery, we can do that, implementing things like SB-XC:TYPEP. Now that we know how to represent target Lisp objects in the cross-compiler running under vanilla ANSI Common Lisp, we need to make the dump code portable. This is not too hard given that the cases which would be hard tend not to be used in the implementation of SBCL itself, so the cross-compiler doesn't need to be able to handle them anyway. Specialized arrays are an exception, and currently we dodge the issue by making the compiler use not-as-specialized-as-possible array values. Probably this is fixable by bootstrapping in two passes, one pass under vanilla ANSI Common Lisp and then another under the SBCL created by the first pass. That way, the problem goes away in the second pass pass, since we know that all types represented by the target SBCL can be represented in the cross-compilation host SBCL.