An proposal by Aaron W. Hsu
I believe that I am settling in one some concepts more definitely, so I would appreciate comments on this proposal.
One of the biggest issues I think the Working Groups face is how we deal with modules. There are a number of issues to discuss and this proposal introduces a fairly comprehensive proposal that incorporates modules, but also extends to touch on issues which may relate to or be affect by modules.
Modules at their core are a means of providing abstraction over bindings. They provide the ability to abstract over more than run time bindings, and allow one to abstract over syntax bindings as well.
However, modules serve another purpose as compilation units. They represents libraries of code, and make it possible to separately compile code and then link them together later.
The following proposal hopes to address the following issues:
Java's packaging format, Scheme48's module system, and R6RS' library format are arguably forms of static, out-of-band systems. R6RS however, suffers from requiring evaluation of the source code. PLT has units, and Chez Scheme has its module format, the former being an example of run-time modules (a la ML's functors), and the later being an example of syntactic modules.
There are three interesting module systems I would like to examine. R6RS and Scheme48 modules are relatively static, while Chez Scheme modules presents a fairly different bent. R6RS libraries are a half-in half-out sort of part of the language. They are not language forms in the sense that any macro may generate them. However, there is not built in standard means of separating a code's implementation from its interface. Thus, in R6RS libraries, you must include the code along with the interface to the code in a single top-level form that has a special meaning to implementations. Scheme48's module system is a little more outside of the system in that it has a separate means of defining interface and implementation of an interface, and is almost a mini-language or packaging system that happens to reference or potentially embed some code. Chez Scheme's module system is a syntactic form that is used by the expander to control visibility of bindings. It support separate compilation, but does not support separate interfaces and implementation by default. However, because Chez Scheme's modules do not suffer from the top-level only status of R6RS libraries, it is possible to extend Chez Scheme's modules to support the separation of interface from implementation. In fact, it is possible to use modules in any sort of macro expansion in many interesting ways.
Another interesting point of R6RS libraries is their use of implicit exports. Because macro bindings in the export form of an R6RS library are not marked as syntax explicitly, it is almost impossible for an implementation to statically analyze what additional bindings must be visible to the outside world in the case that the given exported macro expands to these definitions. This makes it very difficult for implementations to identify certain code in a library as unused, and hence, eliminate it upon compilation, resulting in potentially much smaller compiled files. Implicit exports do not see very widespread support, and indeed, R6RS libraries are the only place where I am aware that they see popular use.
Given the above module systems, and given that R6RS has already defined a library system, the goal of this proposal is to address issues in the R6RS library system and hopefully simplify other semantic ambiguities of the language at the same time. To that end, I propose two conceptually different systems be standardized: a library syntax and a module syntax.
The library syntax takes up where R6RS left off, and serves the same purposes as the R6RS library syntax. Specifically, it is meant to manage compilation namespaces and provide a means of grouping together self-contained units of code. This proposal suggests to keep the existing R6RS library form and its semantics with the following changes:
Implicit exports will be replaced by an indirect-export form. This form (indirect-export export indirect-export ...) indicates that should the export be exported, then the indirect-exports will be implicity exports as well. This makes explicit the dependencies that syntax have.
Co-exports allow you to group a series of exports so that they can be exported all at once. This is used, for example, in macros that may define a number of variables to be exported. The form (co-export primary-export secondary-exports ...) indicates that if primary-export is exported, then all of the secondary-exports are likewise exported.
Local imports integrate with modules and permit better control over the visibility of identifiers.
Implicit phasing simplifies the syntax of imports and has other benefits documented elsewhere .
I propose a module system that is equivalent to the Chez Scheme module system , but with the following changes:
module serves the purpose of providing a lexically context sensitive module form that can be generated by syntax and is simple to use. It allows for controlling syntactic abstraction in powerful ways, and also enables recursive module definitions, and a number of separate module syntaxes to be built on top of it.
The semantics of module are also well defined and easier to specify compared to the other means of local syntax abstraction such as let-syntax enabling us to remove these ambiguous and hard to define structures.
A number of things relate directly to this proposal, so I include them here.
Oscar Waddell and R. Kent Dybvig, Extending the scope of syntactic abstraction. Conference Record of POPL'99: The 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. January 1999
The benefits of module systems and lexically scoped syntactic abstraction (macro) facilities are well-established in the literature. This paper presents a system that seamlessly integrates modules and lexically scoped macros. The system is fully static, permits mutually recursive modules, and supports separate compilation. We show that more dynamic module facilities are easily implemented at the source level in the extended language supported by the system.
Implicit phasing for library dependencies by Ghuloum, Abdulaziz, Ph.D., INDIANA UNIVERSITY, 2008, 161 pages; 3344623
The main objective of this thesis is a system that integrates a powerful library system and a powerful syntactic abstraction facility without compromising the essential properties of these systems. In this system, a library serves as a building block for language extensions, and its exported definitions can be used anywhere in the importing library without restrictions. The run-time variable definitions and the compile-time syntax definitions provided by a library are unified and can be used in the importing library's run-time and compile-time code alike. As a unified and integrated framework, all run-time facilities are extended to compile time, and all compile-time facilities are similarly available at run time.
The dependencies between libraries is specified in two parts. For each library, the user specifies only the import dependencies. The compile-time, expand-time, and run-time dependencies between libraries are derived automatically by the system based on the program source code, which is used as an implicit specification for phase dependencies.
Compared to traditional systems that require that the user explicitly specify phasing dependencies, the implicit phasing system (1) is easier to use, (2) derives more precise dependency information, (3) is efficient, and (4) is straightforward to add on top of existing hygienic syntactic abstraction systems.