Do we support any means of creating disjoint user-defined types, such as in SRFI-9, SRFI-99 or the R6RS record system?
Several SRFIs, R6RS, and most Scheme implementations support some sort of uniform packed integer vectors. In particular, these are necessary for efficient binary I/O, and for memory mapping, so WG2 will certainly want them.
Do we provide a syntax and basic API for these in WG1?
R5RS provides a simple mechanism for easy cases of lazy evaluation. It does not support generalized lazy evaluation, because all built-in procedures are eager whether they 'need' to be or not. The relevant identifiers are delay and force; they are not present in IEEE Scheme. SRFI 45 argues that this pair is insufficient for expressing common lazy algorithms in a way that preserves tail recursion, and adds lazy (equivalent to (delay (force ...)), but acting atomically) and eager. The semantics of delay and force remain downward compatible.
Vote srfi-45 to add just the bindings lazy and eager in addition to delay and force, not all of the srfi-45 utilities. Vote none to remove delay and force from the standard.
Random numbers are useful for a wide variety of applications, including cryptography, statistics, games, simulations and genetic programming. Do we want to provide an interface to random number generation in WG1 Scheme?
Currently, there is no standard way to communicate with the context from which a Scheme program was started. This has become pretty standardized over time: a list of strings ("command-line arguments") and a map from strings to strings ("environment variables") on input, and a small integer or string on output ("exit value"). Scheme should recognize these realities.
We have command-line and exit from ModulesShinn, so the question remains if we should add SRFI-98 environment accessors.
In R5RS, many procedures and syntax forms return an "undefined value". In R6RS, the corresponding convention is to return "undefined values", meaning an undefined number (including zero) of undefined values. How shall R7RS go?
Vote r5rs for a single undefined value, r6rs for zero or more undefined values, or zero for exactly zero values. Anything other than r5rs would break R5RS (and IEEE) compatibility.
"exactly zero" undefined values is just plain arbitrary and therefore sucks. An undefined number allows for future expansion, compatibly.
Assuming a single "undefined" value (dependent on the result of #68), users sometimes want to test for this value. If we enforce a unique undefined value, one approach is to generate this value explicitly to test against, such as (void) provided by some implementations. Alternately we could provide a test such as undefined?. Either approach breaks compatibility with existing extensions, and may be prohibitively difficult to implement for compilers lacking a separate undefined value type. Some programmers also consider testing for this value sign of a broken design.
Vote generate for a (void) procedure, test for undefined?, and both for both.
Undefined values are room for future expansion, not specific null placeholders.
I've used Chicken's (void) as a substitute for returning no values (and then relied on it being equal to itself in unit tests that force me to check the return value of procedures called only for side effect...); I'd rather put (values) at the end of side-effect-only procedures and have a test macro for this case, that doesn't compare return values!
list?, length, equal? and other fundamental primitives may diverge when given cyclic data. In the former two cases, avoiding this is simple and not inefficient, and the equivalents are already provided in SRFI-1. In the latter case a proposal was made and rejected on the R6RS list. In the former case, R6RS seems to require list? return #f and length raise an error.
Do we want to specify the behavior when these primitives encounter cyclic data?
Options are equal? to specify equal? must terminate on cyclic input, r6rs to specify R6RS behavior for list? and length, srfi-1 to specify the SRFI-1 semantics (where length returns #f) and equal?+r6rs or equal?+srfi-1 are options for both.
Old-fashioned Lisps used dynamic extent of variables. Although Scheme has switched to lexical scope, the concept of a dynamic environment can be useful in special cases.
Instead of special variables, SRFI-39 provides first-class "parameter" objects with dynamic bindings. Do we want to provide something similar?
Short of a full time and date library, a single procedure
(current-seconds)
returning the epoch time in seconds, possibly as a real number, would be useful and is sufficient to implement a full library (though access to the host system's timezone would be desirable in that case).
Since some systems may not have access to a clock, we could make this an optional procedure. Alternately, it could be defined as a simple counter in such cases, providing an accurate notion of time ordering but no sense of duration. Finally, it could return #f in the absense of a clock.
Should we have functions allowing a program to compute elapsed time, as distinct from calendar time?
TimeCowan contains a proposal.
Should we provide case-lambda as in SRFI 16 and R6RS? It provides simple overloading of procedures based on the number of their arguments, and does not require that optional arguments appear only after mandatory ones.
It's not clear whether R5RS requires a PORT? procedure or not. It's listed in Section 3.3.2 under Disjointness of Types, but not under section 6.6.1 under Ports. R6RS requires it. Racket, Gauche, MIT Scheme, Gambit, Chicken, Guile, SISC support it; Scheme48/scsh, Kawa, and Chibi currently do not.
Shall we require it?
Currently there's no way to determine whether a port is open or closed, short of trying to read/write to it and catching an error. Do we want to add an interface to this?
In R5RS and R6RS, call-with-values takes two arguments, both procedures. The first is a producer of multiple values; the second is a consumer, to which the multiple values returned by producer are passed as arguments.
A possible extension is to allow multiple producer arguments, flattening all the produced values together, analogous to Common Lisp's multiple-value-call.
Do we add this extension?
This smacks of bloat to me
SRFI-87 extends case with a => clauses, analogous to the use of => in cond clauses, which allows you to pass the item actually matched to a procedure.
Do we add this extension?
It makes sense to be consistent with cond.
SRFI-61 extends => clauses in cond with an optional guard form, such that after the value is generated and found to be true, it's further checked against the guard. If the guard returns #f the clause fails and processing proceeds to the next clause, otherwise the clause is accepted as normal.
Do we add this extension?
This is kind of nice, but strikes me as a super duper optional extension rather than part of a jewel-like core.
Currently, => clauses in cond accept a single value from the generator (right-hand side) and pass it to the receiver (left-hand side). Shall we allow the generator to return multiple values and pass them to the receiver? If both this ticket and #89 pass, multiple values would also be allowed for generator/guard cond clauses.
Multiple values shouldn't be second-class citizens. It's ugly when you can't use the usual niceties of Scheme just because you've broken out into multiple values.
Should we allow (include "''filename''") at the REPL? This is distinct from import in that it just loads code without altering the module structure.
The default reader in R7RS will default to case-sensitive, but users may wish to override this in some situations. R6RS allows at the top-level #!case-fold and #!no-case-fold read syntax to control the case-sensitivity of the current file. Many existing R5RS implementations, on the other hand, use #ci and #cs, with the difference that they refer to the next datum only.
Note PortsCowan provides a separate means of controlling case-sensitivity per-port.
Vote per-datum for the next-datum-only #ci/#cs syntax.
The standard currently says nothing about the character encoding system of source files. Do we require this to be a fixed encoding such as UTF-8, use an out-of-band specification like the Emacs (and Python) -*- coding: foo -*- convention, or just leave it unspecified?
I say "utf-8", but implicitly I assume that this is also bound by the implementation's restriction no available character set, so as not to require "full Unicode", so ASCII-only is Just Fine.
Allowing specification of encoding names, like XML/emacs/Python, then requires standardising what encodings are allowed - and if implementations are allowed to support other encodings as well, then interoperability suffers.
R6RS relegated string-set! to a module, and many modern languages tend towards making strings immutable. Removing entirely, however, breaks IEEE Scheme compatibility and should only be considered if you believe mutable strings are fundamentally broken.
Do we remove string-set!? Vote yes to remove, module to relegate to a module as in R6RS, or no to keep as is.
Mutating strings gets us into a while characters-versus-graphemes-versus-codepoints Unicode mess. Plus, immutable strings open the doors for some useful efficiency gains.
In R6RS auxiliary keywords (such as else in cond and case forms) are explicitly exported from the (rnrs base (6)) library. Do we want to bind and export these from the core library?
If else is bound in the default module, then it must be imported at the call site whenever using it in cond or it won't match hygienically.
If else is not bound in the default module, then it must not be bound or imported at the call site whenever using it in cond or it won't match hygienically.
Another option is to specify for cond and case that they match the else identifier literally, ignoring any hygiene. This breaks compatibility with R5RS and R6RS.
I prefer the unhygienic option since it helps avoid confusing errors due to accidental failure to manage else properly. Likewise, I prefer unbound to bound as it reduces the window for accidental failure.
In R5RS eqv?/equal? are in some sense the broadest tests for equality, comparing structural equality, but also tests for the same exactness, so that
(equal? 0 0.0) => #f
whereas
(= 0 0.0) => #t
Some users consider this confusing, others sometimes want an equal? that behaves like = for numbers.
Do we want to change equal? and eqv? in this way, or add a separate exactness-agnostic procedure? Vote yes to change, equal=? or inexact-equal? for separate procedures of those names (plus the analogous eqv=? or inexact-eqv?), or no to leave as is. Alternately, write in a separate name.
A bikeshed color issue, we need to choose the actual names for the module syntax for the winner of #2.
import, export and include are fairly universal and no alternate is suggested unless someone wants to write in a proposal.
The enclosing syntax can be called either library as in R6RS, module or some other proposal.
Pure stylistic preference. A module is a modular unit of code; a library is (to me) a module meant for sharing between programs. Modules may be used to provide structure within programs, or to make libraries.
Similar to #102, we need to choose a name for the form to include Scheme code directly in a module form. This can be body as in the proposal, begin or some other name.
Adding new symbols seems redundant to me.
The include module form includes files literally with the default case-sensitivity. An include-ci form could include files case-insensitively without resorting to the reader hacks proposed in #92, allowing existing R5RS libraries to be used without modification.
Users invariably want some way to conditionally select code depending on the implementation and/or feature set available. CondExpandCowan allows conditional expansion in the style of SRFI-0 within the module language. SRFI-0 provides cond-expand, SRFI-103 provides a library naming extension, and numerous other personal hacks exist.
Do we want to include something along these lines in WG1 Scheme?
R5RS specifies literal data in source code as immutable, but otherwise provides no way to generate or introspect immutable data.
One proposal is given in ImmutableData, providing mutable?, make-immutable and immutable->mutable.
Racket, for which all pairs are immutable in the default language, needs some way to generate shared and cyclic data structures at runtime, and provides the shared syntax for this. It also has an immutable? utility as the complement to mutable? above.
Currently equal? is strictly broader than eqv? except in the pathological case of comparing the same circular list with itself, for which eqv? returns true and equal? may loop infinitely. We could explicitly require equal? to check and return #t in this case, which most implementations do as a performance hack anyway.
R6RS provided a detailed exception system with support for raising and catching exceptions, using a hierarchy of exception types.
Do we use this, or parts of it, or a new exception system? The r6rs option is just for the core exception handling.
R5RS defines many things as "is an error" without any specification of what happens in that situation. R6RS goes to the opposite extreme and specifies as much as possible what exceptions are raised when.
Taking into account the system provided by ticket #18, we need to come up with guidelines for when exceptions should be raised, and clarify which R5RS "error" situations should raise an exception or be left unspecified.
R5RS specifies only 3 situations where an error is required to be signalled, leaving most situations unspecified as described in ErrorSituations.
I think it's important for portable code to be able to know how to handle various kinds of errors. Most of the errors in ErrorSituations are likely to be programming errors, but even they need to be catchable in systems that host "foreign" code - from sandboxes to "application servers".
Do we provide any binary input or output ports, and if so how do we construct them and operate on them? Can binary and textual operations be mixed on the different port types?
PortsCowan provides binary port operations along with other extensions.
R6RS provides an entirely new I/O system, as well as a separate R5RS-compatible I/O system.
The withdrawn SRFI-91 provides yet another I/O system supporting binary ports.
Note this item as well as #29 and #31 specify semi-orthogonal aspects of I/O systems which are typically specified together by individual proposals. If the same proposal doesn't win for all three, the aspects will be merged as needed.
Do we support encoding and decoding text from ports with different character encoding systems? Different end-of-line conventions? Different normalizations? How are encoding errors handled?
Do we provide a mechanism for custom ports, on which for instance string ports could be created?
R6RS as well as a number of Scheme implementations provide custom ports with various APIs.
R6RS and SRFI-69 both provide hash-table interfaces. Do we provide either of these, or try to provide some primitives on which efficient hash-tables can be implemented?
We've decided to add file-exists? and delete-file, essential for a large class of scripts, but still have no way to get a list of files in a directory. Do we want to provide an interface to this?
let-syntax and letrec-syntax has known ambiguities in their behavior. We have the option of altering the semantics to correct this behavior, defining which behavior we intend, or removing let-syntax entirely. We could also leave this ambiguity unspecified.
The question of whether or not to introduce a new lexical scope (i.e. whether internal defines are visible outside the let-syntax) is straightforward.
If we don't introduce a new lexical scope, the question arises whether or not internal define-syntax forms are allowed and whether they apply to the body of the let-syntax, forms following the let-syntax, or both.
If internal define-syntax applies to the body, we may also wish to specify what happens when said define-syntax redefines an identifier bound by the enclosing let-syntax. This varies by implementation and may be difficult for macro expanders to change, so is left unspecified in the proposals below.
... and with the result of #6 also _ have special meaning in syntax-rules patterns, so they are not treated as pattern variables by default.
However their behavior when used in the literals list of syntax-rules is ambiguous, and simply breaks in most implementations.
Rather than breaking, it makes sense to go ahead and treat them as normal literals, overriding their special meanings.
In particular, there are many existing R5RS macros which make use of _ in the literals and are thus broken outright by #6. Allowing them as literals fixes these macros.
We need a naming convention for the core modules and standard libraries of the new module system.
In R5RS everything is effectively in a single module. R6RS provides a much more fine-grained breakdown of modules which could be retro-fitted to the bindings we include in our standard.
John Cowan has proposed a number of module factorings in items #71, #72, #73, #74, #75, #76, #77, as well as an I/O module breakdown in PortsCowan.
Since the naming and breakdown must be internally consistent I'm grouping these into a single ballot item. Members desiring to put forth a new proposal should specify where all bindings belong, or specify a subset of the bindings and default the rest to some other proposal.
Note some ballots specify explicitly whether or not the bindings in question are intended to be in a module or the core language. In these cases we still need to decide to which module they belong. Where specific votes contradict general factoring proposals, the specific vote wins.
I think we need to actually gather a list of what's going in modules, and then make a decision then.
Often a rational-only exponentiation function is useful; that is, a rational number raised to an integer power. Should we add this procedure to the core so that exponentiation is available even if inexact rationals are not provided or not imported?
NumericTower lists a plausible set of ten from fixnums only to the full monty. Which ones should we allow an implementation to provide? R5RS requires only fixnums large enough to handle string and vector indexes, while R6RS requires the full numeric tower.
Vote on the minimum level of support you want to require (implementations may of course still provide more than this). I've included only the most likely options below, write in other options if needed.
Note quaternions are a fairly rare numeric type, known to be provided only by extensions to scm and chicken, and thus may be difficult for other implementations to support if required.
R5RS provides quotient, modulo and remainder for integral division. R6RS extended this with div/mod and div0/mod0. A thorough analysis of possible division operations is provided in DivisionRiastradh, which includes a proposal for five separate division operator pairs. We need to choose which API we'll provide.
I am drawn, moth-like, to the awesome rigor of Riastradh's proposal, even though it makes my "not jewel-like" glands itch at the same time.
In R5RS, symbols parsed as any sequence of valid symbol characters that does not begin with a character that can begin a number. The three exceptions +, - and ... are also provided. This allows parsing with only constant lookahead to determine type.
R6RS added additional exceptions for symbols beginning with ->, a common idiom, still allowing parsers to determine type with a constant lookahead.
John Cowan proposes allowing anything that cannot be parsed as a number to be a valid symbol. This removes the special case exceptions, but may require arbitrary lookahead.
Alex Shinn proposes symbols are any sequence of valid symbol characters that does not have a prefix which is a valid number. This removes the special case exceptions, allows constant lookahead, and allows extensions to number syntax.
The WG has voted to have a list of character names.
The list in R5RS and the longer list in R6RS are only informative. I suggest adopting the R6RS list and making it normative.
Similar to #84, we need to choose a specific list of mnemonic escapes like \n and \t to be recognized in strings.