This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home.

Ticket 304: symbol literal syntax wastes characters

2012-10-05 12:25:17
WG1 - Core
2011-10-20 21:24:32

[Based on feedback from Marc Feeley.]

Currently symbols can either be delimited with pipes |...| with optional hex escapes inside, or include hex escapes directly without the pipes. This wastes two characters that were reserved in R5RS, the pipe and the backslash, when either one by itself would be sufficient to represent all symbols. This is especially unfortunate because both characters are used as extensions in various Schemes - the pipe being another symbol character in SCSH (to represent shell-style pipes and C-style operators) and the backslash used in Gambit's infix syntax. We should reconsider if we really need to take up both of these characters.

If we are to drop one syntax, it must be the hex escape syntax, because it is incomplete: it can't represent the value of (string->symbol ""). However, we will still need hex escapes inside pipes, or the value of (string->symbol "|") cannot be represented (nor any other symbol containing a pipe); furthermore, symbols containing control characters would be problematic for humans to read or write.

In simple cases the pipe syntax is much clearer than hex escapes, and it is widely supported, in particular by Racket, Gauche, MIT, Gambit, Chicken, Bigloo, SISC, Chez, Ikarus, STklos, KSi, Oaklisp. Scsh, SCM, Scheme 9, Scheme 7, Elk, UMB, VX treat | as an ordinary symbol character, though only scsh seems to have a specific use for it; Scheme48, Larceny, Ypsilon, Mosh, SigScheme throw errors; IronScheme got confused (or perhaps I did).

Support for the hex escape syntax is more limited. The R6RS Schemes Chez, Ikarus, Larceny, Ypsilon, Mosh, IronScheme support it as expected; for whatever reason, the default Racket REPL does not. I tested the remaining Schemes listed above with the form 'a\x46b, which should evaluate to aFb under the R6RS rules: that works only on Kawa, Scheme48, KSi. All other Schemes listed above treat ;b as a comment and do random things with the a\x46 part.

Technically we could allow \0 or similar for the empty symbol, or use \xNN for a single escape and \|...| for the equivalent of the current pipe escape.

\0 is yucky.

Let's stick with pipes and allow hex syntax only inside them. This is mildly incompatible with R6RS, I admit, but I don't think it matters that much. Few people will use these syntaxes in anger anyway. Escaped pipes sound good in principle, but nobody supports them.

The WG voted to adopt this proposal, allowing hex escapes only within identifiers within vertical bars. Outside such identifiers, \ is not special.