This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home.
Source for ticket #25
cc
changetime
2010-11-18 13:55:26
component
WG1 - Strings and Chars
description
R6RS provides comparator routines for
characters and strings using locale-independent
Unicode algorithms. How do we order our
textual data?
id
25
keywords
milestone
owner
alexshinn
priority
major
reporter
alexshinn
resolution
fixed
severity
status
closed
summary
char and string ordering
time
2010-02-23 16:55:35
type
defect
Changes
Change at time 2010-11-18 13:55:26
author
alexshinn
field
comment
newvalue
This has been subsumed by #23.
oldvalue
2
raw-time
1290059726000000
ticket
25
time
2010-11-18 13:55:26
Change at time 2010-11-18 13:55:26
author
alexshinn
field
milestone
newvalue
oldvalue
raw-time
1290059726000000
ticket
25
time
2010-11-18 13:55:26
Change at time 2010-11-18 13:55:26
author
alexshinn
field
resolution
newvalue
fixed
oldvalue
raw-time
1290059726000000
ticket
25
time
2010-11-18 13:55:26
Change at time 2010-11-18 13:55:26
author
alexshinn
field
status
newvalue
closed
oldvalue
new
raw-time
1290059726000000
ticket
25
time
2010-11-18 13:55:26
Change at time 2010-03-02 01:34:21
author
cowan
field
comment
newvalue
R5RS ordering is very undemanding: it requires only that the ASCII uppercase letters sort correctly, that the ASCII lowercase letters sort correctly, that the digits sort correctly, and that digits and letters are not interleaved. On the other hand, it also requires that string sorting be lexicographic with respect to character sorting: that is, if two (sub)strings are the same except for the last character, they are sorted in the same way that the last character is sorted.
I'd like to apply R6RS rules to characters: they sort in Unicode order and char->integer and integer->char (which are arbitrary mappings in R5RS) map characters to Unicode code points. But Thing One implementations shouldn't be required to represent all of Unicode, nor to use UTF-8 or UTF-32 for strings internally (the native sort of UTF-16 is ''not'' codepoint order).
Instead, I believe we should break the lexicographic rule and allow string ordering to be done however an implementation pleases, as long as the R5RS rules are kept (all modern encodings, including EBCDIC, do keep them), and as long as the regular rules of comparison functions are preserved (consistency, trichotomy, etc.)
oldvalue
1
raw-time
1267464861000000
ticket
25
time
2010-03-02 01:34:21