273: Decide whether the presence of non-ASCII characters in symbols requires escaping or using vertical bars when output by `write'.

In the section <Output>, in the description of `write', we say:

Symbols that contain non-ASCII characters are escaped either with inline hex escapes or with vertical bars.

Is this right? Shouldn't implementations that support Unicode be allowed to write symbols that contain non-ASCII characters without treating them specially?

The invariant we want to establish is that read' can read what write' has written.

The question for this ticket is whether the standard should require that that invariant hold only in the same implementation, or across implementations.

If we stick with the existing language, we get interoperability across implementations.

If we change the language to allow any non-whitespace character supported by the implementation to be written unescaped and without vertical bars, we lose interoperability across implementations, but gain the ability of the implementation to display symbols containing Unicode characters in a natural manner.

aag

2011-08-30 03:57:38

Alex points out:

If some other implementation doesn't support the Greek subset of Unicode, then they wouldn't support it even if it were escaped.

cowan

2011-08-30 04:46:21

Alex is mistaken as things stand. See #274.

cowan

2011-08-30 06:09:14

UnicodeCowan 15 doesn't fly; it winds up assigning the pname "a\\xbb0;b" to the symbol a\xbb0;b as well as a\x5c;xbb0\x3b;b, which is intolerable.

So I withdraw my objections.

cowan

2011-09-23 23:45:35

I'm closing this ticket because we no longer have to decide this.

cowan

2011-09-23 23:45:53

resolution␣wontfix

statusnewclosed

Ticket 273: Decide whether the presence of non-ASCII characters in symbols requires escaping or using vertical bars when output by `write'.