This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home.
Source for wiki CharacterConversionCowan version 1
author
cowan
comment
ipnr
127.11.51.1
name
CharacterConversionCowan
readonly
0
text
== Character conversion ==
These routines are loosely based on the Gauche [http://practical-scheme.net/gauche/man/gauche-refe_78.html gauche.charconv] package, which in turn is based on [https://www.gnu.org/savannah-checkouts/gnu/libiconv GNU libiconv].
== Character encoding schemes ==
A ''character encoding scheme'' or ''CES'' is an algorithm for translating between a sequence of bytes and a sequence of characters and vice versa. Converting bytes to characters is ''decoding''; converting characters to bytes is ''encoding''. Typical CESes include ASCII, UTF-8, ISO-8859-1, and TIS-620; each one is named by a case-independent string.
`(ces-valid? `''ces''`)`
Returns `#t` if it is possible to convert between ''ces'' and the implementation-dependent internal CES of Scheme strings and `#f` otherwise.
`(ces-conversion-supported? `''ces,,1,, ces,,2,,''`)`
Returns `#t` if it is possible to convert between ''ces,,1,,'' and ''ces,,2,,'' and `#f` otherwise.
`(ces-guess `''bytevector'' [ ''hint'' ]`)`
Returns a CES which might be the correct encoding of the bytes in ''bytevector''. If ''hint'' is provided, it is a possible CES which gives a possibly helpful indication of the correct CES. For example, given a hint of "UTF-8", this procedure might return "UTF-8" if ''bytevector'' contains well-formed UTF-8, or "ISO-8859-1" if it does not.
== String and bytevector conversion ==
`(string->bytevector `''ces string''`)`
Uses ''ces'' to encode''string'' into a newly allocated bytevector, which is returned.
`(bytevector->string `''ces bytevector''`)`
Uses ''ces'' to decode ''bytevector'' into a newly allocated string, which is returned.
`(transcode-bytevector `''ces,,1,, ces,,2,, bytevector''`)`
Uses ''ces,,1,,'' to decode ''bytevector'' into a string, which is then encoded into a newly allocated bytevector using ''ces,,2,,''. However, the string need not actually be created.
== Conversion ports ==
All of these procedures accept an optional ''size'' argument, specifying the size of an internal conversion buffer in characters (not bytes). It is an error if ''size'' is not a non-negative exact integer. If the size is 0, there is no internal buffer.
`(make-decoded-input-port `''binary-port ces'' [ ''size'' ]`)`
Returns a textual input port which, when read from, reads bytes from ''binary-port'', decodes them as characters using ''ces'', and provides the characters to the reader.
`(make-encoded-output-port `''binary-port ces'' [ ''size'' ]`)`
Returns a textual output port which, when characters are written to it, encodes them as characters using ''ces'', and writes the bytes to ''binary-port''.
`(make-encoded-binary-input-port `''textual-port ces'' [ ''size'' ]`)`
Returns a binary input port which, when read from, reads characters from ''textual-port'', encodes them as characters using ''ces'', and provides the bytes to the reader.
`(make-decoded-binary-output-port `''textual-port ces'' [ ''size'' ]`)`
Returns a binary output port which, when bytes are written to it, decodes them as characters using ''ces'', and writes the characters to ''textual-port''.
`(make-transcoded-binary-input-port `''binary-port ces,,1,, ces,,2,,'' [ ''size'' ]`)`
Returns a binary input port which, when read from, reads bytes from ''binary-port'', decodes them as characters using ''ces,,1,,'', encodes the resulting characters as bytes using ''ces,,2,,'', and provides the bytes to the reader.
`(make-transcoded-binary-output-port `''binary-port ces,,1,, ces,,2,,'' [ ''size'' ]`)`
Returns a binary output port which, when bytes are written on it, decodes them as characters using ''ces,,1,,'', encodes the resulting characters as bytes using ''ces,,2,,'', and writes the bytes on ''binary-port''.
time
2014-09-21 10:13:53
version
1