This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home.
Source for wiki AdvancedUcdCowan version 2
author
cowan
comment
ipnr
66.108.19.185
name
AdvancedUcdCowan
readonly
0
text
See UcdCowan for basic UCD procedures.
It is an error to mutate any objects returned by these procedures.
== Blocks ==
''Blocks'' are disjoint objects that represent the allocation blocks into which the Unicode code point space is divided for administrative purposes. Typically most of a block is allocated at once and contains characters from a single script, but there is often more than one block per script, some blocks contain characters from multiple scripts, and some characters in a block are allocated much later than the rest. The list of blocks provided is implementation-dependent. Since it is not possible to create new ones, `eqv?` may be used to compare them.
`(blocks)`
Returns a list of all blocks known to the implementation.
`(block-name `''block''`)`
Returns a string naming ''block''.
`(block-first `''block''`)`
Returns an exact integer representing the first (smallest) code point in the block.
`(block-last `''block''`)`
Returns an exact integer representing the last (largest) code point in the block.
== Named Sequences ==
Named sequences are disjoint objects which represent a sequence of Unicode code points that has a name specified by the Unicode Standard. Named sequences may be provisional in one version of the UCD and then non-provisional in later versions. The list of named sequences provided is implementation-dependent. Since it is not possible to create new ones, `eqv?` may be used to compare them.
`(named-sequences)`
Returns a list of all named sequence objects known to the implementation.
`(named-sequence-name `''named-sequence''`)`
Returns a string naming ''named-sequence''.
`(named-sequence-code-points `''named-sequence''`)`
Returns a list of exact integers representing the code points of the ''named-sequence''.
`(named-sequence-provisional? `''named-sequence''`)`
Returns `#t` if the ''named-sequence'' is provisional, or `#f` if not.
== Normalization corrections ==
''Normalization-corrections'' are disjoint objects that represent official corrections to the UCD normalization tables. The list of normalization-corrections provided is implementation-dependent. Since it is not possible to create new ones, `eqv?` may be used to compare them.
`(normalization-corrections)`
Returns a list of all normalization-corrections known to the implementation.
`(normalization-correction-description `''block''`)`
Returns a string describing ''normalization-correction''. Note that normalization-corrections don't have names.
`(normalization-correction-codepoint `''block''`)`
Returns an exact integer specifying the code point of the character whose normalization is being corrected.
`(normalization-correction-old `''block''`)`
Returns a list of exact integers specifying the normalization of `(normalization-correction-codepoint `''block''`)` before this normalization correction is applied.
`(normalization-correction-new `''normalization-correction''`)`
Returns a list of exact integers specifying the normalization of `(normalization-correction-codepoint `''block''`)` after this normalization correction is applied.
`(normalization-correction-version `''block''`)`
Returns a list of three exact integers specifying the version of the UCD (in the format of `ucd-version`) in which this normalization-correction was applied.
== Standardized variants ==
''Standardized-variants'' are disjoint objects that represent standardized variants of base charactesr. The list of standardized-variants provided is implementation-dependent. Since it is not possible to create new ones, `eqv?` may be used to compare them.
`(standardized-variants)`
Returns a list of all standardized-variants known to the implementation.
`(standardized-variants-description `''standardized-variant''`)`
Returns a string describing ''standardized-variant''. Note that standardized-variants don't have names.
`(standardized-variants-when `''standardized-variant''`)`
Returns a string specifying the shaping environment under which ''standardized-variant'' is applied.
`(standardized-variant-base-codepoint `''block''`)`
Returns an exact integer specifying the code point of the base character of the standardized variant.
`(standardized-variant-variant-codepoint `''block''`)`
Returns an exact integer specifying the code point of the base character of the standardized variant.
'''Issue: this name is regrettable.'''
== Undigested stuff from UAX #42 ==
=== CJK radicals ===
The cjk-radicals child of the ucd describes the CJK radicals. It has one child element cjk-radical per radical. The attributes on that last element capture the radical number, the corresponding CJK radical character, and the corresponding CJK unified ideograph.
[cjk radicals, 50] =
ucd.content &=
element cjk-radicals {
element cjk-radical {
attribute number { xsd:string {pattern="[0-9]{1,3}'?"}},
attribute radical { single-code-point },
attribute ideograph { single-code-point }} + }?
=== Emoji sources ===
The emoji-sources child of the ucd describes the emoji sources.
[datatype for code points, 51] =
jis-code-point = xsd:string { pattern = "[0-9A-F]{4}" }
[emoji sources, 52] =
ucd.content &=
element emoji-sources {
element emoji-source {
attribute unicode { one-or-more-code-points },
attribute docomo { jis-code-point? },
attribute kddi { jis-code-point? },
attribute softbank { jis-code-point? } } + }?
time
2012-04-11 02:46:10
version
2