This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home.
Source for wiki AdvancedUcdCowan version 1
author
cowan
comment
ipnr
198.185.18.207
name
AdvancedUcdCowan
readonly
0
text
Undigested stuff from UAX #42:
5 Blocks
The blocks child of the ucd describes the blocks. It has one child block element per block, with attributes to describe the extent and name of the block.
[blocks, 46] =
ucd.content &=
element blocks {
element block {
attribute first-cp { single-code-point },
attribute last-cp { single-code-point },
attribute name { text }} + }?
6 Named Sequences
The named-sequences child of the ucd describes the named sequences. It has one child named-sequence element per named sequence, with attributes to describe the name and sequence.
Similarly, the provisional-named-sequences child of the ucd describes the provisional named sequences.
[named sequences, 47] =
ucd.content &=
element named-sequences {
element named-sequence {
attribute cps { one-or-more-code-points },
attribute name { text }} + }?
ucd.content &=
element provisional-named-sequences {
element named-sequence {
attribute cps { one-or-more-code-points },
attribute name { text }} + }?
7 Normalization Corrections
The normalization-corrections child of the ucd describes the normalization corrections. It has one child normalization-correction element per correction, with attributes to describe the code point affected, its old normalization, its new normalization and the version of Unicode in which the correction was made.
[normalization corrections, 48] =
ucd.content &=
element normalization-corrections {
element normalization-correction {
attribute cp { single-code-point },
attribute old { one-or-more-code-points },
attribute new { one-or-more-code-points },
attribute version { text }} + }?
8 Standardized Variants
The standardized-variants child of the ucd describes the standardized variant. It has one child element standardized-variant per variant. The attributes on that last element capture the variation sequence, the description of the desired appearance, and the shaping environment under which the appearance is different.
[standardized variants, 49] =
ucd.content &=
element standardized-variants {
element standardized-variant {
attribute cps { two-code-points },
attribute desc { text },
attribute when { text }} + }?
9 CJK Radicals
The cjk-radicals child of the ucd describes the CJK radicals. It has one child element cjk-radical per radical. The attributes on that last element capture the radical number, the corresponding CJK radical character, and the corresponding CJK unified ideograph.
[cjk radicals, 50] =
ucd.content &=
element cjk-radicals {
element cjk-radical {
attribute number { xsd:string {pattern="[0-9]{1,3}'?"}},
attribute radical { single-code-point },
attribute ideograph { single-code-point }} + }?
10 Emoji sources
The emoji-sources child of the ucd describes the emoji sources.
[datatype for code points, 51] =
jis-code-point = xsd:string { pattern = "[0-9A-F]{4}" }
[emoji sources, 52] =
ucd.content &=
element emoji-sources {
element emoji-source {
attribute unicode { one-or-more-code-points },
attribute docomo { jis-code-point? },
attribute kddi { jis-code-point? },
attribute softbank { jis-code-point? } } + }?
time
2010-10-29 02:00:16
version
1