This is a proposal for a WG2 bytevector API. The conceit is that everything is a separate procedure with minimal arguments; this makes for a lot of procedures, but each one can be easily inlined by even a very dumb compiler, providing high efficiency.
- (bytevector-<type><endian>-ref bytevector n)
- Returns a Scheme number corresponding to the binary value encoded according to type beginning at offset n in bytevector.
- (bytevector-<type><endian>-set! bytevector n v)
- Converts v to a binary value encoded according to type and places it into bytevector beginning at offset n.
The types are:
- unsigned 8-bit integer
- signed 8-bit integer
- unsigned 16-bit integer
- signed 16-bit integer
- unsigned 32-bit integer
- signed 32-bit integer
- unsigned 64-bit integer
- signed 64-bit integer
- unsigned 128-bit integer
- signed 128-bit integer
- 32-bit IEEE float
- 32-bit native float (may not be IEEE)
- 64-bit IEEE float in native endianism
- 64-bit native float (may not be IEEE)
- 64-bit complex number (two 32-bit IEEE floats)
- 64-bit complex number (two 32-bit native floats, may not be IEEE)
- 128-bit complex number (two 64-bit IEEE floats)
- 128-bit complex number (two 64-bit native floats, may not be IEEE)
The endianism values are:
- Native endianism (system-dependent)
Endianism is not applicable to the following types: s8 u8 fn32 fn64 cn64 cn128
- (bytevector-<encoding>-ref bytevector n l)
- Returns a Scheme string corresponding to the binary value encoded according to encoding beginning at offset n in bytevector and continuing for l bytes.
- (bytevector-<encoding>-set! blob n v)
- Converts v to a binary string encoded according to encoding and places it into bytevector beginning at offset n. Returns the number of bytes encoded.
The encodings are:
- UTF-8 encoding
- UTF-16 encoding (respects BOM if present, defaults to native encoding otherwise)
- UTF-16BE encoding (treats BOM as a normal character)
- UTF-16LE encoding (treats BOM as a normal character)
Map, for-each, fold, unfold
- Offsets are in bytes and can be arbitrary
- Offsets are in bytes but must be naturally aligned (divisible by n for an n-byte value)
- Offsets are in n-byte units (forces natural alignment, SRFI-4 style)
Should bytevector=? be provided?
I've trashed the UTF-32 conversions because nobody uses UTF-32. They can come back if somebody needs them.