This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home.

Source for wiki StringSlicesCowan version 8

author

cowan

comment


    

ipnr

127.11.51.1

name

StringSlicesCowan

readonly

0

text

== String span library ==

This is a library for manipulating textual content based on ''string spans'', which are conceptually references to parts of one or more Scheme string, and ''string cursors'', which are pointers into strings.  It is not defined whether the string span type is disjoint from strings, or whether string cursors are a disjoint type at all.  String spans are immutable, and it is an error to mutate the underlying string.

In addition, string cursors may or may not be the same as character-based indexes into strings.  For example, in an implementation whose internal representation of strings is UTF-8, string cursors may be indexes of individual bytes in the string.  However, the operations provided here (with the exception of those in the Compatibility section) are entirely independent of the character repertoire supported by the implementation.

This proposal also contains a useful subset of SRFI 13, which manipulates strings directly with some allowances for shared substrings (which are provided only by Guile).  The string operations of this proposal are defined in terms of the string span operations.  Unlike SRFI 13, it does not provide ''start'' and ''end'' arguments, as their functionality is subsumed by spans.  In addition, the low-level procedures are not provided, nor are any mutation operations.  Procedures with the same names and functions as SRFI-13 procedures are marked [SRFI 13]; note that they don't necessarily support all of the arguments of the SRFI 13 versions.

Procedures marked [R7RS-small] are available in the small language, and are not exported by implementations of this proposal.  They are included only for completeness.

All predicates passed to procedures defined in this proposal may be called in any order and any number of times, except as otherwise noted.

== Issues ==

1. Allow negative indices in constructors?

2. Titlecase doesn't really fit; keep it?

3. Keep functional update?

4. Keep string trees?

5. Keep compatibility routines, possibly in a different package?

== Specification ==

With the exception of the constructors, all the procedures in this proposal exist in pairs: one that accepts and produces string spans and one that accepts and produces strings.  Only the string span version is documented in full; the string version should be understood as accepting the same non-span arguments, performing the same operations, and providing the same non-span results.

== String constructors ==

`(make-string ` ''k'' [ ''char'' ]`)` [R7RS-small]

Returns a string containing ''k'' characters, all of which are ''char''.  If ''char'' is omitted, the contents of the string are implementation-dependent.

`(string `''char'' ...`)` [R7RS-small]

`(string-unfold `''stop? mapper successor'' [ ''seed'' ]`)`

`(string-unfold-right `''stop? mapper successor'' [ ''seed'' ]`)`

`(span->string `''span''`)`

Returns a string which contains the characters of ''span'' in order.

== String span constructors ==

`(make-span `''string start end''`)`

Returns a string span which contains the characters of ''string'' in order from ''start'' (inclusive) to ''end'' (exclusive).

`(span `''char'' ...`)`

`(subspan `''span start end''`)`

`(string->span `''string''`)`

Returns a string span which contains the characters of ''string'' in order.

`(span/cursors `''string start-cursor end-cursor''`)`

`(string-subspan/cursors `''string start-cursor end-cursor''`)`

`(span-transform `''proc span obj'' ...`)`

''Proc'' is a procedure which accepts a string as its first argument and returns a string.  It is invoked on a string which contains the characters of ''span'' in order plus the ''obj'' arguments, if any.  The resulting string is returned as a string span by `span-transform`.  This procedure allows string-based procedures to be easily used in an environment that provides and expects spans.

== Predicates ==

`(span? `''obj''`)`

`(string? `''obj''`)`  [R7RS-small]

Returns `#t` if ''obj'' is a string span, and `#f` otherwise.

`(span-null? `''span''`)`

`(string-null? `''string''`)`  [SRFI 13]

Returns `#t` if ''span'' contains zero characters, and `#f` otherwise.

`(span-every `''pred span''`)`

`(string-every `''pred string''`)`  [SRFI 13]

Returns `#t` if ''pred returns true for every character in ''span'', and `#f` otherwise.

`(span-any `''pred span''`)`

`(string-any `''pred string''`)`  [SRFI 13]

Returns `#t` if ''pred returns true for any character in ''span'', and `#f` otherwise.

`(is-char? `''char''`)`

Returns a predicate which accepts one argument, and returns `#t` if the argument is the same as ''char'' (in the sense of `char=?`) and `#f` otherwise.

`(in-char-set? `''char-set''`)`

Returns a predicate which accepts one argument, and returns `#t` if the argument is an element of ''char-set'', a SRFI 14 character set, and `#f` otherwise.

== Selection ==

`(span-ref `''span k''`)`

`(string-ref `''string k''`)` [R7RS-small]

Returns the 'k'th character of ''span'', starting with 0.  It is an error if ''k'' is not a non-negative exact integer less than the length of ''span''.

`(span-take `''span n''`)`

`(string-take `''string n''`)` [SRFI 13]

Returns a string span which contains the first ''n'' characters of ''span''.

`(span-take-right `''span n''`)`

`(string-take-right `''string n''`)` [SRFI 13]

Returns a string span which contains the last ''n'' characters of ''span''.

`(span-drop `''span n''`)`

`(string-drop `''string  n''`)` [SRFI 13]

Returns a string span which contains all but the first ''n'' characters of ''span''.

`(span-drop-right `''span n''`)`

`(string-drop-right `''string n''`)` [SRFI 13]

Returns a string span which contains all but the last ''n'' characters of ''span''.

`(span-split-at `''span n''`)`

`(string-split-at `''string  n''`)` [SRFI 13]

Returns two values, a string span containing the first ''n'' characters of ''span'', and another string span containing the remaining characters of ''span''.

`(span-replicate `''span from to''`)`

`(string-replicate `''string from to''`)`

== Padding, trimming, and compressing ==

`(span-pad `''span len'' [ ''char'' ]`)`

`(string-pad `''string  len'' [ ''char'' ]`)` [SRFI 13]

`(span-trim `''span len'' [ ''char'' ]`)`

`(string-trim `''string len'' [ ''char'' ]`)` [SRFI 13]

`(span-trim-right `''span len'' [ ''char'' ]`)`

`(string-trim-right `''string len'' [ ''char'' ]`)` [SRFI 13]

`(span-trim-both `''span len'' [ ''char'' ]`)`

`(string-trim-both `''string len'' [ ''char'' ]`)` [SRFI 13]

`(span-compress `''span'' [ ''char'' ]`)`

`(string-compress `''string'' [ ''char'' ]`)`

== Prefixes and suffixes ==

`(span-prefix `''span,,1,,'' ''span,,2,,''`)`

`(string-prefix `''string,,1,,'' ''string,,2,,''`)`

`(span-suffix `''span,,1,,'' ''span,,2,,''`)`

`(string-suffix `''string,,1,,'' ''string,,2,,''`)` [SRFI 13]

`(span-prefix-length `''span,,1,,'' ''span,,2,,''`)`

`(string-prefix-length `''string,,1,,'' ''string,,2,,''`)`

`(span-suffix-length `''span,,1,,'' ''span,,2,,''`)` [SRFI 13]

`(string-suffix-length `''string,,1,,'' ''string,,2,,''`)`

`(span-prefix? `''span,,1,,'' ''span,,2,,''`)`

`(string-prefix? `''string,,1,,'' ''string,,2,,''`)` [SRFI 13]

`(span-suffix? `''span,,1,, span,,2,,''`)`

`(string-suffix? `''string,,1,,'' ''string,,2,,''`)` [SRFI 13]

== Searching ==

`(span-count `''proc span''`)`

`(string-count `''proc string''`)` [SRFI 13]

`(span-take-while `''proc span''`)`

`(string-take-while  `''proc string''`)` [SRFI 13]

`(span-drop-while `''proc span''`)`

`(string-drop-while  `''proc string''`)` [SRFI 13]

`(span-break `''proc span''`)`

`(string-break `''proc string''`)` [SRFI 13]

`(span-drop `''proc span''`)`

`(string-drop `''proc string''`)` [SRFI 13]

`(span-contains `''span,,1,, span,,2,,''`)`

`(string-contains `''string,,1,, string,,2,,''`)` [SRFI 13]

== The whole string span or string ==

`(span-length `''span''`)`

`(string-length `''string''`)` [R7RS-small]

`(span-copy `''span''`)`

`(string-copy `''string'' [ ''start'' [ ''end'' ] ]`)` [R7RS-small]

`(span-reverse `''span''`)`

`(string-reverse `''span''`)` [SRFI 13]

`(span-append `''span'' ...`)`

`(string-append `''string'' ...`)` [R7RS-small]

`(span-concatenate `''list-of-spans''`)`

`(string-concatenate `''list-of-strings''`)` [SRFI 13]

`(span-concatenate-reverse `''list-of-spans''`)`

`(string-concatenate-reverse `''list-of-strings''`)` [SRFI 13]

== Folding and mapping ==

`(span-map `''proc span'' ...`)`

`(string-map `''proc string'' ...`)` [R7RS-small]

`(span-for-each `''proc span'' ...`)`

`(string-for-each `''proc string'' ...`)` [R7RS-small]

`(span-fold `''proc nil span''`)`

`(string-fold `''proc nil string''`)` [R7RS-small]

`(span-fold-right `''proc nil span''`)`

`(string-fold-right `''proc nil string''`)` [R7RS-small]

== Parsing ==

`(span-split `''span [''sep'' [ ''limit'' ] ]`)`

`(span-split `''span [''sep'' [ ''limit'' ] ]`)`

Returns a list of the words contained in ''span''.  If ''sep'' (which is also a string span) is omitted, then the words are separated by arbitrary strings of whitespace characters (those on which `char-whitespace?` returns `#t`). If ''sep'' is supplied, it specifies a string to be used as the word separator. The returned list will then have one more item than the number of non-overlapping occurrences of the separator in the string.  If ''sep'' is empty, then the returned list contains a list of the characters in ''span''.

If ''limit'' is provided, at most that many splits occur, and the remainder of ''span'' is returned as the final element of the list (thus, the result will have at most ''limit'' + 1 elements). If ''limit'' is not specified, then all possible splits are made.  It is an error if ''limit'' is not a positive exact integer.

== Filtering and partitioning ==

`(span-filter `''proc span''`)` [SRFI 13]

`(string-filter `''proc string''`)`

`(span-remove `''proc span''`)`

`(string-remove `''proc string''`)`

`(span-partition `''proc span''`)`

`(string-partition `''proc string''`)`

== Conversion ==

`(span->list `''span''`)`

`(span->vector `''span''`)`

`(string->list `''string'' [ ''start'' [ ''end'' ] ]`)` [R7RS-small]

`(list->string `''list''[ ''start'' [ ''end'' ] ]`)` [R7RS-small]

`(string->vector `''string''[ ''start'' [ ''end'' ] ]`)` [R7RS-small]

`(vector->string `''vector''[ ''start'' [ ''end'' ] ]`)` [R7RS-small]

== String cursors ==

{{{
string-cursor-start
string-cursor-end

string-cursor-ref

string-cursor-next
string-cursor-previous

string-cursor-forward
string-cursor-backward

string-cursor-forward-until
string-cursor-backward-until

string-cursor=?
string-cursor<?
string-cursor>?
string-cursor<=?
string-cursor>=?

string-cursor->index
string-index->cursor

string-cursors->string

string-cursor-difference
}}}

== Functional update ==

{{{
span-replace
string-replace

span-insert
string-insert

span-delete
string-delete
}}}

== String trees ==

{{{
string-tree->string

write-string-tree
}}}


== Compatibility ==

{{{
span-upcase
span-downcase
span-foldcase

span=?
span<?
span>?
span<=?
span>=?
span-ci=?
span-ci<?
span-ci>?
span-ci<=?
span-ci>=?

span-titlecase
string-titlecase [SRFI 13]
}}}

time

2014-12-16 01:04:55

version

8