This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home.

Source for wiki StringSlicesCowan version 7

author

cowan

comment


    

ipnr

127.11.51.1

name

StringSlicesCowan

readonly

0

text

== String slice library ==

This is a library for manipulating textual content based on ''string slices'', which are references to part of a Scheme string, and ''string cursors'', which are pointers into strings.  It is not defined whether the string slice type is disjoint from strings, or whether string cursors are a disjoint type at all.  String slices are immutable, and it is an error to mutate the underlying string.

In addition, string cursors may or may not be the same as character-based indexes into strings.  For example, in an implementation whose internal representation of strings is UTF-8, string cursors may be indexes of individual bytes in the string.  However, the operations provided here (with the exception of those in the Compatibility section) are entirely independent of the character repertoire supported by the implementation.

This proposal also contains a useful subset of SRFI 13, which manipulates strings directly with some allowances for shared substrings (which are both then and now provided only on Guile).  The string operations of this proposal are defined in terms of the string slice operations.  Unlike SRFI 13, it does not provide ''start'' and ''end'' arguments, as their functionality is subsumed by slices.  In addition, the low-level procedures are not provided, nor are any mutation operations.  Procedures with the same names and functions as SRFI-13 procedures are marked [SRFI 13]; note that they don't necessarily support all of the arguments of the SRFI 13 versions.

Procedures marked [R7RS-small] are available in the small language, and are not exported by implementations of this proposal.  They are included only for completeness.

All predicates passed to procedures defined in this proposal may be called in any order and any number of times, except as otherwise noted.

== Issues ==

1. Allow negative indices in constructors?

2. Titlecase doesn't really fit; keep it?

3. Keep functional update?

4. Keep string trees?

5. Keep compatibility routines, possibly in a different package?

== String constructors ==

`(make-string ` ''k'' [ ''char'' ]`)` [R7RS-small]

Returns a string containing ''k'' characters, all of which are ''char''.  If ''char'' is omitted, the contents of the string are implementation-dependent.

`(string `''char'' ...`)` [R7RS-small]

`(string-unfold `''stop? mapper successor'' [ ''seed'' ]`)`

`(string-unfold-right `''stop? mapper successor'' [ ''seed'' ]`)`

`(string-slice->string `''string-slice''`)`

Returns a string which contains the characters of ''string-slice'' in order.

== String slice constructors ==

`(make-string-slice `''string start end''`)`

Returns a string-slice which contains the characters of ''string'' in order from ''start'' (inclusive) to ''end'' (exclusive).

`(string-slice `''char'' ...`)`

`(string-slice-subslice `''slice start end''`)`

`(string->string-slice `''string''`)`

Returns a string slice which contains the characters of ''string'' in order.

`(string-slice/cursors `''string start-cursor end-cursor''`)`

`(string-subslice/cursors `''string start-cursor end-cursor''`)`

`(string-slice-transform `''proc slice obj'' ...`)`

''Proc'' is a procedure which accepts a string as its first argument and returns a string.  It is invoked on a string which contains the characters of ''slice'' in order plus the ''obj'' arguments, if any.  The resulting string is returned as a string slice by `string-slice-transform`.  This procedure allows string-based procedures to be easily used in an environment that provides and expects string slices.

== Predicates ==

`(string-slice? `''obj''`)`

`(string? `''obj''`)`  [R7RS-small]

Returns `#t` if ''obj'' is a string slice, and `#f` otherwise.

`(string-slice-null? `''slice''`)`

`(string-null? `''string''`)`  [SRFI 13]

Returns `#t` if ''slice'' contains zero characters, and `#f` otherwise.

`(string-slice-every `''pred slice''`)`

`(string-every `''pred string''`)`  [SRFI 13]

Returns `#t` if ''pred returns true for every character in ''slice'', and `#f` otherwise.

`(string-slice-any `''pred slice''`)`

`(string-any `''pred string''`)`  [SRFI 13]

Returns `#t` if ''pred returns true for any character in ''slice'', and `#f` otherwise.

`(is-char? `''char''`)`

Returns a predicate which accepts one argument, and returns `#t` if the argument is the same as ''char'' (in the sense of `char=?`) and `#f` otherwise.

`(in-char-set? `''char-set''`)`

Returns a predicate which accepts one argument, and returns `#t` if the argument is an element of ''char-set'', a SRFI 14 character set, and `#f` otherwise.

== Selection ==

`(string-slice-ref `''slice k''`)`

`(string-ref `''string k''`)` [R7RS-small]

Returns the 'k'th character of ''slice'', starting with 0.  It is an error if ''k'' is not a non-negative exact integer less than the length of ''slice''.

`(string-slice-take `''slice n''`)`

`(string-take `''string n''`)` [SRFI 13]

Returns a string slice which contains the first ''n'' characters of ''slice''.

`(string-slice-take-right `''slice n''`)`

`(string-take-right `''string n''`)` [SRFI 13]

Returns a string slice which contains the last ''n'' characters of ''slice''.

`(string-slice-drop `''slice n''`)`

`(string-drop `''string  n''`)` [SRFI 13]

Returns a string slice which contains all but the first ''n'' characters of ''slice''.

`(string-slice-drop-right `''slice n''`)`

`(string-drop-right `''string n''`)` [SRFI 13]

Returns a string slice which contains all but the last ''n'' characters of ''slice''.

`(string-slice-split-at `''slice n''`)`

`(string-split-at `''string  n''`)` [SRFI 13]

Returns two values, a string slice containing the first ''n'' characters of ''slice'', and another string slice containing the remaining characters of ''slice''.

`(string-slice-replicate `''slice from to''`)`

`(string-replicate `''string from to''`)`

== Padding, trimming, and compressing ==

`(string-slice-pad `''slice len'' [ ''char'' ]`)`

`(string-pad `''string  len'' [ ''char'' ]`)` [SRFI 13]

`(string-slice-trim `''slice len'' [ ''char'' ]`)`

`(string-trim `''string len'' [ ''char'' ]`)` [SRFI 13]

`(string-slice-trim-right `''slice len'' [ ''char'' ]`)`

`(string-trim-right `''string len'' [ ''char'' ]`)` [SRFI 13]

`(string-slice-trim-both `''slice len'' [ ''char'' ]`)`

`(string-trim-both `''string len'' [ ''char'' ]`)` [SRFI 13]

`(string-slice-compress `''slice'' [ ''char'' ]`)`

`(string-compress `''string'' [ ''char'' ]`)`

== Prefixes and suffixes ==

`(string-slice-prefix `''slice,,1,,'' ''slice,,2,,''`)`

`(string-prefix `''string,,1,,'' ''string,,2,,''`)`

`(string-slice-suffix `''slice,,1,,'' ''slice,,2,,''`)`

`(string-suffix `''string,,1,,'' ''string,,2,,''`)` [SRFI 13]

`(string-slice-prefix-length `''slice,,1,,'' ''slice,,2,,''`)`

`(string-prefix-length `''string,,1,,'' ''string,,2,,''`)`

`(string-slice-suffix-length `''slice,,1,,'' ''slice,,2,,''`)` [SRFI 13]

`(string-suffix-length `''string,,1,,'' ''string,,2,,''`)`

`(string-slice-prefix? `''slice,,1,,'' ''slice,,2,,''`)`

`(string-prefix? `''string,,1,,'' ''string,,2,,''`)` [SRFI 13]

`(string-slice-suffix? `''slice,,1,, slice,,2,,''`)`

`(string-suffix? `''string,,1,,'' ''string,,2,,''`)` [SRFI 13]

== Searching ==

`(string-slice-count `''proc slice''`)`

`(string-count `''proc string''`)` [SRFI 13]

`(string-slice-take-while `''proc slice''`)`

`(string-take-while  `''proc string''`)` [SRFI 13]

`(string-slice-drop-while `''proc slice''`)`

`(string-drop-while  `''proc string''`)` [SRFI 13]

`(string-slice-break `''proc slice''`)`

`(string-break `''proc string''`)` [SRFI 13]

`(string-slice-drop `''proc slice''`)`

`(string-drop `''proc string''`)` [SRFI 13]

`(string-slice-contains `''slice,,1,, slice,,2,,''`)`

`(string-contains `''string,,1,, string,,2,,''`)` [SRFI 13]

== The whole string slice or string ==

`(string-slice-length `''slice''`)`

`(string-length `''string''`)` [R7RS-small]

`(string-slice-copy `''slice''`)`

`(string-copy `''string'' [ ''start'' [ ''end'' ] ]`)` [R7RS-small]

`(string-slice-reverse `''slice''`)`

`(string-reverse `''slice''`)` [SRFI 13]

`(string-slice-append `''slice'' ...`)`

`(string-append `''string'' ...`)` [R7RS-small]

`(string-slice-concatenate `''list-of-slices''`)`

`(string-concatenate `''list-of-strings''`)` [SRFI 13]

`(string-slice-concatenate-reverse `''list-of-slices''`)`

`(string-concatenate-reverse `''list-of-strings''`)` [SRFI 13]

== Folding and mapping ==

`(string-slice-map `''proc slice'' ...`)`

`(string-map `''proc string'' ...`)` [R7RS-small]

`(string-slice-for-each `''proc slice'' ...`)`

`(string-for-each `''proc string'' ...`)` [R7RS-small]

`(string-slice-fold `''proc nil slice''`)`

`(string-fold `''proc nil string''`)` [R7RS-small]

`(string-slice-fold-right `''proc nil slice''`)`

`(string-fold-right `''proc nil string''`)` [R7RS-small]

== Parsing ==

`(string-slice-split `''slice [''sep'' [ ''limit'' ] ]`)`

`(string-slice-split `''slice [''sep'' [ ''limit'' ] ]`)`

Returns a list of the words contained in ''slice''.  If ''sep'' (which is also a string slice) is omitted, then the words are separated by arbitrary strings of whitespace characters (those on which `char-whitespace?` returns `#t`). If ''sep'' is supplied, it specifies a string to be used as the word separator. The returned list will then have one more item than the number of non-overlapping occurrences of the separator in the string.  If ''sep'' is empty, then the returned list contains a list of the characters in ''slice''.

If ''limit'' is provided, at most that many splits occur, and the remainder of ''slice'' is returned as the final element of the list (thus, the result will have at most ''limit'' + 1 elements). If ''limit'' is not specified, then all possible splits are made.  It is an error if ''limit'' is not a positive exact integer.

== Filtering and partitioning ==

`(string-slice-filter `''proc slice''`)` [SRFI 13]

`(string-filter `''proc string''`)`

`(string-slice-remove `''proc slice''`)`

`(string-remove `''proc string''`)`

`(string-slice-partition `''proc slice''`)`

`(string-partition `''proc string''`)`

== Conversion ==

`(string-slice->list `''slice''`)`

`(string-slice->vector `''slice''`)`

`(string->list `''string'' [ ''start'' [ ''end'' ] ]`)` [R7RS-small]

`(list->string `''list''[ ''start'' [ ''end'' ] ]`)` [R7RS-small]

`(string->vector `''string''[ ''start'' [ ''end'' ] ]`)` [R7RS-small]

`(vector->string `''vector''[ ''start'' [ ''end'' ] ]`)` [R7RS-small]

== String cursors ==

{{{
string-cursor-start
string-cursor-end

string-cursor-ref

string-cursor-next
string-cursor-previous

string-cursor-forward
string-cursor-backward

string-cursor-forward-until
string-cursor-backward-until

string-cursor=?
string-cursor<?
string-cursor>?
string-cursor<=?
string-cursor>=?

string-cursor->index
string-index->cursor

string-cursors->string

string-cursor-difference
}}}

== Functional update ==

{{{
string-slice-replace
string-replace

string-slice-insert
string-insert

string-slice-delete
string-delete
}}}

== String trees ==

{{{
string-tree->string

write-string-tree
}}}


== Compatibility ==

{{{
string-slice-upcase
string-slice-downcase
string-slice-foldcase

string-slice=?
string-slice<?
string-slice>?
string-slice<=?
string-slice>=?
string-slice-ci=?
string-slice-ci<?
string-slice-ci>?
string-slice-ci<=?
string-slice-ci>=?

string-slice-titlecase
string-titlecase [SRFI 13]
}}}

time

2014-12-03 07:42:20

version

7