This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home. For a version of this page that may be more recent, see ImmutableDataStructuresWortman in WG2's repo for R7RS-large.

Immutable­Data­Structures­Wortman

cowan
2015-05-27 01:50:45
19history
source

Introduction

This proposal defines immutable data structures for queues, sets, and maps. A structure is immutable when all its operations leave the structure unchanged. Note that none of the procedures specified here ends with a !.

Immutable structures are sometimes called persistent and are closely related to pure-functional (a.k.a. pure) structures. The availability of immutable data structures facilitates writing efficient programs in the pure-functional style.

Efficiency bounds

We specify required time efficiency upper bounds using big-O notation. We note when, in some cases, there is "slack" between the required bound and the theoretically optimal bound for an operation. Implementations may use data structures with amortized time bounds, but should document which bounds hold in only an amortized sense. The use of randomized data structures with expected time bounds is discouraged.

Disjoint types

Deques, sets, and maps are disjoint from all other Scheme types, and deques are disjoint from sets and maps. It is unspecified whether sets and maps are disjoint.

Deques

A deque (conventionally pronounced "deck") is a queue data structure that supports efficient manipulation of either of its ends. It may be used in place of a (LIFO) stack or (FIFO) queue.

The unlabeled finger tree data structure can meet all these requirements rather conveniently. A pair of lists is also a suitable implementation.

Construction

(ideque [ element ...])

Returns a deque containing the elements. The leftmost element (if any) will be at the front of the deque and the rightmost element (if any) will be at the back. Takes O(n) time, where n is the number of elements.

(ideque-tabulate n proc)

Invokes proc successively on the integers 0 through n - 1 inclusive, and returns a deque containing the results in the given order. Takes O(n) time.

(ideque-unfold stop? mapper successor seed)

Invokes the predicate stop? on seed. If it returns false, generate the next result by applying mapper to seed, generate the next seed by applying successor to seed, and repeat this algorithm with the new seed. If stop? returns true, return a deque containing the results in order of accumulation. Takes O(n) time.

(ideque-unfold-right stop? mapper successor seed)

Invokes the predicate stop? on seed. If it returns false, generate the next result by applying mapper to seed, generate the next seed by applying successor to seed, and repeat the algorithm with the new seed. If stop? returns true, return a deque containing the results in reverse order of accumulation. Takes O(n) time.

Predicates

(ideque? x)

Returns #t if x is an ideque, and #f otherwise. Takes O(1) time.

(ideque-empty? deque)

Returns #t if deque contains zero elements, and #f otherwise. Takes O(1) time.

Queue operations

(ideque-front deque)

(ideque-back deque)

Returns the front/back element of deque. It is an error for deque to be empty. Each takes O(1) time.

(ideque-remove-front deque)

(ideque-remove-back deque)

Returns a deque with the front/back element of deque removed. It is an error for deque to be empty. Each takes O(1) time.

(ideque-add-front deque obj)

(ideque-add-back deque obj)

Returns a deque with obj pushed to the front/back of deque. Each takes O(1) time.

Other accessors

(ideque-take deque n)

(ideque-take-right deque n)

Returns a deque containing the first/last n elements of deque. Takes O(n) time.

(ideque-drop deque n)

(ideque-drop-right deque n)

Returns a deque containing all but the first/last n elements of deque. Takes O(n) time.

(ideque-split-at deque n)

Returns two values, the results of (ideque-take deque n) and (ideque-drop deque n) respectively. Takes O(n) time.

The whole deque

(ideque-length deque)

Returns the length of deque as an exact integer. May take O(n) time, though O(1) is optimal.

(ideque-append deque ...)

Returns a deque with the contents of the first argument deque followed by the others. Takes O(kn) time, where k is the number of deques and n is the number of elements involved, though O(k log n) is possible.

(ideque-concatenate list-of-deques)

Returns a deque with the contents of the first deque in list-of-deques followed by the others. This is provided for Schemes in which the number of arguments which can be passed to apply is limited. Takes O(kn) time, where k is the number of deques and n is the number of elements involved, though O(k log n) is possible.

(ideque-reverse deque)

Returns a deque containing the elements of deque in reverse order. Takes O(n) time.

(ideque-count pred deque)

Returns the number of elements of deque which satisfy pred as an exact integer. Takes O(n) time.

(ideque-delete pred deque)

Returns a deque containing the elements of deque, except for those which satisfy pred. Takes O(n) time.

(ideque-zip deque ...)

Returns a deque of lists (not deques) each of which contains the corresponding element of the argument deques in the order specified. Processing stops when all the elements of any deque have been seen. Takes O(kn) time, where k is the number of deques and n is the number of elements involved.

Mapping

(ideque-map proc deque)

Applies proc to each element of deque and returns a deque containing the results in order. Takes O(n) time.

(ideque-for-each proc deque)

Applies proc to each element of deque in order and returns an unspecified result. Takes O(n) time.

(ideque-fold proc nil deque)

(ideque-fold-right proc nil deque)

Invokes proc on each member of deque in order/reverse order, passing the result of the previous invocation as a second argument. For the first invocation, nil is used as the second argument. Returns the result of the last invocation, or nil if there was no invocation. Takes O(n) time.

(ideque-append-map proc deque)

Applies proc to each element of deque. It is an error if the result is not a list. Returns a deque containing the elements of the lists in order. Takes O(n) time.

Filtering

(ideque-filter pred deque)

(ideque-remove pred deque)

Returns a deque which contains the elements of deque that do/do not satisfy pred. Takes O(n) time.

(ideque-partition proc deque)

Searching

(ideque-find pred deque failure)

(ideque-find-right pred deque failure)

Returns the first/last element of deque that satisfies pred. If there is no such element, invokes the thunk failure and returns what it returns. Takes O(n) time.

(ideque-take-while pred deque)

(ideque-take-while-right pred deque)

Returns a deque containing the longest initial/final prefix of elements in deque all of which satisfy pred. Takes O(n) time.

(ideque-drop-while pred deque)

(ideque-drop-while-right proc deque)

Returns a deque which omits the longest initial/final prefix of elements in deque all of which satisfy pred, but includes all other elements of deque. Takes O(n) time.

(ideque-span pred deque)

(ideque-break pred deque)

Returns two values, the initial prefix of the elements of deque which do/do not satisfy pred, and the remaining elements. Takes O(n) time.

(ideque-any? pred deque)

(ideque-every? pred deque)

Invokes pred on the elements of deque in order until one of them returns a true/false value, which is then returned. If there are no such elements, returns #f/#t. Takes O(n) time.

Conversion

(list->ideque list)

(ideque->list deque)

Conversion between deque and list structures. FIFO order is preserved, so the front of a list corresponds to the front of a deque. Each operation takes O(n) time.

Comparator

ideque-comparator

A comparator suitable for comparing ideques. It does not provide comparison procedures, as there is no ordering between ideques.

Sets

A sorted set data structure stores a finite collection of unique elements with a defined comparator.

These requirements can be satisfied by many flavors of self-balancing binary trees. Red-black trees, 1-2 brother trees, and labeled 2-3 finger trees are particularly convenient in an immutable context.

Construction

If two elements are to be inserted into a set that are equal in the sense of the set's comparator but are not eqv?, the first to be specified or generated prevails.

(iset comparator [ element ...])

Returns a set containing elements element..., where comparator provides the criterion of identity and ordering. Takes O(n log n) time.

(iset-tabulate n proc)

Invokes proc successively on the integers 0 through n - 1 inclusive, and returns a set containing the results. Takes O(n) time.

(iset-unfold stop? mapper successor seed)

Invokes the predicate stop? on seed. If it returns false, generate the next result by applying mapper to seed, generate the next seed by applying successor to seed, and repeat this algorithm with the new seed. If stop? returns true, return a set containing the results. Takes O(n) time.

Predicates

(iset? obj)

Returns #t if obj is a set, and #f otherwise. Takes O(1) time.

(iset-empty? set)

Returns #t if set contains zero elements, and #f otherwise. Takes O(1) time.

(iset-member? set obj)

Returns #t if set contains obj, and #f otherwise. Takes O(log n) time.

Accessors

(iset-min set)

(iset-max set)

Returns the least/greatest element of set. It is an error forset to be empty. Takes O(log n) time; O(1) is optimal.

(iset-comparator set)

Returns the comparator of set. Takes O(1) time.

Element operations

(iset-adjoin set obj)

Returns a set which contains the elements of set and obj as well. If there is already an element of set that is equal (in the sense of the comparator) to obj, the existing element of set prevails. Takes O(log n) time.

(iset-adjoin-all set list)

Returns a set which contains the elements of set and the elements of list as well. It is an error if the elements of list are not increasing in the sense of the comparator of set. If there is already an element of set which is equal (in the sense of the comparator) to an element of list, the element of set prevails. Takes O(k log n) time, where k is the length of list.

(iset-replace set obj)

Returns a set which contains the elements of set and obj as well. If there is already an element of set that is equal (in the sense of the comparator) to obj, obj prevails. Takes O(log n) time.

(iset-delete set obj)

Returns a set which contains the elements of set with the exception of obj, if present. If there is already an element of set that is equal (in the sense of the comparator) to obj, the element of set prevails. Takes O(log n) time.

(iset-delete-all set list)

Returns a set which contains the elements of set, excluding the elements of list. It is an error if the elements of list are not increasing in the sense of the comparator of set. Takes O(k log n) time, where k is the length of list.

(iset-predecessor set obj failure)

(iset-successor set obj failure)

Returns the element that immediately precedes/succeeds obj in set's ordering. If no such element exists because obj is the minimum/maximum element, or because set is empty, returns the result of invoking the thunk failure. Takes O(log n) time.

(iset-search set obj failure success)

A continuation-based universal update procedure. Attempts to find an element in set equal (in the sense of the comparator) to obj. When such an element is found, iset-search calls (success match update remove). The success procedure either tail-calls (update new-element ret) to replace the matched element with the new element, or else tail-calls (remove ret) to remove the matched element from set.

When no such match is found, iset-search calls (failure insert ignore), which either tail-calls (insert ret) to insert obj into set, or else tail-calls (ignore ret) .`

In all cases, iset-search returns two values, a set reflecting the indicated modification (if any) and the value ret produced by one of the continuations. It runs in O(log n) time.

(This procedure is based on an analogous procedure for hash tables suggested by Alexey Radul and attributed to Taylor Campbell.)

The whole set

(iset-size set)

Returns the size of set as an exact integer. May take O(n) time, though O(1) is optimal.

(iset-find set obj failure)

Returns the element equal (in the sense of the comparator of set) to obj in set, or the result of invoking the thunk failure if no such element exists. Takes O(log n) time.

(iset-count pred set)

Returns the number of elements in set which satisfy pred as an exact integer. Takes O(n) time.

(iset-any? pred set)

(iset-every? pred set)

Invokes pred on the elements of set until one of them returns a true/false value, which is then returned. If there are no such elements, returns #f/#t. Takes O(n) time.

Filtering

(iset-range= set obj)

(iset-range< set obj)

(iset-range> set obj)

(iset-range<= set obj)

(iset-range>= set obj)

Returns a set containing only the elements of set that are equal to / less than / greater than / less than or equal to / greater than or equal to obj. Takes O(k log k + log n) time, where n is the number of elements in the set and k is the number of elements returned; O(k + log n) is optimal.

Note that since set elements are unique, iset-range= returns a set of at most one element.

(iset-between set least include-least most include-most)

Returns a set containing the elements of set in the interval between leastand most. If include-least/include-most is true then the result includes an element equal to least/most respectively; otherwise those elements are not included. Takes O(k log k + log n) time, where n is the number of elements in the set and k is the number of elements returned; O(k + log n) is optimal.

(iset-filter pred set)

(iset-remove pred set)

Returns a set containing only those elements x for which (predicate? x) returns true/false. Takes O(n log n) time; O(n) is optimal.

(iset-partitionpred set)

Returns two values, (iset-filter pred set) and (iset-remove pred set) respectively.

Folding and mapping

(iset-fold proc nil set)

The fundamental set iterator. Equivalent to, but may be more efficient than, (fold proc base (iset->ordered-list set)). Takes O(n) time.

(iset-map/monotone proc set [ comparator ])

Returns a set containing the result of invoking proc on every element in set. It is an error unless proc is a monotone unary procedure that preserves the order of set elements. Observe that mapping a set of unique elements with a monotone function yields a set of unique elements, so element uniqueness follows from the monotonicity assumption. If comparator is given, it is the comparator of the result; otherwise the result uses the same comparator as set. Takes O(n) time.

(iset-mapproc set [ comparator [ merger ] ])

Like iset-map/monotone, except that proc is not required to be monotone. The merger procedure is used to select among any duplicate elements (in the sense of the comparator of set) that might be created; it returns the value to be used; if absent, the element chosen is implementation-specific. Takes O(n log n) time.

(iset-for-each proc set)

Invokes '''proc'' on every ''element'' in ''set''. The result is unspecified. Takes O(n) time.

Subsets

Note: The following three predicates do not obey the trichotomy law and therefore do not constitute a total order on sets.

(iset=? set1 set2 ...)

Returns #t if each set contains the same elements, and #f otherwise.

(iset<? ''set1 set2'' ...)`

Returns #t if each set other than the last is a proper subset of the following set, and #f otherwise.

(iset>? ''set1 set2'' ...)`

Returns #t if each set other than the last is a proper superset of the following set, and #f otherwise.

(iset<=? ''set1 set2'' ...)`

Returns #t if each set other than the last is a subset of the following set, and #f otherwise.

(iset>=? ''set1 set2'' ...)`

Returns #t if each set other than the last is a superset of the following set, and #f otherwise.

Conversion

(iset->list set)

Returns a list containing the elements of set in increasing order. Takes O(n) time.

(ordered-list->iset comparator list)

Returns a set containing the elements of list and using comparator. It is an error for list to be anything other than a proper list of elements in increasing order. Takes O(n log n) time; O(n) is optimal.

(list->iset comparator list [ merger'' ])

Returns a set containing the elements of list and using comparator. It is an error if list is not a proper list, but it may contain duplicates and need not be in order. The merger procedure is used to select among any duplicate elements (in the sense of the comparator of set) that might be created; it accepts the existing and new elements and returns the value to be used. Takes O(n log n) time.

Set-theoretic operations

(iset-union set ... )

(iset-intersection set ... )

(iset-difference set ... )

(iset-xor set1 set2)

Returns a set containing the union/intersection/difference/symmetric difference of the arguments. All the arguments must be sets sharing an equivalent comparator. The set operator is applied in a left-associative order. If an element is found in more than one set, the first set in the argument list prevails. May take O(kn log n) time, where k is the number of sets and n is the number of elements involved, though O(kn) time is optimal.

Maps

A map data structure stores key-value associations from a map of keys with a comparator to arbitrary value objects. It is an alternative to an association list or hash table, which also store key-value associations, but with different key constraints and efficiency guarantees.

In the same way that a list of key-value dotted pairs can implement an association list, a set of key-value dotted pairs can implement a map. Implementations may use this approach, or may implement a distinct data structure specific to maps.

Construction

If two associations are to be inserted into a map that are equal in the sense of the map's comparator but are not eqv?, the first to be specified or generated prevails.

(imap comparator ( key value ...])

Returns a map usingcomparator. For each pair of arguments, an association is added to the map with key as its key and value as its value. Takes O(n log n) time.

(imap-tabulate n proc)

Invokes proc successively on the integers 0 through n - 1 inclusive, and returns a map containing the results. Takes O(n) time.

(imap-unfold stop? mapper successor seed)

Invokes the predicate stop? on seed. If it returns false, generate the next result by applying mapper to seed, generate the next seed by applying successor to seed, and repeat this algorithm with the new seed. If stop? returns true, return a map containing the results. Takes O(n) time.

Predicates

(imap? obj)

Returns #t if obj is a map, and #f otherwise. Takes O(1) time.

(imap-empty? map)

Returns #t if map contains zero associations, and #f otherwise. Takes O(1) time.

(imap-contains? map obj)

Returns #t if map contains obj as a key, and #f otherwise. Takes O(log n) time.

Accessors

(imap-min map)

(imap-max map)

Returns the least/greatest key of map. It is an error formap to be empty. Takes O(log n) time; O(1) is optimal.

(imap-comparator map)

Returns the comparator of map. Takes O(1) time.

Element operations

(imap-add map key value)

Returns a map which contains the associations of map and an association with key and value as well. If there is already an association of map whose key is equal (in the sense of the comparator) to key, the existing key prevails. Takes O(log n) time.

(imap-add-all map key-list value-list)

Returns a map which contains the associations of map and associations constructed from the corresponding associations of key-list and value-list as well. It is an error if the associations of key-list are not increasing in the sense of the comparator of map. If there is already an association of map which is equal (in the sense of the comparator) to an association of list, the key of map prevails. Takes O(k log n) time, where k is the length of list.

(imap-replace map key value)

Returns a map which contains the associations of map and an association with key and value as well. If there is already an association of map whose key is equal (in the sense of the comparator) to key, key prevails. Takes O(log n) time.

(imap-delete map key)

Returns a map which contains the associations of map with the exception of any association whose key is key. If there is already an association of map whose key is equal (in the sense of the comparator) to key, the existing key prevails. Takes O(log n) time.

(imap-delete-all map key-list)

Returns a map which contains the associations of map, excluding any associations whose keys appear in key-list. It is an error if the associations of list are not increasing in the sense of the comparator of map. Takes O(k log n) time, where k is the length of key-list.

(imap-predecessor map obj failure)

(imap-successor map obj failure)

Returns the key that immediately precedes/succeeds obj in map's ordering. If no such association exists because obj is the minimum/maximum key, or because map is empty, returns the result of invoking the thunk failure. Takes O(log n) time.

(imap-search map obj failure success)

A continuation-based universal update procedure. Attempts to find an association in map whose key is equal (in the sense of the comparator) to obj. When such an association is found, imap-search calls (success key update remove). The success procedure either tail-calls (update new-value ret) to return a map that associates the matched key with new-value, or else tail-calls (remove ret) to remove the matched association from map.

When no such association is found, imap-search calls (failure insert ignore), which either tail-calls (insert new-value ret) to insert an association whose key is obj and whose value is new-value into map, or else tail-calls (ignore ret) .`

In all cases, imap-search returns two values, a map reflecting the indicated modification (if any) and the value ret produced by one of the continuations. It runs in O(log n) time.

(This procedure is based on an analogous procedure for hash tables suggested by Alexey Radul and attributed to Taylor Campbell.)

The whole map

(imap-size map)

Returns the size of map as an exact integer. May take O(n) time, though O(1) is optimal.

(imap-find map pred failure)

For each association of map, invoke proc on its key and value. If proc returns true on a value, then return that value. If all the calls to proc return #f, return the result of invoking the thunk failure. Takes O(log n) time.

(imap-count pred map)

Returns the number of associations in map which satisfy pred as an exact integer. Takes O(n) time.

(imap-any pred map)

(imap-every pred map)

Invokes pred on the associations of map in order until one of them returns a true/false value, which is then returned. If there are no such associations, returns #f/#t. Takes O(n) time.

Filtering

(imap-range= map obj)

(imap-range< map obj)

(imap-range> map obj)

(imap-range<= map obj)

(imap-range>= map obj)

Returns a map containing only the associations of map that are equal to / less than / greater than / less than or equal to / greater than or equal to obj. Takes O(k log k + log n) time, where n is the number of associations in the map and k is the number of associations returned; O(k + log n) is optimal.

Note that since map associations are unique, imap-range= returns a map of at most one association.

(imap-between map least include-least most include-most)

Returns a map containing the associations of map in the interval between leastand most. If include-least/include-most is true then the result includes an association equal to least/most respectively; otherwise those associations are not included. Takes O(k log k + log n) time, where n is the number of associations in the map and k is the number of associations returned; O(k + log n) is optimal.

(imap-filter pred map)

(imap-remove pred map)

Returns a map containing only those associations x for which (predicate? x) returns true/false. Takes O(n log n) time; O(n) is optimal.

(imap-partitionpred map)

Returns two values, (imap-filter pred map) and (imap-remove pred map) respectively.

Folding and mapping

(imap-fold proc nil map)

The fundamental map iterator. Equivalent to, but may be more efficient than, (fold proc base (imap->ordered-list map)). Takes O(n) time.

(imap-map/monotone proc map [ comparator ])

Returns a map containing the result of invoking proc on every association in map. It is an error unless proc is a monotone unary procedure that preserves the order of map associations. Observe that mapping a map of unique associations with a monotone function yields a map of unique associations, so association uniqueness follows from the monotonicity assumption. If comparator is given, it is the comparator of the result; otherwise the result uses the same comparator as map. Takes O(n) time.

(imap-mapproc map [ comparator [ merger ] ])

Like imap-map/monotone, except that proc is not required to be monotone. The merger procedure is used to select among any duplicate associations (in the sense of the comparator of map) that might be created; it returns the value to be used; if absent, the association chosen is implementation-specific. Takes O(n log n) time.

(imap-for-each proc map)

Invokes '''proc'' on every ''association'' in ''map''. The result is unspecified. Takes O(n) time.

Subsets

Note: The following three predicates do not obey the trichotomy law and therefore do not constitute a total order on sets.

(imap=? set1 set2 ...)

Returns #t if each map contains the same associations, and #f otherwise.

(imap<? ''set1 set2'' ...)`

Returns #t if each map other than the last is a proper subset of the following map, and #f otherwise.

(imap>? ''set1 set2'' ...)`

Returns #t if each map other than the last is a proper superset of the following map, and #f otherwise.

(imap<=? ''set1 set2'' ...)`

Returns #t if each map other than the last is a subset of the following map, and #f otherwise.

(imap>=? ''set1 set2'' ...)`

Returns #t if each map other than the last is a superset of the following map, and #f otherwise.

Conversion

(imap-keys imap)

(imap-values imap)

Returns a list of the keys/values of imap in increasing order.

(imap-entries imap)

Returns two values, lists of the keys and values of imap in increasing order.

(imap->alist map)

Returns an association list containing the associations of map in increasing order. Takes O(n) time.

(ordered-alist->imap comparator list)

Returns a map containing the associations of list and using comparator. It is an error for alist to be anything other than an alist in increasing order. Takes O(n log n) time; O(n) is optimal.

(alist->imap comparator list [ merger'' ])

Returns a map containing the associations of alist and using comparator.It is an error unless alist is a proper association list, but it may contain duplicates and need not be in order. The merger procedure is used to select among any duplicate keys (in the sense of the comparator of map) that might be created; it accepts the existing and new keys and returns the key to be used. Takes O(n log n) time.

Set-theoretic operations

(imap-union map ... )

(imap-intersection map ... )

(imap-difference map ... )

(imap-xor map1 map2)

Returns a map containing the union/intersection/difference/symmetric difference of the arguments. All the arguments must be sets sharing an equivalent comparator. The map operator is applied in a left-associative order. If an association is found in more than one set, the first map in the argument list prevails. May take O(kn log n) time, where k is the number of sets and n is the number of associations involved, though O(kn) time is optimal.