This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home.

Source for wiki HashTablesCowan version 17

author

cowan

comment


    

ipnr

108.182.78.175

name

HashTablesCowan

readonly

0

text

'''Work in progress, do not use yet'''

== Hash tables ==

This WG2 proposal defines an interface to hash tables, which are widely recognized as a fundamental data structure for a wide variety of applications.  A hash table is a data structure that:

   * Is disjoint from all other types.
   * Provides a mapping from objects known as ''keys'' to corresponding objects known as ''values''.
     * Keys may be any Scheme objects in some kinds of hash tables, but are restricted in other kinds.
     * Values may be any Scheme objects.
   * Has no intrinsic order for the key-value ''associations'' it contains, though an order may be defined as an extension.
   * Provides an ''equivalence function'' which defines when a proposed key is the same as an existing key.  No table may contain more than one value for a given key.
   * Supports mutation as the primary means of setting the contents of a table.
   * Assumes that keys are immutable; mutating a key leads to unspecified behavior.
   * Provides key lookup and destructive update in amortised constant time.
   * Does not guarantee that whole-table operations work in the presence of concurrent mutation.

== Procedures ==

=== Constructors ===

`(make-hash-table `''equivalence hash'' `.` ''args''`)`

Returns a newly allocated hash table whose equivalence procedure is ''equivalence''and hash procedure is ''hash'' (defaults to a suitable function, and args mean whatever (number is initial capacity, 'immutable, 'weakkeys, 'weak-values, all implementation-dependent).  The "safe" equivalence procedures are eq?, eqv?, equal?, string=?, string-ci=?.  (SRFI-69 `make-hash-table`; R6RS `make-eq-hashtable`, `make-eqv-hashtable`, and `make-hashtable`)

`(hash-table `''equivalence hash'' ( ''key value'' ) ...`)`

Ditto, but with explicitly specified key/value pairs

=== Predicates ===

`(hash-table? `''obj''`)`

Returns `#t` if ''obj'' is a hash table.  (SRFI-69 `hash-table?`; R6RS `hashtable?`)

`(hash-table-contains? `''hash-table key''`)`

Returns `#t` if there is any association to ''key'' in ''hash-table'', and `#f` otherwise.  Must execute in amortized O(1) time.  (SRFI-69 `hash-table-exists?`; R6RS `hashtable-contains?`)

`(hash-table-value-exists? `''hash-table value''`)`

Returns #t if an association whose value is ''value'' exists, #f if not.

`(hash-table=? `''key= value= hash-table,,1,, hash-table,,2,,''`)`

''Not yet defined''

=== Accessors ===

The following procedures, given a key, return the corresponding value.

`(hash-table-ref `''hash-table key'' [ ''failure'' [ ''success'' ] ]`)`

Extracts the value associated to ''key'' in ''hash-table'', invokes the procedure ''success'' on it, and returns its result.  Otherwise, invokes the procedure ''failure'' on no arguments and returns its result.  If ''success'' is not provided and is required, the value itself is returned.  If ''failure'' is not provided but is required, an error is signaled.  Must execute in amortized O(1) time, not counting the time to call the procedures.  (SRFI-69 `hash-table-ref`)

`(hash-table-ref/default `''hash-table key default''`)`

Semantically equivalent to, but may be more efficient than, the following code:
  `(hash-table-ref `''hash-table key'' `(lambda () `''default''`))`

(SRFI-69 `hash-table-ref/default`, R6RS `hashtable-ref`)

=== Mutators ===

The following procedures alter the associations in a hash table either unconditionally, or conditionally on the presence or absence of a specified key.

`(hash-table-set! `''hash-table key value''`)`

Creates a new association in ''hash-table'' that associates ''key'' with ''value''.  If there is a previous association for ''key'', it is deleted.  It is an error if ''hash-table'' is not a valid argument to the equality predicate of ''hash-table''.  Returns an unspecified value.  Must execute in amortized O(1) time.  (SRFI-69 `hash-table-set!`; R6RS `hashtable-set!`)

`(hash-table-set*! `''hash-table'' ( ''key value'' ) ...`)`

Repeatedly mutates ''hash-table'', setting each ''key'' to the ''value'' following it.

`(hash-table-set-entries! `''hash-table keys values''`)`

Repeatedly mutates ''hash-table'', setting each element of ''keys'' to the corresponding element of ''values''.

`(hash-table-set-alist! `''hash-table alist''`)`

Repeatedly mutates ''hash-table'' using the associations of ''alist''.

`(hash-table-delete! `''hash-table key''`)`

Deletes any association to ''key'' in ''hash-table'' and returns an unspecified value.  It is not an error if no association for that ''key'' exists.  Must execute in amortized O(1) time.    (SRFI-69 `hash-table-delete!`; R6RS `hashtable-delete!`)

`(hash-table-delete-keys! `''hash-table keylist''`)`

Semantically equivalent to, but may be implemented more efficiently than, the following code:

  `(for-each (lambda (key) (hash-table-delete `''hash-table'' key)) `''keylist''`)`

`(hash-table-ref! `''hash-table key'' [ ''failure'' [ ''success'' ] ]`)`

Invokes `hash-table-ref` with the given arguments and returns what it returns.  If ''key'' was found in ''hash-table'', its value is set to the result being returned.

`(hash-table-ref!/default `''hash-table key default''`)`

Invokes `hash-table-ref/default` with the given arguments and returns what it returns.  If ''key'' was found in ''hash-table'', its association is set to the result being returned.

`(hash-table-replace! `''hash-table key updater'' [ ''failure'' [ ''success ] ]`)`

Invokes `hash-table-ref` with the given arguments and returns what it returns.  If ''key'' was not found in ''hash-table'', its value is set to the result being returned.

`(hash-table-replace!/default `''hash-table key updater default''`)

Invokes `hash-table-ref/default` with the given arguments and returns what it returns.  If ''key'' was not found in ''hash-table'', its value is set to the result being returned.

`(hash-table-update! `''hash-table key updater'' [ ''failure'' [ ''success ] ]`)`

Semantically equivalent to, but may be implemented more efficiently than, the following code:

  `(hash-table-set! `''hash-table''` `''key''` (`''updater'' `(hash-table-ref `''hash-table''` `''key failure success''`)))`

Must execute in amortized O(1) time.  Returns an unspecified value.  (SRFI-69 `hash-table-update!/default`; R6RS `hashtable-update!`)

`(hash-table-update!/default `''hash-table key updater default''`)`

Semantically equivalent to, but may be implemented more efficiently than, the following code:

  `(hash-table-set! `''hash-table''` `''key''` (`''updater'' `(hash-table-ref/default `''hash-table''` `''key default''`)))`

Must execute in amortized O(1) time.  Returns an unspecified value.  (SRFI-69 `hash-table-update!`)

`(hash-table-push! `''hash-table key value''`)`

Conses value onto the value of key, empty list if no such key.

`(hash-table-pop! `''hash-table key'' [ ''success'' [ ''failure'' ] ]`)`

Returns car of value corresponding to key, mutates key to cdr.  Failure if the key does not exist or the value is not a pair.  Issue: signal an error for non-pair instead?

`(hash-table-pop!/default `''hash-table key default''`)`

The same, but with default if key does not exist or value is not a pair.  Issue: signal an error for non-pair instead?

== Functional update ==

`(hash-table-set `''hash-table key value''`)`

`(hash-table-set* `''hash-table'' ( ''key value'' ) ...`)`

`(hash-table-set-entries `''hash-table keys values''`)`

`(hash-table-set-alist `''hash-table alist''`)`

`(hash-table-delete `''hash-table key''`)`

`(hash-table-delete-keys `''hash-table keylist''`)`

`(hash-table-update `''hash-table key updater'' [ ''failure'' [ ''success ] ]`)`

`(hash-table-update/default `''hash-table key updater default''`)`

`(hash-table-replace/default hash-table `''key value default''`)`

`(hash-table-replace `''hash-table key updater'' [ ''failure'' [ ''success ] ]`)`

These procedures are equivalent to the mutators with corresponding names, except that they return a new hash table which differs from the original in the specified ways.  No guarantees are provided about their amortized execution time, as they may copy the original hash table.

=== Procedures on the whole hash table ===

`(hash-table-clear! `''hash-table''`)`

Remove all the associations from ''hash-table''.  Should execute in amortized O(1) time.  (R6RS `hashtable-clear!`)

`(hash-table-length `''hash-table''`)`  Must execute in amortized O(1) time.  (SRFI 69 `hash-table-size`, R6RS `hashtable-size`)

Return number of keys in hash-table.

`(hash-table-keys `''hash-table''`)`

Return list of all keys in an unspecified order. (SRFI 69 `hash-table-keys`; R6RS `hashtable-keys` returns a vector)

`(hash-table-values `''hash-table''`)`

Return list of all values in an unspecified order, not necessarily the same order as hash-table-keys. (SRFI 69 `hash-table-values`)

`(hash-table-entries `''hash-table''`)`

Returns two values, the list of keys in an unspecified order and the list of values in the corresponding order.  (R6RS `hash-table-entries` returns vectors)

`(hash-table-find `''hash-table proc''`)`

For each association of ''hash-table'', invoke ''proc'' on its key and value in an unspecified order.   If ''proc'' returns true, then return the value.  If ''proc'' always returns `#f`, return `#f`.

`(hash-table-count `''hash-table proc''`)`

For each association of ''hash-table'', invoke ''proc'' on its key and value in an unspecified order.  Return the number of calls to ''proc'' which returned true.

`(hash-table-remove! `''hash-table proc''`)`

For each association of ''hash-table'', invoke ''proc'' on its key and value in an unspecified order.  If ''proc'' returns true, delete the association.

=== Mapping and folding ===

`(hash-table-map `''equivalence hash proc merger hash-table''`)`

For each key invoke proc with the key and the corresponding value, and expect it to return two values, which are used to construct a new hash table using ''equivalence'' and ''hash''.  When a key already exists in the hash table, the procedure ''merger'' is called with arguments ''oldkey, oldvalue, newkey, newvalue'' and returns two values, the proper key and the proper value.

`(hash-table-map! `''proc merger hash-table''`)`

The same, but the values are used to mutate the hash table instead of creating a new one.

`(hash-table-for-each `''proc hash-table''`)`

The same, but discards the results.  (SRFI 69 `hash-table-walk` has the ''hash-table'' as the first argument)

`(hash-table-map->list `''proc hash-table''`)`

Pass each key and value as separate arguments to proc, collect results in list.  

`(hash-table-fold `''proc''` `''init''` `''hash-table''`)`

Calls ''proc'' for every association in ''hash-table'' with three arguments: the key of the association, the value of the association, and an accumulated value ''val''.  ''Val'' is ''init'' for the first invocation of ''procedure'', and for subsequent invocations of ''procedure'', the returned value of the previous invocation.  The value returned by `hash-table-fold` is the return value of the last invocation of ''proc''. The order in which ''procedure'' is called for different associations is unspecified.  (SRFI-69 `hash-table-fold` has the ''hash-table'' as the first argument)

`(hash-table-unfold `''equivalence hash continue? mapper successor seed''`)`

Create a new hash table using ''equivalence'' and ''hash''.  If the result of applying the predicate ''continue?'' to ''seed'' is `#f`, return the hash table.  Otherwise, apply the procedure ''mapper'' to ''seed''.  ''Mapper'' returns two values, which are inserted into the hash table as the key and the value respectively.  Then get a new seed by applying the procedure ''successor'' to ''seed'', and repeat this algorithm.

=== Copying and conversion ===

`(hash-table-copy `''hash-table''`)`

Returns a newly allocated hash table with the same equivalence and hash predicates and the same associations as ''hash-table''. (SRFI-69 `hash-table-copy`; R6RS `hashtable-copy`, passing `#t` as the second argument)

`(hash-table->alist `''hash-table''`)`

Returns an alist with the same associations as ''hash-table'' in an unspecified order.

`(alist->hash-table `''alist equivalence hash'' `.` ''args''`)`

Returns a newly allocated hash-table whose equivalence and hash procedures are ''equivalence'' and ''hash'' respectively, initializing it from the associations of ''alist''.  (SRFI 69 `alist->hash-table`)

=== Hash tables as functions ===

`(hash-table-accessor `''hash-table'' [ ''success'' [ ''failure'' ] ]`)`

`(hash-table-accessor/default `''hash-table default''`)`

`(hash-table-mutator `''hash-table''`)`

`(hash-table-deleter `''hash-table''`)`

`(hash-table-updater `''hash-table updater'' [ ''failure'' [ ''success'' ] ]`)`

`(hash-table-updater/default `''hash-table updater default''`)`

`(hash-table-replacer `''hash-table updater'' [ ''failure'' [ ''success'' ] ]`)`

`(hash-table-replacer/default `''hash-table updater default''`)`

These procedures allow hash tables to be used as functions with mutable behavior.  They return procedures that are curried versions of `hash-table-ref`, `hash-table-ref/default`, `hash-table-set!`, `hash-table-update!`, `hash-table-update!/default`, `hash-table-replace!`, and `hash-table-replace!/default` respectively on the given arguments.

=== Hash tables as sets ===

`(hash-table-union `''hash-table,,1,, hash-table,,2,,'' [ ''merger'' ]`)`

Adds the associations of hash-table,,2,, to a copy of hash-table,,1,, and returns it.  The values are merged using the merger procedure, which defaults to `(lambda (value,,1,, value,,2,,) value,,2)`.  (SRFI 69 `hash-table-merge!`)

`(hash-table-union! `''hash-table,,1,, hash-table,,2,,'' [ ''merger'' ]`)`

Adds the associations of hash-table,,2,, to a copy of hash-table,,1,, and returns it.  The values are merged using the merger procedure, which defaults to `(lambda (value,,1,, value,,2,,) value,,2)`. 

`(hash-table-difference `''hash-table,,1,, hash-table,,2,,''`)`

Removes the keys of hash-table,,2,, from copy of hash-table,,1,, and returns it.

`(hash-table-difference! `''hash-table,,1,, hash-table,,2,,''`)`

Removes the keys of hash-table,,2,, from hash-table,,1,, and returns it.

== An idea ==

Have a procedure that accepts an equivalence and a hash procedure, and returns a function that behaves the same as the equivalence function when called with two arguments, but when called with zero arguments, returns the hash procedure.  This is used to create hash tables, saving an extra argument each time.  Note that this allows the author of an equivalence predicate to provide hashing without cluttering up the interface, and allows the predicate to be provided as transparently as built-in predicates.

== Everything past this point is obsolete ==

Only kept around as a source of useful text to cut and paste.

== Excluded features ==

The following features are '''not''' part of this proposal for various reasons.

=== Incompatible return values ===

`Hash-table-keys` is present in both SRFI 69 and R6RS, but returns a list of keys in the first, a vector of keys in the second.

`Hash-table-values` returns a list of values in SRFI 69; `hashtable-entries` returns two values, a vector of keys and a vector of values in R6RS.

`Hash-table-hash-function` always returns the hash function of a hash table in SRFI 69, but returns `#f` in the case of `eq?` (and `eqv?`) hash tables in R6RS.

`Hash-table-equivalence-function` returns the equivalence function, but in R6/R7RS you can't reliably check functions for identity, so you don't know what you've got.

=== In R6RS but not in SRFI 69 ===

`Hashtable-clear!`, which removes all associations from a hash table.

Specifying the initial capacity of a hash table.

Immutable hash tables.

Hash tables based on `eqv?`.

=== In SRFI 69 but not in R6RS ===

`Hash-table-walk`, which is the analogue of `for-each` for hash tables.

Conversion from a-lists to hash tables and back.

`Hash-table-merge`, which destructively adds all associations in a source hash table to a destination hash table.

Default values specified in the form of a thunk to call.

=== In Common Lisp but not Scheme ===

The current capacity (as opposed to size) of a hash table.

Rehash size and threshold in constructor, plus accessors and mutators for them.

Hash tables based on `equalp` (which is not in Scheme).

`With-hash-table-iterator`, a hash table external iterator implemented as a local macro.

`Sxhash`, a stable hash function.

I have added equivalences to SRFI 69 and R6RS facilities.  Systems supporting either should be able to support these hash tables trivially; it will be necessary to write `hash-table-fold` for R6RS systems.  I have adopted SRFI 69's term `hash-table` rather than R6RS's `hashtable` because of our decision in #40.  Besides, the word ''hashtable'' obviously means something that can be ... hashted.

The main annoyances of this proposal for SRFI 69 programmers will be remembering to supply a third argument to `hash-table-ref`, and for R6RS programmers will be remembering to insert a hyphen in `hashtable`.  Both will have to get used to the new constructors.

time

2013-05-18 21:20:33

version

17