This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home.

Source for wiki HashTablesCowan version 121

author

cowan

comment

Flushing hash-table-search!

ipnr

127.11.51.1

name

HashTablesCowan

readonly

0

text

== Hash tables ==

This WG2 proposal defines an interface to hash tables, which are widely recognized as a fundamental data structure for a wide variety of applications.  A hash table is a data structure that:

   * Is disjoint from all other types.
   * Provides a mapping from objects known as ''keys'' to corresponding objects known as ''values''.
     * Keys may be any Scheme objects in some kinds of hash tables, but are restricted in other kinds.
     * Values may be any Scheme objects.
   * Has no intrinsic order for the key-value ''associations'' it contains.
   * Provides an ''equality predicate'' which defines when a proposed key is the same as an existing key.  No table may contain more than one value for a given key.
   * Provides a ''hash function'' which maps a candidate key into a non-negative exact integer.
   * Supports mutation as the primary means of setting the contents of a table.
   * Assumes that keys are immutable; mutating a key leads to unspecified behavior.
   * Provides key lookup and destructive update in (expected) amortized constant time, provided a satisfactory hash function is available.
   * Does not guarantee that whole-table operations work in the presence of concurrent mutation of the whole hash table (values may be mutated).

In addition, procedures for use on bimaps (bidirectional hash tables) are defined.

== Issues ==

   * In the specification of `make-hash-table`, the discussion of SRFI 69 hash functions could give readers the (presumably incorrect) idea that hash functions must accept an optional second argument, which would make the hash tables of this proposal significantly more difficult to use correctly.  The reference implementation of this proposal never passes more than one argument to a hash function, so the extra effort needed to write hash functions that accept a second argument would be wasted in implementations derived from the reference implementation.  Furthermore, programmers who labor under the impression that hash functions must accept an optional second argument will write unnecessarily complex hash functions that will never be tested with two arguments outside of unit tests written by those programmers.
   * Why is a bimap created by `make-bimap` required to share structure with its first argument?  Why are `bimap-forward-hash-table` and `bimap-reverse-hash-table` forbidden to return copies?  Copying hash tables would be cleaner, and would enable several procedures to run faster because they wouldn't have to guard against the possibility that one of the encapsulated hash tables had been mutated despite the contract.
     * It shouldn't be necessary to guard against it, because it is an error to mutate them anyway.  Which procedures are affected?
       * A previous version of this API required `hash-table-ref` and `hash-table-ref/default` to raise a specific kind of exception when the key is not found, which implied a guard (in case the key is not found because the hash function raises an exception), which implies the overhead of that guard in all procedures whose specification requires them to behave as though they had called `hash-table-ref` or `hash-table-ref/default`, even in bimap procedures where the invariant would ordinarily ensure the key would be found were it not for exposure of the component hash tables to mutation.  That performance issue disappeared when `hash-table-key-not-found?` was flushed, but there remains the issue of whether implementations of this API should be allowed (without being required) to protect against the class of errors caused by the error of mutating a component hash table.
   * Should this SRFI provide two libraries, one for hash tables and one for bimaps?

== Rationale ==

Hash tables themselves don't really need defending: almost all dynamically typed languages, from awk to !JavaScript to Lua to Perl to Python to Common Lisp, and including many Scheme implementations, provide them in some form as a fundamental data structure.  Therefore, what needs to be defended is not the data structure but the procedures.  The present proposal is at an intermediate level.  It supports a great many convenience procedures on top of the basic hash table interfaces provided by [http://srfi.schemers.org/srfi-69/srfi-69.html SRFI 69] and [http://www.r6rs.org/final/html/r6rs-lib/r6rs-lib-Z-H-14.html R6RS].  Nothing in it adds power to what those interfaces provide, but it does add convenience in the form of pre-debugged routines to do various common things, and even some things not so commonly done but useful.

There is no mandated support for thread safety, immutability, or weakness, though there are semi-portable hooks for specifying these features.

This specification accepts separate equality predicates and hash functions for backward compatibility, but strongly encourages the use of [http://srfi.schemers.org/srfi-114/srfi-114.html SRFI 114] comparators, which package a type test, an equality predicate, and a hash function into a single bundle.

Bimaps are just a convenience structure based on a pair of hash tables, one that maps keys to unique values, the other that maps the values back to their keys.  By providing mutation procedures, it becomes trivial to keep the two hash tables consistent.

=== SRFI 69 compatibility ===

==== Names ====

The names used in the present proposal are mostly derived from SRFI 69, with the following changes:

* The `hash-table-exists?` procedure seems to ask if its argument is an existing hash table.  It has been renamed `hash-table-contains?`, as in R6RS.

* The `hash-table-walk` procedure has been replaced by the similar `hash-table-for-each` procedure.

* The `hash-table-merge!` procedure has been renamed `hash-table-union!`, since it causes the first hash table to become the union of the two hash table arguments.

==== Calling conventions ====

* The `make-hash-table` and `alist->hash-table` procedures now accept a [http://srfi.schemers.org/srfi-114/srfi-114.html SRFI 114] comparator in place of an equality predicate and optional hash function.  The old calling convention is retained for backward compatibility. 

* The SRFI 69 `hash-table-fold` procedure places the hash table argument first.  For the same reasons as the reordering of `hash-table-walk` arguments, in this proposal it appears last.

* Unlike `hash-table-walk`, the `hash-table-for-each` procedure has the same argument order as R7RS `for-each`, `string-for-each`, and `vector-for-each`.  However, unlike those procedures it can only accept one hash table: coordinated access to multiple hash tables is meaningless, given that hash tables are unordered.

==== Reflection and hash-function procedures ====

SRFI 69 provides reflective procedures that, given a hash table, returns its equality predicate and hash function, as well as procedures that expose the implementation's hash functions suitable for the equality predicates `eq?`, `equal?`, `string=?`, and `string-ci=?`.  The second of these can also be used for `eqv?`.  However, if the the value of `eq?` hash function is not idempotent but depends on the memory address of the key, and the garbage collector moves such a key, it must also rehash every hash table containing that key.  In such implementations, the `hash-by-identity` procedure is unsafe to use outside the context of implementation-provided hash tables.

R6RS eliminates this issue by providing separate constructors for `eq?` and `eqv?` hash tables, and refusing to expose the hash functions for them.  However, the present proposal takes the radical option of exposing neither reflection nor implementation-based hash functions.  Instead, implementations are permitted to ignore user-provided hash functions in certain circumstances if they have address-based hash functions available.  They can of course be exposed by implementations as extensions, with suitable warnings against inappropriate uses.

=== R6RS compatibility ===

The relatively few hash table procedures in R6RS are all available in the present proposal under somewhat different names.  The only substantive difference is that R6RS `hashtable-values` and `hashtable-entries` return vectors, whereas in the present proposal `hash-table-values` and `hash-table-entries` return lists.  The present proposal adopts SRFI 69's term `hash-table` rather than R6RS's `hashtable`, because of the universal use of "hash table" rather than "hashtable" in other languages and in technical prose generally.  Besides, the English word ''hashtable'' obviously means something that can be ... hashted.

In addition, the `hashtable-ref` and `hashtable-update!` of R6RS correspond to the `hash-table-ref/default` and `hash-table-update!/default` of both SRFI 69 and the present proposal.

It would be trivial to provide the R6RS names (or for that matter the SRFI 69 names) on top of the present proposal.

=== Common Lisp compatibility ===

As usual, the Common Lisp names are completely different from the Scheme names. Common Lisp provides the following capabilities that are not in the present proposal:

* The constructor allows specifying the current capacity (as opposed to size), rehash size, and rehash threshold of the new hash table.  There are also accessors and mutators for these.

* There are hash tables based on `equalp` (which does not exist in Scheme).

* `With-hash-table-iterator` is a hash table external iterator implemented as a local macro.

* `Sxhash` is a stable hash function.

=== Sources ===

The procedures in the present proposal are drawn primarily from SRFI 69 and R6RS.  In addition, the following sources are acknowledged:

* The `hash-table-mutable?` and the second argument of `hash-table-copy` (which allows the creation of immutable hash tables) are from R6RS, renamed in the style of the present proposal (although it does not require implementations to provide immutable hash tables).

* The `hash-table-extend!`, `hash-table-extend/default`, `hash-table-collect`, `hash-table->alist`, and `alist->hash-table` procedures are from [http://docs.racket-lang.org/reference/hashtables.html Racket], renamed in the style of this proposal.

* The `hash-table-push!` and `hash-table-pop!` procedures are from [http://practical-scheme.net/gauche/man/gauche-refe_53.html Gauche].

* The `hash-table-find` procedure is a modified version of `table-search` in [http://www.iro.umontreal.ca/~gambit/doc/gambit-c.html#Tables Gambit].

* The procedure `hash-table-set-entries!` was suggested by the [http://clhs.lisp.se/Body/f_pairli.htm Common Lisp] function `pairlis`.

* The procedures `hash-table-unfold` and `hash-table-count` were suggested by [http://srfi.schemers.org/srfi-1/srfi-1.html SRFI 1].  

* The procedures `hash-table-accessor` and `hash-table-accessor/default` were loosely inspired by the finite functions of [https://code.google.com/p/owl-lisp/wiki/OwlManual#Finite_Functions Owl Lisp].

* The procedures `hash-table=?`, `hash-table-map`, `hash-table-map-values`, and `hash-table-filter!` were suggested by [http://hackage.haskell.org/packages/archive/containers/0.5.2.1/doc/html/Data-Map-Strict.html Haskell's Data.Map.Strict module].

* The bimap procedures are based on Gauche's bimaps.

The predicate `hash-table-empty?`, as well as the mutation procedures `hash-table-set-entries!`, `hash-table-replace!`, `hash-table-replace!/default`, `hash-table-map!`, `hash-table-intersection!`, and `hash-table-difference!`, were added for completeness. 

The native hash tables of [http://web.mit.edu/scheme_v9.0.1/doc/mit-scheme-ref/Hash-Tables.html MIT],  [http://www.gnu.org/software/guile/manual/html_node/Hash-Table-Reference.html Guile], [http://sisc-scheme.org/manual/html/ch09.html#Hashtables SISC], [http://www-sop.inria.fr/indes/fp/Bigloo/doc/bigloo-7.html#Hash-Tables Bigloo], [http://s48.org/0.57/manual/s48manual_44.html Scheme48], [http://www.cs.indiana.edu/scheme-repository/SCM/slib_2.html#SEC13 SLIB], [http://www.rscheme.org/rs/b/0.7.3.4/5/html/c2143.html RScheme], [https://ccrma.stanford.edu/software/snd/snd/s7.html#hashtables Scheme 7], [https://github.com/barak/scheme9/blob/master/lib/hash-table.scm Scheme 9], [http://www.fifi.org/cgi-bin/info2www?(librep)Hash+Tables Rep], and [https://code.google.com/p/femtolisp/wiki/APIReference FemtoLisp] were also investigated, but no additional procedures were incorporated.

=== Pronunciation ===

The slash in the names of some procedures can be pronounced "with".

=== Acknowledgements ===

Some of the language of the present proposal is copied from SRFI 69 with thanks to its author, Panu Kalliokoski.  However, he is not responsible for what I have done with it.

== Specification ==

The procedures in the present proposal are in the `(srfi xxx)` library (or `(srfi :xxx)` on R6RS).

All references to "executing in expected amortized constant time" presuppose that a satisfactory hash function is available.  Arbitrary or non-idempotent hash functions can make a hash of any implementation.

Hash tables are allowed to cache the results of calling the equality predicate and hash function, so programs cannot rely on the hash function being called exactly once for every primitive hash table operation: it may be called zero, one, or more times.

It is an error if the procedure argument of `hash-table-find`, `hash-table-count`, `hash-table-map`, `hash-table-map-values`, `hash-table-for-each`, `hash-table-map!`, `hash-table-collect`, or `hash-table-fold` mutates the hash table being walked.

It is an error to operate on two hash tables that have different comparators or equality predicates.

It is an error to mutate a key during or after its insertion into a hash table.

=== Index ===

* [#Constructors Constructors]: `make-hash-table`, `hash-table`, `hash-table-tabulate`, `hash-table-unfold`, `alist->hash-table`

* [#Predicates Predicates]: `hash-table?`, `hash-table-contains?`, `hash-table-empty?`, `hash-table=?`, `hash-table-immutable?`

* [#Accessors Accessors]: `hash-table-ref`, `hash-table-ref/default`

* [#Mutators Mutators]: `hash-table-set!`, `hash-table-set-entries!`, `hash-table-delete!`, `hash-table-delete-keys!`, `hash-table-extend!`, `hash-table-extend!/default`, `hash-table-replace!`, `hash-table-replace!/default`, `hash-table-update!`, `hash-table-update!/default`, `hash-table-push!`, `hash-table-pop!`, `hash-table-clear!`

* [#Thewholehashtable The whole hash table]: `hash-table-size`, `hash-table-keys`, `hash-table-values`, `hash-table-entries`, `hash-table-find`, `hash-table-count`, `hash-table-any`, `hash-table-every`

* [#Mappingandfolding Mapping and folding]: `hash-table-map`, `hash-table-map-values`, `hash-table-for-each`, `hash-table-map!`, `hash-table-collect`, `hash-table-fold`, `hash-table-filter!`, `hash-table-remove!`

* [#Copyingandconversion Copying and conversion]: `hash-table-copy`, `hash-table->alist`

* [#Hashtablesasfunctions Hash tables as functions]: `hash-table-accessor`, `hash-table-accessor/default`

* [#Hashtablesassets Hash tables as sets]: `hash-table-union!`, `hash-table-intersection!`, `hash-table-difference!`, `hash-table-xor!`

* [#Bimaps Bimaps]: `make-bimap`, `bimap?`, `bimap-forward-hash-table`, `bimap-reverse-hash-table`, `bimap-contains?`, `bimap-contains-value?`, `bimap=?`, `bimap-ref`, `bimap-value-ref`, `bimap-ref/default`, `bimap-value-ref/default`, `bimap-copy`, `bimap-set!`, `bimap-set-entries!`, `bimap-delete!`, `bimap-delete-keys!`, `bimap-extend!`, `bimap-extend/default!`, `bimap-replace!`, `bimap-replace/default!`, `bimap-update!`, `bimap-update/default!`, `bimap-clear!`, `bimap-filter!`, `bimap-remove!`

=== Constructors ===

`(make-hash-table `''comparator'' [ ''arg'' ... ]`)`

`(make-hash-table `''equality-predicate'' [ ''hash-function'' ] [ ''arg'' ... ]`)`

Returns a newly allocated hash table whose equality predicate and hash function are extracted from ''comparator''.  Alternatively, for backward compatibility with SRFI 69 the equality predicate and hash function can be passed as separate arguments; this usage is deprecated.  Note that unlike the hash functions packaged as part of SRFI 114 comparators, SRFI 69 hash functions accept two arguments, the object to be hashed and a non-negative integer that bounds the hash code.

If an equality predicate rather than a comparator is provided, the ability to omit the ''hash-function'' argument is severely limited.  The implementation must provide hash functions appropriate for use with the predicates `eq?`, `eqv?`, `equal?`, `string=?`, and `string-ci=?`, and may extend this list.  But if any unknown equality predicate is provided without a hash function, an error should be signaled.

It is an error if the equality predicate does not accept two arguments and return a truth value.  It is also an error if the hash function does not accept one argument in the domain of the equality predicate and return a non-negative exact integer.  It is the programmer's responsibility to ensure that if two objects are the same in the sense of the equality predicate, then they  return the same value when passed to the hash function.  However, the converse is not required.

If the equality predicate, whether passed as part of a comparator or explicitly, is more fine-grained (in the sense of R7RS-small section 6.1) than `equal?`, the implementation is free -- indeed, is encouraged -- to ignore the programmer-specified hash function and use something implementation-dependent.  This allows the use of addresses as hashes, in which case the keys must be rehashed if they are moved by the garbage collector.

The meaning of any further arguments is implementation-dependent.  However, implementations which support the ability to specify the initial capacity of a hash table should interpret a non-negative exact integer as the specification of that capacity.  In addition, if the symbols `thread-safe`, `weak-keys` or  `weak-values` are present, implementations should create thread-safe hash tables, hash tables with weak keys, and hash tables with weak values respectively.  In an implementation which does not support these features, an error should be signaled if they are requested.  To avoid collision with the ''hash-function'' argument, none of these arguments can be procedures.

(SRFI 69 `make-hash-table`; R6RS `make-eq-hashtable`, `make-eqv-hashtable`, and `make-hashtable`; Common Lisp `make-hash-table`)

`(hash-table `''comparator'' ( ''key value'' ) ...`)`

Returns a newly allocated hash table, created as if by `make-hash-table` using ''comparator''.  For each pair of arguments, an association is added to the new hash table with ''key'' as its key and ''value'' as its value.  If the implementation supports immutable hash tables, this procedure returns an immutable hash table.

`(hash-table-tabulate `''comparator n proc''`)`

Returns a newly allocated hash table, created as if by `make-hash-table` using ''comparator''.  For each integer from 0 to ''n'' - 1, ''proc'' is invoked on it, returning two values.  The values are used as the key and value of an association added to the new hash table.  If the implementation supports immutable hash tables, this procedure returns an immutable hash table.

`(hash-table-unfold `''stop? mapper successor seed comparator arg'' ... ]`)`

Create a new hash table as if by `make-hash-table` using ''comparator'' and the ''args''.  If the result of applying the predicate ''stop?'' to ''seed'' is true, return the hash table.  Otherwise, apply the procedure ''mapper'' to ''seed''.  ''Mapper'' returns two values, which are inserted into the hash table as the key and the value respectively.  Then get a new seed by applying the procedure ''successor'' to ''seed'', and repeat this algorithm.

`(alist->hash-table `''alist comparator arg'' ...`)`

`(alist->hash-table `''alist equality-predicate'' [ ''hash-function'' ] ''arg'' ...`)`

Returns a newly allocated hash-table as if by `make-hash-table` using ''comparator'' and the ''args''.  It is then initialized from the associations of ''alist''.  Associations earlier in the list take precedence over those that come later.  The second form is for compatibility with SRFI 69, and is deprecated.  (SRFI 69 `alist->hash-table`)

=== Predicates ===

`(hash-table? `''obj''`)`

Returns `#t` if ''obj'' is a hash table, and `#f` otherwise.  (SRFI 69 `hash-table?`; R6RS `hashtable?`; Common Lisp `hash-table-p`)

`(hash-table-contains? `''hash-table key''`)`

Returns `#t` if there is any association to ''key'' in ''hash-table'', and `#f` otherwise.  Must execute in expected amortized constant time.  (SRFI 69 `hash-table-exists?`; R6RS `hashtable-contains?`)

`(hash-table-empty? `''hash-table''`)`

Returns `#t` if ''hash-table'' contains no associations, and `#f` otherwise.

`(hash-table=? `''value-comparator hash-table,,1,, hash-table,,2,,''`)`

Returns `#t` if ''hash-table,,1,,'' and ''hash-table,,2,,'' have the same keys (in the sense of their common equality predicate) and each key has the same value (in the sense of ''value-comparator''), and `#f` otherwise.

`(hash-table-mutable? `''hash-table''`)`

Returns `#t` if the hash table is mutable.  Implementations may or may not support immutable hash tables.  (R6RS `hashtable-mutable?`)

=== Accessors ===

The following procedures, given a key, return the corresponding value.

`(hash-table-ref `''hash-table key'' [ ''failure'' [ ''success'' ] ]`)`

Extracts the value associated to ''key'' in ''hash-table'', invokes the procedure ''success'' on it, and returns its result; if ''success'' is not provided, then the value itself is returned.  If ''key'' is not contained in ''hash-table'' and ''failure'' is supplied, then ''failure'' is invoked on no arguments and its result is returned.  Otherwise, it is an error.  Must execute in expected amortized constant time, not counting the time to call the procedures.  (SRFI 69 `hash-table-ref` does not support the ''success'' procedure)

`(hash-table-ref/default `''hash-table key default''`)`

Semantically equivalent to, but may be more efficient than, the following code:
  `(hash-table-ref `''hash-table key'' `(lambda () `''default''`))`

(SRFI 69 `hash-table-ref/default`, R6RS `hashtable-ref`; Common Lisp `gethash`)

=== Mutators ===

The following procedures alter the associations in a hash table either unconditionally, or conditionally on the presence or absence of a specified key.  It is an error to add an association to a hash table whose key does not satisfy the type test predicate of the comparator used to create the hash table.

`(hash-table-set! `''hash-table'' ( ''key value'' ) ...`)`

Repeatedly mutates ''hash-table'', creating new associations in ''hash-table'' that associates each ''key'' with the ''value'' that follows it.  If there is a previous association for ''key'', it is deleted.  It is an error if the type check procedure of the comparator of ''hash-table'', when invoked on ''key'', does not return `#t`. Likewise, it is an error if ''key'' is not a valid argument to the equality predicate of ''hash-table''.  Returns an unspecified value.  Must execute in expected amortized constant time per key.  (SRFI 69 `hash-table-set!`, R6RS `hashtable-set!`, and Common Lisp `(setf gethash)` do not handle multiple associations)

`(hash-table-set-entries! `''hash-table keys-list values-list''`)`

Repeatedly mutates ''hash-table'', setting each element of ''keys-list'' to the corresponding element of ''values-list'' in the order in which they are specified.  Excess keys or values are ignored.

`(hash-table-delete! `''hash-table key'' ...`)`

Deletes any association to each ''key'' in ''hash-table'' and returns the number of keys that had associations.  Must execute in expected amortized constant time per key.  (SRFI 69 `hash-table-delete!`, R6RS `hashtable-delete!`, and Common Lisp `remhash` do not handle multiple associations)

`(hash-table-delete-keys! `''hash-table keys-list''`)`

Semantically equivalent to, but may be more efficient than, the following code:

  `(for-each (lambda (key) (hash-table-delete! `''hash-table''` key)) `''keys-list''`)`

`(hash-table-extend! `''hash-table key'' [ ''failure'' [ ''success'' ] ]`)`

Effectively invokes `hash-table-ref` with the given arguments and returns what it returns.  If ''key'' was not found in ''hash-table'', its value is set to the result being returned.  Must execute in expected amortized constant time.

`(hash-table-extend!/default `''hash-table key default''`)`

Effectively invokes `hash-table-ref/default` with the given arguments and returns what it returns.  If ''key'' was not found in ''hash-table'', its association is set to the result being returned.  Must execute in expected amortized constant time.

`(hash-table-replace! `''hash-table key'' [ ''failure'' [ ''success'' ] ]`)`

Effectively invokes `hash-table-ref` with the given arguments and returns what it returns.  If ''key'' was found in ''hash-table'', its value is set to the result being returned.  Must execute in expected amortized constant time.

`(hash-table-replace!/default `''hash-table key'' ''default''`)`

Effectively invokes `hash-table-ref/default` with the given arguments and returns what it returns.  If ''key'' was found in ''hash-table'', its value is set to the result being returned.  Must execute in expected amortized constant time.

`(hash-table-update! `''hash-table key updater'' [ ''failure'' [ ''success ] ]`)`

Semantically equivalent to, but may be more efficient than, the following code:

  `(hash-table-set! `''hash-table key''` (`''updater'' `(hash-table-ref `''hash-table key failure success''`)))`

Must execute in expected amortized constant time.  Returns an unspecified value.  (SRFI 69 `hash-table-update!/default` and R6RS `hashtable-update!` do not support the ''success'' procedure)

`(hash-table-update!/default `''hash-table key updater default''`)`

Semantically equivalent to, but may be more efficient than, the following code:

  `(hash-table-set! `''hash-table key''` (`''updater'' `(hash-table-ref/default `''hash-table key default''`)))`

Must execute in expected amortized constant time.  Returns an unspecified value.  (SRFI 69 `hash-table-update!`)

`(hash-table-push! `''hash-table key value failure''`)`

If an association with ''key'' is found in ''hash-table'', then update the value associated with ''key'' to the result of invoking `cons` on ''value'' and  the original value.  If the key is not found, a new association between ''key'' and the result of invoking the thunk ''failure'' is added to ''hash-table''.  The return value of `hash-table-push!` is unspecified.  It is an error if an existing value is not a pair.

`(hash-table-pop! `''hash-table key failure''`)`

If an association with ''key'' is found in ''hash-table'', then return the car of the value, and update the value to its own cdr.  If the value is not found, then the result of invoking the thunk ''failure'' is returned.  It is an error if the existing value is not a pair.

`(hash-table-clear! `''hash-table''`)`

Delete all the associations from ''hash-table''.  (R6RS `hashtable-clear!`; Common Lisp `clrhash`)

=== The whole hash table ===

These procedures process the associations of the hash table in an unspecified order.

`(hash-table-size `''hash-table''`)`

Returns the number of associations in ''hash-table'' as an exact integer.  Should execute in constant time.  (SRFI 69 `hash-table-size`, R6RS `hashtable-size`; Common Lisp `hash-table-count`.)

`(hash-table-find `''hash-table proc failure''`)`

For each association of ''hash-table'', invoke ''proc'' on its key and value.   If ''proc'' returns true, then `hash-table-find` returns what ''proc'' returns.  If all the calls to ''proc'' return `#f`, return the result of invoking the thunk ''failure''.

`(hash-table-count `''hash-table pred''`)`

For each association of ''hash-table'', invoke ''pred'' on its key and value.  Return the number of calls to ''pred'' which returned true.

`(hash-table-any `''proc hash-table''`)`

Calls ''proc'' for as many associations in ''hash-table'' as necessary with two arguments, the key and the value of the association.  If any invocation of ''proc'' returns true, `hash-table-any` immediately returns whatever that invocation returns; otherwise it returns `#f`.

`(hash-table-every `''proc hash-table''`)`

Calls ''proc'' for as many associations in ''hash-table'' as necessary with two arguments, the key and the value of the association.  If any invocation of ''proc'' returns false, `hash-table-every` immediately returns `#f`; otherwise it returns `#t`.

=== Mapping and folding ===

These procedures process the associations of the hash table in an unspecified order.

`(hash-table-map `''proc comparator merger hash-table''`)`

Returns a newly allocated hash table as if by `make-hash-table`.  Calls ''proc'' for every association in ''hash-table'' with two arguments: the key of the association and the value of the association.  The two values returned by ''proc'' are inserted into the new hash table as a key and value.

When the key being added is equal (in the sense of ''comparator'') to a key already inserted in the new hash table, the procedure ''merger'' is called with arguments ''oldkey oldvalue newkey newvalue'' and returns the key to be associated with the new value.

`(hash-table-map-values `''proc comparator hash-table''`)`

Returns a newly allocated hash table as if by `make-hash-table`.  Calls ''proc'' for every association in ''hash-table'' with the value of the association.  The key of the association and the result of invoking ''proc'' are entered into the new hash table.

`(hash-table-for-each `''proc hash-table''`)`

Calls ''proc'' for every association in ''hash-table'' with two arguments: the key of the association and the value of the association.  The value returned by ''proc'' is discarded.  Returns an unspecified value.  (SRFI 69 `hash-table-walk` with the arguments reversed; Common Lisp `maphash`)

`(hash-table-map! `''proc hash-table''`)`

Calls ''proc'' for every association in ''hash-table'' with two arguments: the key of the association and the value of the association.  The value returned by ''proc'' is used to update the value of the association.   Returns an unspecified value.

`(hash-table-collect `''proc hash-table''`)`

Calls ''proc'' for every association in ''hash-table'' with two arguments: the key of the association and the value of the association.  The values returned by the invocations of ''proc'' are accumulated into a list, which is returned.  

`(hash-table-fold `''proc init hash-table''`)`

Calls ''proc'' for every association in ''hash-table'' with three arguments: the key of the association, the value of the association, and an accumulated value ''val''.  ''Val'' is ''init'' for the first invocation of ''procedure'', and for subsequent invocations of ''procedure'', the returned value of the previous invocation.  The value returned by `hash-table-fold` is the return value of the last invocation of ''proc''.  (SRFI 69 `hash-table-fold` has the ''hash-table'' as the first argument)

`(hash-table-filter! `''proc hash-table''`)`

Calls ''proc'' for every association in ''hash-table'' with two arguments, the key and the value of the association, and removes all associations for which ''proc'' returns false from ''hash-table'', which is returned.

`(hash-table-remove! `''proc hash-table''`)`

Calls ''proc'' for every association in ''hash-table'' with two arguments, the key and the value of the association, and removes all associations for which ''proc'' returns true from ''hash-table'', which is returned.

=== Copying and conversion ===

`(hash-table-copy `''hash-table'' [ ''mutable?'' ]`)`

Returns a newly allocated hash table with the same properties and associations as ''hash-table''. If the second argument is present and is true, the new hash table is mutable.  Otherwise it is immutable provided that the implementation supports immutable hash tables.  (SRFI 69 `hash-table-copy` does not support a second argument; R6RS `hashtable-copy`)

`(hash-table-keys `''hash-table''`)`

Returns a newly allocated list of all the keys in ''hash-table''. (SRFI 69 `hash-table-keys`; R6RS `hashtable-keys` returns a vector)

`(hash-table-values `''hash-table''`)`

Returns a newly allocated list of all the keys in ''hash-table''. (SRFI 69 `hash-table-values`)

`(hash-table-entries `''hash-table''`)`

Returns two values, a newly allocated list of all the keys in ''hash-table''and a newly allocated list of all the values in ''hash-table'' in the corresponding order.  (R6RS `hash-table-entries` returns vectors)

`(hash-table->alist `''hash-table''`)`

Returns an alist with the same associations as ''hash-table'' in an unspecified order.  (SRFI 69)

=== Hash tables as functions ===

The following procedures provide functions with mutable behavior based on hash tables.  In this way, for example, lists can be processed by `map` using the procedure returned from a hash table by `hash-table-accessor`.

`(hash-table-accessor `''hash-table'' [ ''failure'' [ ''success'' ] ]`)`

Curried version of `hash-table-ref`.  Returns a procedure of one argument, a key, which returns what `hash-table-ref` returns when invoked with the the passed arguments.

`(hash-table-accessor/default `''hash-table default''`)`

Curried version of `hash-table-ref/default`.  Returns a procedure of one argument, a key, which returns what `hash-table-ref/default` returns when invoked with the passed arguments.

=== Hash tables as sets ===

`(hash-table-union! `''hash-table,,1,, hash-table,,2,,''`)`

Adds the associations of ''hash-table,,2,,'' to ''hash-table,,1,,'' and returns ''hash-table,,1,,''.  If a key appears in both hash tables, its value is set to the value appearing in ''hash-table,,1,,''.  Returns 'hash-table,,1,,''.  (SRFI 69 `hash-table-merge!`)

`(hash-table-intersection! `''hash-table,,1,, hash-table,,2,,''`)`

Deletes the associations from ''hash-table,,1,,'' which don't also appear in ''hash-table,,2,,'' and returns ''hash-table,,1,,''.

`(hash-table-difference! `''hash-table,,1,, hash-table,,2,,''`)`

Deletes the associations of ''hash-table,,1,,'' whose keys are also present in ''hash-table,,2,,'' and returns ''hash-table,,1,,''.

`(hash-table-xor! `''hash-table,,1,, hash-table,,2,,''`)`

Deletes the associations of ''hash-table,,1,,'' whose keys are also present in ''hash-table,,2,,'', and then adds the associations of ''hash-table,,2,,'' whose keys are not present in 'hash-table,,1,,'' to 'hash-table,,1,,''.  Returns ''hash-table,,1,,''.

== Bimaps ==

A bimap is built by starting with a hash table which represents the forward mapping from keys to values; it is an error if the values are not unique.  A second hash table is constructed and populated with the reverse mapping.  It is possible to retrieve either underlying hash table for read-only operations, but it is an error to mutate them, so bimap mutation procedures are provided.

`(make-bimap `''hash-table comparator arg'' ...`)`

Returns a newly allocated bimap whose forward hash table is ''hash-table'' and whose reverse hash table is newly allocated using ''comparator'' and any ''args''.  It is an error if any value in ''hash-table'' is not unique in the sense of ''comparator''.  It is an error to mutate ''hash-table'' after this procedure returns, as it shares structure with the bimap.

`(bimap? `''obj''`)`

Returns `#t` if ''obj'' is a bimap and `#f` otherwise.

`(bimap-forward-hash-table `''bimap''`)`

Returns the original hash table passed to the bimap.  It is an error to mutate this hash table.

`(bimap-reverse-hash-table `''bimap''`)`

Returns the reverse hash table created by `make-bimap`.  It is an error to mutate this hash table.

`(bimap-contains? `''bimap key''`)`

Returns `#t` if ''key'' is contained in one of the associations of the bimap, and `#f` otherwise.

`(bimap-contains-value? `''bimap value''`)`

Returns `#t` if ''value'' is contained in one of the associations of the bimap, and `#f` otherwise.

`(bimap=? `''bimap,,1,, bimap,,2,,''`)`

Returns `#t` if ''bimap,,1,,'' and ''bimap,,2,,'' contain the same key-value associations.  Two associations ''key,,1,,/value,,1,,'' and ''key,,2,,/value,,2,,'' are the same if and only if the reverse hash tables of both ''bimap,,1,,'' and ''bimap,,2,,'' map ''value,,1,,' and ''value,,2,,'' to keys that are the same in the sense of the equivalence procedures of both tables.  Returns `#f` otherwise.

`(bimap-ref `''bimap key'' [ ''failure'' [ ''success'' ] ]`)`

Returns what `(hash-table-ref (bimap-forward-hash-table `''bimap''`) `''key failure success''`)` returns.

`(bimap-ref/default `''bimap key default''`)`

Returns what `(hash-table-ref (bimap-forward-hash-table `''bimap''`) `''key default''`)` returns.

`(bimap-value-ref `''bimap value'' [ ''failure'' [ ''success'' ] ]`)`

Returns what `(hash-table-ref (bimap-reverse-hash-table `''bimap''`) `''value failure success''`)` returns.

`(bimap-value-ref/default `''bimap value default''`)`

Returns what `(hash-table-ref (bimap-reverse-hash-table `''bimap''`) `''value default''`)` returns.

`(bimap-copy `''bimap'' [ ''immutable?'' ]`)`

Returns a newly allocated bimap with the same properties and associations as ''bimap''. If the second argument is present and is true, the new bimap is mutable.  Otherwise it is made from immutable hash tables and itself immutable, provided the implementation supports immutable hash tables.

The mutation procedures `bimap-set!`, `bimap-set-entries!`, `bimap-delete!`, `bimap-delete-keys!`, `bimap-extend!`, `bimap-extend/default!`, `bimap-replace!`, `bimap-replace/default!`, `bimap-update!`, `bimap-update/default!`, `bimap-clear!`, `bimap-filter!`, and `bimap-remove!` have the same behavior as their hash table analogues, mutating both hash tables appropriately.

== Implementation ==

The [http://snow-fort.org/pkg sample implementation] is easily layered over any hash table implementation that supports either SRFI 69 or R6RS.  The sample implementation is built directly on top of `(r6rs hashtables)` (available at http://snow-fort.org/pkg), whose portable implementation uses `(rnrs hashtables)` if that library is available, or SRFI 69 if that library is available, or a slightly improved version of the SRFI 69 reference implementation if no better alternative is available.  There is also an [https://github.com/larcenists/larceny/blob/master/lib/SRFI/srfi-69.sch implementation of SRFI 69] on top of the `(rnrs hashtables)` library of R6RS, which shows how to construct an implementation of SRFI 69 that would be fully compatible with the sample implementation of this proposal.

The hash tables specified here are fully compatible with hash tables as specified by the R6RS, but there may be conflicts when particular implementations of the two libraries (this proposal and R6RS) are imported into a single library or program.  There will be no conflicts when the sample implementation of this proposal is imported along with the reference implementation of `(r6rs hashtables)` used by that sample implementation.

The hash tables specified here are mostly compatible with hash tables as specified by SRFI 69, but there will almost certainly be conflicts when a library or program imports both the library specified here and an implementation of SRFI 69.  It should be possible to construct an implementation of SRFI 69 that does not conflict with the sample implementation of this proposal or with `(r6rs hashtables)`, but all non-conflicting implementations of SRFI 69 must

   * extend the `make-hash-table` and `alist->hash-table` procedures to accommodate their more general arguments as specified by this proposal
   * extend the `hash-table-fold` procedure to accept arguments in the order specified here as well as in the order specified by SRFI 69
   * take care never to call a hash function with more than one argument

As of this writing, there are no fully compatible implementations of SRFI 69.

The sample implementation was originally intended to support all the native hash table systems mentioned in [#Sources Sources] above as well.  However, this turned out not to be practical for the following reasons:

* Gauche does not support arbitrary equality predicates, only `eq?`, `eqv?`, `equal?`, and `string=?`.

* S7 does not support arbitrary equality predicates: the implementation chooses a predicate based on the nature of the keys.

* SISC, Scheme48/scsh, RScheme, Scheme 9, and Rep do not document any procedure that copies a hash table, nor any way of inspecting it to determine its equality predicates and hash functions so that it can be re-created.

* SLIB hash tables are vectors, not disjoint objects.

* !FemtoLisp supports only `equal?` as the equality predicate.

Native Guile hash tables are a special case.  The equivalents of `hash-table-ref/default`, `hash-table-set!`, and `hash-table-delete` require the equality predicate and hash function to be passed to them explicitly (although there are utility functions for `eq?`, `eqv?`, and `equal?` hash tables).  Consequently, hash tables corresponding to the present proposal would have to be records containing a Guile hash table, an equality predicate, and a hash function, which means they could not interoperate directly with native Guile hash tables.

time

2015-06-21 10:50:07

version

121