This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home. For a version of this page that may be more recent, see HashTablesCowan in WG2's repo for R7RS-large.

Hash­Tables­Cowan

cowan
2013-05-18 21:20:33
17history
source

Work in progress, do not use yet

Hash tables

This WG2 proposal defines an interface to hash tables, which are widely recognized as a fundamental data structure for a wide variety of applications. A hash table is a data structure that:

Procedures

Constructors

(make-hash-table equivalence hash . args)

Returns a newly allocated hash table whose equivalence procedure is equivalenceand hash procedure is hash (defaults to a suitable function, and args mean whatever (number is initial capacity, 'immutable, 'weakkeys, 'weak-values, all implementation-dependent). The "safe" equivalence procedures are eq?, eqv?, equal?, string=?, string-ci=?. (SRFI-69 make-hash-table; R6RS make-eq-hashtable, make-eqv-hashtable, and make-hashtable)

(hash-table equivalence hash ( key value ) ...)

Ditto, but with explicitly specified key/value pairs

Predicates

(hash-table? obj)

Returns #t if obj is a hash table. (SRFI-69 hash-table?; R6RS hashtable?)

(hash-table-contains? hash-table key)

Returns #t if there is any association to key in hash-table, and #f otherwise. Must execute in amortized O(1) time. (SRFI-69 hash-table-exists?; R6RS hashtable-contains?)

(hash-table-value-exists? hash-table value)

Returns #t if an association whose value is value exists, #f if not.

(hash-table=? key= value= hash-table1 hash-table2)

Not yet defined

Accessors

The following procedures, given a key, return the corresponding value.

(hash-table-ref hash-table key [ failure [ success ] ])

Extracts the value associated to key in hash-table, invokes the procedure success on it, and returns its result. Otherwise, invokes the procedure failure on no arguments and returns its result. If success is not provided and is required, the value itself is returned. If failure is not provided but is required, an error is signaled. Must execute in amortized O(1) time, not counting the time to call the procedures. (SRFI-69 hash-table-ref)

(hash-table-ref/default hash-table key default)

Semantically equivalent to, but may be more efficient than, the following code: (hash-table-ref hash-table key (lambda () default))

(SRFI-69 hash-table-ref/default, R6RS hashtable-ref)

Mutators

The following procedures alter the associations in a hash table either unconditionally, or conditionally on the presence or absence of a specified key.

(hash-table-set! hash-table key value)

Creates a new association in hash-table that associates key with value. If there is a previous association for key, it is deleted. It is an error if hash-table is not a valid argument to the equality predicate of hash-table. Returns an unspecified value. Must execute in amortized O(1) time. (SRFI-69 hash-table-set!; R6RS hashtable-set!)

(hash-table-set*! hash-table ( key value ) ...)

Repeatedly mutates hash-table, setting each key to the value following it.

(hash-table-set-entries! hash-table keys values)

Repeatedly mutates hash-table, setting each element of keys to the corresponding element of values.

(hash-table-set-alist! hash-table alist)

Repeatedly mutates hash-table using the associations of alist.

(hash-table-delete! hash-table key)

Deletes any association to key in hash-table and returns an unspecified value. It is not an error if no association for that key exists. Must execute in amortized O(1) time. (SRFI-69 hash-table-delete!; R6RS hashtable-delete!)

(hash-table-delete-keys! hash-table keylist)

Semantically equivalent to, but may be implemented more efficiently than, the following code:

(for-each (lambda (key) (hash-table-delete hash-table key)) ''keylist'')`

(hash-table-ref! hash-table key [ failure [ success ] ])

Invokes hash-table-ref with the given arguments and returns what it returns. If key was found in hash-table, its value is set to the result being returned.

(hash-table-ref!/default hash-table key default)

Invokes hash-table-ref/default with the given arguments and returns what it returns. If key was found in hash-table, its association is set to the result being returned.

(hash-table-replace! hash-table key updater [ failure [ ''success ] ])

Invokes hash-table-ref with the given arguments and returns what it returns. If key was not found in hash-table, its value is set to the result being returned.

(hash-table-replace!/default hash-table key updater default`)

Invokes hash-table-ref/default with the given arguments and returns what it returns. If key was not found in hash-table, its value is set to the result being returned.

(hash-table-update! hash-table key updater [ failure [ ''success ] ])

Semantically equivalent to, but may be implemented more efficiently than, the following code:

(hash-table-set! hash-table key (updater (hash-table-ref hash-table key failure success)))

Must execute in amortized O(1) time. Returns an unspecified value. (SRFI-69 hash-table-update!/default; R6RS hashtable-update!)

(hash-table-update!/default hash-table key updater default)

Semantically equivalent to, but may be implemented more efficiently than, the following code:

(hash-table-set! hash-table key (updater (hash-table-ref/default hash-table key default)))

Must execute in amortized O(1) time. Returns an unspecified value. (SRFI-69 hash-table-update!)

(hash-table-push! hash-table key value)

Conses value onto the value of key, empty list if no such key.

(hash-table-pop! hash-table key [ success [ failure ] ])

Returns car of value corresponding to key, mutates key to cdr. Failure if the key does not exist or the value is not a pair. Issue: signal an error for non-pair instead?

(hash-table-pop!/default hash-table key default)

The same, but with default if key does not exist or value is not a pair. Issue: signal an error for non-pair instead?

Functional update

(hash-table-set hash-table key value)

(hash-table-set* hash-table ( key value ) ...)

(hash-table-set-entries hash-table keys values)

(hash-table-set-alist hash-table alist)

(hash-table-delete hash-table key)

(hash-table-delete-keys hash-table keylist)

(hash-table-update hash-table key updater [ failure [ ''success ] ])

(hash-table-update/default hash-table key updater default)

(hash-table-replace/default hash-table key value default)

(hash-table-replace hash-table key updater [ failure [ ''success ] ])

These procedures are equivalent to the mutators with corresponding names, except that they return a new hash table which differs from the original in the specified ways. No guarantees are provided about their amortized execution time, as they may copy the original hash table.

Procedures on the whole hash table

(hash-table-clear! hash-table)

Remove all the associations from hash-table. Should execute in amortized O(1) time. (R6RS hashtable-clear!)

(hash-table-length hash-table) Must execute in amortized O(1) time. (SRFI 69 hash-table-size, R6RS hashtable-size)

Return number of keys in hash-table.

(hash-table-keys hash-table)

Return list of all keys in an unspecified order. (SRFI 69 hash-table-keys; R6RS hashtable-keys returns a vector)

(hash-table-values hash-table)

Return list of all values in an unspecified order, not necessarily the same order as hash-table-keys. (SRFI 69 hash-table-values)

(hash-table-entries hash-table)

Returns two values, the list of keys in an unspecified order and the list of values in the corresponding order. (R6RS hash-table-entries returns vectors)

(hash-table-find hash-table proc)

For each association of hash-table, invoke proc on its key and value in an unspecified order. If proc returns true, then return the value. If proc always returns #f, return #f.

(hash-table-count hash-table proc)

For each association of hash-table, invoke proc on its key and value in an unspecified order. Return the number of calls to proc which returned true.

(hash-table-remove! hash-table proc)

For each association of hash-table, invoke proc on its key and value in an unspecified order. If proc returns true, delete the association.

Mapping and folding

(hash-table-map equivalence hash proc merger hash-table)

For each key invoke proc with the key and the corresponding value, and expect it to return two values, which are used to construct a new hash table using equivalence and hash. When a key already exists in the hash table, the procedure merger is called with arguments oldkey, oldvalue, newkey, newvalue and returns two values, the proper key and the proper value.

(hash-table-map! proc merger hash-table)

The same, but the values are used to mutate the hash table instead of creating a new one.

(hash-table-for-each proc hash-table)

The same, but discards the results. (SRFI 69 hash-table-walk has the hash-table as the first argument)

(hash-table-map->list proc hash-table)

Pass each key and value as separate arguments to proc, collect results in list.

(hash-table-fold proc init hash-table)

Calls proc for every association in hash-table with three arguments: the key of the association, the value of the association, and an accumulated value val. Val is init for the first invocation of procedure, and for subsequent invocations of procedure, the returned value of the previous invocation. The value returned by hash-table-fold is the return value of the last invocation of proc. The order in which procedure is called for different associations is unspecified. (SRFI-69 hash-table-fold has the hash-table as the first argument)

(hash-table-unfold equivalence hash continue? mapper successor seed)

Create a new hash table using equivalence and hash. If the result of applying the predicate continue? to seed is #f, return the hash table. Otherwise, apply the procedure mapper to seed. Mapper returns two values, which are inserted into the hash table as the key and the value respectively. Then get a new seed by applying the procedure successor to seed, and repeat this algorithm.

Copying and conversion

(hash-table-copy hash-table)

Returns a newly allocated hash table with the same equivalence and hash predicates and the same associations as hash-table. (SRFI-69 hash-table-copy; R6RS hashtable-copy, passing #t as the second argument)

(hash-table->alist hash-table)

Returns an alist with the same associations as hash-table in an unspecified order.

(alist->hash-table alist equivalence hash . args)

Returns a newly allocated hash-table whose equivalence and hash procedures are equivalence and hash respectively, initializing it from the associations of alist. (SRFI 69 alist->hash-table)

Hash tables as functions

(hash-table-accessor hash-table [ success [ failure ] ])

(hash-table-accessor/default hash-table default)

(hash-table-mutator hash-table)

(hash-table-deleter hash-table)

(hash-table-updater hash-table updater [ failure [ success ] ])

(hash-table-updater/default hash-table updater default)

(hash-table-replacer hash-table updater [ failure [ success ] ])

(hash-table-replacer/default hash-table updater default)

These procedures allow hash tables to be used as functions with mutable behavior. They return procedures that are curried versions of hash-table-ref, hash-table-ref/default, hash-table-set!, hash-table-update!, hash-table-update!/default, hash-table-replace!, and hash-table-replace!/default respectively on the given arguments.

Hash tables as sets

(hash-table-union hash-table1 hash-table2 [ merger ])

Adds the associations of hash-table2 to a copy of hash-table1 and returns it. The values are merged using the merger procedure, which defaults to (lambda (value,,1,, value,,2,,) value,,2). (SRFI 69 hash-table-merge!)

(hash-table-union! hash-table1 hash-table2 [ merger ])

Adds the associations of hash-table2 to a copy of hash-table1 and returns it. The values are merged using the merger procedure, which defaults to (lambda (value,,1,, value,,2,,) value,,2).

(hash-table-difference hash-table1 hash-table2)

Removes the keys of hash-table2 from copy of hash-table1 and returns it.

(hash-table-difference! hash-table1 hash-table2)

Removes the keys of hash-table2 from hash-table1 and returns it.

An idea

Have a procedure that accepts an equivalence and a hash procedure, and returns a function that behaves the same as the equivalence function when called with two arguments, but when called with zero arguments, returns the hash procedure. This is used to create hash tables, saving an extra argument each time. Note that this allows the author of an equivalence predicate to provide hashing without cluttering up the interface, and allows the predicate to be provided as transparently as built-in predicates.

Everything past this point is obsolete

Only kept around as a source of useful text to cut and paste.

Excluded features

The following features are not part of this proposal for various reasons.

Incompatible return values

Hash-table-keys is present in both SRFI 69 and R6RS, but returns a list of keys in the first, a vector of keys in the second.

Hash-table-values returns a list of values in SRFI 69; hashtable-entries returns two values, a vector of keys and a vector of values in R6RS.

Hash-table-hash-function always returns the hash function of a hash table in SRFI 69, but returns #f in the case of eq? (and eqv?) hash tables in R6RS.

Hash-table-equivalence-function returns the equivalence function, but in R6/R7RS you can't reliably check functions for identity, so you don't know what you've got.

In R6RS but not in SRFI 69

Hashtable-clear!, which removes all associations from a hash table.

Specifying the initial capacity of a hash table.

Immutable hash tables.

Hash tables based on eqv?.

In SRFI 69 but not in R6RS

Hash-table-walk, which is the analogue of for-each for hash tables.

Conversion from a-lists to hash tables and back.

Hash-table-merge, which destructively adds all associations in a source hash table to a destination hash table.

Default values specified in the form of a thunk to call.

In Common Lisp but not Scheme

The current capacity (as opposed to size) of a hash table.

Rehash size and threshold in constructor, plus accessors and mutators for them.

Hash tables based on equalp (which is not in Scheme).

With-hash-table-iterator, a hash table external iterator implemented as a local macro.

Sxhash, a stable hash function.

I have added equivalences to SRFI 69 and R6RS facilities. Systems supporting either should be able to support these hash tables trivially; it will be necessary to write hash-table-fold for R6RS systems. I have adopted SRFI 69's term hash-table rather than R6RS's hashtable because of our decision in #40. Besides, the word hashtable obviously means something that can be ... hashted.

The main annoyances of this proposal for SRFI 69 programmers will be remembering to supply a third argument to hash-table-ref, and for R6RS programmers will be remembering to insert a hyphen in hashtable. Both will have to get used to the new constructors.