This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home. For a version of this page that may be more recent, see GeneratorsGauche in WG2's repo for R7RS-large.

Generators­Gauche

cowan
2014-11-05 05:21:15
14history
source

Generators

A generator is simply a procedure with no arguments that works as a source of a series of values. Every time it is called, it yields a value. Returning an end-of-file object indicates the generator is exhausted. (Note that some generators are infinite in length, and never return an end-of-file object.) For example, read-char is a generator that generates characters from the current input port.

It is common practice to abstract the sources of values in such a way, so it is useful to define utility procedures that work on generators. This proposal provides them.

A generator is very lightweight, and useful for implementing simple on-demand calculations. However, it's important to keep in mind that it is side-effecting construct; you can't safely backtrack, for example.

These procedures are mostly from the Gauche core and the Gauche modules gauche.generator and data.random, with some renaming to make them more systematic.

Generator constructors

A generator isn't a special datatype but just an ordinary procedure, so you can construct a generator with lambda. This proposal provides some common generator constructors for convenience.

If you want to use your procedure as a generator, note that a generator can be invoked many times even after it returns an end-of-file object once Once it returns an end-of-file object, it keeps returning an end-of-file object for all subsequent calls.

The result of generator constructors is just a procedure, and printing it doesn't show much. In the examples in this section we use generator->list to convert the generator to the list.

These procedures have names prefixed with make- and suffixed with -generator.

make-generator arg …

The simplest generator. Returns the given arguments followed by an end-of-file object when called. When no arguments are provided, it returns a null generator that generates no values.

make-circular-generator arg …

Returns an infinite generator that repeats the given arguments forever.

 
(generator->list (circular-generator 1 2 3) 10)
  ⇒ (1 2 3 1 2 3 1 2 3 1)

Note that the above example limits the length of the converted list to 10; otherwise generator->list won't return.

make-iota-generator [ count [ start [ step ] ] ]

Creates a generator of a series of count numbers (by default, an infinite number), starting from start (0 by default) and increased by step (1 by default).

 
(generator->list (make-iota-generator 10 3 2))
  ⇒ (3 5 7 9 11 13 15 17 19 21)

If both start and step are exact, the generator yields exact numbers; otherwise it yields inexact numbers. The exactness of count does not affect the exactness of the results.

 
(generator->list (giota +inf.0 1/2 1/3) 6)
  ⇒ (1/2 5/6 7/6 3/2 11/6 13/6)
(generator->list (giota +inf.0 1.0 2.0) 5)
  ⇒ (1.0 3.0 5.0 7.0 9.0)
make-range-generator start end [ step ]

Creates a generator of a series of numbers. The series begins with start, increases by step (default 1), and continues while the number is less than end.

 
(generator->list (make-range-generator 3 8))
  ⇒ (3 4 5 6 7)
make-tabulation-generator [ k ] proc

Creates a generator of a series of k numbers (by default, an infinite number), namely (proc 0), (proc 1), (proc 2), ....

make-coroutine-generator proc

Creates a generator from a coroutine.

The proc argument is a procedure that takes one argument, yield. When called, generate immediately returns a generator g. When g is called, proc runs until it calls yield. Calling yield causes the execution of proc to be suspended, and g returns the value passed to yield.

Once proc returns, it is the end of the series — g returns an end-of-file object from then on. The return value of proc is ignored.

The following code creates a generator that produces a series 0, 1, and 2 (effectively the same as (make-iota-generator 3) and binds it to g.

 
(define g
  (generate
   (lambda (yield) (let loop ((i 0))
               (when (< i 3) (yield i) (loop (+ i 1)))))))

(generator->list g) ⇒ (0 1 2)
make-list-generator lis
make-vector-generator vec [ start [ end ] ]
make-reverse-vector-generator vec [ start [ end ] ]
make-string-generator str [ start [ end ] ]

These procedures return generators that yields each item in the given argument. A generator returned from make-vector-reverse-generator procedures runs in reverse order.

 
(generator->list (make-list-generator '(1 2 3 4 5)))
  ⇒ (1 2 3 4 5)
(generator->list (make-vector-generator '#(1 2 3 4 5)))
  ⇒ (1 2 3 4 5)
(generator->list (make-reverse-vector-generator '#(1 2 3 4 5)))
  ⇒ (5 4 3 2 1)
(generator->list (make-string-generator "abcde"))
  ⇒ (#\a #\b #\c #\d #\e)

By default the generator is exhausted once all items are retrieved; the optional start and end arguments can limit the range the generator walks across.

For make-reverse-vector-generator, the first value is the item right before to the end element, and the last value is the start element. For all the others, the first value the generator yields is the start-th element, and it ends right before the end-th element.

 
(generator->list (make-vector-generator '#(a b c d e) 2))
  ⇒ (c d e)
(generator->list (make-vector-generator '#(a b c d e) 2 4))
  ⇒ (c d)
(generator->list (make-reverse-vector-generator '#(a b c d e) 2))
  ⇒ (e d c b)
(generator->list (make-reverse-vector-generator '#(a b c d e) 2 4))
  ⇒ (d c)
(generator->list (make-reverse-vector-generator '#(a b c d e) #f 2))
  ⇒ (b a)
make-bits-generator n

This procedure takes an exact integer considered as a twos-complement value and treats it as a sequence of boolean values (0 for false and 1 for true), returning bits from the least significant bit first. Note that the infinite sequence of high-order 0 or 1 bits are not returned, so this is a finite generator. In particular, (make-bits-generator 0) and (make-bits-generator -1) are both null generators.

 
(generator->list (bits->generator #b10110))
 ⇒ (#f #t #t #f #t)
make-port-sexp-generator input-port
make-port-line-generator input-port
make-port-char-generator input-port
make-port-byte-generator input-port

Returns generators that read the appropriate type of object from the given port, respectively.

make-for-each-generator for-each obj

A generic generator that converts any collection obj to a generator that walks across the obj using the for-each procedure appropriate for obj.

make-unfold-generator stop? mapper successor seed

A generator constructor similar to unfold.

The stop? predicate takes a seed value and determines whether to stop. The mapper procedure calculates a value to be returned by the generator from a seed value. The successor procedure calculates the next seed value from the current seed value.

For each call of the resulting generator, stop? is called with the current seed value. If it returns true, then the generator returns an end-of-file object. Otherwise, it applies mapper to the current seed value to get the value to return, and uses g to update the seed value.

 
(generator->list (make-unfold-generator
                      (lambda s (> s 5))
                      (lambda s (* s 2))
                      (lambda s (+ s 1))
                      0))
  ⇒ '(0 2 4 6 8 10)

Generator operations

The following procedures accept generators and return a generator.

The names of these procedures are prefixed with g.

gcons* item … gen

Returns a generator that adds items in front of gen. Once the items have been consumed, the generator is guaranteed to tail-call gen.

 
(generator->list (gcons* 'a 'b (giota 2)))
 ⇒ (a b 0 1)
gappend gen …

Returns a generator that yields the items from the first given generator, and once it is exhausted, use the second generator, and so on. Calls to the returned generator are guaranteed to tail-call one of the generator arguments until they are exhausted.

 
(generator->list (gappend (giota 3) (giota 2)))
 ⇒ (0 1 2 0 1)

(generator->list (gappend))
 ⇒ ()
gconcatenate gen

The gen argument is a generator of generators. Returns a generator that yields the results from the first generator, then the second one, then the third, etc.

It is similar to (apply gappend (generator->list gen)), except that gconcatenate can work even if gen generates an infinite number of generators. Calls to the returned generator are guaranteed to tail-call one of the generators provided by gen until they are exhausted.

 
(generator->list
  (gconcatenate
    (make-element-generator (giota 3) (giota 2))))
 ⇒ (0 1 2 0 1)
gmerge comparator gen1 gen2
gunion comparator gen1 gen2
gintersection comparator gen1 gen2

Creates a generator that yields elements out of input generators, with the order determined by a SRFI 114 comparator. Gmerge returns all the elements; gunion returns all the elements that are distinct in the sense of the comparator; gintersection returns all the elements that appear in all the generators.

Each input generator must yield appropriately ordered elements by itself; otherwise the result won't be ordered.

If only one generator is given, it is just returned. In that case, comparator is ignored.

 
(generator->list (gmerge < '(1 3 8) '(5) '(2 4)))
  ⇒ '(1 2 3 4 5 8)
gmap proc gen gen2

Returns a generator that yields a value returned by the application of proc to the values returned by the generator arguments. The returned generator terminates when any of the generator arguments are exhausted.

Note: This differs from generator-collect, which consumes all values at once and returns the results as a list, while gmap returns a generator immediately without consuming input.

gfold proc seed gen gen2

A generator analogue of fold for mapping with state. It yields a sequence of sub-folds over proc.

The proc argument is a procedure that takes as many arguments as the input generators plus one. It is called as (proc v1 v2 … seed), where v1, v2, … are the values yielded from the input generators, and seed is the current seed value. It must return two values, the yielding value and the next seed.

Note: This differs from generator-fold, which consumes all values while folding over them, while gfold returns a generator immediately without consuming input.

gfilter pred gen
gremove pred gen

Return generators that yield the items from the source generator, except those on which pred answers false or true respectively.

gfilter-map proc gen gen2

Works the same as (gfilter values (gmap proc gen gen2 …)), only slightly more efficiently.

gstate-filter proc seed gen

This allows stateful filtering of a series. The proc argument must be a procedure that takes a value v from the source generator and a seed value. It should return two values, a boolean flag and the next seed value. If the boolean is true, the generator yields v. Otherwise, the generator keeps calling proc, updating the seed value, until the flag is true or the source generator is exhausted.

The following example takes a generator of oscillating values and yields only values that are greater than their previous value.

 
(generator->list
 (gstate-filter (lambda (v s) (values (< s v) v)) 0
                (make-element-generator 1 2 3 2 1 0 1 2 3 2 1 0 1 2 3)))
 ⇒ (1 2 3 1 2 3 1 2 3)
gbuffer-filter proc seed gen

This procedure allows n-to-m mapping between elements in input and output — that is, it can use several input elements to generate one or more output elements.

The procedure proc receives the next input element and an accumulated seed value. It returns two values: A list of output values, and the next seed value. If you need to look at more input to generate output, you can return an empty list as the first value.

If the input reaches the end, the output ends when the output of the last call to proc is exhausted (the final seed value is discarded).

gtake gen k [ padding ]
gdrop gen k

The generator analogues of take and drop. Gtake returns a generator that yields (at most) the first k items of the source generator, while gdrop returns a generator that skips the first k items of the source generator.

These won't complain if the source generator is exhausted before generating k items. By default, the generator returned by gtake terminates when the source generator does, but if you provide the padding argument, then the returned generator will yield k items, using the padding value to provide sufficient additional values.

gtake-while pred gen
gdrop-while pred gen

The generator analogues of take-while and drop-while. The generator returned from gtake-while yields items from the source generator as long as pred returns true for each. The generator returned from gdrop-while first reads and discards values from the source generator while pred returns true for them, then starts yielding items returned by the source.

gpairs car-gen cdr-gen

Returns a generator that yields pairs whose car is the result of calling car-gen and whose cdr is the result of calling cdr-gen.

 
(define g
  (gpairs (make-generator 113 101 12 68 -55)
          (make-generator #t #f #t #f #f))

(generator->list g 10)
 ⇒ ((113 . #t) (101 . #f) (12 . #t) (68 . #f) (-55 . #f))
gtuple gen …

Returns a generator that yields lists whose i-th element is generated from the i-th argument.

 
(define g
  (gtuples (make-generator -43 53 -114)
           (make-generator #f #f #f)
           (make-generator #\8 #\1 #\i))

(generator->list g 3)
 ⇒ ((-43 #f #\8) (53 #f #\1) (-114 #f #\i))

This procedure is analogous to zip, but gzip would be a misleading name.

glists sizer item-gen [ padding ]
gvectors sizer item-gen [ padding ]
gstrings sizer item-gen [ padding ]

Creates a generator that generates lists, vectors or strings of values from item-gen, respectively. The size of each datum is determined by sizer, which can be either a non-negative integer or a generator of non-negative integers. The value of the sizer determines the length of the result data.

The padding argument controls how the end of input is handled, just like gtake. When padding is not provided, the last item from the output generator may not have k items if the input is too short to fill them, as shown in the above example. If padding is present and the input is too short to complete k items, padding is used to fill the rest.

 
(generator->list (gslices (giota 7) 3))
  ⇒ ((0 1 2) (3 4 5) (6))
 
(generator->list (gslices (giota 6) 3 'x))
  ⇒ ((0 1 2) (3 4 5))
(generator->list (gslices (giota 7) 3 'x))
  ⇒ ((0 1 2) (3 4 5) (6 x x))
gdelete item gen [ = ]

Creates a generator that returns whatever gen returns, except for any items that are the same as item in the sense of =, which defaults to equal?.

gdelete-neighbor-dups gen [ pred ]

Creates a generator that returns whatever gen returns, except for any items that are equal to the preceding item in the sense of pred, which defaults to equal?.

Consuming generated values

Unless otherwise noted, these procedures consume all the values of the generator passed to them. They have names prefixed with generator-.

generator->list generator [ k ]

Reads items from generator and returns a list of them. By default, it reads until the generator is exhausted. If an optional argument k is given, it must be a non-negative integer, and the list ends when either k items are consumed, or the generator is exhausted.

Be careful not to pass an infinite generator to this procedure without specifying k, or this procedure will not return.

generator->reverse-list generator [ k ]

Reads items from generator and returns a list of them in reverse order. By default, this reads until the generator is exhausted. If an optional argument k is given, it must be a non-negative integer, and the list ends when either k items are read, or the generator is exhausted.

Be careful not to pass an infinite generator to this procedure without specifying k, or this procedure will not return.

generator-fold proc seed gen1 gen2

Works like fold on the values generated by the generator arguments.

When one generator is given, for each value v generated by gen, proc is called as (proc v r), where r is the current accumulated result; the initial value of the accumulated result is seed, and the return value from proc becomes the next accumulated result. When gen returns an end-of-file object, the accumulated result at that time is returned from generator-fold.

When more than one generator is given, proc is invoked on the values returned by all the generator arguments followed by the current accumulated result. The iteration terminates when any of the generators returns an end-of-file object.

 
(with-input-from-string "a b c d e"
  (lambda () (generator-fold cons 'z read)))
  ⇒ (e d c b a . z)
generator-for-each proc gen gen2

A generator analogue of for-each that consumes generated values using side effects. Repeatedly applies proc on the values yielded by gen, gen2 … until any one of the generators yields an end-of-file object. The values returned from proc are discarded.

generator-collect proc gen gen2

A generator analogue of map. Repeatedly applies proc on the values yielded by gen, gen2 … until any one of the generators yields an end-of-file object. The values returned from proc are collected into a list and returned.

 
(with-input-from-string "a b c d e"
  ((lambda () (generator-map symbol->string read)))
  ⇒ ("a" "b" "c" "d" "e")

The same effects can be achieved by combining generator->list and gmap.

 
(generator->list (gmap proc gen gen2 …))
generator-last gen

Returns the last item from the generator.

generator-find pred gen

Returns the first item from the generator gen that satisfies the predicate pred.

generator-length gen

Returns the number of items available from the generator gen.

generator-count pred gen

Returns the number of items available from the generator gen that satisfy the predicate pred.

generator-any pred gen

Returns #t if any of the values available from the generator gen satisfy the predicate pred. After such a value is found, the procedure returns without consuming the rest of the generator.

generator-every pred gen

Returns #t if any of the values available from the generator gen do not satisfy the predicate pred. After such a value is found, the procedure returns without consuming the rest of the generator.

generator=? pred gen ...

Returns #f if the corresponding values available from the generators gen are not equal in the sense of the predicate pred. After such a value is found, the procedure returns without consuming the rest of the generators. If all the values of all the generators have been consumed without finding any values that are not equal, generator=? returns #t.

generator-unfold gen unfold arg ...

Equivalent to (unfold eof-object? (lambda (x) x) gen (gen) args ...). The values of gen are unfolded into the collection that unfold creates. The combination of make-for-each-generator and generator-unfold makes it possible to convert any collection that has a for-each procedure into any collection that has an unfold constructor.

Syntax

generator-let* (binding …) body body2 …

This captures a monadic pattern that frequently appears in generator-related code. It is similar in spirit to and-let* from SRFI 2, but returns as soon as an evaluated expression returns an end-of-file object.

The binding part can be either (var expr) or ( expr ). The actual definition will explain this syntax clearly.

 
(define-syntax generator-let*
  (syntax-rules ()
    ((_ () body body2 ...) (begin body body2 ...))
    ((_ ((var gen-expr) more-bindings ...) . body)
     (let ((var gen-expr))
       (if (eof-object? var)
         var
         (generator-let* (more-bindings ...) . body))))
    ((_ (( gen-expr ) more-bindings ...) . body)
     (let ((var gen-expr))
       (if (eof-object? var)
         var
         (generator-let* (more-bindings ...) . body))))))
do-generator (var gexpr) body …

Gexpr is an expression that yields a generator. It is evaluated once. The resulting generator is called repeatedly until it returns an end-of-file object. Every time the generator is called, the body expressions are evaluated in a scope where var is bound to the value returned from the generator.

This macro exists for performing side-effects. You can do the same thing with generator-for-each, but sometimes this macro makes the imperative code more readable:

 
(do-generator (line read-line)
  ;; do some side-effecting stuff with line
  )