This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home. For a version of this page that may be more recent, see GeneratorsGauche in WG2's repo for R7RS-large.

Generators­Gauche

cowan
2014-09-27 16:01:54
8history
source

Generators

A generator is merely a procedure with no arguments that works as a source of a series of values. Every time it is called, it yields a value. Returning an end-of-file object indicates the generator is exhausted. For example, read-char can be seen as a generator that generates characters from the current input port.

It is common practice to abstract the sources of values in such a way, so it is useful to define utility procedures that work on generators. This module provides them.

A generator is very lightweight, and useful for implementing simple on-demand calculations. However, keep in mind that it is side-effecting construct; you can't safely backtrack, for example.

These procedures are from the Gauche core and the Gauche modules gauche.generator and data.random, with some renaming to make them more systematic.

Generator constructors

A generator isn't a special datatype but just an ordinary procedure, so you can make a generator with lambdas. This module provides some common generator constructors for convenience.

If you want to use your procedure as a generator, note that a generator can be invoked many times even after it returns an end-of-file object once. You have to code it so that once it returns an end-of-file object, it keeps returning an end-of-file object for all subsequent calls.

The result of generator constructors is merely a procedure, and printing it doesn't show much. In the examples in this section we use generator->list to convert the generator to the list.

These procedures have names beginning with make- and ending with -generator.

make-generator arg …

The simplest generator. Returns the given arguments followed by an end-of-file object when called.

make-circular-generator arg …

Returns an infinite generator that repeats the given arguments forever.

 
(generator->list (circular-generator 1 2 3) 10)
  ⇒ (1 2 3 1 2 3 1 2 3 1)

Note that the above example limits the length of the converted list by 10; otherwise generator->list won't return.

make-iota-generator [ count [ start [ step ] ] ]

Creates a generator of a series of count numbers (by default, an infinite number), starting from start (0 by default) and increased by step (1 by default).

 
(generator->list (make-iota-generator 10 3 2))
  ⇒ (3 5 7 9 11 13 15 17 19 21)

If both start and step are exact, the generator yields exact numbers; otherwise it yields inexact numbers.

 
(generator->list (giota +inf.0 1/2 1/3) 6)
  ⇒ (1/2 5/6 7/6 3/2 11/6 13/6)
(generator->list (giota +inf.0 1.0 2.0) 5)
  ⇒ (1.0 3.0 5.0 7.0 9.0)
make-range-generator start end [ step ]

Creates a generator of a series of numbers. The series begins with start, increases by step (default 1), and continues while the number is below end.

 
(generator->list (make-range-generator 3 8))
  ⇒ (3 4 5 6 7)
make-tabulation-generator [k] proc

Creates a generator of a series of k numbers (by default, an infinite number), namely (proc 0), (proc 1), (proc 2), ....

make-coroutine-generator proc

Creates a generator from a coroutine.

The proc argument is a procedure that takes one argument, yield. When called, generate immediately returns a generator g. When g is called, proc runs until it calls yield. Calling yield causes the execution of proc to be suspended, and g returns the value passed to yield.

Once proc returns, it is the end of the series — g returns an end-of-file object from then on. The return value of proc is ignored.

The following code creates a generator that produces a series 0, 1, and 2 (effectively the same as (make-iota-generator 3) and binds it to g.

 
(define g
  (generate
   (lambda (yield) (let loop ((i 0))
               (when (< i 3) (yield i) (loop (+ i 1)))))))

(generator->list g) ⇒ (0 1 2)
make-list-generator lis [ start [ end ] ]
make-vector-generator vec [ start [ end ] ]
make-reverse-vector-generator vec [ start [ end ] ]
make-string-generator str [ start [ end ] ]

These procedures return generators that yields each item in the given argument. A generator returned from make-vector-reverse-generator procedures runs in reverse order.

 
(make-generator-list (list->generator '(1 2 3 4 5)))
  ⇒ (1 2 3 4 5)
(generator->list (make-vector-generator '#(1 2 3 4 5)))
  ⇒ (1 2 3 4 5)
(generator->list (make-reverse-vector-generator '#(1 2 3 4 5)))
  ⇒ (5 4 3 2 1)
(generator->list (make-string-generator "abcde"))
  ⇒ (#\a #\b #\c #\d #\e)

By default the generator is exhausted once all items are retrieved; the optional start and end arguments can limit the range the generator walks across.

For forward generators, the first value the generator yields is start-th element, and it ends right before end-th element. For reverse generators, the first value is the item right before to the end-th element, and the last value is the start-th element.

 
(generator->list (vector->generator '#(a b c d e) 2))
  ⇒ (c d e)
(generator->list (make-vector-generator '#(a b c d e) 2 4))
  ⇒ (c d)
(generator->list (make-reverse-vector-generator '#(a b c d e) 2))
  ⇒ (e d c b)
(generator->list (make-reverse-vector-generator '#(a b c d e) 2 4))
  ⇒ (d c)
(generator->list (make-reverse-vector-generator '#(a b c d e) #f 2))
  ⇒ (b a)
make-bits-generator n

This procedure takes an exact integer and treats it as a sequence of boolean values (0 for false and 1 for true), taking bits from the least significant bit.

 
(generator->list (bits->generator #b10110))
 ⇒ (#f #t #t #f #t)
make-port-sexp-generator input-port
make-port-line-generator input-port
make-port-char-generator input-port
make-port-byte-generator input-port

Returns generators that read characters or bytes from the given port, respectively.

make-for-each-generator obj for-each

A generic version to convert any collection obj to a generator that walks across the obj using the for-each procedure appropriate for obj.

make-gunfold-generator p f g seed

A generator constructor similar to unfold.

P is a predicate that takes a seed value and determines where to stop. F is a procedure that calculates a value from a seed value. G is a procedure that calculates the next seed value from the current seed value.

For each call of the resulting generator, p is called with the current seed value. If it returns true, then we see we've done, and we return an end-of-file object. Otherwise, we apply f on the current seed value to get the value to return, and use g to update the seed value.

 
(generator->list (gunfold (^s (> s 5)) (^s (* s 2)) (^s (+ s 1)) 0))
  ⇒ '(0 2 4 6 8 10)

Generator operations

The following procedures take generators (noted as gen and gen2) and return a generator. For the convenience, they also accept any collection to gen and gen2 parameters; if a collection is passed where a generator is expected, it is implicitly coerced into a generator.

The names of these procedures begin with g.

gcons* item … gen

Returns a generator that adds items in front of gen.

 
(generator->list (gcons* 'a 'b (giota 2)))
 ⇒ (a b 0 1)
gappend gen …

Returns a generator that yields the items from the first given generator, and once it is exhausted, use the second generator, and so on.

 
(generator->list (gappend (giota 3) (giota 2)))
 ⇒ (0 1 2 0 1)

(generator->list (gappend))
 ⇒ ()
gconcatenate gen

The gen argument should generate generators and/or sequences. Returns a generator that yields elements from the first generator/sequence, then the second one, then the third, etc.

It is similar to (apply gappend (generator->list gen)), except that gconcatenate can work even gen generates infinite number of generators.

 
($ generator->list $ gconcatenate
   $ list->generator `(,(giota 3) ,(giota 2)))
 ⇒ (0 1 2 0 1)
gmerge comparator gen gen2 …
gunion comparator gen gen2 …
gintersection comparator gen gen2 …

Creates a generator that yields elements out of input generators, with the order determined by a SRFI 114 comparator. Gmerge returns all the elements; gunion returns all the elements that are distinct in the sense of the comparator; gintersection returns all the elements that appear in all the generators.

Each input generator must yield appropriately ordered elements by itself; otherwise the result won't be ordered.

If only one generator is given, it is just returned. In that case, constructor is ignored.

 
(generator->list (gmerge < '(1 3 8) '(5) '(2 4)))
  ⇒ '(1 2 3 4 5 8)
gmap proc gen gen2 …

Returns a generator that yields a value returned by proc applied on the values from given generators. The returned generator terminates when any of the given generator is exhausted.

Note: This differs from generator-collect, which consumes all values at once and returns the results as a list, while gmap returns a generator immediately without consuming input.

gmap-accum proc seed gen gen2 …

A generator version of map-accum, mapping with states.

The proc argument is a procedure that takes as many arguments as the input generators plus one. It is called as (proc v v2 … seed) where v, v2, … are the values yielded from the input generators, and seed is the current seed value. It must return two values, the yielding value and the next seed.

gfilter pred gen
gremove pred gen

Returns a generator that yield the items from the source generator, except those on which pred answers false or true respectively.

gfilter-map proc gen gen2 …

Works the same as (gfilter values (gmap proc gen gen2 …)), only slightly more efficiently.

gstate-filter proc seed gen

This allows stateful filtering of a series. The proc argument must be a procedure that takes a value V from the source generator and a seed value. It should return two values, a boolean flag and the next seed value. If it returns true for the boolean flag, the generator yields V. Otherwise, the generator keeps calling proc, with updating the seed value, until it sees the true flag value or the source generator is exhausted.

The following example takes a generator of oscillating values and yields only values that are greater than their previous value.

 
(generator->list
 (gstate-filter (^[v s] (values (< s v) v)) 0
                (list->generator '(1 2 3 2 1 0 1 2 3 2 1 0 1 2 3))))
 ⇒ (1 2 3 1 2 3 1 2 3)
gbuffer-filter proc seed gen

This procedure allows n-to-m mapping between elements in input and output — that is, you can take a look at serveral input elements to generate one or more output elements.

The procedure proc receives the next input element and accumulated seed value. It returns two values: A list of output values, and the next seed value. If you need to look at more input to generate output, you can return an empty list as the first value.

If the input reaches the end, the output ends when the output of the last call to proc is exhausted (the last seed value is discarded).

Suppose you have a text file. Each line contains a command, but if the line ends with backslash, next line is treated as a continuation of the current line. The following code creates a generator that returns logical lines, that is, the lines after such line continuations are taken care of.

 
(gbuffer-filter (^[v s]
                  (if-let1 m (#/\\$/ v)
                    (values '() (cons (m 'before) s))
                    (values `(,(string-concatenate-reverse (cons v s))) '())))
                '()
                (file->line-generator "input-file.txt")
                (^[s] `(,(string-concatenate-reverse s))))
gtake gen k [ padding ]
gdrop gen k

The generator version of take and drop. Gtake returns a generator that yields (at most) the first k items of the source generator, while gdrop returns a generator that skips the first k items of the source generator.

These won't complain if the source generator is exhausted before generating k items. By default, the generator returned by gtake terminates as the source ends, but if you provide the padding argument, then the returned generator will yield k items, using the padding value to fill the rest.

gtake-while pred gen
gdrop-while pred gen

The generator version of take-while and drop-while. The generator returned from gtake-while yields items from the source generator as long as pred returns true for each. The generator returned from gdrop-while first reads items from the source generator while pred returns true for them, then starts yielding items.

gslices gen k [ padding ]

This returns a generator, that yields a list of k items from the input generator gen at a time.

 
(generator->list (gslices (giota 7) 3))
  ⇒ ((0 1 2) (3 4 5) (6))

The padding argument controls how the end of input is handled, just like gtake. When padding is not provided, the last item from the output generator may not have k items if the input is too short to fill them, as shown in the above example. If padding is present and the input is too short to complete k items, padding is used to fill the rest.

 
(generator->list (gslices (giota 6) 3 #t 'x))
  ⇒ ((0 1 2) (3 4 5))
(generator->list (gslices (giota 7) 3 #t 'x))
  ⇒ ((0 1 2) (3 4 5) (6 x x))
gpairs car-gen cdr-gen

Returns a generator that yields pairs, whose car is generated from car-gen and whose cdr is generated from cdr-gen.

 
(define g
  (gpairs (make-generator 113 101 12 68 -55)
          (make-generator #t #f #t #f #f))

(generator->list g 10)
 ⇒ ((113 . #t) (101 . #f) (12 . #t) (68 . #f) (-55 . #f))
gtuple gen …

Returns a generator that yields lists whose i-th element is generated from the i-th argument.

 
(define g
  (gtuples (make-generator -43 53 -114)
           (make-generator #f #f #f)
           (make-generator #\8 #\1 #\i))

(generator->list g 3)
 ⇒ ((-43 #f #\8) (53 #f #\1) (-114 #f #\i))
glists sizer item-gen
gvectors sizer item-gen
gstrings sizer item-gen

Creates a generator that generates lists, vectors or strings of values from item-gen, respectively. The size of each datum is determined by sizerwhich can be either a nonnegative integer or a generator of nonnegative integers. The value of the sizer determines the length of the result data.

gdelete item gen [ = ]
gdelete-neighbor-dups = item-gen

Creates a generator that returns whatever gen returns, except for any items that are the same as item in the sense of =, which defaults to equal?.

gdelete-neighbor-dupsgen [ = ]

Creates a generator that returns whatever gen returns, except for any items that are equal to the preceding item in the sense of =, which defaults to equal?.

Folding generated values

These procedures consume all the values of the generator passed to them. They have names beginning with generator-.

generator->list generator [ k ]

Reads items from generator and returns a list of them. By default, this reads until the generator is exhausted. If an optional argument k is given, it must be a nonnegative integer, and the list ends either k items are read, or the generator is exhausted.

Be careful not to pass an infinite generator to this without specifying k, or this procedure will not return.

generator->reverse-list generator [ k ]

Reads items from generator and returns a list of them in reverse order. By default, this reads until the generator is exhausted. If an optional argument k is given, it must be a nonnegative integer, and the list ends either k items are read, or the generator is exhausted.

Be careful not to pass an infinite generator to this without specifying k, or this procedure will not return.

generator-fold proc seed gen gen2 …

Works like fold on the generated values by generator procedures gen gen2 ….

When one generator is given, for each value v generated by gen, proc is called as (proc v r), where r is the current accumulated result; the initial value of the accumulated result is seed, and the return value from proc becomes the next accumulated result. When gen returns an end-of-file object, the accumulated result at that time is returned from generator-fold.

When more than one generator is given, proc is called as (proc v1 v2r), where v1, v2 … are the values yielded from gen, gen2, …, respectively, and r is the current accumulated result. The iteration terminates when any one of the generators returns an end-of-file object.

 
(with-input-from-string "a b c d e"
  (cut generator-fold cons 'z read))
  ⇒ (e d c b a . z)
generator-for-each proc gen gen2 …

A generator version of for-each. Repeatedly applies proc on the values yielded by gen, gen2 … until any one of the generators yields an end-of-file object. The values returned from proc are discarded.

This procedure consumes generated values using side effects.

generator-collect proc gen gen2 …

A generator analogue of map. Repeatedly applies proc on the values yielded by gen, gen2 … until any one of the generators yields an end-of-file object. The values returned from proc are collected into a list and returned.

 
(with-input-from-string "a b c d e"
  (cut generator-map symbol->string read))
  ⇒ ("a" "b" "c" "d" "e")

The same effects can be achieved by combining generator->list and gmap.

 
(generator->list (gmap proc gen gen2 …))
generator-last gen

Returns the last item from the generator.

generator-find pred gen

Returns the first item from the generator gen that satisfies the predicate pred.

generator-length gen

Returns the number of items available from the generator gen.

generator-count pred gen

Returns the number of items available from the generator gen that satisfy the predicate pred.

generator-any pred gen

Returns #t if any of the values available the generator gen satisfy the predicate pred. For consistency, this procedure consumes any remaining values from the generator.

generator-every pred gen

Returns #t if all of the values available the generator gen satisfy the predicate pred.

Syntax

glet* (binding …) body body2 …

This captures a monadic pattern frequently appears in the generator code. It is in a similar spirit to and-let*, but returns as soon as the evaluated expression returns an end-of-file object, instead of #f as and-let* does.

The binding part can be either (var expr) or ( expr ). The actual definition will explain this syntax clearly.

 
(define-syntax glet*
  (syntax-rules ()
    [(_ () body body2 ...) (begin body body2 ...)]
    [(_ ([var gen-expr] more-bindings ...) . body)
     (let1 var gen-expr
       (if (eof-object? var)
         var
         (glet* (more-bindings ...) . body)))]
    [(_ ([ gen-expr ] more-bindings ...) . body)
     (let1 var gen-expr
       (if (eof-object? var)
         var
         (glet* (more-bindings ...) . body)))]))
do-generator (var gexpr) body …

Gexpr is an expression that yields a generator. It is evaluated once. The resulting generator is called repeatedly until it returns an end-of-file object. Every time the generator is called, body … are evaluated in a scope where var is bound to the value yielded from the generator.

This macro exists for side-effects. You can write the same thing with for-each families, but sometimes this macro makes the imperative code more readable:

 
(do-generator [line (file->line-generator "filename")]
  ;; do some side-effecting stuff with line
  )