Abstract
Rationale
Procedure index
Specification
Sample Implementation
Acknowledgements
References & links
Copyright

Abstract

Lazy sequences (or lseqs) are a generalization of lists. In particular, an lseq is either a proper list or a dotted list whose last cdr is a SRFI 121 generator. A generator is a procedure that can be invoked with no arguments in order to lazily supply additional elements of the lseq. When a generator has no more elements to return, it returns an end-of-file object. Consequently, lazy sequences cannot contain end-of-file objects.

This proposal provides a set of procedures suitable for operating on lazy sequences based on SRFI 1.

Rationale

Lazy sequences are more heavyweight than generators, on which they are based, but they are more lightweight than SRFI 41 streams. However, streams are even, as explained in the SRFI 41 rationale; that is, the initial state of a stream does not have any elements that have already been realized. By contrast, lazy sequences are odd, meaning that at least one element is realized at all times unless the lseq is empty. Therefore, when you construct an lseq in an iterative lazy algorithm, only the cdr side of the lazy pair is lazily evaluated; the car side is evaluated immediately, even if you don’t use it.

In most cases this doesn't matter, because calculating one additional item is a negligible overhead. However, when you create a self-referential lazy structure, in which the earlier elements of a sequence are used to caculate the latter elements of itself, a bit of caution is needed; code that is valid for circular streams may not terminate if it is mechanically converted to use lazy sequences. This eagerness is also visible when side effects are involved; for example, a lazy character sequence reading from a port may read one character ahead.

This proposal is less comprehensive than SRFI 1, because it omits most procedures that process every element of the list (at least, when used in the absence of call/cc). For example, there are left folds but no right folds, since it's just as efficient to use lazy-realize and then ordinary SRFI 1 fold-right. The linear-update procedures of SRFI 1 are also left out, as lazy lists are not intended to be mutated.

Procedure Index

Here is a short list of the procedures provided by this SRFI.

Constructors

generator->lseq

Predicates

lseq?         lseq=

Selectors

lazy-car     lazy-cdr
lazy-first   lazy-second  lazy-third  lazy-fourth lazy-fifth
lazy-sixth   lazy-seventh lazy-eighth lazy-ninth  lazy-tenth
lazy-rest    lazy-ref
lazy-take    lazy-drop lazy-split-at

The whole lazy sequence

lazy-realize lseq->generator
lazy-length
lazy-append  lazy-concatenate 
lazy-zip     lazy-unzip1       lazy-unzip2
lazy-unzip3  lazy-unzip4       lazy-unzip5

Folding and mapping

lazy-map
lazy-fold       lazy-reduce 
lazy-for-each   lazy-pair-for-each

Searching

lazy-member       lazy-memq     lazy-memv
lazy-find         lazy-find-rest 
lazy-any          lazy-every
lazy-index
lazy-take-while   lazy-drop-while
lazy-span         lazy-break

Lazy association lists

lazy-assoc        lazy-assq       lazy-assv

Comparators

lazy-comparator   make-lazy-comparator

Specification

Except as noted, if any of these procedures accepts multiple lseq arguments, then at least one of them must be finite: that is, it must either be a proper list, or contain a generator that eventually returns an end-of-file object.

The templates given below obey the following conventions for procedure formals:

`lseq`	A lazy sequence
`x`, `y`, `a`, `b`	Any value
`object`, `value`	Any value
`n`, `i`	A natural number (an integer >= 0)
`proc`	A procedure
`pred`	A procedure whose return value is treated as a boolean
`generator`	A procedure with no arguments that returns a sequence of values
`=`	A boolean procedure taking two arguments

To interpret the examples, pretend that they are executed on a Scheme that prints lazy sequences with the syntax of lists.

Constructors

Every list constructor procedure is also an lseq constructor procedure. The procedure generator->lseq constructs an lseq based on the values of a generator. In order to prepend a realized value to a generator, simply use cons; to prepend more than one value, use SRFI 1's cons*.

generator->lseq generator -> lseq

Returns an lseq whose elements are the values generated by generator. The exact behavior is as follows:

Generator is invoked with no arguments to produce an object obj.
If obj is an end-of-file object, the empty list is returned.
Otherwise, a newly allocated pair whose car is obj and whose cdr is generator is returned.

(generator->lseq (make-circular-generator 'c)) => (c c c ...)

Predicates

lseq? x -> boolean

Returns #t iff x is an lseq, otherwise #f. This procedure may return #t if x is an improper list whose last car is a procedure that requires arguments, since there is no portable way to examine a procedure to determine how many arguments it requires.

lseq= elt= lseq₁ ... -> boolean

Determines lseq equality, given an element-equality procedure. The lseq A equals the lseq B if they are of the same length, and their corresponding elements are equal, as determined by elt=. If the element-comparison procedure's first argument is from lseq_i, then its second argument is from lseq_i+1, i.e. it is always called as (elt= a b) for a an element of lseq A, and b an element of lseq B.

In the n-ary case, every lseq_i is compared to lseq_i+1 (as opposed, for example, to comparing lseq₁ to lseq_i, for i>1). If there are no lseq arguments at all, lseq= simply returns true.

The dynamic order in which the elt= procedure is applied to pairs of elements is not specified. For example, if lseq= is applied to three lseqs, A, B, and C, it may first completely compare A to B, then compare B to C, or it may compare the first elements of A and B, then the first elements of B and C, then the second elements of A and B, and so forth.

The equality procedure must be consistent with eq?. That is, it must be the case that

(eq? x y) => (elt= x y).

Note that this implies that two lseqs which are eq? are always lseq=, as well; implementations may exploit this fact to "short-cut" the element-by-element comparisons.

(lseq= eq?) => #t       ; Trivial cases
(lseq= eq? '(a)) => #t

Selectors

lazy-car lseq -> object
lazy-first lseq -> object
lazy-cdr lseq -> object
lazy-rest lseq -> object
lazy-second lseq -> object
lazy-third lseq -> object
lazy-fourth lseq -> object
lazy-fifth lseq -> object
lazy-sixth lseq -> object
lazy-seventh lseq -> object
lazy-eighth lseq -> object
lazy-ninth lseq -> object
lazy-tenth lseq -> object
: Returns the second, third, ..., or tenth element of lseq.

(lazy-third '(a b c d e)) => c
lazy-ref lseq i -> value: Returns the ith element of lseq. (This is the same as (lazy-first (lazy-drop lseq i)).) It is an error if i >= n, where n is the length of lseq.

(lazy-ref '(a b c d) 2) => c
lazy-take lseq i -> list
lazy-drop lseq i -> lseq
: lazy-take returns the first i elements of lseq as a proper list.
lazy-drop returns all but the first i elements of lseq.

(lazy-take '(a b c d e) 2) => (a b) (lazy-drop '(a b c d e) 2) => (c d e)

lazy-drop is exactly equivalent to performing i lazy-rest operations on lseq.
lazy-split-at lseq i -> [lseq object]: Splits the lseq lseq at index i, returning two values, a list of the first i elements, and an lseq of the remaining elements. It is equivalent to

(values (lazy-take lseq i) (lazy-drop lseq i))

The whole lazy sequence

lazy-realize lseq -> list: Repeatedly applies lazy-cdr to lseq until its generator (if there is one) has been exhausted, and returns lseq, which is now guaranteed to be a proper list. This procedure can be called on an arbitrary lseq before passing it to a procedure which only accepts lists. However, if the generator never returns an end-of-file object, lazy-realize will never return.

lseq->generator lseq -> generator: Returns a generator which when invoked will return all the elements of lseq, including any that have not yet been realized.

lazy-length lseq -> integer

Returns the length of its argument, which is the non-negative integer n such that lazy-rest applied n times to the lseq produces an empty lseq.

gappend gen …

Returns a generator that yields the items from the first given generator, and once it is exhausted, use the second generator, and so on.

(generator->list (gappend (make-iota-generator 3) (make-iota-generator 2)))
 ⇒ (0 1 2 0 1)

(generator->list (gappend))
 ⇒ ()

lazy-concatenate lseq

The lseq argument is an lseq of lseqs. Returns an lseq that whose elements are all the elements of the first lseq, then all the elements of the second one, then the third, etc.

It is similar to (apply lazy-append (lazy-realize gen)), except that lazy-concatenate can work even if lseq contains an infinite number of lseqs.

(lazy-concatenate
  (list (generator->lseq (make-iota-generator 3))
        (generator->lseq (make-iota-generator 2))))
 ⇒ (0 1 2 0 1)

lazy-zip lseq₁ lseq₂ ... -> lseq

If lazy-zip is passed n lseqs, it returns an lseq as long as the shortest of these lseqs, each element of which is an n-element list comprised of the corresponding elements from the arguments.

(lazy-zip '(one two three) 
     (make-iota-generator +inf.0 1 1)
     '(odd even odd even odd even odd even))
    => ((one 1 odd) (two 2 even) (three 3 odd))

(lazy-zip '(1 2 3)) => ((1) (2) (3))

lazy-unzip1 list-of-lseqs -> lseq

lazy-unzip2 list-of-lseqs -> [lseq lseq]

lazy-unzip3 list-of-lseqs -> [lseq lseq lseq]

lazy-unzip4 list-of-lseqs -> [lseq lseq lseq lseq]

lazy-unzip5 lslist-of-lseqs -> [lseq lseq lseq lseq lseq]

lazy-unzip1 takes a list of lseqs, where every lseq must contain at least one element, and returns an lseq containing the initial element of each such lseq. That is, it returns (lazy-map lazy-first lseqs). lazy-unzip2 takes a list of lseqs, where every lseq must contain at least two elements, and returns two values: an lseq of the first elements, and an lseq of the second elements. lazy-unzip3 does the same for the first three elements of the lseqs, and so forth.

(lazy-unzip2 (list '(1 one) '(2 two) '(3 three))) =>
    (1 2 3) 
    (one two three)

Folding and mapping

lazy-fold kons knil lseq₁ lseq₂ ... -> value

The fundamental lseq iterator.

First, consider the single lazy-parameter case. If lseq₁ = (e₁ e₂ ... e_n), then this procedure returns

(kons e_n ... (kons e₂ (kons e₁ knil)) ... )

That is, it obeys the (tail) recursion

(lazy-fold kons knil lis) = (lazy-fold kons (kons (lazy-first lis) knil) (lazy-last lis))
(lazy-fold kons knil '()) = knil

Examples:

(lazy-fold + 0 lseq)			; Add up the elements of lseq.

(lazy-fold cons '() lseq)		; Reverse lseq.

;; How many symbols in lseq?
(lazy-fold (lambda (x count) (if (symbol? x) (+ count 1) count))
      0
      lseq)

;; Length of the longest string in lseq:
(lazy-fold (lambda (s max-len) (max max-len (string-length s)))
      0
      lseq)

If n lseq arguments are provided, then the kons function must take n+1 parameters: one element from each lseq, and the "seed" or fold state, which is initially knil. The fold operation terminates when the shortest lseq runs out of values:

(lazy-fold lseq* '() '(a b c) '(1 2 3 4 5)) => (c 3 b 2 a 1)

lazy-reduce f ridentity lseq -> value

lazy-reduce is a variant of lazy-fold.

ridentity should be a "right identity" of the procedure f — that is, for any value x acceptable to f,

(f x ridentity) = x

lazy-reduce has the following definition:

If lseq = (), return ridentity;
Otherwise, return (lazy-fold f (lazy-first lseq) (lazy-last lseq)).

...in other words, we compute (lazy-fold f ridentity lseq).

Note that ridentity is used only in an empty-lseq case. You typically use lazy-reduce when applying f is expensive and you'd like to avoid the extra application incurred when lazy-fold applies f to the head of lseq and the identity value, redundantly producing the same value passed in to f. For example, if f involves searching a file directory or performing a database query, this can be significant. In general, however, lazy-fold is useful in many contexts where lazy-reduce is not (consider the examples given in the lazy-fold definition — only one of the folds uses a function with a right identity. The other four may not be performed with lazy-reduce).

;; take the max of an lseq of non-negative integers.
(lazy-reduce max 0 nums)

lazy-map proc lseq₁ lseq₂ ... -> lseq

proc is a procedure taking as many arguments as there are lseq arguments and returning a single value. lazy-map applies proc element-wise to the elements of the lseqs and returns an lseq of the results, in order. The dynamic order in which proc is applied to the elements of the lseqs is unspecified.

(lazy-map lazy-second '((a b) (d e) (g h))) =>  (b e h)

(lazy-map (lambda (n) (expt n n))
     (make-iota-generator +inf.0 1 1)
    =>  (1 4 27 256 3125 ...)

(lazy-map + '(1 2 3) '(4 5 6)) =>  (5 7 9)

(let ((count 0))
  (lazy-map (lambda (ignored)
         (set! count (+ count 1))
         count)
       '(a b))) =>  (1 2) or (2 1)

lazy-for-each proc lseq₁ lseq₂ ... -> unspecified

lazy-pair-for-each proc lseq₁ lseq₂ ... -> unspecified

The arguments to lazy-for-each are like the arguments to lazy-map, but lazy-for-each calls proc for its side effects rather than for its values. Unlike lazy-map, lazy-for-each is guaranteed to call proc on the elements of the lseqs in order from the first element(s) to the last, and the value returned by lazy-for-each is unspecified.

The procedure lazy-pair-for-each is the same as lazy-for-each, except that it calls proc on each pair rather than each element.

(let ((v (make-vector 5)))
  (lazy-for-each (lambda (lazy-)
              (vector-set! v i (* i i)))
            '(0 1 2 3 4))
  v)  =>  #(0 1 4 9 16)

Searching

The following procedures all search lseqs for the leftmost element satisfying some criteria.

lazy-find pred lseq -> value

Return the first element of lseq that satisfies predicate pred; false if no element does.

(lazy-find even? '(3 1 4 1 5 9)) => 4

Note that lazy-find has an ambiguity in its lookup semantics — if lazy-find returns #f, you cannot tell (in general) if it found a #f element that satisfied pred, or if it did not find any element at all. In many situations, this ambiguity cannot arise — either the lseq being searched is known not to contain any #f elements, or the lseq is guaranteed to have an element satisfying pred. However, in cases where this ambiguity can arise, you should use lazy-find-tail instead of lazy-find — lazy-find-tail has no such ambiguity:

(cond ((lazy-find-tail pred lseq) => (lambda (lseq) ...))
      (else ...)) ; Search failed.

lazy-find-tail pred lseq -> lseq or false

Return the longest tail of lseq whose first element satisfies pred. If no element does, return false.

lazy-find-tail can be viewed as a general-predicate variant of the lazy-member function.

Examples:

(lazy-find-tail even? '(3 1 37 -8 -5 0 0)) => (-8 -5 0 0)
(lazy-find-tail even? '(3 1 37 -5)) => #f

;; imember x lseq:
(lazy-find-tail (lambda (elt) (equal? x elt)) lseq)

lazy-find-tail is essentially lazy-drop-while, where the sense of the predicate is inverted: lazy-find-tail searches until it finds an element satisfying the predicate; lazy-drop-while searches until it finds an element that doesn't satisfy the predicate.

lazy-take-while pred lseq -> lseq

Returns the longest initial prefix of lseq whose elements all satisfy the predicate pred.

(lazy-take-while even? '(2 18 3 10 22 9)) => (2 18)

lazy-drop-while pred lseq -> lseq

Drops the longest initial prefix of lseq whose elements all satisfy the predicate pred, and returns the rest of the lseq.

(lazy-drop-while even? '(2 18 3 10 22 9)) => (3 10 22 9)

lazy-span pred lseq -> [lseq lseq]

lazy-break pred lseq -> [lseq lseq]

lazy-span splits the lseq into the longest initial prefix whose elements all satisfy pred, and the remaining tail. lazy-break inverts the sense of the predicate: the tail commences with the first element of the input lseq that satisfies the predicate.

In other words: lazy-span finds the initial span of elements satisfying pred, and lazy-break breaks the lseq at the first element satisfying pred.

lazy-span is equivalent to

(values (lazy-take-while pred lseq) 
        (lazy-drop-while pred lseq))

(lazy-span even? '(2 18 3 10 22 9)) =>
  (2 18)
  (3 10 22 9)

(lazy-break even? '(3 1 4 1 5 9)) =>
  (3 1)
  (4 1 5 9)

lazy-any pred lseq₁ lseq₂ ... -> value

Applies the predicate across the lseqs, returning true if the predicate returns true on any application.

If there are n lseq arguments lseq₁ ... lseq_n, then pred must be a procedure taking n arguments and returning a boolean result.

lazy-any applies pred to the first elements of the lseq_i parameters. If this application returns a true value, lazy-any immediately returns that value. Otherwise, it iterates, applying pred to the second elements of the lseq_i parameters, then the third, and so forth. The iteration stops when a true value is produced or one of the lseqs runs out of values; in the latter case, lazy-any returns #f. The application of pred to the last element of the lseqs is a tail call.

Note the difference between lazy-find and lazy-any — lazy-find returns the element that satisfied the predicate; lazy-any returns the true value that the predicate produced.

Like lazy-every, lazy-any's name does not end with a question mark — this is to indicate that it does not return a simple boolean (#t or #f), but a general value.

(lazy-any integer? '(a 3 b 2.7))   => #t
(lazy-any integer? '(a 3.1 b 2.7)) => #f
(lazy-any < '(3 1 4 1 5)
       '(2 7 1 8 2)) => #t

lazy-every pred lseq₁ lseq₂ ... -> value

Applies the predicate across the lseqs, returning true if the predicate returns true on every application.

If there are n lseq arguments lseq₁ ... lseq_n, then pred must be a procedure taking n arguments and returning a boolean result.

lazy-every applies pred to the first elements of the lseq_i parameters. If this application returns false, lazy-every immediately returns false. Otherwise, it iterates, applying pred to the second elements of the lseq_i parameters, then the third, and so forth. The iteration stops when a false value is produced or one of the lseqs runs out of values. In the latter case, lazy-every returns the true value produced by its final application of pred. The application of pred to the last element of the lseqs is a tail call.

If one of the lseq_i has no elements, lazy-every simply returns #t.

Like lazy-any, lazy-every's name does not end with a question mark — this is to indicate that it does not return a simple boolean (#t or #f), but a general value.

lazy-index pred lseq₁ lseq₂ ... -> integer or false

Return the index of the leftmost element that satisfies pred.

If there are n lseq arguments lseq₁ ... lseq_n, then pred must be a function taking n arguments and returning a boolean result.

lazy-index applies pred to the first elements of the lseq_i parameters. If this application returns true, lazy-index immediately returns zero. Otherwise, it iterates, applying pred to the second elements of the lseq_i parameters, then the third, and so forth. When it finds a tuple of lseq elements that cause pred to return true, it stops and returns the zero-based index of that position in the lseqs.

The iteration stops when one of the lseqs runs out of values; in this case, lazy-index returns #f.

(lazy-index even? '(3 1 4 1 5 9)) => 2
(lazy-index < '(3 1 4 1 5 9 2 5 6) '(2 7 1 8 2)) => 1
(lazy-index = '(3 1 4 1 5 9 2 5 6) '(2 7 1 8 2)) => #f

lazy-member x lseq [=] -> lseq

lazy-memq x lseq -> lseq

lazy-memv x lseq -> lseq

These procedures return the longest tail of lseq whose first element is x, where the tails of lseq are the non-empty lseqs returned by (lazy-drop lseq i) for i less than the length of lseq. If x does not occur in lseq, then #f is returned. lazy-memq uses eq? to compare x with the elements of lseq, while lazy-memv uses eqv?, and lazy-member uses =, which defaults to equal?.

    (lazy-memq 'a '(a b c))           =>  (a b c)
    (lazy-memq 'b '(a b c))           =>  (b c)
    (lazy-memq 'a '(b c d))           =>  #f
    (lazy-memq (lseq 'a) '(b (a) c)) =>  #f
    (lazy-member (lseq 'a)
            '(b (a) c))           =>  ((a) c)
    (lazy-memq 101 '(100 101 102))    =>  *unspecified*
    (lazy-memv 101 '(100 101 102))    =>  (101 102)

The comparison procedure is used to compare the elements e_i of lseq to the key x in this way:


(= x e_i)		; lseq is (E1 ... En)

That is, the first argument is always x, and the second argument is one of the lseq elements. Thus one can reliably find the first element of lseq that is greater than five with (lazy-member 5 lseq <)

Note that fully general lseq searching may be performed with the lazy-find-tail and lazy-find procedures, e.g.

(lazy-find-tail even? lseq) ; Find the first elt with an even key.

Lazy association lists

An "lazy association list" (or "lazy alist") is an lseq of pairs. The car of each pair contains a key value, and the cdr contains the associated data value. They can be used to construct simple look-up tables in Scheme. Note that lazy alists are probably inappropriate for performance-critical use on large data; in these cases, immutable maps or some other alternative should be employed.

lazy-assoc key lazy-alist [=] -> lseq or #f

lazy-assq key lazy-alist -> lseq or #f

lazy-assv key lazy-alist -> lseq or #f

These procedures find the first pair in lazy-alist whose car field is key, and returns that pair. If no pair in lazy-alist has key as its car, then #f is returned. lazy-assq uses eq? to compare key with the car fields of the ipairs in lazy-alist, while lazy-assv uses eqv? and lazy-assoc uses =, which defaults to equal?.

(define e '((a 1) (b 2) (c 3)))
(lazy-assq 'a e)                              =>  (a 1)
(lazy-assq 'b e)                              =>  (b 2)
(lazy-assq 'd e)                              =>  #f
(lazy-assq (lseq 'a) '(((a)) ((b)) ((c))))  =>  #f
(lazy-assoc (lseq 'a) '(((a)) ((b)) ((c)))) =>  ((a))
(lazy-assq 5 '((2 3) (5 7) (11 13)))	      =>  *unspecified*
(lazy-assv 5 '((2 3) (5 7) (11 13)))	      =>  (5 7)

The comparison procedure is used to compare the elements e_i of lseq to the key parameter in this way:


(= key (lazy-first e_i))	; lseq is (E1 ... En)

That is, the first argument is always key, and the second argument is one of the lseq elements. Thus one can reliably find the first entry of lazy-alist whose key is greater than five with (lazy-assoc 5 lazy-alist <)

Note that fully general lazy alist searching may be performed with the lazy-find-tail and lazy-find procedures, e.g.

;; Look up the first association in lazy alist with an even key:
(lazy-find (lambda (a) (even? (lazy-first a))) lazy alist)

Comparators

lazy-pair-comparator: The lazy-comparator object is a SRFI-114 comparator suitable for comparing lseqs. Note that it is not a procedure. It compares lseqs using default-comparator on their first elements. If they are not equal, that value is returned. If they are equal, lazy-comparator is used on the lazy-rest of the lseqs and that value is returned.
make-lazy-comparator comparator -> comparator: The make-lazy-comparator procedure returns a comparator suitable for comparing lseqs using element-comparator to compare the elements.

Sample Implementation

The files in the implementation are as follows:

FIXME

Acknowledgements

Without the work of Olin Shivers on SRFI 1, this SRFI would not exist. Everyone acknowledged there is transitively acknowledged here. This is not to imply that either Olin or anyone else necessarily endorses the final results, of course.

LazySequencesCowan

Table of contents