Abstract

Lazy sequences (or lseqs, pronounced "ell-seeks") are a generalization of lists. In particular, an lseq is either a proper list or a dotted list whose last cdr is a SRFI 121 generator. A generator is a procedure that can be invoked with no arguments in order to lazily supply additional elements of the lseq. When a generator has no more elements to return, it returns an end-of-file object. Consequently, lazy sequences cannot reliably contain end-of-file objects.

This proposal provides a set of procedures suitable for operating on lazy sequences based on SRFI 1.

Rationale

Lazy sequences are more heavyweight than generators, on which they are based, but they are more lightweight than SRFI 41 streams. However, streams are even, as explained in the SRFI 41 rationale; that is, the initial state of a stream does not have any elements that have already been realized. By contrast, lazy sequences are odd, meaning that at least one element is realized at all times unless the lseq is empty. Therefore, when constructing an lseq in an iterative lazy algorithm, only the cdr side of the lazy pair is lazily evaluated; the car side is evaluated immediately, even if it is never used.

In most cases this doesn't matter, because calculating one additional item is a negligible overhead. However, when you create a self-referential lazy structure, in which the earlier elements of a sequence are used to calculate the latter elements of itself, a bit of caution is needed; code that is valid for circular streams may not terminate if it is mechanically converted to use lazy sequences. This eagerness is also visible when side effects are involved; for example, a lazy character sequence reading from a port may read one character ahead.

This proposal is less comprehensive than SRFI 1, because it omits many procedures that process every element of their list arguments (at least, when used in the absence of call/cc). Lseqs are meant to be used with ordinary Scheme functions, which are strict, so neither left nor right folds are provided: it is just as time-efficient to use lseq-realize and then SRFI 1 fold or fold-right, and just as space-efficient to use lseq->generator and SRFI 121's generator-fold.

The linear-update procedures of SRFI 1 are also left out, as lazy sequences are not intended to be mutated.

 

Table of contents


Procedure index
Specification
  
  Constructors
  
Predicates
  
Selectors
  
The whole lazy sequence
  
Mapping and filtering
  
Searching
  
Lazy association lists
  
Comparators
  
Sample Implementation
Acknowledgements
References & links
Copyright



Procedure Index

Here is a short list of the procedures provided by this SRFI.



 Constructors

generator->lseq 


 Predicates

lseq?         lseq=


 Selectors

lseq-car     lseq-cdr
lseq-first   lseq-second  lseq-third  lseq-fourth lseq-fifth
lseq-sixth   lseq-seventh lseq-eighth lseq-ninth  lseq-tenth
lseq-rest    lseq-ref
lseq-take    lseq-drop lseq-split-at   


 The whole lazy sequence

lseq-realize lseq->generator
lseq-length
lseq-append  lseq-concatenate 
lseq-zip     lseq-unzip1       lseq-unzip2
lseq-unzip3  lseq-unzip4       lseq-unzip5


 Mapping and filtering

lseq-map        lseq-pair-map
lseq-for-each   lseq-pair-for-each
lseq-filter     lseq-remove


 Searching
lseq-member       lseq-memq     lseq-memv
lseq-find         lseq-find-rest 
lseq-any          lseq-every
lseq-index
lseq-take-while   lseq-drop-while
lseq-span         lseq-break


 Lazy association lists

lseq-assoc        lseq-assq       lseq-assv


 Comparators

lseq-comparator   make-lseq-comparator




Specification
Except as noted, if any of these procedures accepts multiple lseq arguments, then at
least one of them must be finite: that is, it must either be a proper list,
or contain a generator that eventually returns an end-of-file object.


The templates given below obey the following conventions for procedure formals:

 lseq	
     A lazy sequence

     x, y, a, b
     Any value
 object, value
     Any value
 n, i
     A natural number (an integer >= 0)
 proc
     A procedure
 pred
     A procedure whose return value is treated as a boolean
 generator
     A procedure with no arguments that returns a sequence of values
 =
     A boolean procedure taking two arguments


To interpret the examples, pretend that they are executed on a Scheme that prints lazy sequences with the syntax of lists,
truncating them when they get too long.


Constructors

Every list constructor procedure is also an lseq constructor procedure.
The procedure generator->lseq constructs an lseq based on the
values of a generator.  In order to prepend a realized value to a generator,
simply use cons; to prepend more than one value, use SRFI 1's
cons*.



 generator->lseq generator -> lseq

    Returns an lseq
    whose elements are the values generated by generator.  The exact behavior is as follows:

Generator is invoked with no arguments to produce an object obj.
If obj is an end-of-file object, the empty list is returned.
Otherwise, a newly allocated pair whose car is obj and whose
cdr is generator is returned.

(generator->lseq (make-repeating-generator 'c)) => (c c c ...)



Predicates



lseq? x -> boolean


    Returns #t iff x is an lseq, otherwise #f.
       This procedure may return #t if x is an improper list
       whose last cdr is a procedure that requires arguments, since there is no
       portable way to examine a procedure to determine how many arguments it requires.
    





lseq= elt= lseq₁ ... -> boolean

    Determines lseq equality, given an element-equality procedure.
    The lseq A equals the lseq B 
    if they are of the same length,
    and their corresponding elements are equal, 
    as determined by elt=. 
    If the element-comparison procedure's first argument is
    from lseq_i, 
    then its second argument is from lseq_i+1, 
    i.e. it is always called as
        (elt= a b)
    for a an element of lseq A, 
    and b an element of lseq B.

    In the n-ary case, 
    every lseq_i is compared to 
    lseq_i+1 
    (as opposed, for example, to comparing 
    lseq₁ to    lseq_i, 
    for i>1). 
    If there are no lseq arguments at all, 
    lseq= simply returns true.

    The dynamic order in which the elt= procedure is
    applied to pairs of elements is not specified. 
    For example, if lseq= is applied
    to three lseqs, A, B, and C, 
    it may first completely compare A to B,
    then compare B to C, 
    or it may compare the first elements of A and B,
    then the first elements of B and C, 
    then the second elements of A and B, and so forth.

    The equality procedure must be consistent with eq?. 
    That is, it must be the case that

        (eq? x y) => (elt= x y).

    Note that this implies that two lseqs which are eq? 
    are always lseq=, as well; implementations may exploit this
    fact to "short-cut" the element-by-element comparisons.
(lseq= eq?) => #t       ; Trivial cases
(lseq= eq? '(a)) => #t






Selectors





lseq-car   lseq -> object 


lseq-first  lseq -> object 
   
    These procedures are synonymous.
    They return the first element of lseq.  They
    are included for completeness, as they are the same as car.
    It is an error to apply them to an empty lseq.




lseq-cdr   lseq -> lseq


lseq-rest  lseq -> lseq
    
    These procedures are synonymous.
       They return an lseq with the contents of lseq except for the
    first element.  The exact behavior is as follows:

If lseq is a pair whose cdr is a procedure, then the procedure
is invoked with no arguments to produce an object obj.
If obj is an end-of-file object, then the cdr of lseq is
set to the empty list, which is returned.
If obj is any other object, then a new pair is allocated whose car
is obj and whose cdr is the cdr of lseq (i.e. the procedure).
The cdr of lseq is set to the newly allocated pair, which is returned.
If lseq is a pair whose cdr is not a procedure, then the cdr is returned.
If lseq is not a pair, it is an error.


Implementations that inline cdr are advised to inline lseq-cdr if
possible.



lseq-second  lseq -> object 


lseq-third   lseq -> object 


lseq-fourth  lseq -> object 


lseq-fifth   lseq -> object 


lseq-sixth   lseq -> object 


lseq-seventh lseq -> object 


lseq-eighth  lseq -> object 


lseq-ninth   lseq -> object 


lseq-tenth   lseq -> object  


    Returns the second, third, ..., or tenth element of lseq. 

(lseq-third '(a b c d e)) => c





lseq-ref lseq i -> value

    
    Returns the ith element of lseq.  
    (This is the same as 
        (lseq-first (lseq-drop lseq i)).)
    It is an error if i >= n, 
    where n is the length of lseq.
    
(lseq-ref '(a b c d) 2) => c
    




lseq-take lseq i -> lseq


lseq-drop lseq i -> lseq


    lseq-take returns the first i elements of lseq.

    lseq-drop returns all but the first i elements of lseq.

(lseq-take '(a b c d e)  2) => (a b)
(lseq-drop '(a b c d e)  2) => (c d e)

    lseq-drop is exactly equivalent to performing i lseq-rest operations on lseq.





lseq-split-at  lseq i -> [list lseq]

    Splits lseq 
    at index i, returning two values, a list of the 
    first i elements, and an lseq of the remaining elements. It is equivalent
    to
(values (lseq-take lseq i) (lseq-drop lseq i))




The whole lazy sequence




lseq-realize lseq -> list

     
     Repeatedly applies lseq-cdr to lseq
        until its generator (if there is one) has been exhausted,
        and returns lseq, which is now
        guaranteed to be a proper list.  This
        procedure can be called on an arbitrary lseq before passing
        it to a procedure which only accepts lists.  However, if the
        generator never returns an end-of-file
        object, lseq-realize will never return.

 



lseq->generator lseq -> generator

     
     Returns a generator which when invoked will return all the elements
        of lseq, including any that have not yet been realized.

 



lseq-length  lseq -> integer

    Returns the length of its argument, which is the non-negative integer n such that lseq-rest 
    applied n times to the lseq produces an empty lseq.  lseq must be finite, or this procedure
    will not return.




lseq-append lseq …
Returns an lseq that lazily contains all the elements of all the lseqs
in order.





lseq-concatenate lseq
The lseq argument is an lseq of lseqs.
Returns an lseq whose elements are all the elements of the first lseq,
then all the elements of the second one, then the third, etc.

It is similar to (apply lseq-append (lseq-realize lseq)), except
that lseq-concatenate can work even if lseq contains an infinite
number of lseqs.





lseq-zip lseq₁ lseq₂ ... -> lseq

    If lseq-zip is passed n lseqs, it returns an lseq as long as the shortest
    of these lseqs, each element of which is an n-element list comprised
    of the corresponding elements from the arguments.

(lseq-zip '(one two three) 
     (generator->lseq (make-iota-generator +inf.0 1 1))
     (generator->lseq (make-repeating-generator) '(odd even))))
    => ((one 1 odd) (two 2 even) (three 3 odd))

(lseq-zip '(1 2 3)) => ((1) (2) (3))

    

Mapping and filtering





lseq-map proc lseq₁ lseq₂ ... -> unspecified


lseq-pair-map proc lseq₁ lseq₂ ... -> unspecified

     

     proc is a procedure taking as many arguments 
     as there are lseq arguments and returning a single value.  
     lseq-map lazily applies proc element-wise to the elements
     of the lseqs and returns an lseq of the results, 
     in order.  
     The dynamic order in which proc 
     is applied to the elements of the lseqs is unspecified.
     The procedure lseq-pair-map is the same as
     lseq-map, except that it
     calls proc on each pair rather than
     each element.
    
(lseq-map lseq-second '((a b) (d e) (g h))) =>  (b e h)

(lseq-map (lambda (n) (expt n n))
     (make-iota-generator +inf.0 1 1)
    =>  (1 4 27 256 3125 ...)

(lseq-map + '(1 2 3) '(4 5 6)) =>  (5 7 9)

(let ((count 0))
  (lseq-map (lambda (ignored)
         (set! count (+ count 1))
         count)
       '(a b))) =>  (1 2) or (2 1)


 
    



lseq-for-each proc lseq₁ lseq₂ ... -> unspecified


lseq-pair-for-each proc lseq₁ lseq₂ ... -> unspecified

     

     The arguments to lseq-for-each are like the arguments to 
     lseq-map, but
     lseq-for-each calls proc for its side effects rather
     than for its values.  
     Unlike lseq-map, lseq-for-each is guaranteed to call 
     proc on the elements of the lseqs in order from the first
     element(s) to the last, 
     and the value returned by lseq-for-each is unspecified.
     The procedure lseq-pair-for-each is the same as
     lseq-for-each, except that it
     calls proc on each pair rather than
     each element.

(let ((v (make-vector 5)))
  (lseq-for-each (lambda (lseq-)
              (vector-set! v i (* i i)))
            '(0 1 2 3 4))
  v)  =>  #(0 1 4 9 16)

  
 



lseq-filter pred lseq -> lseq


lseq-remove pred lseq -> lseq

     

     The procedure lseq-filter returns an lseq that contains
     only the elements of lseq that satisfy
     pred.
     The procedure lseq-remove is the same as
     lseq-filter, except that it
     returns elements that do not satisfy pred.

(let ((v (make-vector 5)))
  (lseq-for-each (lambda (lseq-)
              (vector-set! v i (* i i)))
            '(0 1 2 3 4))
  v)  =>  #(0 1 4 9 16)

  
 
  



Searching


The following procedures all search lseqs for the leftmost element satisfying
some criteria. 




lseq-find pred lseq -> value

    Return the first element of lseq that satisfies predicate pred;
    #f if no element does.

(lseq-find even? '(3 1 4 1 5 9)) => 4


    Note that lseq-find has an ambiguity in its lookup semantics — if lseq-find
    returns #f, you cannot tell (in general) if it found a #f element
    that satisfied pred, or if it did not find any element at all. In
    many situations, this ambiguity cannot arise — either the lseq being
    searched is known not to contain any #f elements, or the lseq is
    guaranteed to have an element satisfying pred. However, in cases
    where this ambiguity can arise, you should use lseq-find-tail instead of
    lseq-find — lseq-find-tail has no such ambiguity:
(cond ((lseq-find-tail pred lseq) => (lambda (lseq) ...))
      (else ...)) ; Search failed.





lseq-find-tail pred lseq -> lseq or #f

    Return the longest tail of lseq whose first element satisfies pred. If no element does,
    return #f.

    lseq-find-tail can be viewed as a general-predicate variant of the lseq-member
    function.

    Examples: 
(lseq-find-tail even? '(3 1 37 -8 -5 0 0)) => (-8 -5 0 0)
(lseq-find-tail even? '(3 1 37 -5)) => #f

;; imember x lseq:
(lseq-find-tail (lambda (elt) (equal? x elt)) lseq)



    lseq-find-tail is essentially lseq-drop-while, 
    where the sense of the predicate is inverted: 
    lseq-find-tail searches until it finds an element satisfying
    the predicate; lseq-drop-while searches until it finds an 
    element that doesn't satisfy the predicate.




lseq-take-while  pred lseq -> lseq


Returns the longest initial prefix of lseq whose elements all
satisfy the predicate pred.

(lseq-take-while even? '(2 18 3 10 22 9)) => (2 18)





lseq-drop-while pred lseq -> lseq

Drops the longest initial prefix of lseq whose elements all
satisfy the predicate pred, and returns the rest of the lseq.

(lseq-drop-while even? '(2 18 3 10 22 9)) => (3 10 22 9)





lseq-span   pred lseq -> [lseq lseq]


lseq-break  pred lseq -> [lseq lseq]


lseq-span splits the lseq into the longest initial prefix whose
elements all satisfy pred, and the remaining tail. 
lseq-break inverts the sense of the predicate: 
the tail commences with the first element of the input lseq
that satisfies the predicate.


In other words: 
lseq-span finds the initial span of elements 
satisfying pred, 
and lseq-break breaks the lseq at the first element
satisfying pred.


lseq-span is equivalent to 
(values (lseq-take-while pred lseq) 
        (lseq-drop-while pred lseq))



(lseq-span even? '(2 18 3 10 22 9)) =>
  (2 18)
  (3 10 22 9)

(lseq-break even? '(3 1 4 1 5 9)) =>
  (3 1)
  (4 1 5 9)






lseq-any pred lseq₁ lseq₂ ... -> value

    Applies the predicate across the lseqs, returning true if the predicate
    returns true on any application.

    If there are n lseq arguments lseq₁ ... lseq_n, then pred must be a
    procedure taking n arguments and returning a boolean result.

    lseq-any applies pred to the first elements of the lseq_i parameters.
    If this application returns a true value, lseq-any immediately returns
    that value. Otherwise, it iterates, applying pred to the second
    elements of the lseq_i parameters, then the third, and so forth.
    The iteration stops when a true value is produced or one of the lseqs runs
    out of values; in
    the latter case, lseq-any returns #f. 
    The application of pred to the last element of the
    lseqs is a tail call.

    Note the difference between lseq-find and lseq-any — lseq-find returns the element
    that satisfied the predicate; lseq-any returns the true value that the
    predicate produced.

    Like lseq-every, lseq-any's name does not end with a question mark — this is to
    indicate that it does not return a simple boolean (#t or #f), but a
    general value.

(lseq-any integer? '(a 3 b 2.7))   => #t
(lseq-any integer? '(a 3.1 b 2.7)) => #f
(lseq-any < '(3 1 4 1 5)
       '(2 7 1 8 2)) => #t





lseq-every pred lseq₁ lseq₂ ... -> value

    Applies the predicate across the lseqs, returning true if the predicate
    returns true on every application.

    If there are n lseq arguments lseq₁ ... lseq_n, then pred must be a
    procedure taking n arguments and returning a boolean result.

    lseq-every applies pred to the first elements of the lseq_i parameters.
    If this application returns false, lseq-every immediately returns false.
    Otherwise, it iterates, applying pred to the second elements of the
    lseq_i parameters, then the third, and so forth. The iteration stops
    when a false value is produced or one of the lseqs runs out of values.
    In the latter case, lseq-every returns
    the true value produced by its final application of pred. 
    The application of pred to the last element of the lseqs 
    is a tail call.

    If one of the lseq_i has no elements, lseq-every simply returns #t.

    Like lseq-any, lseq-every's name does not end with a question mark — this is to
    indicate that it does not return a simple boolean (#t or #f), but a
    general value.




lseq-index pred lseq₁ lseq₂ ... -> integer or #f

    Return the index of the leftmost element that satisfies pred.

    If there are n lseq arguments lseq₁ ... lseq_n, then pred must be a
    function taking n arguments and returning a boolean result.

    lseq-index applies pred to the first elements of the lseq_i parameters.
    If this application returns true, lseq-index immediately returns zero.
    Otherwise, it iterates, applying pred to the second elements of the
    lseq_i parameters, then the third, and so forth. When it finds a tuple of
    lseq elements that cause pred to return true, it stops and returns the
    zero-based index of that position in the lseqs.

    The iteration stops when one of the lseqs runs out of values; in this
    case, lseq-index returns #f.

(lseq-index even? '(3 1 4 1 5 9)) => 2
(lseq-index < '(3 1 4 1 5 9 2 5 6) '(2 7 1 8 2)) => 1
(lseq-index = '(3 1 4 1 5 9 2 5 6) '(2 7 1 8 2)) => #f





lseq-member x lseq [=] -> lseq


lseq-memq x lseq -> lseq


lseq-memv x lseq -> lseq

     

    These procedures return the longest tail of lseq whose first element is
    x, where the tails of lseq are the 
    non-empty lseqs returned by 
        (lseq-drop lseq i)
    for i less than the length of lseq.  
    If x does
    not occur in lseq, then #f is returned.  
    lseq-memq uses eq? to compare x
    with the elements of lseq, 
    while lseq-memv uses eqv?, and
    lseq-member uses =, which defaults to equal?.

    (lseq-memq 'a '(a b c))           =>  (a b c)
    (lseq-memq 'b '(a b c))           =>  (b c)
    (lseq-memq 'a '(b c d))           =>  #f
    (lseq-memq (lseq 'a) '(b (a) c)) =>  #f
    (lseq-member (lseq 'a)
            '(b (a) c))           =>  ((a) c)
    (lseq-memq 101 '(100 101 102))    =>  *unspecified*
    (lseq-memv 101 '(100 101 102))    =>  (101 102)


    

    The comparison procedure is used to compare the elements e_i of lseq
    to the key x in this way:

(= x e_i)		; lseq is (E1 ... En)

    That is, the first argument is always x, and the second argument is
    one of the lseq elements. Thus one can reliably find the first element
    of lseq that is greater than five with
	(lseq-member 5 lseq <)


    Note that fully general lseq searching may be performed with
    the lseq-find-tail and lseq-find procedures, e.g.
(lseq-find-tail even? lseq) ; Find the first elt with an even key.





Lazy association lists

An "lazy association list" (or "lazy alist") is an lseq of pairs. The car of each pair
contains a key value, and the cdr contains the associated data value. They can
be used to construct simple look-up tables in Scheme.
Note that lazy alists are probably inappropriate for performance-critical use on large data;
in these cases, immutable maps or some other alternative should be employed.





lseq-assoc key lseq-alist [=] -> lseq or #f


lseq-assq key lseq-alist -> lseq or #f


lseq-assv key lseq-alist -> lseq or #f


  
    These procedures
    find the first pair in lseq-alist whose car field is key, 
    and returns that pair.  
    If no pair in lseq-alist has key as its car, 
    then #f is returned.  
    lseq-assq uses eq? to compare key 
    with the car fields of the ipairs in lseq-alist, 
    while lseq-assv uses eqv? 
    and lseq-assoc uses =, which defaults to equal?.

(define e '((a 1) (b 2) (c 3)))
(lseq-assq 'a e)                              =>  (a 1)
(lseq-assq 'b e)                              =>  (b 2)
(lseq-assq 'd e)                              =>  #f
(lseq-assq (lseq 'a) '(((a)) ((b)) ((c))))  =>  #f
(lseq-assoc (lseq 'a) '(((a)) ((b)) ((c)))) =>  ((a))
(lseq-assq 5 '((2 3) (5 7) (11 13)))	      =>  *unspecified*
(lseq-assv 5 '((2 3) (5 7) (11 13)))	      =>  (5 7)


    The comparison procedure is used to compare the elements e_i of lseq
    to the key parameter in this way:

(= key (lseq-first e_i))	; lseq is (E1 ... En)

    That is, the first argument is always key, 
    and the second argument is one of the lseq elements. 
    Thus one can reliably find the first entry
    of lseq-alist whose key is greater than five with
	(lseq-assoc 5 lseq-alist <)
     

    Note that fully general lazy alist searching may be performed with
    the lseq-find-tail and lseq-find procedures, e.g.
;; Look up the first association in lazy alist with an even key:
(lseq-find (lambda (a) (even? (lseq-first a))) lazy alist)




Comparators




lseq-pair-comparator

     
     The lseq-comparator object is a SRFI-114 comparator suitable for comparing lseqs.
     Note that it is not a procedure.
     It compares lseqs using default-comparator on their first elements.  If they are not equal, that value is returned.  If they are equal, lseq-comparator is used on the lseq-rest of the lseqs and that value is returned.

 


make-lseq-comparator comparator -> comparator

     
     The make-lseq-comparator procedure returns a comparator suitable for comparing lseqs
     using element-comparator to compare the elements.

 

 

Sample Implementation
The files in the implementation are as follows:

FIXME




Acknowledgements

Without the work of Olin Shivers on SRFI 1,
this SRFI would not exist. Everyone acknowledged there is transitively acknowledged here.
This is not to imply that either Olin or anyone else necessarily endorses the final
results, of course.

Special thanks to Shiro Kawai, whose Gauche implementation of lazy sequences inspired this one.

`lseq`	A lazy sequence
`x`, `y`, `a`, `b`	Any value
`object`, `value`	Any value
`n`, `i`	A natural number (an integer >= 0)
`proc`	A procedure
`pred`	A procedure whose return value is treated as a boolean
`generator`	A procedure with no arguments that returns a sequence of values
`=`	A boolean procedure taking two arguments

Lazy­Sequences­Cowan

Table of contents

LazySequencesCowan