This site is a static rendering of the Trac instance that was used by R7RS-WG1 for its work on R7RS-small (PDF), which was ratified in 2013. For more information, see Home.

Source for ticket #438

cc

changetime

2012-10-12 04:04:21

component

WG1 - Core

description

Submitter's name: Marc Feeley

Submitter's email: feeley at iro.umontreal.ca

Relevant draft: r7rs draft 6

Type: defect

Priority: major

Relevant section of draft: 6.7. Strings, 6.8. Vectors, 6.9. Bytevectors

Summary: Inconsistency of sequence copying procedures

R7RS has three vector-like data types: strings, vectors and bytevectors.  The inconsistencies in their properties and sequence copying procedures (names and API) make it harder than it needs to be for the programmer to remember.

== Self-evaluation inconsistencies ==

Vectors and bytevectors have a similar external representation, yet bytevectors are self evaluating (page 46) and vectors are not self evaluating.  I do not care very much if they are, or if they are not self-evaluating, but it should be the same for vectors and bytevectors.

== Sequence copying procedures inconsistencies ==

Subsequences of strings can be extracted using the procedure `substring` which takes 3 required parameters, i.e.

{{{
  (substring string start end)
}}}

There is also a `string-copy` procedure which takes a single required parameter and returns a copy of the string.  These procedures are related like so:

{{{
  (string-copy string) = (substring string 0 (string-length string))
}}}

Subsequences of vectors can be extracted using the procedure `vector-copy` only, which takes one required parameter and 3 optional parameters, i.e.

{{{
  (vector-copy vector [start [end [fill]]])
}}}

With a single parameter a copy of the whole vector is returned, otherwise a subsequence is returned.

Subsequences of bytevectors can be extracted using the procedure `bytevector-copy-partial`, which takes 3 required parameters and behaves exactly like `substring` except for the fact that bytevectors are being processed and returned, i.e.

{{{
  (bytevector-copy-partial bv start end)
}}}

There is also a `bytevector-copy` procedure which takes a single required parameter and returns a copy of the bytevector.  These procedures are related like so

{{{
  (bytevector-copy bv) = (bytevector-copy-partial bv 0 (bytevector-length bv))
}}}

There are also 2 procedures to copy the content of a bytevector to another bytevector imperatively: `bytevector-copy` and `bytevector-copy-partial!`.

I do not see a good reason for having different APIs (mix of required and optional parameters) and naming conventions for similar operations.

The naming convention could be based on the one which has been in place for strings for a long time, i.e. `substring`, `subvector`, and `subbytevector` for extracting subsequences.  The same API should be used consistently for all the procedures, in other words:

{{{
  (substring     string     [start [end [fill]]])
  (subvector     vector     [start [end [fill]]])
  (subbytevector bytevector [start [end [fill]]])
}}}

Note that it reads even better if bytevector operations are named using the SRFI-4 naming convention:

{{{
  (substring   string   [start [end [fill]]])
  (subvector   vector   [start [end [fill]]])
  (subu8vector u8vector [start [end [fill]]])
}}}

The functional copy procedures would remain for consistency:

{{{
  (string-copy   string)   = (substring   string)
  (vector-copy   vector)   = (subvector   vector)
  (u8vector-copy u8vector) = (subu8vector u8vector)
}}}

The imperative partial copy procedure defined for bytevectors

{{{
  (bytevector-copy-partial! from start end to at)
}}}

should exist for other sequences too.  Better consistency would be achieved by exchanging the order of the destination and source, in order to benefit from the same pattern of optional parameters as the other procedures:

{{{
  (substring-move!   to at from [start [end [fill]]])
  (subvector-move!   to at from [start [end [fill]]])
  (subu8vector-move! to at from [start [end [fill]]])
}}}

I don't think the imperative copy operation performed by `bytevector-copy!` is sufficiently common to be included in R7RS (and applied to the other sequence types).  In any case the same operation could be obtained by using a `...-move!` procedure with an additional constant 0 used for the ''at'' parameter.

Finally, I think the handling of the fill parameter is questionable. It is a bad idea for the fill parameter to have a default.  When fill is absent, it should be an error when start and end are not within the bounds of the sequence.  Otherwise, some index calculation errors (off-by-one on ''end'') may go unnoticed.  Moreover, when it is supplied, the fill should also be used when start is less than 0, for consistency with the case where end is greater to the length of the sequence.

id

keywords

milestone

owner

cowan

priority

major

reporter

cowan

resolution

fixed

severity

status

closed

summary

Formal Comment: Inconsistency of sequence copying procedures

time

2012-07-02 04:09:01

type

defect

Changes

Change at time 2012-10-12 04:04:21

author

cowan

field

comment

newvalue

Accepted in principle.

oldvalue

raw-time

1349989461897951

ticket

time

2012-10-12 04:04:21

Change at time 2012-10-12 04:04:21

author

cowan

field

resolution

newvalue

fixed

oldvalue

raw-time

1349989461897951

ticket

time

2012-10-12 04:04:21

Change at time 2012-10-12 04:04:21

author

cowan

field

status

newvalue

closed

oldvalue

accepted

raw-time

1349989461897951

ticket

time

2012-10-12 04:04:21

Change at time 2012-07-02 10:09:44

author

cowan

field

comment

newvalue

oldvalue

raw-time

1341198584125486

ticket

time

2012-07-02 10:09:44

Change at time 2012-07-02 10:09:44

author

cowan

field

owner

newvalue

cowan

oldvalue

alexshinn

raw-time

1341198584125486

ticket

time

2012-07-02 10:09:44

Change at time 2012-07-02 10:09:44

author

cowan

field

status

newvalue

accepted

oldvalue

new

raw-time

1341198584125486

ticket

time

2012-07-02 10:09:44