[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RFC: MD5sums in svn_txdelta interface

From: Greg Hudson <ghudson_at_MIT.EDU>
Date: 2004-04-23 06:17:30 CEST

When Karl and company added MD5 support to Subversion, one of the
things they did was stick MD5 support into svn_txdelta():

  /** Return the @a md5 digest for the complete fulltext deltified by
   * @a stream, or @c NULL if @a stream has not yet returned its final
   * @c NULL window. The digest is allocated in the same memory as @a
   * STREAM.
   */
  const unsigned char *svn_txdelta_md5_digest (svn_txdelta_stream_t *stream);

It's my belief (a couple of years too late) that (1) this doc string
is pretty confusing (what is actually checksummed is the target data
read during the delta operation; I don't know how that translates into
"the complete fulltext deltified by STREAM"), and more importantly (2)
that MD5 checksums have nothing to do with delta computation and don't
belong in the same interface.

So I'm thinking that in the svn 2.0 API, we should rip out this
interface and instead introduce an "MD5 metastream" which calculates
MD5 digests as data is read from or written to an underlying stream.

I mention this now because I'm about to commit code to add a new
interface:

  /** Return a writable stream which, when fed target data, will send
   * delta windows to @a handler/@a handler_baton which transform the
   * data in @a source to the target data. As usual, the window handler
   * will receive a NULL window to signify the end of the window stream.
   * The stream handler functions will read data from @a source as
   * necessary.
   */
  svn_stream_t *svn_txdelta_target_push (svn_txdelta_window_handler_t handler,
                                         void *handler_baton,
                                         svn_stream_t *source,
                                         apr_pool_t *pool);

If people agree with me about MD5 digests not belonging in the
text-delta interface, then this signature should be fine, and it will
be the caller's responsibility to compute checksums on the data before
writing it to the target-push stream, because that's the direction
we're eventually heading in anyway. On the other hand, if people
disagree with me, then for symmetry, svn_txdelta_target_push() should
accept an optional argument for where an MD5 digest will be stored
upon closure of the target-push stream. (Or a callback, or whatever
is de rigeur these days.)

Of course, it's also possible that people disagree with me on the
whole concept of svn_txdelta_target_push, in which case my commit can
be backed out and the issue will be rendered temporarily moot. I want
this new function so that the FSFS implementation of
svn_fs_apply_text() can generate a delta on the fly as contents data
are pushed by the caller; I think that's a reasonable thing to want.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Fri Apr 23 06:17:48 2004

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.