[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Make client diff output streamy

From: Julian Foad <julian.foad_at_wandisco.com>
Date: Wed, 14 Sep 2011 10:21:05 +0100

On Tue, 2011-09-13, Hyrum K Wright wrote:
> In looking over the client diff APIs, I noticed the output parameters
> are file handles, not streams. This feels...odd to me, so I hacked
> together the attached patch which updates the APIs to use output
> streams. It isn't ready to commit just yet, but before doing the
> backward compat dance (rev'ing an API before it's even released, ugh),
> I thought I'd post it here for comments or concerns first.

Cool. +1 (concept). Haven't read the patch.

> The one gotcha is that running an external diff command still requires
> files, so this patch creates temporary ones and then copies them to
> the stream. Right now, this is a limitation of our own APIs, but
> fundamentally it's because APR wants an apr_file_t to plug into the
> stdout and stderr of processes it launches.

Side note: This (stdout -> apr_file_t -> svn_stream_t) is in contrast
with how to achieve streamy *input* to an external diff. Most external
diff programs take their input from files on disk (not through stdin or
an open file descriptor), and thus we require not just an apr_file_t
object but an actual path on disk. See idea below.

> Ideally, we'd find some
> way of mimicking an apr_file_t with a stream, but that feels like
> Future Work.

Right.

> Incidentally, one motivation for having streams instead of files is
> compressed pristines. At some point, I'd like to use streams as
> *input* to things like diff and merge, which is a prerequisite for
> storing stuff in compressed form and uncompressing it via a stream.
> (If that doesn't make much sense, it's 'cause I'm still working out
> the details in my head.)

For converting efficiently from a readable stream to a readable file,
for passing to external diff programs such as "kdiff3", I envisage a
stream API that either creates a temp file and copies the stream content
into it, or, if it's just a simple stream reading from a file, returns
the path of the existing file, using some private knowledge inside the
streams layer to detect that case. Pseudo-code like this:

  svn_stream_t stream1 = pristine_read(checksum)
    # Read pristine text (from a plain file, a compressed file,
    # or, for a higher level API, perhaps from an RA connection?)

  const char *in_path1 = svn_stream_get_as_readable_file(stream1)
    # If stream1 is reading directly from an existing plain file,
    # get that file's path, else copy the stream to a temp file
    # and return its path.

  system("kdiff3", in_path1, in_path2, "-o", out_path)
    # Run the external command that requires its input as a file

  svn_stream_close(stream1)
    # Delete the file iff it was a temp file

> Index: subversion/include/svn_client.h
> ===================================================================
> --- subversion/include/svn_client.h (revision 1170202)
> +++ subversion/include/svn_client.h (working copy)
> @@ -2831,8 +2831,8 @@ svn_client_diff5(const apr_array_header_t
> *diff_op
> svn_boolean_t ignore_content_type,
> svn_boolean_t use_git_diff_format,
> const char *header_encoding,
> - apr_file_t *outfile,
> - apr_file_t *errfile,
> + svn_stream_t *outstream,
> + svn_stream_t *errstream,
> const apr_array_header_t *changelists,
> svn_client_ctx_t *ctx,
> apr_pool_t *pool);

Makes me wonder why we have errfile/errstream. This is a diff API, not
a general-purpose external program runner.

- Julian
Received on 2011-09-14 11:21:46 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.