[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [PATCH] Make client diff output streamy

From: Hyrum K Wright <hyrum.wright_at_wandisco.com>
Date: Wed, 14 Sep 2011 09:06:51 -0500

On Wed, Sep 14, 2011 at 4:21 AM, Julian Foad <julian.foad_at_wandisco.com> wrote:
> On Tue, 2011-09-13, Hyrum K Wright wrote:
>> In looking over the client diff APIs, I noticed the output parameters
>> are file handles, not streams.  This feels...odd to me, so I hacked
>> together the attached patch which updates the APIs to use output
>> streams.  It isn't ready to commit just yet, but before doing the
>> backward compat dance (rev'ing an API before it's even released, ugh),
>> I thought I'd post it here for comments or concerns first.
>
> Cool.  +1 (concept).  Haven't read the patch.
>
>
>> The one gotcha is that running an external diff command still requires
>> files, so this patch creates temporary ones and then copies them to
>> the stream.  Right now, this is a limitation of our own APIs, but
>> fundamentally it's because APR wants an apr_file_t to plug into the
>> stdout and stderr of processes it launches.
>
> Side note:  This (stdout -> apr_file_t -> svn_stream_t) is in contrast
> with how to achieve streamy *input* to an external diff.  Most external
> diff programs take their input from files on disk (not through stdin or
> an open file descriptor), and thus we require not just an apr_file_t
> object but an actual path on disk.  See idea below.
>
>>   Ideally, we'd find some
>> way of mimicking an apr_file_t with a stream, but that feels like
>> Future Work.
>
> Right.
>
>> Incidentally, one motivation for having streams instead of files is
>> compressed pristines.  At some point, I'd like to use streams as
>> *input* to things like diff and merge, which is a prerequisite for
>> storing stuff in compressed form and uncompressing it via a stream.
>> (If that doesn't make much sense, it's 'cause I'm still working out
>> the details in my head.)
>
> For converting efficiently from a readable stream to a readable file,
> for passing to external diff programs such as "kdiff3", I envisage a
> stream API that either creates a temp file and copies the stream content
> into it, or, if it's just a simple stream reading from a file, returns
> the path of the existing file, using some private knowledge inside the
> streams layer to detect that case.  Pseudo-code like this:
>
>  svn_stream_t stream1 = pristine_read(checksum)
>    # Read pristine text (from a plain file, a compressed file,
>    # or, for a higher level API, perhaps from an RA connection?)
>
>  const char *in_path1 = svn_stream_get_as_readable_file(stream1)
>    # If stream1 is reading directly from an existing plain file,
>    # get that file's path, else copy the stream to a temp file
>    # and return its path.
>
>  system("kdiff3", in_path1, in_path2, "-o", out_path)
>    # Run the external command that requires its input as a file
>
>  svn_stream_close(stream1)
>    # Delete the file iff it was a temp file

An intriguing idea, but you'd have to know more about the underlying
stream than our current stream API currently exposes.

>> Index: subversion/include/svn_client.h
>> ===================================================================
>> --- subversion/include/svn_client.h     (revision 1170202)
>> +++ subversion/include/svn_client.h     (working copy)
>> @@ -2831,8 +2831,8 @@ svn_client_diff5(const apr_array_header_t
>> *diff_op
>>                   svn_boolean_t ignore_content_type,
>>                   svn_boolean_t use_git_diff_format,
>>                   const char *header_encoding,
>> -                 apr_file_t *outfile,
>> -                 apr_file_t *errfile,
>> +                 svn_stream_t *outstream,
>> +                 svn_stream_t *errstream,
>>                   const apr_array_header_t *changelists,
>>                   svn_client_ctx_t *ctx,
>>                   apr_pool_t *pool);
>
> Makes me wonder why we have errfile/errstream.  This is a diff API, not
> a general-purpose external program runner.

Yeah, I was wondering about that, too. Something like
svn_io_run_diff2() can return an error stream/file, but a client-level
API should somehow wrap that in an svn_error_t, me thinks. Though,
I've no experience nor knowledge of the typical use case for external
diff tools. Perhaps they do provide meaningful feedback through
stderr?

-Hyrum

-- 
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/
Received on 2011-09-14 16:07:27 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.