On Wed, Sep 14, 2011 at 4:21 AM, Julian Foad <julian.foad_at_wandisco.com> wrote:
> On Tue, 2011-09-13, Hyrum K Wright wrote:
>> In looking over the client diff APIs, I noticed the output parameters
>> are file handles, not streams. This feels...odd to me, so I hacked
>> together the attached patch which updates the APIs to use output
>> streams. It isn't ready to commit just yet, but before doing the
>> backward compat dance (rev'ing an API before it's even released, ugh),
>> I thought I'd post it here for comments or concerns first.
>
> Cool. +1 (concept). Haven't read the patch.
>
>
>> The one gotcha is that running an external diff command still requires
>> files, so this patch creates temporary ones and then copies them to
>> the stream. Right now, this is a limitation of our own APIs, but
>> fundamentally it's because APR wants an apr_file_t to plug into the
>> stdout and stderr of processes it launches.
>
> Side note: This (stdout -> apr_file_t -> svn_stream_t) is in contrast
> with how to achieve streamy *input* to an external diff. Most external
> diff programs take their input from files on disk (not through stdin or
> an open file descriptor), and thus we require not just an apr_file_t
> object but an actual path on disk. See idea below.
>
>> Ideally, we'd find some
>> way of mimicking an apr_file_t with a stream, but that feels like
>> Future Work.
>
> Right.
>
>> Incidentally, one motivation for having streams instead of files is
>> compressed pristines. At some point, I'd like to use streams as
>> *input* to things like diff and merge, which is a prerequisite for
>> storing stuff in compressed form and uncompressing it via a stream.
>> (If that doesn't make much sense, it's 'cause I'm still working out
>> the details in my head.)
>
> For converting efficiently from a readable stream to a readable file,
> for passing to external diff programs such as "kdiff3", I envisage a
> stream API that either creates a temp file and copies the stream content
> into it, or, if it's just a simple stream reading from a file, returns
> the path of the existing file, using some private knowledge inside the
> streams layer to detect that case. Pseudo-code like this:
>
> svn_stream_t stream1 = pristine_read(checksum)
> # Read pristine text (from a plain file, a compressed file,
> # or, for a higher level API, perhaps from an RA connection?)
>
> const char *in_path1 = svn_stream_get_as_readable_file(stream1)
> # If stream1 is reading directly from an existing plain file,
> # get that file's path, else copy the stream to a temp file
> # and return its path.
>
> system("kdiff3", in_path1, in_path2, "-o", out_path)
> # Run the external command that requires its input as a file
>
> svn_stream_close(stream1)
> # Delete the file iff it was a temp file
An intriguing idea, but you'd have to know more about the underlying
stream than our current stream API currently exposes.
>> Index: subversion/include/svn_client.h
>> ===================================================================
>> --- subversion/include/svn_client.h (revision 1170202)
>> +++ subversion/include/svn_client.h (working copy)
>> @@ -2831,8 +2831,8 @@ svn_client_diff5(const apr_array_header_t
>> *diff_op
>> svn_boolean_t ignore_content_type,
>> svn_boolean_t use_git_diff_format,
>> const char *header_encoding,
>> - apr_file_t *outfile,
>> - apr_file_t *errfile,
>> + svn_stream_t *outstream,
>> + svn_stream_t *errstream,
>> const apr_array_header_t *changelists,
>> svn_client_ctx_t *ctx,
>> apr_pool_t *pool);
>
> Makes me wonder why we have errfile/errstream. This is a diff API, not
> a general-purpose external program runner.
Yeah, I was wondering about that, too. Something like
svn_io_run_diff2() can return an error stream/file, but a client-level
API should somehow wrap that in an svn_error_t, me thinks. Though,
I've no experience nor knowledge of the typical use case for external
diff tools. Perhaps they do provide meaningful feedback through
stderr?
-Hyrum
--
uberSVN: Apache Subversion Made Easy
http://www.uberSVN.com/
Received on 2011-09-14 16:07:27 CEST