On 3/31/07, Nicolás Lichtmaier <nick@reloco.com.ar> wrote:
> Hi, when I learned about SoC I started looking at the patch format task.
> But I didn't post anything and that task has been already asked. Anyway,
> near one of the deadlines, I submited a proposal. But I don't want to
> step into Charles Acknin toes, it would be great if we can cooperate in
> both, implementhing this and learning about Subversion coding. Or I can
> do some other Subversion task, if that's undesirable. Or even nothing
> =). Anyway, I'll post here some of the thoughts I had about the issue.
>
> This is what I have written:
>
> == Introduction ==
>
> The diff/patch tools have been great tools. The format they use (called
> "unified diff") has become a standard, and has been integrated into many
> development processes, often with tools that provide means to attach
> patches to issues. In designing a new format, we should learn from their
> success points, those are:
>
> * A format which serves both for automatic patching and reviewing.
> * A format so simple that you can even edit it with a text editor.
> * A terse format. Metadata gets out of the way, so it has become the
> preferred format to review changes.
>
> An interesting use case of a patch format is bugzilla. Bugzilla was
> designed around CVS, and it helps a community to coordinate the work by
> attaching patches to issues. Those patches are reviewed, approved and
> commited.
>
> Just as Subversion took the best of CVS concepts and created a better
> version control system, we should take this format and enhance it so
> that it can serve Subversion needs. Those new needs are three:
>
> * The ability of describe tree modifications (renames, deletions, etc.),
> * The ability to describe modifications to properties.
> * Handle binary files' modifications.
>
> == Requirements ==
>
> * The format must be easily readable, and the metadata should go out of
> the way. It shouldn't be verbose. E.g: I wouldn't put information in the
> style of RFC-822 headers. The current format's terse metadata is an
> example of this.
>
> * The patch should convey all the meaning implied in a 'svn merge'
> operation that is considered reasonable. A patch should be like tearing
> the merge action into two steps, i.e.: 'diff + patch = merge'.
>
> * As the patch format should be designed so it can be used by other
> tools, all Subversion specific references should be tagged as such. This
> doesn't necesarilly mean complicating the format or overengineering it.
> It's just leaving some place for expansion (e.g. ignoring unknown
> merge-tracking info)
>
> * The old format must keep working. This new feature should me
> implemented with a new switch. Or... currently there's no support for
> tree modifications, perhaps if there are no tree modifications the
> format would be compatible...
>
> * Support for binary diffs. Binary diffs would just be diffs created
> with the binary diff algorithm already present in subversion, and then
> encoded in base64 to get a text representation. The algorithm name
> should be stated to allow for change in the future.
>
> * The new format should handle properties. IMO they should be diffed as
> if each property value were a file (easily see which lines had been
> added to svn:ignore).
>
> == Relationship with merge-tracking ==
>
> As patching from an improved patch should work like a normal merge there
> are implications related to the merge-tracking functionality. Of course,
> this would be in case of patching in the same repository.
>
> At first look, it seems that just including the merge info property
> would suffice. It would need to be special-cased though, to resolve the
> "elision", i.e. to include the merge info when that info being inherited
> from a parent directory.
>
> The most common case will be that patches and WCs are all from the same
> repository, as patches are exchanged in a given coding community working
> around a common repository. Cross-repository diff+patch case should be
> studied carefuly. I would just ignore the merge info in cross-repo
> pathes (perhaps a "force" switch could be used). This means that the
> GUID of the repository should be included in the patch.
>
> == Design notes ==
>
> I don't think the new format should be 'unified diff' compatible. I see
> much more value in showing more clearly the tree modifications. E.g.: A
> file rename should be marked as a file rename. Not as the whole file
> disappearing (with '-' lines) and reappearing (with '+' lines).
>
> Implementation idea for the above: Each diff part could have a
> 'copy-from' field and a 'copy-to' field. This would allow for a clear
> display of files that have been modified after being copied. The pure
> rename case would only be a copy-from, copy-to part.
>
> To handle this new format, the idea would be to have a standalone
> patching program. This standalone code could be, hopefuly, later added
> by GNU patch or other tools (I wouldn't use APR for this).
>
> == Optional future steps ==
>
> It could be posible to have a standalone tool to convert back and forth
> the standard "unified diff" format and the new one. This tool would fail
> when converting to unified diff a patch which has tree modifications.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: dev-help@subversion.tigris.org
>
>
(Replying to all this time ;-)
Just wanted to throw out the idea that allowing comments in the patch
file might be useful for enabling both justification for individual
code changes and suggestions for improvement if a patch was being
passed back and forth. If comments were supported in the format then
there would be the potential for integration with merge GUIs etc.
Chris
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Sun Apr 1 00:11:27 2007