Could we divorce the questions of storage and algorithms, please?
We all seem to be in agreement that we need to store the file contents
on the shelf's base and the file contents as modified in the shelved
patch. There are many possible ways to do so: as two full files, or as
unidiffs, or even as deltas. All these representations are
interchangeable: any of them can be derived from any other. Which one
to use is one question.
Then, there is the problem of how to rebase a shelf, i.e., how to apply
a shelved patch to a different tree than it was composed against. This
is another question; it is independent of the first one. (This is the
question to which "apply a unidiff" and "use diff3" were suggested as
solutions.)
More below.
Johan Corveleyn wrote on Mon, Aug 28, 2017 at 22:41:15 +0200:
> There is one big disadvantage of storing the complete modified files,
> and that's storage. If I'm making a small edit to a 100 MB file,
> instead of storing a patch of 500 bytes, I have to store 100 MB per
> shelved change.
>
> I'm not an expert, but do you really need the modified file itself, if
> you have the patch and a reference to the base file (pristine)? Why
> store both F (pristine) and F' (modified file), if I can reconstruct
> F' out of F + P (patch). So I suggest:
>
> * Store the patch
> * Keep the pristine on which the patch was based (keep a reference to
> it in the pristine store, like in Brane's suggestion)
>
Rather than store a patch which is meant to apply to a particular base,
how about cutting the middleman and storing a delta (and the path_at_revision
of the delta base, if there is one)? As I say above, even if the storage
takes the form of a delta, we can implement 'svn shelf --export-as-unidiff'
and 'svn shelf --rebase-using-3-way-merge' in terms of it.
However, I think the right place to implement this is not as special
codepaths for shelves, but inside the pristine store. Therefore, I
envision the MVP simply adding two fulltext files to the pristine store,
as brane originally suggested. A future enhancement will be to teach
the pristine store not to store two .svn/pristine/*.svn-base files, but
to store one .svn-base file and one delta. That future enhancement
_will_ require a format bump, but it is entirely orthogonal to the
shelves work: neither of them is a prerequisite to the other.
I think we can borrow more ideas from problems we have already solved;
for example: a patch that makes tree changes could be stored as a
serialized editor drive, or as NODES table rows describing tree
resulting from that drive. In either representation, rebasing such a
patch the same problem as 'svn update' of a wc that has local mods, with
the shelf's base and modified trees substituted for BASE and ACTUAL
respectively.
Really, the shelves work is simply about having more than one ACTUAL
tree, isn't it? And it's not stored in the host OS but serialised under
.svn/.
> We can still perform the 3-way merge by first reconstructing F and F'
> out of F and P.
>
Agreed. In terms of the divorce I tried to chart above, "reconstructing
F and F'" is the first question, and "performing diff3" is (one proposed
answer to) the second question.
Cheers,
Daniel
P.S. Once the pristine store knows how to store text-base foo as a delta
against text-delta bar, it could just as easily store text-base foo as a
self-delta, and presto, we have compressed pristines.
> And even if the pristine gets lost (either because the patch was
> transferred to another user; or because the user executed the
> not-yet-existing command 'svn cleanup --vacuum-shelve-pristines' to
> reclaim diskspace) the patch will still be usable, although without
> the 3-way merging.
>
> --
> Johan
Received on 2017-08-29 01:27:06 CEST