BUG REPORT:
SUMMARY:
svndumpfilter (version 1.8.8) rearranges the order of Node records
in Revision records in its output, and as a result, the Node-path:,
which the specification says MUST come first, often appears not
as the first record. This can corrupt data when the output of
svndumpfilter is read by "svnadmin load" and other applications that
rely in their implementation on the specified order of Node records.
HOW TO REPRODUCE:
Execute this shell script (e.g. on Ubuntu Linux 14.04)
and try it several times, as the output is not deterministic,
probably due to some hash randomization):
#!/bin/bash
repo=1
rm -rf $repo 2 3
url=file://`pwd`/$repo
set -x
svnadmin create $repo
svn mkdir -m '' $url/dir1
svnadmin dump $repo >2
svndumpfilter --drop-empty-revs include dir1 <2 >3
svndumpfilter --version
diff -u 2 3
ACTUAL OUTCOME:
$ ./svndumpfilter-bug
+ svnadmin create 1
+ svn mkdir -m '' file:///home/mgk25/1/dir1
Committed revision 1.
+ svnadmin dump 1
* Dumped revision 0.
* Dumped revision 1.
+ svndumpfilter --drop-empty-revs include dir1
Including (and dropping empty revisions for) prefixes:
'/dir1'
Revision 0 committed as 0.
Revision 1 committed as 1.
+ svndumpfilter --version
svndumpfilter, version 1.8.8 (r1568071)
compiled Aug 13 2014, 17:12:39 on x86_64-pc-linux-gnu
Copyright (C) 2013 The Apache Software Foundation.
This software consists of contributions made by many people;
see the NOTICE file for more information.
Subversion is open source software, see http://subversion.apache.org/
+ diff -u 2 3
--- 2 2015-02-02 15:28:18.164314000 +0000
+++ 3 2015-02-02 15:28:18.173314000 +0000
@@ -16,10 +16,6 @@
Prop-content-length: 99
Content-length: 99
-K 10
-svn:author
-V 5
-mgk25
K 8
svn:date
V 27
@@ -28,10 +24,14 @@
svn:log
V 0
+K 10
+svn:author
+V 5
+mgk25
PROPS-END
-Node-path: dir1
Node-kind: dir
+Node-path: dir1
Node-action: add
Prop-content-length: 10
Content-length: 10
EXPECTED OUTCOME:
The "diff" output should ideally be empty, or at least
the Node-path: record should remain the first record
and not move after other records such as Node-kind:.
Any Revision record in it should start with a Node-path:.
REFERENCES:
The specification of the svnadmin dump output format at
https://svn.apache.org/repos/asf/subversion/trunk/notes/dump-load-format.txt
states:
==== Node records ====
Each Revision record is followed by one or more Node records.
Node records have the following sequence of header lines:
-------------------------------------------------------------------
Node-path: <path/to/node/in/filesystem>
[Node-kind: {file | dir}]
Node-action: {change | add | delete | replace}
[Node-copyfrom-rev: <rev>]
{...}
-------------------------------------------------------------------
Bracketing in [] indicates optional lines; { | } is an alternation group.
Dump decoders should be prepared for the optional lines after
Node-action to be in any order,
I read this specification in the sense that each revision record
MUST start with a Node-path: entry.
SEVERITY:
This is actually causing svndumpfilter output to be rejected
by"svnadmin load". For example, I have dumped a repository
and fed it through svndumpfilter. The result is rejected by
svnadmin load, because after svndumpfilter has moved
Node-copyfrom-rev: and Text-copy-source-md5: in front of the
associated Node-path: line, the "svnadmin load" command gets
confused about the MD5 value to check:
Node-copyfrom-rev: 1509
Node-copyfrom-path: trunk/proj1/file.tex
Text-copy-source-md5: b19b7a6aadc43644e7c6ae02c584e74d
Node-path: trunk/proj1/doc/file.tex
Node-action: add
Text-copy-source-sha1: d3bcc5e1d73e4ef72fabaea3ef98426789d2c5d2
Node-kind: file
Content-length: 0
This gets rejected with
svnadmin: Copy source checksum mismatch on copy from 'trunk/proj1/file.tex'@62
to 'trunk/teaching/proj1/doc/file.tex' in rev based on r1535:
expected: b19b7a6aadc43644e7c6ae02c584e74d
actual: 2a9a6d26189d9f35a436a9db6de202a1
Markus
--
Markus Kuhn, Computer Laboratory, University of Cambridge
http://www.cl.cam.ac.uk/~mgk25/ || CB3 0FD, Great Britain
Received on 2015-02-02 17:14:52 CET