[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Script or process to fix old invalid data in repositories or dumps?

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Sat, 5 Apr 2014 11:12:02 +0000

Mark Phippard wrote on Fri, Apr 04, 2014 at 13:30:48 -0400:
> So I realize I can use that option to force the file to load, but that is
> just punting the problem to the future. Has anyone ever written any
> scripts that can run through an entire repository and fix these sort of
> problems? In this case, maybe a script that goes through a repos and
> retrieves and then sets each revprop using the current command line?

That's easy enough:

[[[
#!/usr/bin/env zsh
#
# Renormalize svn:* revprops in repository $1 (local path).
#
set -e
[ $# -eq 1 ] && [ -d "$1" ] || { echo "Usage: $0 REPOS" >&2; exit 1 }
REPOS=$1
ABS_REPOS=`cd -- "$REPOS" && pwd`
for revnum in {0..`svnlook youngest -- "$REPOS"`};
  for prop in $(svnlook proplist -r$revnum --revprop -- "$REPOS" | cut -c3- | grep '^svn:');
    svnadmin setrevprop -r$revnum -- "$REPOS" "$prop" =(svn propget --strict --revprop -r$revnum -- "$prop" "file://$ABS_REPOS")
]]]

(It should be easy enough to convert this to plain sh --- all the {} and
=() parts are just syntactic sugar.)

That doesn't fix node properties, but it cannot do this on an existing
repository. svnsync does fix nodeprops.

In my testing, the 'svnlook proplist' part errors out when a property
has mixed EOLs (both CRLF and LF in a single property value). I don't
have a recent 1.8/1.9-dev build to test with to see if that issue persists.

> Another problem I've seen is when the data is not UTF8. I know you can use
> svnsync to fix this problem by using the --source-prop-encoding ARG option.
> Are there any scripts to do this without doing an svnsync?

I imagine you can just take the above script and change:

    =(svn propget --strict ...)

to

    =(svn propget --strict ... | iconv -f iso-8859-1 -t utf-8)

I'm not sure whether config:miscellany:log-encoding needs to be unset,
since I expect --strict mode to ignore it.

> If not, does svnsync at least auto-fix the line-ending problem?

Yep:

[[[
% svnsync init file://$PWD/{r2,r}
Copied properties for revision 0.
NOTE: Normalized svn:* properties to LF line endings (1 rev-props, 0 node-props).
]]]

This behaviour is in 1.7, probably earlier too.

Daniel

P.S. =(cmd) is shorthand for 'a plain file whose contents is the output of cmd'.
Received on 2014-04-05 13:12:51 CEST

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.