Daniel Shahaf <d.s <at> daniel.shahaf.name> writes:
>
> LiuYan 刘研 wrote on Thu, Nov 18, 2010 at 02:53:37 +0000:
> > Daniel Shahaf <d.s <at> daniel.shahaf.name> writes:
> >
> > >
> > > Stefan Sperling wrote on Wed, Nov 17, 2010 at 18:13:44 +0100:
> > > > On Wed, Nov 17, 2010 at 03:06:19PM +0000, LiuYan 刘研 wrote:
> > > > > I mean, if the revprops files are not in UTF-8 encoding, don't return
> > REPORT
> > >
> > > Small correction: it's meaningless to talk about the encoding of the
> > > revprop files; it's only meaningful to talk about the encoding of the
> > > value of a given property.
> > >
> > > (At the revprop files level, the values are binary, and the rest of the
> > > data in those files is always ASCII.)
> > >
> > >
> >
> > You're right Daniel, but in such situation, these revprop files can be
treated
> > as readable text files:
>
> This is simply not true: if you apply 'iconv -f latin1 -t utf-8' to
> a revprop file, you will CORRUPT that revprop file.
>
>
You're right Daniel, simply apply an 'iconv' operation to a revprop file will
CORRUPT it, there's data length value should be changed too.
So I wrote a small script to do the conversion as I mentioned at
http://article.gmane.org/gmane.comp.version-control.subversion.user/101383
The script do the following operations:
1. find out the affected revprop files
2. change the svn:log value length from "V 85" to "V 98"
3. change/convert the svn:log value to a UTF-8 encoded string
Here's the small script, and be aware of this script file is in GBK encoding.
Administrator_at_CMTEL-SVR-HR-DB /cygdrive/d/SVNRepositories/repos/cmcc/db
$ cat fix-cvs2svn.sh
IFS=$'\n'
grep -i -r -n "Standard project directories initialized by cvs2svn" revprops/*
| cut -d ":" -f 1 > affected_files.txt
#grep result sample
#0/1:8:Standard project directories initialized by cvs2svn.由 cvs2svn
#0/133:8:Standard project directories initialized by cvs2svn.由 cvs2svn
for file in `cat affected_files.txt`
do
echo $file
#${file:9}: strip out 'revprops/'
dest_file="fix/${file:9}"
cp --force --preserve --verbose $file "fix-backup/${file:9}"
gawk 'FNR==7 {print "V 98"} FNR==8{print "Standard project directories
initialized by cvs2svn."} FNR==9{print "由 cvs2svn 初始化的标准项目文件夹"} FNR<7
|| FNR==10 {print $0}' $file | iconv --from GBK --to UTF-8 > "$dest_file"
done
Received on 2010-11-18 06:09:02 CET