[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Generating a dump file using a powershell script

From: Daniel Shahaf <d.s_at_daniel.shahaf.name>
Date: Tue, 22 Jun 2010 20:12:05 +0300 (Jerusalem Daylight Time)

Geoff Worboys wrote on Tue, 22 Jun 2010 at 17:36 -0000:
> powershell .\Import-from-Source D:\SourceFolder D:\Temp\DumpFile.dat
> It takes the entire contents of D:\SourceFolder and creates
> a subversion dump file in D:\Temp\DumpFile.dat. It replicates
> the structure inside D:\SourceFolder so if you want a "trunk"
> folder etc you have to have created them first.
> Objects (the full tree) from D:\SourceFolder are first sorted
> by their last-write-time property and I then create a revision
> entry for each date that appears (the revision resolution is
> adjustable in the script). This makes it so that each file
> ends up appearing to have been committed on the same date that
> it had on the original source file, so checking out the files
> with the use-commit-times option gives them same date as the
> original file (if not, necessarily, exactly the same time).

i.e., you import the files in order of their timestamps, so that
svn:date remain globally sorted?


> Q1: If, in the dump file, I sometimes give a file a property
> svn:eol-style = native, but the file itself has been copied
> directly into the dump file (ie. contains CRLF end-of-lines)
> is that going to matter to svnadmin load?
> [Will the load process take care of things for me or do I
> need to parse such files and make them all LF - which is what
> svn says it uses internally for "native" files? ]
> My experiments seemed to show that svnadmin dump also produced
> the the CRLF end-of-lines but it all gets quite confusing so
> thought I would ask here.

i.e., 'svnadmin dump' produces CRLF for svn:eol-style=native files?
That surprises me; I'd expect such files to be outputted with LF in dump
files. (My testing agrees with my expectation.) Can you double-check?

In any case, it probably *should* use LF, since dumpfiles are supposed
to be a portable binary format.

> Since I mostly work under Windows it's probably not a big deal
> for me ... but I'd rather the script was correct in case it
> gets used by others that may have other requirements.
> Q2: When writing the code to try and identify text versus
> binary files I decided to look at what subversion did ... but
> now I am confused. In libsvn_subr\io.c function
> svn_io_detect_mimetype2 a comment says:
> going to examine the first block of data, and make sure that 85%
> of the bytes are such that their value is in the ranges 0x07-0x0D
> or 0x20-0x7F, and that 100% of those bytes is not 0x00.
> but my reading of this code
> if (((binary_count * 1000) / amt_read) > 850)
> {
> *mimetype = generic_binary;
> return SVN_NO_ERROR;
> }
> suggests that it is actually setting the type to binary only
> if it finds more than 85% are binary bytes (in earlier code a
> file binary if forced if any null byte is found).
> Can anyone explain this? A bug or am I missing something?

What's the question? Are you saying the code/comment disagree?

> Q5: I found a description of the dump file in the source but
> that description says "Properties are stored in the same
> human-readable hashdump format used by working copy property
> files," Any pointers to a description for that?

You're quoting <http://svn.apache.org/repos/asf/subversion/trunk/notes/dump-load-format.txt>.

Internally the function it uses is svn_hash_write2(), and there's
a small documentation comment at the top of hash.c. But, as you say,

> (Obviously I've gotten by just by visually checking dump files
> produced by svnadmin, but it would be good to know what I was
> doing. ;-)

the format isn't hard to reverse-engineer, right?

> Hmmm... big post for my first post. Hope that's okay.

Yeah. For next time, you could consider adding a one-paragraph summary
at the top, and/or make it clear what kind of responses you're looking
for (e.g., "Hey, I'm looking for people to try my script", or "Hey, I'm
looking for answers to questions I ran into developing a script", or ...)
Received on 2010-06-22 19:11:51 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.