[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Q: identical files "shared" in repository?

From: Ph. Marek <philipp.marek_at_bmlv.gv.at>
Date: 2005-04-11 07:39:31 CEST

Hello everybody!

I've got a question which could evolve into a feature request :-)

Say, I've got a repository with trunk T and branch A.
If there are changes to T which get merged into A (and committed to A) it's
very likely that files changed by the merge have not been tampered with on
the branch, ie. they are only modified in the trunk.

If that gets committed, the difference on A gets committed into the
repository, right? Or are just pointers to trunk's file contents stored?


I believe I can't make myself really clear. Sorry, I'll try again:
If there are many files with identical data, are they stored in the repository
multiple times or are they referenced to a common block?

IF we rely on MD5 being good enough (or SHA-512 or whatever can be used in the
future, that is, is exported by apr), we could "share" identical contents in
the repository.

The feature request would be this:

Generate a checksum-table (or file for fsfs) and store (checksum, internal
pointer to contents) or (checksum, revision, filename) into it. This is
cheap, as everything is known at commit time; the only thing to do is to
update an index.
Then check the contents for the further commits, and identical files can be
stored with minimal costs.

That would make branching really zero-cost!

Opinions, ideas, other comments?



To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Apr 11 07:40:47 2005

This is an archived mail posted to the Subversion Dev mailing list.