[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Pros and cons of significantly large repositories

From: Andrew R Feller <afelle1_at_lsu.edu>
Date: 2007-04-09 15:51:07 CEST

Thanks for the feedback Erik and James; I appreciate it!


I believe project / product documentation should live within the same
repository as the project / product. Especially when multiple people
are working on a project, there is no excuse to have it in the
repository, so it can be referred to and updated as necessary. So under
the main BTT structure of a repository, we have a "documentation"
directory along with basic configuration and third party compiled code
(JARs, DLLs, etc) in a library/compiled directory.


Currently, we are sticking to the common used BTT layout from the root
level, but we have also been trying to justify the use of alternative
repository layouts such as the multiple project layout (example below).
The argument for alternative layouts has mainly been fueled by project
repositories that hold work for several related products. What other
popular repository layouts are common? When should repositories REALLY
deviate from the standard BTT root?


/ (repository root)












As far as archiving binary data, we are looking to archive Lotus Notes
database templates and COBOL source into Subversion. These aren't meant
to be updated by users but rather as tracked history and reference. The
concern especially with Lotus Notes database templates is the size as a
basic .ntf runs half a MB. In the case that a repository has grown too
large, another coworker recommended using svndumpfilter to get a subset
of a repository and history and archiving the rest on Tivoli. What are
the recommended ways to handle repositories that have grown unwieldy?


Thanks once again for the change fellas,



        Using multiple repositories for code that you may want to merge
together is a bad use of multiple repositories. I do not recommend it.
Separating your documentation from your source code would be a fine
separation but the documents aren't likely to take up a lot of space and
the separation might make it harder to find.


        I agree with James above. I like to keep everything that is
related to a project or product in one repository. That means that by
checking out a particular revision of the repository, I can get
everything related to the project as it stood at that point in time. If
one project messes up its repo, then other projects are unaffected.



        The more repos you have, the more complicated your setup of
permissions, hook scripts, and whatnot will be. So there's another


        The consultant that got us started on the Subversion road
recommended storing the binary builds in the same repository. I
personally believe


        This depends on your build. One of our tools embeds a date code
into the binary, and our user cares about it. So archiving the "golden
build" is necessary in this case. If you trust your tools enough to
faithfully reproduce a build, you don't have to store build products.
Another "binary" we archive are installers. We can always re-create an
installation program, but we find it very convenient to avoid doing
that. We consume a lot of disk space, but it has been manageable. The
benefits of having everything controlled have outweighed the costs of a
big repository.


         that if your binaries are quite large it would be a better idea
to just back them up somewhere else like some other repository or just
on disk (if

         they go poof you should be able to rebuild them right?).
There are pros and cons to each side of the argument, but my advice
would be to pay


        One nice thing about storing binary builds is that SVN makes it
easy to demonstrate that a file is bit-for-bit correct and makes it easy
to track changes. That helps me to show that a particular "golden file"
really has not changed since it was created.


         attention to how big the binary is and realize that the binary
will increase the size of the repository by the size of the binary if
you are popping


         them into separate labeled release folders. Unless you are
loading them all into the same folder/location, you will not get the
benefits of the binary diff.


        There's my two cents, hope it helps.



        Good points, James. Here's another 40% of a nickel. Erik



        From: Andrew R Feller [mailto:afelle1@lsu.edu]
        Sent: Thursday, April 05, 2007 12:52 PM
        To: users@subversion.tigris.org
        Subject: Pros and cons of significantly large repositories




        My company is currently trying to use Subversion not only to
store code for new projects but also dumping of binary builds and
internal documentation (processes, meetings, etc). The question most
often asked is "Are you going to use a single repository or multiple
repositories?" I know a Subversion repository can hold any amount of
data, but I want to know is:


        What are the pros and cons for having really large repositories
versus multiple, smaller repositories?

        What experiences have people had with repository administration
and general usage that have made a particular choice good or bad?


        I appreciate your feedback and insight!




Received on Mon Apr 9 15:51:48 2007

This is an archived mail posted to the Subversion Users mailing list.