[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: unversioned properties: size limitations?

From: Alexey Neyman <stilor_at_att.net>
Date: Mon, 11 Aug 2014 23:17:01 -0700

On Tuesday, August 12, 2014 07:44:03 AM Branko Čibej wrote:
> On 12.08.2014 07:16, Alexey Neyman wrote:
> > On Tuesday, August 12, 2014 06:59:20 AM Branko Čibej wrote:
> > > On 12.08.2014 03:31, Alexey Neyman wrote:
> > > > Hi SVN users/developers,
> > > >
> > > >
> > > >
> > > > Is there a limitation in size on the property value that can be set?
> > > >
> > > > Any scalability traps to be aware of (i.e. non-linear increase in time
> > > >
> > > > due to increase in size of the property value)? I tried a 4Mb
> > > >
> > > > property, seems to work fine...
> > >
> > > One thing to be aware of is that properties were never designed to be
> > >
> > > large. Property values are always transmitted in full text between
> > >
> > > client and server (i.e., they're not compressed); they're stored in full
> > >
> > > text in the repository (not deltified the way file contents are). So the
> > >
> > > more large properties you have, and the more often you modify them, the
> > >
> > > less efficient your repository will be, in terms of storage requirements
> > >
> > > and network bandwidth.
> > >
> > >
> > >
> > > So while you should be able to store a 2 gig property value, I really,
> > >
> > > really recommend not to do that.
> >
> > I thought of having pre- and post-commit hooks communicate using a
> > *revision* property: pre-commit hook would set a revision property
> > with the list of files and actions to be performed on them, and the
> > post-commit hook will perform these actions by committing a new
> > revision (instead of modifying a transaction by pre-commit hook).
> >
> >
> >
> > Thus a more specific question - when are arbitrary *revision*
> > properties sent from the client to the server? Obviously, svn:*
> > properties are used by various SVN commands; but am I right to assume
> > that non-standard revision properties are sent only for the 'svn pg
> > --revprop' command?
> >
> >
> >
> > That said, I expect the property value to be much less than 2Gb. So
> > far, the largest commit we've had was ~20000 files - with ~150
> > characters per path, that would be about 3Mb for the property value.
>
> Sure, you'll only transmit revprops with propget --revprop and propset
> --revprop. I'm not sure what the implications are of storing large
> values in revprops, these are handled a bit differently than versioned
> properties on the server.
>
> And of course, revision properties are not versioned.
>
> I'm still not sure what you're trying to achieve, though. "Communication
> between pre- and post-commit hooks" doesn't describe the problem, it
> describes a solution, and there are of course other ways for hooks to
> communicate that do not involve the repository.

I've mentioned this in the other thread where you also responded. There are two problems
that are currently (we're using 1.6) solved by modifying the transaction in the pre-commit
hook:

1. We have a <version.h> header that needs to reflect the last modification date of *any*
file in the project. Currently, pre-commit script modifies a property in each commit which
touches any file in /project/trunk.

2. We have a few checks in pre-commit that are performed on text but not on binary files,
and (unless it the type is set explicitly), the text is distinguished from binaries using simple
heuristics. To avoid running this heuristics over and over, the result is saved into a property
on that file.

So, to avoid modifying the transaction by pre-commit (that no longer reliably works in 1.7
and 1.8), I am changing the pre-commit to list the 'tasks' to be performed by the post-
commit - which will check in a new revision. I don't want to involve an out-of-repository
storage for that list of tasks unless absolutely necessary - hence, revision property looks
like the perfect place to store the "follow-up tasks" for a particular revision.

> Also I find your approach less than robust:
>
> * There's no guarantee that the post-commit hook will ever run, so
> it's a bad idea to rely on it for anything that's critical to your
> workflow.

Can you please elaborate on this? I thought that if a transaction was promoted to a
revision, the post-commit hook is always run. I understand that post-commit may fail and
this failure will not roll back a revision. But when is it not run at all?

PS. I know that there's an interface, svn_fs_commit_txn, than can bypass both pre- and
post-commit hook. But it is not used in regular commits from the command line, is it?

> * There's no guarantee that other commits won't happen before your
> post-commit hook is run; so whatever you do with the repository in
> post-commit may have to deal with conflicts, which is not fun to
> automate.

I understand that, but I don't expect conflicts: the actions by the post-commit will only
touch certain properties that are not set manually. After all, I can reject an attempt to set
those properties in the pre-commit.

Regards,
Alexey.
Received on 2014-08-12 08:31:14 CEST

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.