[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: svn commit: r18701 - trunk/contrib/client-side

From: David James <djames_at_collab.net>
Date: 2006-03-06 21:09:58 CET

> Is bisect distributed with Python 2.0? I'm without net access as I
> write this, so can't check, but I see my Python 2.3 has it.

Yes, see http://svn.python.org/projects/python/branches/release20-maint/Lib/bisect.py

NOTE: Only the "bisect" and "insort" functions are supported in Python
2.0. bisect_left and bisect_right are only supported in Python 2.1 and
later, so I was careful to avoid usage of these functions.

> > + def __init__(self, url, begin, end, find_propchanges=False):
> > + """
> > + Create a new RevisionLog object, which stores, in
> > self.revs, a list
> > + of the revisions which affected the specified URL between
> > begin and
> > + end. If find_propchanges is True, self.propchange_revs
> > will contain a
> > + list of the URLs which changed properties directly on the
> > specified
>
> List of the *revisions* which changed properties.

Ah, good catch. Will fix.

>
> > + # Read the log to look for revision numbers and merge-
> > tracking info
> > + self.revs = []
> > + self.propchange_revs = []
> > + for line in launchsvn("log %s" % log_opts):
> > + m = revision_re.match(line)
> > + if m:
> > + rev = int(m.groups()[0])
> > + self.revs.append(rev)
> > + elif srcdir_change_re.match(line):
> > + self.propchange_revs.append(rev)
>
> I guess we're counting upon a line to match ^r before finding a line
> that matches the M line, otherwise we'll append a None to
> propchange_revs? Isn't it better Python style to set rev to None at
> the top of the loop?

If rev is unset, I believe Python will report a NameError exception.
All log messages should specify a revision number before showing the
changes to individual files.

I don't care whether we put "rev = None" at the top of the loop or
not, but Giovanni recommended earlier that I remove the "rev = None"
line, in http://svn.haxx.se/dev/archive-2006-03/0143.shtml

> > +class VersionedProperty:
> > + """A read-only, cached view of a versioned property"""
> > +
> > + def __init__(self, url, name):
> > + """View the history of a versioned property at URL with
> > name"""
>
> Maybe rename 'name' to 'propname', to be a little clearer.

Sure, I'll do this.

> Could you also add an explanation of the method you're using here
> (such as bisect) which will explain why revs is set to [0] and values
> to [None] initially.

I've implemented a cache for property values over ranges of revisions.
To do this, I store in "revs" the first revision where a particular
new propvalue is initialized, and I store in value the value of the
prop at that revision. Initially, we set revs to [0] and values to
[None] to indicate that, as of revision 0, we know nothing about the
value of the property (thus the value is None).

Later, if you run the "load" member function with a Subversion log, we
can cache the entire range of the log by noting each revision in which
the property was changed. At the end of the range of the log, we
invalidate our cache by adding the value "None" to our cache for any
revisions which fall outside the range of our log.

An example: We know that the 'svnmerge' property was added in revision
10, and changed in revisions 21. We gathered log information up until
revision 40.

revs = [0, 10, 21, 40]
values = [None, "val1", "val2", None]

What these values say:
- From revision 0 to revision 9, we know nothing about the property.
- In revision 10, the property was set to "val1". This property stayed
the same until revision 21, when it was changed to "val2".
- We don't know what happened after revision 40.

I'll add comments to the code to explain this. Let me know if there's
any other details that are needed.

> > + def get(self, rev=None):
> > + """
> > + Get the property at revision 'rev'. If rev is not
> > specified, get
> > + the property at 'head'.
> > + """
> > + if rev is not None:
> > + i = bisect(self.revs, rev) - 1
> > + if self.values[i] is not None:
> > + return self.values[i]
> > + return self.raw_get(rev)
>
> Why not insert the values returned by raw_get using the result from
> bisect() so you don't have to do another call raw_get() again?
> You've already got the i value you need, so mine as well use it.

Sure, we could cache this preperty value.

Keep in mind: Because we have not run a verbose log, we do not know
the revision range for which this property will be valid. As a result,
we need to be conservative, and assume that it's possible that this
property could have changed in the very next revision. So this caching
would only apply to a single revision.

I'm planning to later extend our caching system to include additional
functions, such as "cache_range" which caches a particular property
for a supplied range of values. When I implement this change, I'll
also add caching here.

> > +
> > + def keys(self):
> > + """
> > + Get a list of the revisions in which this property changed.
> > + """
> > + return self.revs[1:-1]
>
> When I see a method names keys(), I don't normally expect an ordered
> list to be returned. So later down in the code, when I see it being
> used, I wonder if it needs to be sorted.
>
> Can we come up with a better fitting name, such as changed_revs,
> which matches the documentation string.

Yes, this is better. I'll change this.

> > @@ -671,8 +729,12 @@
> > if long(begin) > long(end):
> > return RevisionSet(""), RevisionSet(""), RevisionSet("")
> >
> > - revs, merges = find_changes(url, begin, end, find_reflected)
> > - revs = RevisionSet(",".join(map(str,revs)))
> > + logs[url] = RevisionLog(url, begin, end, find_reflected)
> > + mergeprops[url] = VersionedProperty(url, opts["prop"])
> > + mergeprops[url].load(logs[url])
> > + blockprops[url] = VersionedProperty(url, opts["block_prop"])
> > + blockprops[url].load(logs[url])
> > + revs = RevisionSet(",".join(map(str,logs[url].revs)))
>
> Are you iterating over the block_prop property, which wasn't done
> before, to prevent a property conflict on it?

I was planning to implement code to prevent property conflicts on
blockprops, but I decided against it because it wasn't clear whether
we wanted to merge blockprops from one branch to another. Currently,
blockprops are sometimes merged from one branch to another -- and
sometimes not -- depending on whether the merge conflicts. Which do
you think is better? Should blockprops be merged, or not?

> I don't see any code currently using that, so can we remove it to
> help improve performance?

Yes. I'll do this.

> BTW, now that logs, mergeprops and blockprops are global, we could
> pickle them into file in ~/.subversion/svnmerge.py dictionary keyed
> by the repository's UUID, so we can save lookups in the future.

Good idea!

Cheers,

David

--
David James -- http://www.cs.toronto.edu/~james
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Mar 6 21:11:54 2006

This is an archived mail posted to the Subversion Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.