Eugene Kuleshov wrote:
> Amiri Arash OSF SD wrote:
>> we have some assumptions in place. meaning that, we expect projects to
>> have a certain structure. the "root" folder of the project must be right
>> above the trunk folder. So, we assume a structure like:
>> this way, we find one of those three folders and take the parent folder
>> from there. We then checkout everything from the repository from there
>> (or, if we already checked out something and created the cache- and
>> indexfile, we update the files, by comparing the revision numbers)
> Can you please elaborate how this will handle repositories like apache
> For example Maven 2.x code is located at
My colleague has been a bit short on information, so I try to elaborate.
Our program finds the complete history of single files (not directories),
including all (parallel) branches and tags. The goal: implementing a SVN
history view that has all the benefits of the CVS history view.
There are programs that draw nice graphs, but all I tried were far too slow.
They analyze all revisions of the whole repository (at least TortoiseSVN
does that). We have over 100000 revisions in our repository and Tortoise
needs 20 minutes to draw one graph. So it is completely useless for us.
Our approach only works if the following assumptions are true (so it's
useless for repositories that are organized in another way):
1. Everything that belongs to a project is contained in some root directory.
There are no copy operations that originate from outside of this directory.
2. Nothing else resides inside this project root directory (this would make
the log cache bigger and slow down everything).
3. The project root directory has never been renamed or moved. (This should
be fixed, because it is a severe limitation)
This is what we are doing:
We request the complete log of the project root directory (including changed
paths) - in your example: https://svn.apache.org/repos/asf/maven/components
This log contains the complete history of every object that belongs to
maven/components, including all branches and tags.
This is a time consuming operation for big projects, so we store the
complete log in a local cache file (client side). There is a separate cache
for every project root directory. Every time the history function is being
invoked, only new revisions are requested and append to the cache.
Using this cache, we find the first ancestor of the requested file by
starting at BASE and going backwards through the cache until we find
an "add" without a copy-source. Then we go forward again and follow all
copy commands. So we are able to calculate a set of paths of all copies of
the file which we are searching for (the set grows with each copy and
shrinks with each delete). Every revision that touches a path contained in
the set is being added to the result list. In reality the story is a bit
more complicated, but this is the basic algorithm.
Additionally we calculate the last modified revision of each tag and fill in
the "tags" column of the history view using this information.
To unsubscribe, e-mail: dev-unsubscribe_at_subclipse.tigris.org
For additional commands, e-mail: dev-help_at_subclipse.tigris.org
Received on 2008-04-02 00:15:28 CEST