[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

RE: Search through repository

From: Rob Hubbard <Rob.Hubbard_at_celoxica.com>
Date: 2006-10-18 14:34:17 CEST


I think that there is much more to this feature request than is initially apparent. Here are some of my thoughts on the subject...

There are very many different things that you might wish to search within, including:
    * paths (added, deleted or modified) [as with "svn log -qv" or "svn diff --summarize"]
    * the change "mode" itself, i.e. added/deleted/modified/replaced and with/without history
    * copied path sources and destinations, and their revisions
    * lines within files [as with "svn cat"]
    * lines added, deleted and/or modified [as with "svn diff"]
    * the line change "mode" itself, i.e. added/deleted/unchanged
    * commit comments [as with "svn log"]
    * properties and property changes [as with "svn proplist", "svn propget", ...]
    * usernames
    * dates(?)

There are different ways to search
    * find all matches in a range of revisions
    * search backwards (or forwards) through a range of revisions until the first match is found

There are different ways to generate a reply
    * show only the particular lines that match
    * show entire objects (log messages, file lists, ...) containing a line that matches
    * show some entirely different information when a match is found
    * whether restrict the paths to the given path, or not (e.g. "svn log -qv" can show paths outside the given path)
    * reporting non-matches rather than matches

Searches are made more complicated by "renames" and copies. "Pegging" (@) needs to be supported or otherwise dealt with.

When "data mining", it is often desirable to relate revisions to tags. The "tag" revision should be the copied revision of the copied path, rather than the creation revision of the created path. Care must be taken with how complex tags are dealt with.

Some example "queries" might be:
    - to search for a revision comment mentioning a bug fix
    - to search for the revision where a particular file was deleted
    - to search for all copies from (or to) a particular path
    - to determine the releases or builds (i.e. tags) between which a given bug was claimed fixed
    - to produce a log of all changes made by a particular developer
    - to find a revision where a given property on a given file was modified
    - show the whole log message for any revision where some line in the message contains a given word
    - show just the changed path list for revisions where the message contains a given word
    - to search the full change history of a given line, or range of lines, of a file (beyond just the most recent changes as shown by "svn blame")

There is no doubt that something more sophisticated than just piping "svn diff" or "svn log" through grep is required. This is because grep will not be able to make use of the structure in the SVN output.

There is so much variation, that I really feel that tailor-made scripts are necessary. Or, perhaps there is scope for a "subversion query language"!

On the other hand, there might be a call to add some features to SVN to make such scripting easier, or faster. Some other more basic additions might help with parsing output from svn in scripts. For example
    * add "--xml" option to "svn diff"
    * add option to "svn diff" to output all lines, or some fixed number of context lines (this is currently possible using an external diff tool, but the --xml option above would not be available)
    * allow SVN diff to produce a report of all the individual delta across a range of revisions, rather than just the overall delta
    * improve the output for changes to properties (e.g. show these the same way that file changes are shown)

Also, an SVN "batch mode" has been suggested before now, which seems like a very good idea to me.

(Sorry if any of these things already have been implemented.)


> -----Original Message-----
> From: Tomasz Pajak [mailto:spidertp@o2.pl]
> Sent: 12 October 2006 06:54
> To: users@subversion.tigris.org
> Subject: Search through repository
> Hello everybody, I've been asking in Tortoise users group
> about function
> "Search through repository" and they said, that I should ask
> about it here.
> Because they're using SVN libraries to connect to the
> repository, they
> don't have possibility to make such a function.
> So, is this possible to add it to your libraries?
> Keep up the good work!
> Best regards,
> Tomasz Pajak
> --
> Tomasz Paj±k
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
> For additional commands, e-mail: users-help@subversion.tigris.org

This message has been checked for all known viruses by the MessageLabs Virus Scanning Service, on behalf of Celoxica Ltd.

This email and any files transmitted with it are confidential and
may be legally privileged. It is intended solely for the use of the
individual or entity to whom it is addressed. If you have received
this in error, please contact the sender and delete the material
immediately. Whilst this email has been swept for viruses, you
should carry out your own virus check before opening any
attachment. Celoxica Ltd accepts no liability for any loss or
damage which may be caused by software viruses or interception
or interruption of this email.

To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org
Received on Wed Oct 18 14:35:40 2006

This is an archived mail posted to the Subversion Users mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.