[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Statsdlg - first patch: data gathering upgrade

From: Stefan Küng <tortoisesvn_at_gmail.com>
Date: 2007-10-07 19:35:03 CEST

Andreas Nicolai wrote:

> while I'm hacking away on the stats dialog, I created (+attached) the
> first patch that includes the reworking of the stats data gathering
> algorithm.
>
> The patch only affects the files: StatGraphDlg.h and StatGraphDlg.cpp
> and is created against revision 10908.
>
> Here's a brief review of the code changes:
>
[snip]

Thanks a lot for your patch!
I've committed the patch in revision 10914.

> Just one thing I noted... Because of the aligning to begin/end of the
> week, revision intervals that start in the middle of a week and end in
> the middle of the week may actually be reported as one week longer than
> the time span actually is. However, if I don't align the interval with
> the start of the week, the weekly interval may actuall start on a
> Wednesday and last until next weeks Tuesday. For a different revision
> range (maybe including the previous 200 revs) the interval may be
> between Friday and next weeks Thursday. This, however, results in
> different min/max commit and file changes counts. So I guess I don't get
> around the aligning part, and for the improved data gathering algorithm
> I need the m_minDate.
>
>
> Design questions:
> 1. The data structures created in the ShowStats() dialog need to be used
> in the other statistics functions as well. Re-gathering the data would
> be a waste of time, so I would propose making these variables member
> variables of the dialog that get populated when the dialog is first
> shown. All other statistics views can then use the information and
> obtain/calculate specific other data. Would that make sense having these
> mappings and lists as member variables?

Sure. Whatever can be used in other class methods should be made a
member variable.

> 2. The maps for the commit and file change data is currently of type:
> map<int, map<stdstring, LONG> > so that data can be accessed by:
>
> LONG commits = commitsPerAuthorAndWeek[week_nr][author_name];
>
> However, the memory needed for storing the data could be reduced if
> instead of strings the authors would be identified by a number that and
> the name/number connection is made via yet another mapping. So, the
> statement above would look like:
>
> LONG commits = commitsPerAuthorAndWeek[week_nr][authorNumber[author_name]];
>
> Since the memory footprint of the statistics dialog is rather low
> compared to the log dialog, I would probably postpone this upgrade until
> later. Also, it would hurt readibility of the code, so I'd prefer the
> way data is stored now. What are your thoughts on this?

Authornames usually are short. So the memory we would save by converting
the authors first to an ID number wouldn't be that much.
If you want to do this, just go ahead. But as you said, that doesn't
have a high priority.

Stefan

-- 
        ___
   oo  // \\      "De Chelonian Mobile"
  (_,\/ \_/ \     TortoiseSVN
    \ \_/_\_/>    The coolest Interface to (Sub)Version Control
    /_/   \_\     http://tortoisesvn.net
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tortoisesvn.tigris.org
For additional commands, e-mail: dev-help@tortoisesvn.tigris.org
Received on Sun Oct 7 19:35:18 2007

This is an archived mail posted to the TortoiseSVN Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.