[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: [tortoisesvn] r23238 committed - Do not copy CFileTextLines before converting...

From: Oto BREZINA <otik_at_printflow.eu>
Date: Thu, 30 Aug 2012 20:58:37 +0200

On 2012-08-29 20:47, Stefan Küng wrote:
> On 28.08.2012 22:53, Oto BREZINA wrote:
>
>>> Also keep in mind that the line array is used by all views and even in
>>> multiple threads. And your custom array isn't thread safe.
>> It is not (multithead safe) - was intended to be simple as fast. In case
>> multithread is needed keeping vector makes sense, but I did not found
>> any refence to FileTextLines lines other then Load, Patch(methods are
>> not called?), ... all seems be one thread related.
>> After load(load, convert,diff) lines from FileTextLines are not
>> referenced at all. Params are.
>> Save uses FileTextLines in its scope as well as Patch methods.
>>
>> See proof of concept patch(separated Lines and Params) - build it(that's
>> the proof), but don't try to run it (it does nothing).
> you're right, it's not used by multiple threads.
> But if we ever change that, we would have to rewrite a whole lot of code
> since your class isn't thread safe.
> That's another reason why I really prefer to use existing classes.
If you are talking about proof patch, separating file lines data and
file params, seems to me as good idea anyway
>
> Do you have any data on speed comparing a vector (do we need a vector or
> could a deque work as well?) and your class?
>
> What's the load/save time for files in different sizes, e.g. 100kb, 1M,
> 10M, 20M, 50M, maybe even 100M?
I used only one file pair; for really small files around 100kb load is
not a real matter ...

My test files:
a.txt
562 614B
UTF8
464 lines with around 4096 chars/line
- attacks line string and draw line operations

b.txt
73 673 387B
UTF16BE
139939 mostly empty lines
- attacks 16BE conversion and mainly line addition/removal

Note TwowayDiff and Save was expected to have about same time, biggest
difference in adding elements. Save if fast enought ...
Every test was performed 3x for every configuration, best time is used.

* Vector - original solution
1x CDiffData::Load
UserMode[µs] 5,428,834
2x CFileTextLines::Save
UserMode[µs] 93,600 46,800 46,800 46,800
1x CDiffData::DoTwoWayDiff
UserMode[µs] 2,121,613

* Vector + reserve
1x CDiffData::Load
UserMode[µs] 4,477,228
2x CFileTextLines::Save
UserMode[µs] 124,800 62,400 46,800 78,000
1x CDiffData::DoTwoWayDiff
UserMode[µs] 2,106,013

* OwnClass
1x CDiffData::Load
UserMode[µs] 4,258,827
2x CFileTextLines::Save
UserMode[µs] 109,200 54,600 46,800 62,400
1x CDiffData::DoTwoWayDiff
UserMode[µs] 2,168,413

* OwnClass + reserve
1x CDiffData::Load
UserMode[µs] 4,196,426
2x CFileTextLines::Save
UserMode[µs] 124,800 62,400 62,400 62,400
1x CDiffData::DoTwoWayDiff
UserMode[µs] 2,090,413

* Deque - new best
1x CDiffData::Load
UserMode[µs] 4,087,226
2x CFileTextLines::Save
UserMode[µs] 93,600 46,800 31,200 62,400
1x CDiffData::DoTwoWayDiff
UserMode[µs] 2,090,413

Conclusion, From test resuslts is seems that Vector depend on reserve
too much and possible when its expanding algo is overridden we can
expect sufficient speed (even not faster then own class).

On other side Deque beats on tested data all other tested solution. With
its not need to know size in before, it seems to be best solution for
this problem.

>
> Optimizing is good, but most of the time using existing
> APIs/containers/classes is still the best option.
>
>
>
> Stefan
>

-- 
Oto ot(ik) BREZINA - 오토
------------------------------------------------------
http://tortoisesvn.tigris.org/ds/viewMessage.do?dsForumId=757&dsMessageId=3002265
To unsubscribe from this discussion, e-mail: [dev-unsubscribe_at_tortoisesvn.tigris.org].
Received on 2012-08-30 20:58:49 CEST

This is an archived mail posted to the TortoiseSVN Dev mailing list.

This site is subject to the Apache Privacy Policy and the Apache Public Forum Archive Policy.