Harvey, Edward wrote:
>> I agree wholeheartedly with what John wrote in his reply. Scanning a
>> disk can be done a lot faster than we're currently doing; centralized
>> metadata helps, but there are obviously other tricks, too -- just as an
>> example, look at how quickly Google Picasa can scan your whole disk for
>> images!
>>
>
> I challenge the idea that scanning the whole tree could possibly ever be "fast."
Ignoring caching of filesystem metadata, for an initial estimate,
performance is bound by the number of seeks and non-sequential accesses.
Our current WC scanning code excels in both categories, because of the
explosion of small files that have to be read, and the out-of-order
directory scanning.
There is vast room for improvement here, but it implies a radical change
in WC design; not only centralizing metadata, but also turning the whole
scanning concept upside down. We'll have to give control of the scan to
the code that knows about the underlying organization (i.e., stop doing
it in editor drives), and likely have to stop trying to be streamy
(i.e., trade memory footprint for performance).
Just recently I came across an interesting example of how naively
written code can seriously degrade performance of a disk-based
application; I happened to connect a large NTFS USB disk full of videos
to a Mac. Both Windows and Mac OS are too smart for their own good and
aggressively scan new volumes. For that particular drive, it took
Windows about a minute or so before it magnanimously allowed me to start
using the drive. I gave up on the Mac after 10 minutes. ...
-- Brane
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe_at_subversion.tigris.org
For additional commands, e-mail: dev-help_at_subversion.tigris.org
Received on 2008-04-15 17:00:47 CEST