[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Total failure in dav layer when >1024 files already open

From: Jonathan Gilbert <o2w9gs702_at_sneakemail.com>
Date: 2006-06-05 21:23:05 CEST

At 01:37 PM 05/06/2006 -0400, Michael Sweet wrote:
[snip]
>For select(), you can just loop through your state data and use the
>FD_ISSET macro to test the corresponding fd_set. That is O(n).
>
>The situation is somewhat similar for fd_set vs. pollfd array
>management. Adding and/or removing a file descriptor in a dynamic
>pollfd array requires O(n log n) for every addition or removal and
>O(log n) for update. Doing so for fd_set is O(1) (constant time)
>for all of the UNIX fd_set implementations I have worked with.

You say 'UNIX' here explicitly, so I suspect you maybe already know what
I'm about to say. :-)

I just wanted to point out that on Windows, FD_SET, FD_CLR and FD_ISSET are
O(n), not O(1). FD_ISSET even involves a function call. This is the natural
trade-off between the efficiency of a bitset and the ability to use
arbitrarily-large file descriptors. UNIX systems chose efficiency, while
Microsoft chose to limit only the total number of file descriptors and not
their actual values.

UNIX systems were so interested in efficiency, in fact, that they didn't
even bother to check for errors. Thus, if you get a file descriptor that is
too large and pass it into FD_SET or FD_CLR, what you get is stack or heap
corruption. With Microsoft's version, when you try to add too many file
descriptors, the ones which are over the limit are silently ignored. Not a
perfect solution either, but it is at least detectable. In any event, the
fd_set interface is too simplified, too generalized, for code to figure out
what will work and what won't without assuming certain internal
implementation details.

On Windows, FD_SETSIZE is precisely the number of fds you can place into an
fd_set, while on UNIX systems, it is typically one greater than the maximum
file descriptor value that can be set. Some UNIX systems pay attention if
you set FD_SETSIZE before including the requisite headers, while others
(like an ancient linux box of mine, for instance -- so crazed for
efficiency that it implements the various FD_* macros in inline assembler!)
define everything about fd_sets with two underscores first in "bits"
headers, and then typedef/#define them over. Clearly, changing the value of
FD_SETSIZE won't work in that situation (though changing __FD_SETSIZE
might). Incidentally, Windows is counted among those operating systems
which *do* allow the value to be changed by the user :-)

At least there's one thing that Windows' version does more efficiently than
the traditional UNIX bitset approach: FD_ZERO is O(1), because all it has
to do is reset the count of items in the list to 0. :-)

Jonathan Gilbert

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Jun 5 21:25:21 2006

This is an archived mail posted to the Subversion Dev mailing list.