[svn.haxx.se] · SVN Dev · SVN Users · SVN Org · TSVN Dev · TSVN Users · Subclipse Dev · Subclipse Users · this month's index

Re: Total failure in dav layer when >1024 files already open

From: Michael Sweet <mike_at_easysw.com>
Date: 2006-06-05 16:29:49 CEST

Greg Hudson wrote:
> ...
>> Every OS other than Linux allows select() to work with an arbitrary
>> number of file descriptors; even the Linux kernel allows it, just not
>> glibc.
>
> I believe you're simply mistaken. Solaris has a default limit of 1024
> (or 65536 for 64-bit code). NetBSD has a default limit of 256. It's
> extremely unlikely that you can find even one OS which tries to do a
> dynamically-sized fd_set.

Yes, each OS has a default limit, but the point is that you can
allocate your own fd_set to get a larger number of FDs. CUPS has
been doing this for years, and it isn't until very recently that
glibc has disabled that particular feature that this has become a
problem.

FWIW, Windows implements fd_set quite differently from the UNIX
world, essentially providing a dynamically-sized poll array.

> The Linux kernel can allow an arbitrary number of file descriptors
> because it isn't responsible for the particular semantics of FD_SET and
> friends.

It is an artificial limit. The "fd_set" type may be fixed-size, but
there is no reason to limit select() when other operating systems
don't and

>> Try managing and array of thousands of poll entries sometime, and
>> then compare the efficiency of a bit test vs. scanning an array
>> after the fact...
>
> Both select() and poll() require you to scan the fd set structure after
> the call to find out what I/O events actually happened. No difference
> there. This is not normally an issue, as performing a few thousand
> memory accesses is generally cheap compared to a single I/O operation,
> but there have been some stabs at solving the theoretical scaling
> problem, such as Linux's epoll(), Solaris's /dev/poll, and the like.

Sure, with select() you need to scan *one* array - your active
connections (or whatever it is that you are using select() for),
but with poll() you need to scan the poll array *and* then do a
lookup in your active connections array (or visa-versa).

The select() method is O(n). The poll() method is O(n log n) unless
you manage a large array that maps from FD to the corresponding
state data for that FD, which can eat up a LOT of memory...

-- 
______________________________________________________________________
Michael Sweet, Easy Software Products           mike at easysw dot com
Internet Printing and Document Software          http://www.easysw.com
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org
Received on Mon Jun 5 16:30:14 2006

This is an archived mail posted to the Subversion Dev mailing list.