Martin v. Löwis wrote:

*> Michael Sweet wrote:
*

*>> The select() method is O(n). The poll() method is O(n log n) unless
*

*>> you manage a large array that maps from FD to the corresponding
*

*>> state data for that FD, which can eat up a LOT of memory...
*

*>
*

*> You are assuming an implementation strategy here.
*

Not of select() or poll(), but certainly for the application.

For a single file descriptor, there is no difference for an

application - use fd_set, use pollfd, both are simple and

constant time.

However, for any non-trivial application managing multiple file

descriptors, poll() adds overhead in the form of pollfd array

management, no matter how else you implement things in your

code.

* > poll() is
*

*> O(number-of-polled-fds) on Linux. Yes, there is an array of all
*

*> files, per process, but that does not consume a LOT of memory.
*

*> Instead, it consumes one pointer per file number, up to some
*

*> dynamically-determined maximum.
*

*>
*

*> OTOH, select is O(maximum-fd-value). Assuming you always pass
*

*> only a few file descriptors to select/poll, poll is more
*

*> efficient if the process has many open files.
*

Again, I'm not talking about the implementation complexity of

either interface, but the application code complexity to use

those interfaces. Regardless, O(maximum-fd-value) is equivalent

to O(number-of-polled-fds) in terms of algorithmic complexity

[O(n) == O(n)], and very likely in practice (clock time) as well.

In the case of CUPS, the number of file descriptors is often the

same as the number of files we are doing a select() on, so for

us poll() is a performance loser.

On the application side, with poll() you need to either loop

through the pollfd array to determine which file descriptors are

active, or loop through the (separately maintained) state array

which references those file descriptors. That is O(n). Then you

have the lookup the state data for that file descriptor; for a

sparse array (index != fd), the best you can do is O(log N),

making the total (application) code complexity O(n log n).

Similarly, if you loop through your state data [O(n)] and lookup

the file descriptors in a sorted pollfd array [O(log n)], you

still have O(n log n). The fastest implementation would be to

use an array that maps file descriptors (or pollfd array indices)

to your state data, which reduces the total complexity to O(n).

For select(), you can just loop through your state data and use the

FD_ISSET macro to test the corresponding fd_set. That is O(n).

The situation is somewhat similar for fd_set vs. pollfd array

management. Adding and/or removing a file descriptor in a dynamic

pollfd array requires O(n log n) for every addition or removal and

O(log n) for update. Doing so for fd_set is O(1) (constant time)

for all of the UNIX fd_set implementations I have worked with.

--
______________________________________________________________________
Michael Sweet, Easy Software Products mike at easysw dot com
Internet Printing and Document Software http://www.easysw.com
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Received on Mon Jun 5 19:37:51 2006