Why poll is not replaced with epoll?

Level-triggered epoll is very similar to poll. Why isn't poll just a wrapper for epoll on systems supporting the latter?

EDIT: I mean, are there any technical barriers against such decision? Implementing poll as epoll would dramatically boost performance of many network applications. There should be some technical issue that I fail to notice.


Solution 1:

poll is much simpler for easy cases; it is probably just as efficient for small numbers of file descriptors. The caller doesn't need to worry about maintaining poll FDs and adding/removing FDs, they can just add all the ones they want on each call to poll.

My feeling is that they are complimentary, although poll COULD be implemented as a wrapper for epoll, it probably shouldn't be.

epoll could (almost) be implemented as a wrapper for poll, but that would defeat its efficiency arguments.

Solution 2:

Okay, 7 years later I have a more convincing answer based on this article by Evan Klitzke.

Firstly, the reason I asked the question in the first place is the often mentioned performance advantage of epoll compared to poll/select. The word goes that epoll is asymptotically more efficient (O(1)) than poll (O(N)).

What is not as widely known is that only edge-triggered epoll is truly O(1), while level-trggered epoll has same asymptotics of O(N). Indeed, level-triggered flavor has to go over the list of watched fds every time it is called to find ones that potentially has still more data pending. Edge-triggered variety can rely on signals in response to new bytes appearing in an fd.

It would be interesting to find out, how exactly a resumed thread finds out which fd woke it up, but it's certainly possible that this datum is passed through during epoll-triggered wake-up.

Obviously, poll/select cannot use edge-triggered epoll as the semantics are different. As we saw, implementing with level-triggered epoll wouldn't bring asymptotic performance benefits. And possibly, also negatively affect it if constant factors or constant terms are high (as they seem to be based on coarse benchmark that I did and quoted in another comment).

For more information, please read Blocking I/O, Nonblocking I/O, And Epoll.

Solution 3:

The semantics of poll() and epoll are different. If poll() informs you that a descriptor is readable, then you do some reading but do not read all the bytes available, and then pass that descriptor into poll() again, it will wake up immediately. AFAIK the same is not true of epoll.

Also note that epoll descriptors are a limited resource. The manpage talks about epoll_create() failure conditions which AFAIK do not occurr with poll().

While I am not sure of all the implementation details, from this we can say that it doesn't make sense to make poll() a wrapper for epoll. The programmer must be aware of these points, and existing code written with the assumptions poll() allows would break.