Why poll is not replaced with epoll?
Level-triggered epoll
is very similar to poll
. Why isn't poll
just a wrapper for epoll
on systems supporting the latter?
EDIT: I mean, are there any technical barriers against such decision? Implementing poll
as epoll
would dramatically boost performance of many network applications. There should be some technical issue that I fail to notice.
Solution 1:
poll is much simpler for easy cases; it is probably just as efficient for small numbers of file descriptors. The caller doesn't need to worry about maintaining poll FDs and adding/removing FDs, they can just add all the ones they want on each call to poll.
My feeling is that they are complimentary, although poll COULD be implemented as a wrapper for epoll, it probably shouldn't be.
epoll could (almost) be implemented as a wrapper for poll, but that would defeat its efficiency arguments.
Solution 2:
Okay, 7 years later I have a more convincing answer based on this article by Evan Klitzke.
Firstly, the reason I asked the question in the first place is the often mentioned performance advantage of epoll
compared to poll
/select
. The word goes that epoll
is asymptotically more efficient (O(1)) than poll
(O(N)).
What is not as widely known is that only edge-triggered epoll
is truly O(1), while level-trggered epoll
has same asymptotics of O(N). Indeed, level-triggered flavor has to go over the list of watched fds every time it is called to find ones that potentially has still more data pending. Edge-triggered variety can rely on signals in response to new bytes appearing in an fd.
It would be interesting to find out, how exactly a resumed thread finds out which fd woke it up, but it's certainly possible that this datum is passed through during epoll-triggered wake-up.
Obviously, poll
/select
cannot use edge-triggered epoll
as the semantics are different. As we saw, implementing with level-triggered epoll
wouldn't bring asymptotic performance benefits. And possibly, also negatively affect it if constant factors or constant terms are high (as they seem to be based on coarse benchmark that I did and quoted in another comment).
For more information, please read Blocking I/O, Nonblocking I/O, And Epoll.
Solution 3:
The semantics of poll()
and epoll
are different. If poll()
informs you that a descriptor is readable, then you do some reading but do not read all the bytes available, and then pass that descriptor into poll()
again, it will wake up immediately. AFAIK the same is not true of epoll
.
Also note that epoll
descriptors are a limited resource. The manpage talks about epoll_create()
failure conditions which AFAIK do not occurr with poll()
.
While I am not sure of all the implementation details, from this we can say that it doesn't make sense to make poll()
a wrapper for epoll
. The programmer must be aware of these points, and existing code written with the assumptions poll()
allows would break.