Handling of asynchronous events—reference

http://www.win.tue.nl/~aeb/linux/lk/lk-12.html12.Handling of asynchronous eventsOne wants to be noti

http://www.win.tue.nl/~aeb/linux/lk/lk-12.html

12.Handling of asynchronous events

One wants to be notified of various events,like data that has become available,files that have changed,and signals that have been raised. FreeBSD has the niceAPI. Let us discuss the Unix/Linux situation.

It is easy to wait for a single event. Usually one does a (blocking)read(),and that is it.

Many mechanisms exist to wait for any of a set of events,or just to test whether anything interesting happened.

12.1O_NONBLOCK

If theopen()call that opened a file includes the O_NONBLOCK flag,the file is opened in non-blocking mode. Neither theopen()nor any subsequent operations on the returned file descriptor will cause the calling process to wait.

A nonblocking open is useful (i) in order to obtain a file descriptor for subsequent use when no I/O is planned,e.g. forioctl()calls to get or set properties of a device; especially on device files,an ordinary open might have unwanted side effects,such as a tape rewind etc. (ii) when reading from a pipe: the read will return immediately when no data is available; when writing to a pipe: the write will return immediately (without writing anything) when there are no readers.

O_NOACCESS

An obscure Linux feature is that one can open a file with the O_NOACCESS flag (defined as 3,where O_RDONLY is 0,O_WRONLY is 1 and O_RDWR is 2). In order to open a file with this mode,one needs both read and write permission. This had the same purpose: announce that no reading or writing was going to be done,and only a file descriptor for ioctl use was needed. (Used in LILO,fdformat,and a few similar utilities.)

People would love to have this facility also for directories,so that one could do afd = open(".",O_NOACCESS),go elsewhere,and return byfchdir(fd). But an O_NOACCESS open fails on directories.

12.2select

Theselect()mechanism was introduced in 4.2BSD. The prototype of this system call is

int select(int nfds,fd_set *restrict readfds,fd_set *restrict writefds,fd_set *restrict errorfds,struct timeval *restrict timeout);

It allows one to specify three sets of file descriptors (as bit masks) and a timeout. The call returns when the timeout expires or when one of the file descriptors inreadfdshas data available for reading,one of those inwritefdshas buffer space available for writing,or an error occurred for one of those inerrorfds. Upon return,the file descriptor sets and the timeout are rewritten to indicate which file descriptor has the stated condition,and how much time from the timeout is left. (Note that other Unix-type systems do not rewrite the timeout.)

There are two select system calls. The old one uses a parameter block,the new one uses five parameters. Otherwise they are equivalent.

12.3pselect

Thepselectsystem call was added in Linux 2.6.16 (and was present earlier elsewhere). With onlyselect()it is difficult,almost impossible,to handle signals correctly. A signal handler itself cannot do very much: the main program is in some unknown state when the signal is delivered. The usual solution is to only raise a flag in the signal handler,and test that flag in the main program.

int gotsignal = 0;

void sighand(int x) {
gotsignal = 1;
}

int main() {
...
signal(SIGINT,sighand);
while (1) {
if (gotsignal) ...
select();
...
}

Now if one wants to wait for either a signal or some event on a file descriptor,then testing the flag and if it is not set callingselect()has a race: maybe the signal arrived just after the flag was tested and just before select was called,and the program may hang inselect()without reacting to the signal.

The callpselect()is designed to solve this problem. This function is just likeselect()but has prototype

int pselect(int nfds,const struct timespec *restrict timeout,const sigset_t *restrict sigmask);

with a sixth parametersigmask,and it does the equivalent of

sigset_t origmask;
sigprocmask(SIG_SETMASK,&sigmask,&origmask);
ready = select(nfds,&readfds,&writefds,&exceptfds,timeout);
sigprocmask(SIG_SETMASK,&origmask,NULL);

as an atomic action. Now one can block the signals of interest until the call ofpselect()and have asigmaskthat unblocks them. If a signal occurs,the call will return witherrnoset to EINTR.

This function uses a struct timespec (with nanoseconds) instead of a struct timeval (with microseconds),and does not update its value on return.

The self-pipe trick

Before the introduction ofpselect()people resorted to obscure tricks to obtain the same effect. Famous is Daniel Bernstein’s: create a non-blocking pipe,and add a file descriptor for reading from this pipe to thereadfdsargument ofselect(). In the signal handler,write a byte to the pipe. This works.

The system call

The pselect system call has a 7-parameter prototype (the 7th parameter being the size of the 6thsigmaskparameter),but most architectures cannot handle 7-parameter system calls,so there is also a 6-parameter version where the 6th parameter is a pointer to a struct that has the last two parameters. Unlike the POSIX library routine,the system call does return the leftover part of the timeout.

This system call starts changing the signal mask,and ends restoring it. However,if it was interrupted by a signal,this signal should be delivered,while the signal mask might block it. This is solved by the recentTIF_RESTORE_SIGMASKmechanism in the kernel. When the pselect system call returns after being interrupted by a signal,it does not immediately restore the original signal mask,but first runs the user’s signal handler,and first upon return from that the original signal mask is restored.

12.4poll

Thepoll()system call is rather similar toselect(). The prototype is

struct pollfd {
    int   fd;         /* file descriptor */
    short events;     /* requested events */
    short revents;    /* returned events */
};

int poll(struct pollfd *fds,nfds_t nfds,int timeout);

where the fieldseventsamdreventsare bitmasks indicating for what eventsfdshould be watched,and what conditions actually occurred. The timeout is in milliseconds; a negative number means an infinite timeout.

ppoll

Just likepselectis a version of select that allows safe handling of signals,ppollis such a version ofpoll. The prototype is

int ppoll(struct pollfd *fds,const struct timespec *timeout,const sigset_t *sigmask);

12.5epoll

When the number of file descriptors becomes very large,theselect()andpoll()mechanisms become inefficient. With N descriptors,O(N) information must be copied from user space to kernel and vice versa,and loops of length O(N) are needed to test the conditions.

Solaris introduced the/dev/pollmechanism (seepoll(7d)on Solaris),where the idea is that one does the copy from user space to kernel only once (by writing an array of struct pollfd’s to/dev/poll) and gets only interesting information back (via an ioctl on this device that copies the interesting struct pollfd back to userspace).

Linux tries something similar using the three system callsepoll_create,epoll_ctl,epoll_wait(added in 2.5.44,seeepoll(7)). Benchmarks seem to indicate that the performance is comparable to that of select and poll until one has thousands of descriptors,only a small fraction of which is ready. (And then epoll is clearly better.) In most tests,the FreeBSD kqueue wins.

For a discussion of these and several other mechanisms,especially for the context of web servers,see.

epoll_pwait

作者: dawei

【声明】:永州站长网内容转载自互联网,其相关言论仅代表作者个人观点绝非权威,不代表本站立场。如您发现内容存在版权问题,请提交相关链接至邮箱:bqsm@foxmail.com,我们将及时予以处理。

为您推荐

联系我们

联系我们

0577-28828765

在线咨询: QQ交谈

邮箱: xwei067@foxmail.com

工作时间:周一至周五,9:00-17:30,节假日休息

返回顶部