← Back to blog

Linux Asynchronous I/O

The four I/O models available in Linux -- blocking, nonblocking, multiplexing, and asynchronous -- and why the terminology is often misused.

Disclaimer: This post is a compilation of my study notes and I am not an expert on this subject (yet). I put the sources linked for a deeper understanding.

When we talk about asynchronous tasks what comes to our minds is almost always a task running in a separated thread. But, as I will clear below, this task usually is blocking, and not async at all.

Another interesting misunderstanding occurs between nonblocking and asynchronous (or blocking and synchronous as well), and people use to think these pair of words are always interchangeable. We will discuss below why this is wrong.

The Models

So here we will discuss the 04 available I/O models under Linux, they are:

Blocking I/O

The most common way to get information from a file descriptor is a synchronous and blocking I/O. It sends a request and waits until data is copied from the kernel to the user.

An easy way to implement multitasking is to create various threads, each one with blocking requests.

Non-blocking I/O

This is a synchronous and non-blocking way to get data.

When a device is open with the option O_NONBLOCK, any unsuccessful try to read it return an exception EWOULDBLOCK or EAGAIN. The application then retries to read the data until the file descriptor is ready for reading.

This method is very wasteful and maybe that's the reason to be rarely used.

I/O Multiplexing or select/poll

This method is an asynchronous and blocking way to get data.

The multiplexing, in POSIX, is done with the functions select() and poll(), that registers one or more file descriptors to be read, and then blocks the thread. As soon as the file becomes available, the select returns and is possible to copy the data from the kernel to the user.

I/O multiplexing is almost the same as the blocking I/O, with the difference that is possible to wait for multiple file descriptors to be ready at a time.

Asynchronous I/O

This one is the method that is, at the same time, asynchronous and non-blocking.

The async functions tell the kernel to do all the job and report only when the entire message is ready in the kernel for the user (or already in the user space).

There are two asynchronous models:

One of the main difference between these two is that the first copy data from kernel to the user, while the seconds let the user do that.

Try it yourself

Step through each I/O model to see how the application and kernel interact:

Application
Calls read()
 
Kernel
Idle

Step 1/5: Application issues a read() system call to the kernel.

The Confusion

The POSIX states that

Asynchronous I/O Operation is an I/O operation that does not of itself cause the thread requesting the I/O to be blocked from further use of the processor. This implies that the process and the I/O operation may be running concurrently.

So the third and fourth models are, really, asynchronous. The third being blocker, since after the register of the functions it waits for the FDs.

I believe that there is almost always space to a discussion when it comes to the use of any terms. But only the fact that there is divergence whether blocking I/O and synchronous I/O are the same thing shows us that we have to be cautious when we use these terms.

To finish, an image worths a thousand of words, and a table even more, so let us look at this:

ModelSynchronous?Blocking?
Blocking I/OYesYes
Non-blocking I/OYesNo
I/O MultiplexingNoYes
Asynchronous I/ONoNo

Further Words

When dealing with asynchronous file descriptors, it is important to make account of how many of them the application can handle open at a time. This is easily checked with

cat /proc/sys/fs/file-max

Be sure to check it to the right user.

Other References