CFQ IO Scheduler
Recall that the CFQ scheduler was written in response to some potential problems with the deadline controller. In particular, from an interview with Jens Axboe, the developer of CFQ, deadline, and a few other IO schedulers, Jens stated, “While deadline worked great from a latency and hard drive perspective, it had no concept of individual process fairness.” So one process that was doing a great deal of IO could cause other applications to have their IO starve.
The CFQ scheduler created the concept of having queues for each process. These queues are created as needed for a particular process. Also, while perhaps not new, the scheduler divided the concept of IO into two parts, Synchronous IO and Asynchronous IO (AIO). Synchronous IO is important because the application stops running until the IO request is finished. In other words, synchronous IO “blocks” the execution of the application until it is done. This is fairly common in read operations because an application may need to read some input data prior to continuing execution.
On the other hand, AIO allows an application to continue operation because the result of the IO operation returns immediately. So rather than wait for confirmation that the application’s IO request has succeeded as in the case of synchronous IO, for AIO the result is returned immediately even if the IO operation has not yet finished. This is potentially very useful for write operations and allows the application to overlap IO with computation (helps efficiency – that is, it can improve run time).
CFQ was designed to separate out synchronous and asynchronous IO operations, favoring synchronous operations (naturally). It also favors read operations for a couple of reasons: (1) reads have a tendency to block execution because the application needs the data to continue, (2) it’s possible with the elevator approach for schedulers to “starve” a read operation that is fairly far out on disk geometry (near the outside of the disk). By favoring read operations it improves read responsiveness and greatly reduces the possibility of a far-out read starvation.
CFQ goes even further by keeping the concept of deadlines from the Deadline IO Scheduler to prevent IO operations from being starved. Jens’ wrote the deadline scheduler and realized that for good performance for some applications it needed the concept of an IO operation “timing out.” That is, an IO operation may be put into a queue for execution but subsequent IO operations may be put ahead of it in the queue. Therefore the IO request at the end of the queue may never get executed or it’s execution may be seriously delayed. The deadline IO scheduler has the concept of a “time-out” period. If the IO request is not executed in during this period, it will be executed immediately. This keeps IO operations from starving in the queue.
Jens combined all of these concepts along with the per-process concept to create CFQ. It can be rather complicated to define exactly how these concepts interact and it’s beyond this article to go into detail, but understanding the concepts that go into CFQ are very important. This is particularly important if we are going to try tuning the scheduler for performance.
Tunable Parameters in CFQ
In addition to open-source being a great way to have access to the code to make changes yourself to adapt it to your requirements, many times the developers of software allowing you to “tune” the application for your situation without having to hack the code base yourself. The IO schedulers in the Linux kernel are no exception. In particular, the CFQ scheduler has 9 parameters for tuning performance. While discussing the parameters can get long, it is worthwhile to take a look at these parameters in a bit more depth:
- back_seek_max
This parameter, given in Kbytes, sets the maximum “distance” for backward seeking. By default, this parameter is set to 16 MBytes. This distance is the amount of space from the current head location to the sectors that are backward in terms of distance. This idea comes from the Anticipatory Scheduler (AS) about anticipating the location of the next request. This parameter allows the scheduler to anticipate requests in the “backward” or opposite direction and consider the requests as being “next” if they are within this distance from the current head location. - back_seek_penalty
This parameter is used to compute the cost of backward seeking. If the backward distance of a request is just (1/back_seek_penalty) from a “front” request, then the seeking cost of the two requests is considered equivalent and the scheduler will not bias toward one or the other (otherwise the scheduler will bias the selection to “front direction requests). Recall, the CFQ has the concept of elevators so it will try to seek in the current direction as much as possible to avoid the latency associated with a seek. This parameters defaults to 2 so if the distance is only 1/2 of the forward distance, CFQ will consider the backward request to be close enough to the current head location to be “close”. Therefore it will consider it as a forward request. - fifo_expire_async
This particular parameter is used to set the timeout of asynchronous requests. Recall that CFQ maintains a fifo (first-in, first-out) list to manage timeout requests. In addition, CFQ doesn’t check the expired requests from the fifo queue after one timeout is dispatched (i.e. there is a delay in processing the expired request). The default value for this parameter is 250 ms. A smaller value means the timeout is considered much more quickly than a larger value. - fifo_expire_sync
This parameter is the same as fifo_expire_async but for synchronous requests. The default value for this parameter is 125 ms. If you want to favor synchronous request over asynchronous requests, then this value should be decreased relative to fifo_expire_asynchronous. - slice_sync
Remember that when a queue is selected for execution, the queues IO requests are only executed for a certain amount of time (the time_slice) before switching to another queue. This parameter is used to calculate the time slice of the synchronous queue. The default value for this parameter is 100 ms, but this isn’t the true time slice. Rather the time slice is computed from the following: time_slice = slice_sync + (slice_sync / 5 * 4 – io_priority)). If you want the time slice for the synchronous queue to be longer (perhaps you have more synchronous operations), then increase the value of slice_sync. - slice_async
This parameter is the same as slice_sync but for the asynchronous queue. The default is 40 ms. Notice that synchronous operations are preferred over asynchronous operations. - slice_asyn_rq
This parameter is used to limit the dispatching of asynchronous requests to the device request-queue in queue’s slice time. This limits the number of asynchronous requests are executed (dispatched). The maximum number of requests that are allowed to be dispatched also depends upon the io priority. The equations for computing the maximum number of requests is, max_nr_requests = 2 * (slice_async_rq + slice_async_rq * (7 – io_priority)). The default for slice_async_rq is 2. - slice_idle
This parameter is the idle time for the synchronous queue only. In a queue’s time slice (the amount of time operations can be dispatched), when there are no requests in the synchronous queue CFQ will not switch to another queue but will sit idle to wait for the process creating more requests. If there are no new requests submitted within the idle time, then the queue will expire. The default value for this parameter is 8 ms. This parameters can control the amount of time the schedulers waits for synchronous requests. This can be important since synchronous requests tend to block execution of the process until the operation is completed. Consequently, the IO scheduler looks for synchronous requests within the idle window of time that might come from a streaming video application or something that needs synchronous operations. - quantum
This parameter controls the number of dispatched requests to the device queue, request-device (i.e. the number of requests that are executed or at least sent for execution). In a queue’s time slice, a request will not be dispatched if the number of requests in the device request-device exceeds this parameter. For the asynchronous queue, dispatching the requests is also restricted by the parameter slice_async_rq. The default for this parameter is 4.
You can see that the CFQ scheduler prefers synchronous IO requests. The reason for this is fairly simple – synchronous IO operations block execution. So until that IO operation is executed the application cannot continue to run. These applications can include streaming video or streaming audio (who wants their movie or music to be interrupted?), but there are a great deal more applications that perform synchronous IO.
On the other hand, Asynchronous IO (AIO) can be very useful because execution immediately returns to the application immediately without waiting for confirmation that the operation has completed. This allows the application to “overlap” computation and IO. This can be very useful for many operations depending upon the goals and requirements. There is quite a good article that talks about synchronous and asynchronous and blocking and non-blocking IO requests.
This article from: http://www.linux-mag.com/id/7572/