Chapter 6. Performance Considerations

Table of Contents

6.1. I/O Caching
6.1.1. Write-back Caching
6.1.2. Write-through Caching
6.1.3. Caching: Summary
6.2. Virtual Device
6.3. I/O Schedulers
6.3.1. Scheduler Options
6.3.2. CFQ
6.3.3. Deadline
6.3.4. NOOP
6.3.5. Backing Storage
6.3.6. Additional Resources
6.4. cgroups options

Managing disk images doesn't stop at file manipulation and storage pool monitoring. After you create a disk image something else is going to use it. That's where performance tuning considerations come into play. This section straddles the line between system administrator and application developer roles. What I mean to say is that application of some techniques in this section may require knowledge which is outside of your domain as a system administrator. To help bridge the knowledge gap I'll include notes on how to identify what you're looking for when tuning the system.

Many performance tuning decisions come down to one question: In the event of catastrophic system failure, how expensive is it to replace the data? If that value is low you can reach higher levels of performance at the cost of higher risk of data loss. If that value is high you can reach greater levels of data integrity at the cost of performance.

In this section we'll cover the following topics:

You may also be interested in reading over Chapter 5, Disk Formats.

6.1. I/O Caching

I/O caching requirements differ from host to host. I/O caching refers to the mode (or write policy) by which the kernel writes modified data to both the cache and the caches backing store. There are two general modes to consider, write-back and write-through. Let's review them now:

Write-back

Writes are done lazily. That is, writes initially happen in cache, and then are propagated to the appropriate backing storage device. Also known as write-behind.

Write-through

Writes are done synchronously to cache and the backing store (main system memory/disk drive).

Selecting the correct cache mode can increase or decrease your overall system performance. Selecting the correct mode depends on several factors, including:

  • Cost of data loss

  • System latency vs. throughput requirements

  • Operating System support

  • Hypervisor feature support

  • Virtualization deployment strategy

In addition to write-back and write-through modes there is a third pseudo-mode called none. This mode is considered a write-back mode. In this mode the onus is on the guest operating system to handle the disk write cache correctly in order to avoid data corruption on host crashes[45]. In a supported system where latency/throughput are valued over data integrity you should consider choosing the none mode[44]

TODO: Words about using "none" for your I/O cache?

TODO: It would be great to explain what systems handle disk write cache better than others

Next we'll review the two cache mode options in greater detail. At the end of the chapter we'll summarize the use cases for each mode.

6.1.1. Write-back Caching

TODO: This intro paragraph needs to be reworked

Write-back caching means that as I/O from the virtual guest happens it is reported as complete as soon as the data is in the virtual hosts page cache[45]. This is a shortcut around the I/O process wherein the data is written into the systems cache and then subsequently written into the backing storage volume. Whether that volume be volatile system memory (such as ram), or a non volatile source (such as a disk drive). In write-back caching the new data is not written to the backing store until a later time.

I remember the phrase write-back by thinking of it like this: As soon as a write happens on the guest a response is sent back to indicate that the operation has completed.

Using write-back caching will have several side-affects:

PRO: Increased performance

Both the guest and host will experience increased I/O performance due to the lazy nature of cache-writes.

CON: Increased risk of corruption

Until the data is flush'd there is an increased risk of data corruption/loss due to the volatile properties of system cache.

CON: Doesn't support guest migrations

You can not use the guest migration hypervisor feature if you are using write-back cache mode.

CON: Not supported by all Operating Systems

Not all OSs may support write-back cache. For example, RHEL releases prior to 5.6[46].

Though the CONs out-number the PROs, In reality, write-back is not as dangerous as it may appear to be. The QEMU User Documentation[45] says the following:

By default, the cache=writeback mode is used. It will report data writes as completed as soon as the data is present in the host page cache. This is safe as long as your guest OS makes sure to correctly flush disk caches where needed. If your guest OS does not handle volatile disk write caches correctly and your host crashes or loses power, then the guest may experience data corruption.

If your guest is ineligible for the none mode, because it doesn't manage its disk write cache well, then write-back mode is a great secondary option.

6.1.2. Write-through Caching

Write-through caching means that modified data is written synchronously into both system cache, as well as a backing store (RAM/disk drive). Because the writing happens in the backing store as well, write-through cache introduces a performance hit.

Because write-through caching puts a larger load on the host it is best used in moderation. You should avoid enabling write-through caching on a host with many guests, as this configuration is prone to scaling issues. You should only consider enabling write-through caching in situations where data integrity is paramount above all else or where write-back caching is not available on the guest.

6.1.3. Caching: Summary

Put a table here?