Archive for category Storage

Virtualization Considerations for Storage Managers

This article was first published on

It’s common for new technology to require changes in all areas of an organization’s overall infrastructure. Virtualization is no exception. While many administrators often focus on CPU and memory constraints, storage-related performance is also a very common bottleneck. In some ways, virtual machines can be managed like physical ones. After all, each VM runs its own operating systems, applications, and services. But there are also numerous additional considerations that must be taken into account when designing a storage infrastructure. By understanding the unique needs of virtual machines, storage managers can build a reliable and scalable data center infrastructure to support their VMs.

Analyzing Disk Performance Requirements

For many types of applications, the primary consideration around which the storage infrastructure is designed is based on I/O operations per second (IOPS). IOPS refer to the number of read and write operations that are performed, but do not always capture the whole picture. Additional considerations include the type of activity. For example, since virtual disks that are stored on network-based storage arrays must support guest OS disk activity, the average I/O request size tends to be small. Additionally, I/O requests are frequent and often random in nature. Paging can also create a lot of traffic on memory-constrained host servers. There are also other considerations that will be workload-specific. For example, it’s also good to measure the percentage of read vs. write operations when designing the infrastructure.

Now, multiply all of these statistics by the number of VMs that are being supported on a single storage device, and you are faced with the very real potential for large traffic jams. The solution? Optimize the storage solution for supporting many, small, and non-sequential IO operations. And, most importantly, distribute VMs based on their levels and types of disk utilization. Performance monitoring can help generate the information you need.

Considering Network-Based Storage Approaches

Many environments already use a combination of NAS, SAN, and iSCSI-based store to support their physical servers. These methods can still be used for hosting virtual machines, as most virtualization platforms provide support for them. For example, SAN- or iSCSI-based volumes that are attached to a physical host server can be used to store virtual machine configuration files, virtual hard disks, and related data. It is important to note that, by default, the storage is attached to the host and not to the guest VM. Storage managers should keep track of which VMs reside on which physical volumes for backup and management purposes.

In addition to providing storage at the host-level, guest operating systems (depending on their capabilities) can take advantage of NAS and iSCSI-based storage. With this approach, VMs can directly connect to network-based storage. A potential drawback, however, is that guest operating systems can be very sensitive to latency, and even relatively small delays can lead to guest OS crashes or file system corruption.

Evaluating Useful Storage Features

As organizations place multiple mission-critical workloads on the same servers through the use of virtualization, they can use various storage features to improve reliability, availability and performance. Implementing RAID-based striping across arrays of many disks can help significantly improve performance. The array’s block size should be matched to the most common size of I/O operations. However, more disks means more chances for failures. So, features such as multiple parity drives and hot standby drives are a must.

Fault tolerance can be implemented through the use of multi-pathing for storage connections. For NAS and iSCSI solutions, storage managers should look into having multiple physical network connections and implementing fail-over and load-balancing features by using network adapter teaming. Finally, it’s a good idea for host servers to have dedicated network connections to their storage arrays. While you can often get by with shared connections in low-utilization scenarios, the load placed by virtual machines can be significant and can increase latency.

Planning for Backups

Storage administrators will have the need to backup many of their virtual machines. Apart from allocating the necessary storage space, it is necessary to develop a method for dealing with exclusively-locked virtual disk files. There are two main approaches:

  • Guest-Level Backups: In this approach, VMs are treated like physical machines. Generally, you would install backup agents within VMs, define backup sources and destinations, and then let them go to work. The benefit of this approach is that only important data is backed up (thereby reducing required storage space). However, your backup solution must be able to support all potential guest OS’s and versions. And, the complete recovery process can involve many steps, including reinstalling and reconfiguring the guest OS.
  • Host-Level Backups: Virtual machines are conveniently packaged into a few important files. Generally, this includes the VM configuration file and virtual disks. You can simply copy these files to another location. The most compatible approach involves stopping or pausing the VM, copying the necessary files, and then restarting the VM. The issue, however, is that this can require downtime. Numerous first- and third-party solutions are able to backup VMs while they’re “hot”, thereby eliminating service interruptions. Regardless of the method used, replacing a failed or lost VM is easy – simple restore the necessary files to the same or another host server and you should be ready to go. The biggest drawback of host-level backups is in the area of storage requirements. You’re going to be allocating a ton of space for the guest OS’s, applications, and data you’ll be storing.

Storage solutions options such as the ability to perform snapshot-based backups can be useful. However, storage administrators should thoroughly test the solution and should look for explicitly-stated virtualization support from their vendors. Remember, backups must be consistent to a point in time, and non-virtualization-aware solutions might neglect to flush information stored in the guest OS’s cache.


By understanding and planning for the storage-related needs of virtual machines, storage administrators can help their virtual environments scale and keep pace with demand. While some of the requirements are somewhat new, many involve utilizing the same storage best practices that are used for physical machines. Overall, it’s important to measure performance statistics and to consider storage space and performance when designing a storage infrastructure for VMs.

Optimizing Microsoft Virtual Server, Part 3: Designing Virtual Hard Disk Storage

This article was first published on

Much of the power and flexibility of virtualization solutions comes from the features available for virtual hard disks. Unfortunately, the many different configuration types that are available, you can end up reducing overall performance if you’re not careful. A key concept is virtual hard disk file placement. Let’s look at some scenarios and recommendations that can have a significant impact on performance.

Note: For an introduction to working with Virtual Server’s disk architecture, see Understanding Virtual Hard Disk Options.

VHD File Placement

Most production-class servers will have multiple physical hard disks installed, often to improve performance and to provide redundancy. When planning for allocating VHDs on the host’s file system, the rule is simple: Reduce disk contention. The best approach requires an understanding of how VHD files are used.

If each of your VMs has only one VHD, then you can simply spread them across the available physical spindles based on their expected workload. A common configuration is to use one VHD for the OS and to attach another for data storage. If both VHDs will be busy, placing then on different physical volumes can avoid competition for resources. Other configurations can be significantly more complicated, but the general rule still applies: try to spread disk activity across physical spindles whenever possible.

Managing Undo and Differencing Disks

If you are using undo disks or differencing disks, you’ll want to arrange them such that concurrent I/O is limited. Figure 1 shows an example in which differencing disks are spread across physical disks. In this configuration, the majority of disk read activity is occurring on the parent VHD file, whereas the differencing disk will experience the majority of write activity. Of course, these are only generalizations as the size of the VHDs and the actual patterns of read and write activity can make a huge difference.


Figure 1: Arranging parent and child VHD files for performance.

In some cases, using undo disks can improve performance (for example, when the undo disks and base VHDs are on separate physical spindles). In other cases, such as when you have a long chain of differencing disks, you can generate a tremendous amount of disk-related overhead. For some read and write operations, Virtual Server might need to access multiple files to find the “latest” version of the data. And, this problem will get worse over time. Committing undo disks and merging differencing disks with their parent VHDs are important operations that can help restore overall performance.

Fixed-Size vs. Dynamically-Expanding VHDs

The base type for VHDs you create can have a large affect on overall performance. While dynamically-expanding VHDs can make more efficient use of physical disk space on the host, they tend to get fragmented as they grow. Fixed-size VHDs are more efficient since physical disk space is allocated and reserved when they’re created. The general rule is, if you can spare the disk space, go with fixed-size hard disks. Also, keep in mind that you can always convert between fixed-size and dynamically-expanding VHDs, if your needs change.

Host Storage Configuration

The ultimate disk-related performance limits for your VMs will be determined by your choice of host storage hardware. One important decision (especially for lower-end servers) is the type of local storage connection. IDE-based hard disks will offer the poorest performance, whereas SATA, SCSI, and Serial-Attached SCSI (SAS) will offer many improvements. The key to the faster technologies is that they can efficiently carry out multiple concurrent I/O operations (a common scenario when multiple VMs are cranking away on the same server).

When evaluating local storage solutions, there are a couple of key parameters to keep in mind. The first is overall disk throughput (which reflects the total amount of data that can be passed over the connection in a given amount of time). The other important metric is the number of I/O operations per second that can be processed. VM usage patterns often result in a large number of small I/O operations. Just as important is the number of physical hard disks that are available. The more physical disk spindles that are available, the better will be your overall performance.

Using RAID

Various implementations of RAID technology can also make the job of placing VHD files easier. Figure 2 provides a high-level overview of commonly-used RAID levels, and their pros and cons. By utilizing multiple physical spindles in each array, performance can be significantly improved. Since multiple disks are working together at the disk level, the importance of manually moving VHD files to independent disks is reduced. And, of course, you’ll have the added benefit of fault-tolerance.


Figure 2: Comparing various RAID levels

Virtual IDE vs. SCSI Controllers

Virtual Server allows you two different methods for connecting virtual hard disks to your VMs: IDE and SCSI. Note that these options are independent of the storage technology you’re using on the host server. The main benefit of IDE is compatibility: Pretty much every x86-compatible operating system supports the IDE standard. You can have up to four IDE connections per VM, and each can have a virtual hard disk or virtual CD/DVD-ROM device attached.

While IDE-based connections work well for many simpler VMs, SCSI connections offer numerous benefits. First, VHDs attached to an IDE channel are limited to 127GB, whereas SCSI-attached VHDs can be up to 2 terabytes in size. Additionally, the virtual SCSI controller can support up to a total of 28 attached VHDs (four SCSI adapters times seven available channels on each)! Figure 3 provides an overview of the number of possible disk configurations.


Figure 3: Hard disk connection interface options for VHDs

If that isn’t enough there’s one more advantage: SCSI-attached VHDs often perform better than IDE-attached VHDs, especially when the VM is generating a lot of concurrent I/O operations. Figure 3 shows an overview of the available hard disk connections for a VM.


Figure 4: Configuring a SCSI-attached VHD for a VM.

One helpful feature is that, in general, the same VHD file can be attached to either IDE or SCSI controllers without making changes. A major exception to the rule is the generally the boot hard disk, as BIOS and driver changes will likely be required to make that work. Still, the rule for performance is pretty simple: Use SCSI-attached VHDs whenever you can and use IDE-attached VHDs whenever you must.


When you’re trying to setup a new Virtual Server installation for success, designing and managing VHD storage options is a great first step. Disk I/O bottlenecks are a common cause of real-world performance limitations, but there are several ways to reduce them. In the next article, I’ll talk about maintaining VHDs to preserve performance over time.