This article was first published on SearchServerVirtualization.TechTarget.com.
It’s common for new technology to require changes in all areas of an organization’s overall infrastructure. Virtualization is no exception. While many administrators often focus on CPU and memory constraints, storage-related performance is also a very common bottleneck. In some ways, virtual machines can be managed like physical ones. After all, each VM runs its own operating systems, applications, and services. But there are also numerous additional considerations that must be taken into account when designing a storage infrastructure. By understanding the unique needs of virtual machines, storage managers can build a reliable and scalable data center infrastructure to support their VMs.
Analyzing Disk Performance Requirements
For many types of applications, the primary consideration around which the storage infrastructure is designed is based on I/O operations per second (IOPS). IOPS refer to the number of read and write operations that are performed, but do not always capture the whole picture. Additional considerations include the type of activity. For example, since virtual disks that are stored on network-based storage arrays must support guest OS disk activity, the average I/O request size tends to be small. Additionally, I/O requests are frequent and often random in nature. Paging can also create a lot of traffic on memory-constrained host servers. There are also other considerations that will be workload-specific. For example, it’s also good to measure the percentage of read vs. write operations when designing the infrastructure.
Now, multiply all of these statistics by the number of VMs that are being supported on a single storage device, and you are faced with the very real potential for large traffic jams. The solution? Optimize the storage solution for supporting many, small, and non-sequential IO operations. And, most importantly, distribute VMs based on their levels and types of disk utilization. Performance monitoring can help generate the information you need.
Considering Network-Based Storage Approaches
Many environments already use a combination of NAS, SAN, and iSCSI-based store to support their physical servers. These methods can still be used for hosting virtual machines, as most virtualization platforms provide support for them. For example, SAN- or iSCSI-based volumes that are attached to a physical host server can be used to store virtual machine configuration files, virtual hard disks, and related data. It is important to note that, by default, the storage is attached to the host and not to the guest VM. Storage managers should keep track of which VMs reside on which physical volumes for backup and management purposes.
In addition to providing storage at the host-level, guest operating systems (depending on their capabilities) can take advantage of NAS and iSCSI-based storage. With this approach, VMs can directly connect to network-based storage. A potential drawback, however, is that guest operating systems can be very sensitive to latency, and even relatively small delays can lead to guest OS crashes or file system corruption.
Evaluating Useful Storage Features
As organizations place multiple mission-critical workloads on the same servers through the use of virtualization, they can use various storage features to improve reliability, availability and performance. Implementing RAID-based striping across arrays of many disks can help significantly improve performance. The array’s block size should be matched to the most common size of I/O operations. However, more disks means more chances for failures. So, features such as multiple parity drives and hot standby drives are a must.
Fault tolerance can be implemented through the use of multi-pathing for storage connections. For NAS and iSCSI solutions, storage managers should look into having multiple physical network connections and implementing fail-over and load-balancing features by using network adapter teaming. Finally, it’s a good idea for host servers to have dedicated network connections to their storage arrays. While you can often get by with shared connections in low-utilization scenarios, the load placed by virtual machines can be significant and can increase latency.
Planning for Backups
Storage administrators will have the need to backup many of their virtual machines. Apart from allocating the necessary storage space, it is necessary to develop a method for dealing with exclusively-locked virtual disk files. There are two main approaches:
- Guest-Level Backups: In this approach, VMs are treated like physical machines. Generally, you would install backup agents within VMs, define backup sources and destinations, and then let them go to work. The benefit of this approach is that only important data is backed up (thereby reducing required storage space). However, your backup solution must be able to support all potential guest OS’s and versions. And, the complete recovery process can involve many steps, including reinstalling and reconfiguring the guest OS.
- Host-Level Backups: Virtual machines are conveniently packaged into a few important files. Generally, this includes the VM configuration file and virtual disks. You can simply copy these files to another location. The most compatible approach involves stopping or pausing the VM, copying the necessary files, and then restarting the VM. The issue, however, is that this can require downtime. Numerous first- and third-party solutions are able to backup VMs while they’re “hot”, thereby eliminating service interruptions. Regardless of the method used, replacing a failed or lost VM is easy – simple restore the necessary files to the same or another host server and you should be ready to go. The biggest drawback of host-level backups is in the area of storage requirements. You’re going to be allocating a ton of space for the guest OS’s, applications, and data you’ll be storing.
Storage solutions options such as the ability to perform snapshot-based backups can be useful. However, storage administrators should thoroughly test the solution and should look for explicitly-stated virtualization support from their vendors. Remember, backups must be consistent to a point in time, and non-virtualization-aware solutions might neglect to flush information stored in the guest OS’s cache.
Summary
By understanding and planning for the storage-related needs of virtual machines, storage administrators can help their virtual environments scale and keep pace with demand. While some of the requirements are somewhat new, many involve utilizing the same storage best practices that are used for physical machines. Overall, it’s important to measure performance statistics and to consider storage space and performance when designing a storage infrastructure for VMs.