This article was first published on SearchServerVirtualization.TechTarget.com.
Providing and managing storage resources in any IT environment can quickly grow out of control. When you’re using local storage, you often run into limitations based on the number of hard disks that can physically be attached to a single computer. Multiply these requirements by dozens or hundreds of servers, and it quickly becomes unmanageable. Fortunately, there’s a potential solution in centralized, network-based storage. In this article, we’ll look at how you can use network-based storage options to improve the performance and manageability of virtual machines running on Microsoft Virtual Server.
Effects of Network-Based Storage
Using network-based storage can have several effects on overall performance: some are good, some are (potentially) bad. Let’s start with the positive: Disk and network caching that is common on many storage solutions can help increase overall performance. When using centralized storage, even relatively small solutions might have multiple gigabytes of high-speed memory cache. Anytime you can avoid physical disk access is a win from a performance standpoint. Additionally, when using centralized storage, you can take advantage of advanced backup and recovery features such as snapshots and split-mirror features (the terminology and technology vary by vendor).
There are some down-sides to network-based storage. First and foremost is latency: performing round trips across a network can be time-consuming and long delays could lead to VM crashes. Also, the added burden on the network when multiple VMs are trying to use resources can require infrastructure upgrades. Overall, the benefits can outweigh the risks and difficulties (as long as you plan and test properly). With this in mind, let’s look at some technical approaches.
Sharing Virtual Hard Disks (VHDs)
The fact that VHDs are actually files comes with an unexpected benefit: Multiple VMs can access the same VHD files concurrently, as long as the VHD files are read-only. This is a great option if you’re already planning to use undo disks and/or differencing disks since the base or parent VHDs will be read-only, anyway. While you might increase contention and generate “hot spots” on the host file system, when sharing files with many VMs, these effects can be offset by caching. Only performance testing can provide the real numbers, but sharing meets your needs, you’ll have the added benefit of minimizing physical disk space usage.
Using Network-Attached Storage (NAS)
NAS devices provide access to files over a network connection. Standard Windows file shares are the most common example. While NAS devices can support several different protocols, in the Windows world, the CIFS standard is most common. Microsoft’s implementation (SMB) is the protocol that allows Windows users to access file shares. A simple approach involves configuring one or more virtual machines to access a virtual hard disk over the network using a UNC path instead of a local path. Figure 1 provides an example.
Figure 1: Accessing a VHD over the network
In order to implement this configuration, the Virtual Server service account must have access to the remote network location, and proper permissions must be set. Whenever a guest OS makes a disk I/O request, Virtual Server sends the request over the network to the VHD file located on the file share.
Using a Storage Area Network (SAN)
SAN technology is based on low-latency, high-performance Fibre Channel networks. The idea is to centralize storage while providing the highest levels of disk compatibility and performance. The major difference between SAN and NAS devices is that SANs use block-level I/OThis means that, to the host operating system, SAN-based storage is indistinguishable from local storage. You can perform operations such as formatting and defragmenting a SAN-attached volume. In contrast, with NAS-based access, you’re limited to file-level operations.
The major drawbacks related to SANs are cost (Fibre Channel host bus adapters and switch ports can be very expensive) and management. Generally, a pool of storage must be carved into smaller slices, each of which is dedicated to a server. This can often lead to wasted disk space (although many vendors have introduced methods for more dynamically managing allocation). Figure 2 shows a high-level logical view of a typical SAN implementation.
Figure 2: A basic Storage Area Network (SAN) environment
In order to improve management and reduce costs, configurations that combine SAN and NAS technologies are common in many environments. The Virtual Server computers can access VHD files using the NAS devices and the NAS devices, in turn, will connect to the SAN. This method can help reduce costs (by limiting the number of Fibre Channel ports and connections required) and simplify administration. Figure 3 provides an example of this type of configuration.
Figure 3: Combining NAS and SAN devices to store VHD files.
Using iSCSI
The iSCSI standard was designed to provide the storage characteristics of SCSI connections over an Ethernet network. iSCSI clients and servers (called initiators and targets, respectively) are readily available from many different vendors. As with SAN technology, iSCSI provides for block-level disk access. The major benefit of iSCSI is that it can work over an organization’s existing investment in copper-based Ethernet (which is dramatically cheaper than Fibre Channel solutions). Some benchmarks have shown that iSCSI can offer performance similar to Fibre Channel solutions. On the initiator side, iSCSI can be implemented as a software-based solution, or can take advantage of dedicated accelerator cards.
Comparing Network Storage Options
The bottom line for organizations that are trying to manage storage-hungry VMs is that there are several options available for centralizing storage. One major caveat is that you should verify support policies with vendors. Unsupported configurations may work, but you’ll be running without a safety net. And, I can’t overstate enough the importance of testing network-based storage configurations. Issues such as latency and protocol implementation nuances can lead to downtime and data loss. Overall, however, storing VHDs on network-based storage makes a lot of sense and can help reduce some major virtualization headaches.