At VMworld in August of 2013, VMware announced VMware Virtual SAN (VSAN). It was in public beta until early
March and went GA (General Availability) on March 10th. VSAN is VMware's native version of Software Defined Storage (SDS). It is simple and easy to setup and managed by user-defined policies that are then applied to VMs as needed. It is this policy-based control that makes VSAN so powerful.
This white paper will look at VSAN, including what it is, basic requirements, how it works, and how various types of failures are handled. Some common uses will also be discussed.
Vendor lock-in on the storage side is a big problem in many environments today due to the expense of getting the storage area network (SAN) or Network Attached Storage (NAS) array (often in the hundreds of thousands to millions of dollars per array), not counting the expertise required to operate the array, tune it, and do the necessary provisioning, monitoring, and optimization tasks. In addition, with the cost per gigabyte dropping rapidly, purchasing storage well in advance of when it is actually needed is expensive (it would be cheaper if purchased just before it was needed), but that can lead to a "death by a thousand cuts" syndrome of constantly having to go back to management and ask for another few disks (which are relatively inexpensive), a new shelf (somewhat more expensive), or even an entire new array (a very expensive proposition). Up until now, the cost was deemed unavoidable and worth the cost to ensure high availability, shared access across hosts, low latencies, etc. These features will still probably be required for large, complex companies (and for core data center functions, etc.) for years to come, but in many other cases, they may not be required (see the Use Cases section for some ideas on where this technology may make sense).
VSAN is implemented at the kernel level, and thus doesn't suffer from the performance disadvantages of the Virtual Storage Appliance (VSA), which was (and is) implemented as a virtual appliance (VA). While the VSA was designed as a small-to-medium business (SMB) or remote office / branch office (ROBO) solution where a SAN or NAS array was too expensive, VSAN is designed for use in the enterprise in addition to the VSA use cases. Both the VSA and VSAN have the same basic purpose: take local storage located in individual servers and turn it into shared storage that can be used by HA, vMotion, DRS, etc.
VSAN is implemented at the cluster level, similarly to HA and DRS today; in fact, it is just another property of a cluster. It can be enabled in just two clicks, though there are many advanced options that can be set, along with storage policies to provide the needed availability to VMs at the best cost and performance (see the Storage Policies section on page 4 for more detail on this). The nice thing about this product is that you can scale up by adding additional storage within an ESXi host (up to 42 disks); you scale out by simply adding another ESXi host into the cluster (up to the vSphere maximum of 32 nodes).
VSAN has the following requirements:
A few quick notes on disks and disk groups before we move on... First, SSD space is used for caching (both read and write) only, and thus any discussion of usable space ignores all of the SSD space in every host. Second, VMware's best practice is that 10% of the space in each disk group be SSD to ensure there is enough space for caching (it will work if there is less, but performance may be impacted). Third, each host can have zero to five disk groups located on it. Any host with zero disk groups on it can run VMs like any other host, but storage requests will go to the other nodes. Note: Just because a VM runs on a given host, there is no guarantee the storage needed by that VM is local to the same host.
While some may question performance and/or scalability, both of the VSAN solution as well as the CPU performance cost, VMware has tested and shown nearly two million input/output operations per second (IOPS) in a single cluster (read only; roughly half that in a mixed read/write scenario), and at that level, only a 10% hit to CPU performance. While the 10% may sound like a lot, most ESXi servers today are running closer to 50% CPU utilization, so the extra 10% hit will not likely affect VM performance. Each cluster supports up to 4.4 petabytes of space as well, allowing for large amounts of data per cluster. Note that this space is given directly to VSAN to use; if any RAID is used (and usually it is not—just the raw disks are given to VSAN to use as it sees fit), only RAID 0 is supported. In fact, in many ways, it acts like Storage Spaces in Windows Server 2012 in this regard.
VSAN is designed to perform well and maximized available space. Previously it was mentioned that it is implemented at the vmkernel level to maximize performance from the compute side. This requires SSD drives for caching to maximize performance from the disk perspective, and requires at least 1 GbE (with 10 GbE recommended) on the network side. To maximize space, all vmdk files are thin provisioned and no parity or mirroring RAID is employed on the hosts. Additional capacity can be added by simply adding disks to a host and giving VSAN access to the new disks.
Enabling VSAN is a very simple process. Once VSAN has been properly licensed, simply go to the desired cluster and check the box for VSAN, like you would for DRS or HA. Note that if HA is already enabled, you will need to temporarily disable it so that VSAN can be configured, then you can re-enable it.
Once you check the box, the only major question is how VSAN should get the disks it needs to work: automatically or manually? If you choose automatically, VSAN will automatically use all the SSD and magnetic disks that it can find that are not used elsewhere in the system (for example to boot the host) and will create disk groups automatically. If you want more control over which disks are to be used and/or which disks belong in which disk groups, choose manual and configure the disks as desired.
The power of VSAN is not so much that it turns local storage into shared storage, though that is very impressive, but rather that policies can be setup and applied to VMs and that the system will automatically enforce those policies. There are several policies that can be set in VSAN. They include:
By default, a single policy is created and used by everything that uses VSAN, and that policy is not visible in the Web Client. It is simply configured to tolerate the loss of a host, disk, or disk group by setting Failures to Tolerate to one.
Note: Don't confuse Storage Policies (used by VSAN only) with Storage Profiles (usable with any datastore type). Storage policies determine the performance and availability of a VM located on a VSAN datastore and are fully automatic once assigned, while Storage Profiles define a preferred storage type (typically based on the speed of the underlying disks) for a VM. The profiles are manually created and manually assigned and do not automatically move VMs to other disks if the profile-configured type is not the actual location of the VM (Storage vMotion would typically be manually invoked to fix the issue).
VSAN will look at the storage policy assigned to each VM and then automatically apply it, placing each .vmdk file on disks it chooses (or the other VM files, collectively known as "VM Home" In this white paper, the term vmdk has been used throughout for brevity, but it applies equally to the VM Home folders).
For example, if you asked to tolerate one failure, VSAN would create two copies of the disk, each on a separate host. While you can determine where individual pieces of a VM are stored via the Web Client, the beauty of VSAN (and SDS in general) is that it really doesn't matter—the policy is what matters and the system will automatically enforce the policy (assuming enough hosts and capacity is available).
You may be wondering how failures are handled. VSAN is very fault tolerant and can continue to operate in the event of a disk, network, or server failure. The ability to handle failures depends on the storage policies previously described. No data loss will occur in any case as writes are not acknowledged to the VM until all copies on all hosts have acknowledged the write as complete.
Let's look at each of the scenarios and review how VSAN responds.
In traditional environments, RAID solves the problem and thus the loss of a disk is transparent to vSphere; simply replace the disk and the RAID controller rebuilds the LUN. The VSA works this way as well, but due to this need for redundancy at both the LUN and server levels, if RAID 10 is used (and often it is for performance reasons), only 25% of the space purchased is actually usable (half lost to RAID 10 and half to RAIN 1). VSAN cuts that in half due the use of individual disks (or at worst RAID 0, which is the same from a redundancy as well as disk overhead perspective).
With VSAN, your data is replicated by policy to multiple locations (unless you choose to have no redundancy for a non-critical VM), and thus when a disk is lost, VSAN will see that it is not in compliance with the defined storage policy and immediately begin copying the data to another disk to automatically come into compliance again. When speaking of disk loss, it is important to note that the loss of a SSD will cause the disk group to go offline and copying of the data to a new location to automatically begin. Note that no administrator intervention is required in this process at all (except for physically replacing the failed disk of course). This is probably the most common scenario (from a failure perspective—the planned loss of a server [temporarily] for maintenance, patching, etc. will probably be the most common scenario in most environments).
In the event of complete network failure, only VMs that are running on local storage (relative to where the VM is located) will continue to run; HA can attempt to restart the VMs on other nodes if capacity exists where the disk files are located, assuming that HA still has a valid network path (unless a storage policy that striped a single VMDK across multiple nodes was defined). This is why redundant network paths (ideally to redundant switches) are always recommended.
A few notes are important when using VSAN in a HA-enabled cluster. First, to handle network partition scenarios, a witness is assigned on an additional host in the cluster to make an odd number of nodes so that a majority (a quorum) could be on line if a partition were to occur. In the event of a network partition, VSAN will always restart any affected VMs in the partition that has quorum. Second, the normal heartbeat communication that takes place across the management network is changed to the VSAN network instead (except for checking for host node isolation, which will still use the default gateway (by default) of the management network). Third, datastore heartbeats are disabled if the cluster only has VSAN datastores as there is no additional availability gained.
The failure of the switch is a rare event, however, and thus the foregoing is unlikely to occur, while the loss of a single network port, cable, or NIC is far more probable. In those cases, HA will simply restart any affected VMs on nodes that still have access to storage (locally or across the network) and the VM will be back online quickly. This is also a fairly rare event in most environments.
The final failure scenario is the failure of a server. When the server fails, HA will restart the VMs elsewhere in the cluster like normal. VSAN does not start rebuilding the lost data on another server right away however, unlike in the disk failure scenario. The reason for this is that the server may be rebooting after a patch for example and will be back online soon. To prevent lots of extra space being used unnecessarily in situations like these, when the server goes down, a 60-minute timer begins; if the server is back online within that time frame, VSAN will simply synchronize the copies of the data. On the other hand, after the timer runs out, VSAN will automatically create a new copy of the missing data, much like in the disk failure scenario previously described.
This brings up an interesting question: What happens if the host is placed in maintenance mode? As with most things in IT, the answer is: "It depends." When a host is placed in maintenance mode, there are three things VSAN can do with the data on that host:
Common use cases include:
These use cases will be used to prove the capability and resilience of VSAN and as it becomes well tested and proven will probably move into the mainstream for storage needs in vSphere.
VSAN will fundamentally reshape the role of storage in many organizations over the coming years. In much the same way that virtualization was looked at somewhat skeptically in the early years and is now considered standard practice for most workloads in all environments (including mission critical production)—so too storage and network virtualization are in the early phases of adoption, but will soon become mainstream for many use cases. While VMware's stated goal is to coexist with existing SAN and NAS environments, it will likely replace many of them in the coming years, being much less costly, much simpler, and far easier to manage via policies.
John Hales (A+, Network+, CTT+, MCSE, MCDBA, MOUS, MCT, VCA-DCV, VCA-Cloud, VCA-Workforce Mobility, VCP, VCP-DT, VCAP-DCA, VCI, EMCSA) is a VMware instructor at Global Knowledge, teaching all of the vSphere and View classes that Global Knowledge offers. John has written a book called Administering vSphere 5: Planning, Implementing, and Troubleshooting published by Cengage, as well as other technical books—from exam-preparation books to quick-reference guides, as well as custom courseware for individual customers. John lives with his wife and children in Sunrise, Florida.