In this post I will talk about best practices when designing a vSAN cluster. I will start off by discussing why this is important, then look at disk groups and caching within these disk groups. You can also see videos where I explain these concepts in more depth on VMware Learning Zone, To see these videos you can register here

Why is design important?

When we talk about vSAN some of the most important aspects are around how vSAN has been setup and if we have planned accordingly. Quiet often we hear of customers setting up their vSAN Cluster incorrectly and hence missing out on potential performance. So just like building a car we need to think about wheel size and ensuring that the correct tyres are used along with so many other components so the car acts and performs appropriately. We need to do the same from a Virtual SAN point of view. So let’s get started.

Do disk groups matter and why?

Think of a disk group like you would a Storage Group on a Storage Array. With v SAN you can have hybrid or all flash configurations. With vSAN Hybrid configurations you can have 1 SSD (cache tier) and up to 7 Magnetic Disks (capacity tier) per disk group. For All Flash configurations you can have 1SSD in the cache tier (this should be your faster SSD type) and up to 7 SSDs in capacity tier (this would be your slower SSD type). So in total you can have 35 disks in capacity tier and 5 disks in the cache tier per host.

So just like with traditional storage where we might have a Storage Processor acting as our cache buffer, vSAN will use the SSD’s at the cache layer to improve performance.

Now if you want to improve performance then you may benefit from using multiple disk groups. As I said you can have 5 disk groups per host, however it’s imperative to note that there are also memory requirements when it comes to using multiple disk groups. If you wish to have 5 disk groups with 7 disks in the cache tier then you will need to have at least 32GB of memory in the host. Doing this should improve the performance of your vSAN cluster and since you are using multiple disk groups you will also inherit better redundancy. This may not be so cost effective in the grander scheme of things however when making your design decisions you will need to keep this in mind.

vsan-hybrid-all-flash

I have unboxed my SSD’s, Now what?

Let’s say you want to build an All-Flash configuration and  your disks arrive. What happens if you have unboxed all of your SSD’s and you don’t know which SSD should go in the cache tier I hear you ask?

Well you can go to http://www.vmware.com/resources/compatibility/search.php?deviceCategory=vsanio and plug in the disk Part Number. As you see in the screenshot you are then told which tier the SSD should be sitting on.

hcl

How do I know the right amount of cache required for optimal performance?

This is where the research will pay off once complete. You need to review the amount of VM’s that you want to utilize in the vSAN environment and then look at the workload of those virtual machines. The reason for this is that you need to ensure that the most frequently used blocks are kept in cache tier so there are no read misses.  The biggest issue with working all of this out is that workloads can vary at different times. Just like a kids toy shop is going to be much busier around Christmas and may be less busy in June, the same will go for VM’s in our datacenters. This is where we recommend using 10% flash cache at a minimum for any vSAN configuration.

In vSAN 6.2 we introduced client cache which utilizes DRAM memory local to the VM so we can improve read performance. Since local cache is now being used by the VM, performance is improved because there is not a need to reach across the network to fetch data.

Let’s talk about read cache

Read cache is only utilized in hybrid configurations. It keeps all of the recently read disk blocks in cache. This improves performance when we have a read hit (finds data first time it searches) and latency is reduced because of this. So let’s say you have a VM and there is a block of data that you need to access, vSAN will read from that replica, If you are utilizing Failures to Tolerate then there are multiple replicas so vSAN divides the caching evenly between the replcas. If you have a 4 node vSAN cluster and you are searching for a block of data on ESXiA then if it is not there then the directory service is used to find where the block of data is stored on another replica. If found then the data is taken from there, however if it is not in cache then a read needs to be performed on the magnetic disk. In this case we will get a read miss.

And now write cache

Write cache is used in both Hybrid and All Flash configurations. Its purpose is to improve performance and lifetime of SSD’s that sit in the capacity tier in all flash configurations. So when a write is sent to a vSAN object that is using a default availability policy (FTT=1) then that write will go to 2 write caches. Therefore if there was an event of a host failure the IO would still be sitting in cache on another host and therefore we do not encounter data loss.

write-io

Which type of SSD is right for me?

Usually when speaking about disks we seem to refer to either SSD’s or MD’s. But what about PCIe. One of the obvious considerations when choosing your disk type is cost. Unfortunately this is an evil reality most folk will encounter. The next priorities can be capacity and performance. SSD’s are still tied to SATA’s 6GB standard. In comparison, PCIe, or Peripheral Component Interconnect Express, is a physical interconnect for motherboard expansion. It can provide up to 16 lanes for data transfer, at ~1Gb/s per lane in each direction for PCIe 3.x devices. So in english this means a total bandwidth of ~32Gb/s for PCIe devices that can use all 16 lanes. You can read more about PCIe here.

http://www.bit-tech.net/hardware/2010/11/27/pci-express-3-0-explained/1

I hope all of this information is useful and I will follow up with more of these types of performance blogs in the near future.

Thanks

Francis