NetApp has included some very powerful troubleshooting commands with the 8.3 update which I’d like to bring to your attention: its QOS statistics and its subcommands. Prior to 8.3, we used the dashboard command to view statistics at the cluster node level. The problem with dashboard is that it’s reporting on cluster level statistics and it can be difficult to isolate problems caused by a single object. The advantage of the QOS command is that we now have the ability to target specific objects in a very granular fashion.
The advantage of the QOS command is that we now have the ability to target specific objects in a very granular fashion.
Let me give you a real world example of an excellent use case where we were observing high latency at the cluster level. Since we were dealing with a virtualized workload, we wanted to determine the volume hosting the data store of that particular guest system causing the latency.
As you can see in this example, QOS statistics volume performance shows the volume hosting the data store and the problem guest is easily discernable from the rest of the active volumes.
As engineers, we always need a backup plan because every environment is different.
I know what you are thinking; why not grab this data out of vcenter? Yes, I agree; however, in this instance the customer had turned off collection statistics. It turned out the latency in question was caused by corruption from ESX. A different discussion for another day… As engineers, we always need a backup plan because every environment is different and the challenge is that systems you usually need are sometimes down.
For more information on QOS statistics please visit: https://library.netapp.com/ecm/ecm_download_file/ECMP1610202