What may increase the latency 3-4 times for SQL FCI the database?

We have SQL Server Basic Availability Groups (BAG) configured to run an SQL database running over local Intel SSD drive. I was asked to move a database to SQL Server Failover Cluster Instance (FCI) to increase the SQL Server performance: run the database on top of HA virtual drive powered by software-defined storage. From my experience hyperconverged VSAN should double read operations thus the SQL IO latency (for reads from the database) should decrease twice.

So two scenarios were benchmarked: SQL BAG and SQL FCI. For those two cases, the maximum server memory of 512 GB RAM was set to SQL Server to exclude caching and perform fair read operations from the database table.

Management Studio SQL and SQLQueryStress were used for testing purposes. The SQL statement is SELECT TOP (500000) ... FROM [SQL].[dbo].[table] to read first 500K rows.

SQL BAG query results are the following:

Management Studio SQL: Query time = 15sec

SQLQueryStress:

Number of Threads = 1: Query time = 2sec
Number of Threads = 2: Query time = 2sec
Number of Threads = 4: Query time = 2sec
Number of Threads = 8: Query time = 2sec
Number of Threads = 10: Query time = 3sec
Number of Threads = 12: Query time = 4sec

The SQL FCI scenario was built on Windows Failover cluster of two identical hardware nodes running Windows Server 2016. The storage was configured using software-defined storage (hyperconverged VSAN) over Intel SSD drives. So a virtual disk was presented to Failover Cluster as Cluster Disk. For testing the Cluster Disk I used diskspd

The diskspd resutls are the following:

4k random read – 76K IOPS (SSD), 153K IOPS (hyperconverged VSAN - Cluster Disk)
8k random read – 45K IOPS (SSD), 89K IOPS (hyperconverged VSAN - Cluster Disk)

As I expected, hyperconverged VSAN doubled the storage performance. For the next, SQL FCI was configured to store database files on that Cluster Disk. Another copy of the database was uploaded to the server and the same tests were performed.

SQL FCI query results are the following:

Management Studio SQL: Query time = 15sec

SQLQueryStress:

Number of Threads = 1: Query time = 9sec
Number of Threads = 2: Query time = 8sec
Number of Threads = 4: Query time = 9sec
Number of Threads = 8: Query time = 8sec
Number of Threads = 10: Query time = 10sec
Number of Threads = 12: Query time = 12sec

The questions are:

  1. Why the latency is the same for SQL BAG and SQL FCI both benchmarked via Management Studio?

  2. What may increase the latency 3-4 times for SQL FCI the database?


Solution 1:

FCI is history, stick with AlwasysOn Availability Groups (AGs) for your newer deployments. This is much faster (because SQL Server knows what to replicate) and at least Basic AGs are included into Standard edition of SQL Server.

https://blogs.technet.microsoft.com/msftpietervanhove/2017/03/14/top-5-questions-about-basic-availability-groups/