Why does active-active configuration degrade performance compared to failover?
We are setting up the new storage for an HPC compute cluster that we are managing for applied statistics, bioinformatics, and genomics.
Configuration
We have the main enclosure with a Dell EMC ME4084 (84x12TB 7200rpm) and an additional enclosure with a Dell EMC ME484 (28x12TB). The EMC ME4084 provides ADAPT distributed RAID (similar to RAID6) and dual hardware controllers.
The file server is running CentOS 7. The storage is connected to the file server using two SAS cables. Each LUN corresponds to a 14-disk group with ADAPT and both SAS connections appear as the devices sdb
and sdj
. The examples below are given for LUN ID 0.
We configured multipath
as follows for the active-active configuration:
$ cat /etc/multipath.conf
defaults {
path_grouping_policy multibus
path_selector "service-time 0"
}
$ multipath -ll
mpatha (3600c0ff000519d6edd54e25e01000000) dm-6 DellEMC ,ME4
size=103T features='0' hwhandler='0' wp=rw
`-+- policy=‘service-time 0' prio=1 status=active
|- 1:0:0:0 sdb 8:16 active ready running
`- 1:0:1:0 sdj 8:144 active ready running
The failover configuration:
$ cat /etc/multipath.conf
defaults {
path_grouping_policy failover
path_selector "service-time 0"
}
$ multipath -ll
mpatha (3600c0ff000519d6edd54e25e01000000) dm-6 DellEMC ,ME4
size=103T features='0' hwhandler='0' wp=rw
|-+- policy=’service-time 0' prio=1 status=active
| `- 1:0:0:0 sdb 8:16 active ready running
`-+- policy=’service-time 0' prio=1 status=enabled
`- 1:0:1:0 sdj 8:144 active ready running
We verified that writing to mpatha
results in writing to both sdb
and sdj
in the active-active configuration and only to sdb
in the active-enabled configuration. We striped mpatha
and another mpathb
into a logical volume and formatted it using XFS.
Test Setup
We benchmarked I/O performance using fio
under the following workloads:
- Single 1MiB random read/write process
- Single 4KiB random read/write process
- 16 parallel 32KiB sequential read/write processes
- 16 parallel 64KiB random read/write processes
Test Results
Failover Active-Active
------------------- -------------------
Workload Read Write Read Write
-------------- -------- -------- -------- --------
1-1mb-randrw 52.3MB/s 52.3MB/s 51.2MB/s 50.0MB/s
1-4kb-randrw 335kB/s 333kB/s 331kB/s 330kB/s
16-32kb-seqrw 3181MB/s 3181MB/s 2613MB/s 2612MB/s
16-64kb-randrw 98.7MB/s 98.7MB/s 95.1MB/s 95.2MB/s
I am only reporting only one set of tests but the results are consistent across replicates (n=3) and to the choice of path_selector
.
Is there any reason active-active cannot at the very least match the performance of active-enabled? I don’t know if the issue is with the workloads and multipath configuration. The difference was even more staggering (20%) when we used a linear logical volume instead of striping. I'm really curious to see if I overlooked something obvious.
Many thanks,
Nicolas
As you are using HDDs, a single controller is already plently fast for your backend disks. Adding another controller in active/active mode means no additional IOPs (due to HDDs), but more overhead at the multipath level, hence the reduced performance.
In other words: you will saturate the HDDs way before the CPU of the first controller, so leave them in active/passive mode. Moreover, I would try to use a single 28 disk array and benchmark it to see if it provides more or less performance than the actual 2x 14 disks setup.