Benchmarking Azure's Premium Storage P30 Disks
We're running performance tests on two new Standard DS13 (8 Core, 56 GB) VMs (both using the latest/default Windows 2012 R2 image) backed by Premium Storage and have hit a wall with step 1 in testing the local SSD performance.
We understand 25% of the 400GB local SSD for these VMs is made available as temporary storage and the other 75% is used for Premium Storage caching: http://azure.microsoft.com/blog/2014/12/11/new-premium-storage-backed-virtual-machines/
On the remaining 25%, we expect to see performance along these lines: http://www.brentozar.com/archive/2014/09/azure-really-60-faster/ http://azure.microsoft.com/blog/2014/10/06/d-series-performance-expectations/
... but Crystal Disk Mark shows it crawling along:
Sequential Read : 4.097 MB/s
Sequential Write : 4.096 MB/s
Random Read 512KB : 4.112 MB/s
Random Write 512KB : 4.112 MB/s
Random Read 4KB (QD=1) : 2.057 MB/s [ 502.3 IOPS]
Random Write 4KB (QD=1) : 2.057 MB/s [ 502.2 IOPS]
Random Read 4KB (QD=32) : 2.048 MB/s [ 500.0 IOPS]
Random Write 4KB (QD=32) : 2.047 MB/s [ 499.7 IOPS]
Test : 50 MB [D: 7.2% (8.1/112.0 GB)] (x5)
Date : 2015/02/14 15:35:41
OS : Windows Server 2012 R2 Datacenter (Full installation) [6.3 Build 9600] (x64)
The performance of the OS disk is better but nowhere close to the 150 MB/s you'd expect for a P20 disk (assuming that's what's allocated for the default 127GB OS disk).
Expecting:
http://azure.microsoft.com/en-us/documentation/articles/storage-premium-storage-preview-portal/
Seeing:
Sequential Read : 66.031 MB/s
Sequential Write : 63.034 MB/s
Random Read 512KB : 65.861 MB/s
Random Write 512KB : 63.580 MB/s
Random Read 4KB (QD=1) : 2.097 MB/s [ 511.9 IOPS]
Random Write 4KB (QD=1) : 2.047 MB/s [ 499.7 IOPS]
Random Read 4KB (QD=32) : 2.086 MB/s [ 509.3 IOPS]
Random Write 4KB (QD=32) : 2.078 MB/s [ 507.4 IOPS]
Test : 50 MB [C: 12.9% (16.4/127.0 GB)] (x5)
Date : 2015/02/14 15:46:35
OS : Windows Server 2012 R2 Datacenter (Full installation) [6.3 Build 9600] (x64)
And the performance of the P30 disk (with ReadOnly cache) isn't much better:
Sequential Read : 204.567 MB/s
Sequential Write : 39.677 MB/s
Random Read 512KB : 204.549 MB/s
Random Write 512KB : 34.865 MB/s
Random Read 4KB (QD=1) : 20.951 MB/s [ 5114.9 IOPS]
Random Write 4KB (QD=1) : 1.666 MB/s [ 406.7 IOPS]
Random Read 4KB (QD=32) : 20.893 MB/s [ 5100.9 IOPS]
Random Write 4KB (QD=32) : 20.944 MB/s [ 5113.4 IOPS]
Test : 50 MB [E: 0.0% (0.2/1023.0 GB)] (x5)
Date : 2015/02/14 15:22:59
OS : Windows Server 2012 R2 Datacenter (Full installation) [6.3 Build 9600] (x64)
When compared to our current CloudDrive with host caching deployed on D13s (note the performance of 4KB random reads):
Sequential Read : 136.711 MB/s
Sequential Write : 10.210 MB/s
Random Read 512KB : 190.744 MB/s
Random Write 512KB : 9.063 MB/s
Random Read 4KB (QD=1) : 10.813 MB/s [ 2639.8 IOPS]
Random Write 4KB (QD=1) : 0.508 MB/s [ 107.5 IOPS]
Random Read 4KB (QD=32) : 106.533 MB/s [ 26009.1 IOPS]
Random Write 4KB (QD=32) : 9.363 MB/s [ 2286.0 IOPS]
Test : 50 MB [F: 4.1% (24.9/600.0 GB)] (x5)
Date : 2015/02/14 20:25:01
OS : Windows Server 2012 Datacenter (Full installation) [6.2 Build 9200] (x64)
And this is what SQLIO reports for the local SSD:
C:\Program Files (x86)\SQLIO>sqlio -dD
sqlio v1.5.SG
1 thread reading for 30 secs from file D:testfile.dat
using 2KB IOs over 128KB stripes with 64 IOs per run
size of file D:testfile.dat needs to be: 8388608 bytes
current file size: 0 bytes
need to expand by: 8388608 bytes
expanding D:testfile.dat ... done.
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 499.38
MBs/sec: 0.97
And for the P30:
C:\Program Files (x86)\SQLIO>sqlio -dE
sqlio v1.5.SG
1 thread reading for 30 secs from file E:testfile.dat
using 2KB IOs over 128KB stripes with 64 IOs per run
size of file E:testfile.dat needs to be: 8388608 bytes
current file size: 0 bytes
need to expand by: 8388608 bytes
expanding E:testfile.dat ... done.
initialization done
CUMULATIVE DATA:
throughput metrics:
IOs/sec: 5103.03
MBs/sec: 9.96
The 5000 IOPS advertised for the P30 is holding up but what about the 200 MB/s throughput per disk?
NOTE: Attempts to create the P30 data disk with ReadWrite cache policy result in:
Update-AzureVm : BadRequest: The disk cache setting ReadWrite is not supported for DataVirtualHardDisk.
Any guidance would be appreciated:
- Why is the local SSD storage throttled at 500 IOPS and 1-4 MB/s throughput?
- How do we achieve 200MB/s on writes as we see with reads on P30s, what's the test to run?
- MS: can you publish I/O benchmarks that we can run to validate max limits?
Solution 1:
To answer your questions:
- Local storage is throttled to 500 IOPS @8KB. Those limits were a mistake and will be raised substantially soon.
- To hit 200 MB/sec on writes you need to (a) use a block size of at least 40KB (otherwise you run into the 5,000 IOPS limit first), and (b) use a queue depth of at least 25 (for a 40KB block, as the block size goes up, you can use a smaller queue depth).
- We agree, it would be nice if we published benchmarks that you can use to validate the limits. If we do, it probably won't be until we move out of preview.
David Berg - Microsoft Azure Performance Team