NVMe over ROCEベンチマーク

NVMe over ROCEのベンチマークをとってみたので、以前に計測したiSCSIとNVMe over TCPのベンチマークと比較してみる。

今回のROCEも以前のiSCSIおよびNVMe over TCPと同じハードウェア構成で計測しているが、MellanoxのOFEDドライバがまだLinuxの5.xカーネルに対応していないので、OSの条件が若干違う。(Fedora Core 29 - 4.18.16-300.fc29.x86_64)

表に纏めてみた。NVMe over ROCEの方がNVMe over TCPよりも性能が良いはずなのだが、この計測ではTCPの方がROCEよりもリードで10.9%、ライトで9.6%良くなっている。。。もう少し検証の必要がある。

プロトコル リード ライト
NVMe over ROCE 95.2K IOPS 40.8K IOPS
NVMe over TCP 104K IOPS 44.7K IOPS
iSCSI 25.3K IOPS 10.7K IOPS

4KBブロックのリード70%ライト30%性能をFIOで計測した結果の詳細は下記の通り。

[root@rdma21 ~]# fio --name=rdma --filename=/dev/nvme0n1 --rw=randrw --rwmixread=70 --direct=1 --invalidate=1 --ioengine=libaio --bs=4k --numjobs=256 --time_based --runtime=10 --group_reporting --iodepth=1
rdma: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
...
fio-3.7
Starting 256 processes
Jobs: 256 (f=256): [m(256)][100.0%][r=354MiB/s,w=152MiB/s][r=90.6k,w=38.9k IOPS][eta 00m:00s]
rdma: (groupid=0, jobs=256): err= 0: pid=3491: Fri Dec  6 15:35:17 2019
   read: IOPS=95.2k, BW=372MiB/s (390MB/s)(3742MiB/10060msec)
    slat (nsec): min=1919, max=88351k, avg=15074.63, stdev=491888.72
    clat (nsec): min=274, max=419529k, avg=1846839.26, stdev=10413184.20
     lat (usec): min=21, max=419539, avg=1862.19, stdev=10456.89
    clat percentiles (usec):
     |  1.00th=[    35],  5.00th=[    50], 10.00th=[    69], 20.00th=[   123],
     | 30.00th=[   190], 40.00th=[   273], 50.00th=[   379], 60.00th=[   523],
     | 70.00th=[   742], 80.00th=[  1139], 90.00th=[  2409], 95.00th=[  6063],
     | 99.00th=[ 21890], 99.50th=[ 49546], 99.90th=[175113], 99.95th=[219153],
     | 99.99th=[308282]
   bw (  KiB/s): min=    7, max=10168, per=0.39%, avg=1473.90, stdev=1771.22, samples=4908
   iops        : min=    1, max= 2542, avg=368.44, stdev=442.81, samples=4908
  write: IOPS=40.8k, BW=160MiB/s (167MB/s)(1605MiB/10060msec)
    slat (nsec): min=1708, max=86770k, avg=13879.61, stdev=450914.50
    clat (nsec): min=289, max=404766k, avg=1843763.52, stdev=10434117.84
     lat (usec): min=22, max=404777, avg=1857.93, stdev=10454.86
    clat percentiles (usec):
     |  1.00th=[    35],  5.00th=[    51], 10.00th=[    71], 20.00th=[   126],
     | 30.00th=[   194], 40.00th=[   277], 50.00th=[   383], 60.00th=[   529],
     | 70.00th=[   742], 80.00th=[  1139], 90.00th=[  2376], 95.00th=[  5932],
     | 99.00th=[ 21627], 99.50th=[ 51119], 99.90th=[179307], 99.95th=[219153],
     | 99.99th=[308282]
   bw (  KiB/s): min=    7, max= 4496, per=0.39%, avg=637.40, stdev=763.57, samples=4868
   iops        : min=    1, max= 1124, avg=159.31, stdev=190.90, samples=4868
  lat (nsec)   : 500=0.01%, 750=0.01%, 1000=0.01%
  lat (usec)   : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=5.09%
  lat (usec)   : 100=10.94%, 250=21.22%, 500=21.29%, 750=11.73%, 1000=7.14%
  lat (msec)   : 2=10.87%, 4=5.04%, 10=3.61%, 20=1.89%, 50=0.66%
  lat (msec)   : 100=0.24%, 250=0.23%, 500=0.03%
  cpu          : usr=0.33%, sys=0.72%, ctx=1381126, majf=0, minf=3812
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=957995,410883,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=372MiB/s (390MB/s), 372MiB/s-372MiB/s (390MB/s-390MB/s), io=3742MiB (3924MB), run=10060-10060msec
  WRITE: bw=160MiB/s (167MB/s), 160MiB/s-160MiB/s (167MB/s-167MB/s), io=1605MiB (1683MB), run=10060-10060msec

Disk stats (read/write):
  nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%
#