在VMware ESXi上对三星PCIe 4.0 NVMe SSD 980 PRO硬盘进行直通DirectPath和虚拟化Virtualization的读写测试

在VMware ESXi上对三星PCIe 4.0 NVMe SSD 980 PRO硬盘进行直通DirectPath和虚拟化Virtualization的读写测试
存储虚拟化后,读性能损失约21.5%,写性能损失约18.4%

Read and Write Tests of Samsung PCIe 4.0 NVMe SSD 980 PRO Hard Drive with DirectPath and Virtualization on VMware ESXi

物理服务器 NF5280M5
硬盘 PCIEx4转NMVe
虚拟化软件 ESXi 7.03
虚拟机配置 8核心 16G内存 120G系统盘
操作系统 CentOS7.9
数据盘 三星 PCIe 4.0 NVMe SSD 980 PRO
测试软件 fio

1、软件安装

yum -y install epel-release
yum -y install fio

2、挂载数据目录

sudo vgcreate lvmgroup /dev/vdb
sudo lvcreate -l 100%FREE -n data lvmgroup
sudo mkfs.xfs /dev/lvmgroup/data
sudo mkdir /data
sudo echo "/dev/lvmgroup/data /data xfs defaults 0 0" >>/etc/fstab
sudo mount -av

3、编写 fio 测试配置文件:创建一个用于 fio 测试的配置文件。您可以使用文本编辑器创建一个新文件,例如 fio_test_config.fio

[global]
ioengine=libaio
direct=1
thread=1
rw=randrw
bs=4k
numjobs=4
size=32G
runtime=300
group_reporting

[randwrite]

[global]:这部分指定了全局配置,适用于所有作业。
ioengine=libaio:指定I/O引擎为libaio,表示使用异步I/O。
direct=1:启用直接I/O,绕过操作系统的文件系统缓存。
thread=1:每个作业使用一个线程。
rw=randrw:读写模式为随机读写。
bs=4k:块大小为4KB。
numjobs=4:作业数量为4,表示并发的作业数。
size=32G:每个作业的测试文件大小为32GB。
runtime=300:测试运行时间为300秒。
group_reporting:指定所有线程的统计信息应汇总在一起报告。

直通DirectPath

[root@10-53-220-58 ~]# cd /data/
[root@10-53-220-58 data]# ls
[root@10-53-220-58 data]# sudo dd if=/dev/zero of=/data/fio_test_file bs=1G count=32
32+0 records in
32+0 records out
34359738368 bytes (34 GB) copied, 36.9548 s, 930 MB/s
[root@10-53-220-58 data]# vi fio_test_config.fio
[root@10-53-220-58 data]# sudo fio fio_test_config.fio
randwrite: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
...
fio-3.7
Starting 4 threads
randwrite: Laying out IO file (1 file / 32768MiB)
randwrite: Laying out IO file (1 file / 32768MiB)
randwrite: Laying out IO file (1 file / 32768MiB)
randwrite: Laying out IO file (1 file / 32768MiB)
Jobs: 4 (f=4): [m(4)][100.0%][r=102MiB/s,w=103MiB/s][r=26.2k,w=26.3k IOPS][eta 00m:00s] 
randwrite: (groupid=0, jobs=4): err= 0: pid=9048: Fri Apr 12 13:55:32 2024
   read: IOPS=26.0k, BW=102MiB/s (107MB/s)(29.8GiB/300001msec)
    slat (nsec): min=3381, max=69136, avg=4802.53, stdev=1677.41
    clat (usec): min=21, max=19215, avg=113.61, stdev=218.66
     lat (usec): min=34, max=19220, avg=118.47, stdev=218.66
    clat percentiles (usec):
     |  1.00th=[   48],  5.00th=[   51], 10.00th=[   53], 20.00th=[   55],
     | 30.00th=[   57], 40.00th=[   59], 50.00th=[   65], 60.00th=[   74],
     | 70.00th=[   77], 80.00th=[   85], 90.00th=[   93], 95.00th=[  343],
     | 99.00th=[ 1270], 99.50th=[ 1369], 99.90th=[ 1516], 99.95th=[ 1565],
     | 99.99th=[ 1958]
   bw (  KiB/s): min=20680, max=36128, per=24.99%, avg=26004.68, stdev=2194.12, samples=2397
   iops        : min= 5170, max= 9032, avg=6501.10, stdev=548.54, samples=2397
  write: IOPS=25.0k, BW=102MiB/s (106MB/s)(29.7GiB/300001msec)
    slat (usec): min=4, max=135, avg= 6.16, stdev= 1.90
    clat (nsec): min=1292, max=1175.4k, avg=27396.71, stdev=3962.20
     lat (usec): min=19, max=1181, avg=33.62, stdev= 4.09
    clat percentiles (nsec):
     |  1.00th=[18304],  5.00th=[21888], 10.00th=[23680], 20.00th=[24960],
     | 30.00th=[26496], 40.00th=[26752], 50.00th=[27264], 60.00th=[27520],
     | 70.00th=[28288], 80.00th=[29312], 90.00th=[31616], 95.00th=[34048],
     | 99.00th=[39168], 99.50th=[41216], 99.90th=[43776], 99.95th=[45312],
     | 99.99th=[50944]
   bw (  KiB/s): min=20280, max=36008, per=24.99%, avg=25980.38, stdev=2247.11, samples=2397
   iops        : min= 5070, max= 9002, avg=6495.02, stdev=561.78, samples=2397
  lat (usec)   : 2=0.04%, 4=0.01%, 10=0.01%, 20=0.90%, 50=50.88%
  lat (usec)   : 100=44.43%, 250=1.06%, 500=0.46%, 750=0.46%, 1000=0.53%
  lat (msec)   : 2=1.22%, 4=0.01%, 10=0.01%, 20=0.01%
  cpu          : usr=4.49%, sys=11.07%, ctx=15587579, majf=0, minf=10
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=7804560,7797520,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=102MiB/s (107MB/s), 102MiB/s-102MiB/s (107MB/s-107MB/s), io=29.8GiB (31.0GB), run=300001-300001msec
  WRITE: bw=102MiB/s (106MB/s), 102MiB/s-102MiB/s (106MB/s-106MB/s), io=29.7GiB (31.9GB), run=300001-300001msec

Disk stats (read/write):
    dm-2: ios=7801557/7794611, merge=0/0, ticks=862107/186550, in_queue=1048584, util=100.00%, aggrios=7804560/7797562, aggrmerge=0/1, aggrticks=857542/181607, aggrin_queue=1039149, aggrutil=100.00%
  nvme0n1: ios=7804560/7797562, merge=0/1, ticks=857542/181607, in_queue=1039149, util=100.00%
www.zhangfangzhou.cn

虚拟化Virtualization

[root@10-53-220-44 data]# sudo dd if=/dev/zero of=/data/fio_test_file bs=1G count=32
32+0 records in
32+0 records out
34359738368 bytes (34 GB) copied, 71.8294 s, 478 MB/s
[root@10-53-220-44 data]# vi fio_test_config.fio
[root@10-53-220-44 data]# sudo fio fio_test_config.fio
randwrite: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
...
fio-3.7
Starting 4 threads
randwrite: Laying out IO file (1 file / 32768MiB)
randwrite: Laying out IO file (1 file / 32768MiB)
randwrite: Laying out IO file (1 file / 32768MiB)
randwrite: Laying out IO file (1 file / 32768MiB)
Jobs: 4 (f=4): [m(4)][100.0%][r=78.2MiB/s,w=77.8MiB/s][r=20.0k,w=19.9k IOPS][eta 00m:00s]
randwrite: (groupid=0, jobs=4): err= 0: pid=9016: Fri Apr 12 14:43:21 2024
   read: IOPS=20.4k, BW=79.7MiB/s (83.6MB/s)(23.4GiB/300001msec)
    slat (usec): min=4, max=114, avg= 6.78, stdev= 1.98
    clat (usec): min=42, max=38534, avg=134.39, stdev=190.16
     lat (usec): min=53, max=38545, avg=141.24, stdev=190.14
    clat percentiles (usec):
     |  1.00th=[   79],  5.00th=[   87], 10.00th=[   89], 20.00th=[   92],
     | 30.00th=[   93], 40.00th=[   96], 50.00th=[   98], 60.00th=[  101],
     | 70.00th=[  104], 80.00th=[  106], 90.00th=[  113], 95.00th=[  161],
     | 99.00th=[ 1205], 99.50th=[ 1319], 99.90th=[ 1467], 99.95th=[ 1516],
     | 99.99th=[ 2057]
   bw (  KiB/s): min=   73, max=24856, per=16.35%, avg=13347.32, stdev=9723.44, samples=2397
   iops        : min=   18, max= 6214, avg=3336.67, stdev=2431.02, samples=2397
  write: IOPS=20.4k, BW=79.6MiB/s (83.5MB/s)(23.3GiB/300001msec)
    slat (usec): min=5, max=115, avg= 8.06, stdev= 1.98
    clat (nsec): min=1403, max=38544k, avg=44914.21, stdev=37185.93
     lat (usec): min=32, max=38560, avg=53.04, stdev=37.23
    clat percentiles (usec):
     |  1.00th=[   36],  5.00th=[   39], 10.00th=[   40], 20.00th=[   42],
     | 30.00th=[   43], 40.00th=[   44], 50.00th=[   45], 60.00th=[   46],
     | 70.00th=[   47], 80.00th=[   48], 90.00th=[   50], 95.00th=[   52],
     | 99.00th=[   57], 99.50th=[   59], 99.90th=[   69], 99.95th=[   87],
     | 99.99th=[  135]
   bw (  KiB/s): min=   71, max=25352, per=16.36%, avg=13341.18, stdev=9726.48, samples=2397
   iops        : min=   17, max= 6338, avg=3335.13, stdev=2431.78, samples=2397
  lat (usec)   : 2=0.01%, 10=0.01%, 20=0.01%, 50=45.37%, 100=33.22%
  lat (usec)   : 250=19.16%, 500=0.43%, 750=0.43%, 1000=0.46%
  lat (msec)   : 2=0.91%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
  cpu          : usr=3.53%, sys=11.43%, ctx=12237590, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=6121601,6116013,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=79.7MiB/s (83.6MB/s), 79.7MiB/s-79.7MiB/s (83.6MB/s-83.6MB/s), io=23.4GiB (25.1GB), run=300001-300001msec
  WRITE: bw=79.6MiB/s (83.5MB/s), 79.6MiB/s-79.6MiB/s (83.5MB/s-83.5MB/s), io=23.3GiB (25.1GB), run=300001-300001msec

Disk stats (read/write):
    dm-2: ios=6120646/6115090, merge=0/0, ticks=806337/255779, in_queue=1062184, util=100.00%, aggrios=6121601/6116035, aggrmerge=0/1, aggrticks=801719/251319, aggrin_queue=1051124, aggrutil=100.00%
  sdb: ios=6121601/6116035, merge=0/1, ticks=801719/251319, in_queue=1051124, util=100.00%
www.zhangfangzhou.cn

4、结果
直通 (DirectPath)
read: IOPS=26.0k, BW=102MiB/s (107MB/s)(29.8GiB/300001msec)
write: IOPS=25.0k, BW=102MiB/s (106MB/s)(29.7GiB/300001msec)

虚拟化Virtualization
read: IOPS=20.4k, BW=79.7MiB/s (83.6MB/s)(23.4GiB/300001msec)
write: IOPS=20.4k, BW=79.6MiB/s (83.5MB/s)(23.3GiB/300001msec)

 

读取 (Read) 写入 (Write) 带宽 (Bandwidth)
直通 (DirectPath) 26.0k 25.0k 102MiB/s (107MB/s)
虚拟化 (Virtualization) 20.4k 20.4k 79.7MiB/s (83.6MB/s)

存储虚拟化后,读性能损失约21.5%,写性能损失约18.4%