在VMware ESXi上对三星PCIe 4.0 NVMe SSD 980 PRO硬盘进行直通DirectPath和虚拟化Virtualization的读写测试

在VMware ESXi上对三星PCIe 4.0 NVMe SSD 980 PRO硬盘进行直通DirectPath和虚拟化Virtualization的读写测试
存储虚拟化后,读性能损失约21.5%,写性能损失约18.4%

Read and Write Tests of Samsung PCIe 4.0 NVMe SSD 980 PRO Hard Drive with DirectPath and Virtualization on VMware ESXi

物理服务器 NF5280M5
硬盘 PCIEx4转NMVe
虚拟化软件 ESXi 7.03
虚拟机配置 8核心 16G内存 120G系统盘
操作系统 CentOS7.9
数据盘 三星 PCIe 4.0 NVMe SSD 980 PRO
测试软件 fio

1、软件安装

yum -y install epel-release
yum -y install fio

2、挂载数据目录

sudo vgcreate lvmgroup /dev/vdb
sudo lvcreate -l 100%FREE -n data lvmgroup
sudo mkfs.xfs /dev/lvmgroup/data
sudo mkdir /data
sudo echo "/dev/lvmgroup/data /data xfs defaults 0 0" >>/etc/fstab
sudo mount -av

3、编写 fio 测试配置文件:创建一个用于 fio 测试的配置文件。您可以使用文本编辑器创建一个新文件,例如 fio_test_config.fio

[global]
ioengine=libaio
direct=1
thread=1
rw=randrw
bs=4k
numjobs=4
size=32G
runtime=300
group_reporting

[randwrite]

[global]:这部分指定了全局配置,适用于所有作业。
ioengine=libaio:指定I/O引擎为libaio,表示使用异步I/O。
direct=1:启用直接I/O,绕过操作系统的文件系统缓存。
thread=1:每个作业使用一个线程。
rw=randrw:读写模式为随机读写。
bs=4k:块大小为4KB。
numjobs=4:作业数量为4,表示并发的作业数。
size=32G:每个作业的测试文件大小为32GB。
runtime=300:测试运行时间为300秒。
group_reporting:指定所有线程的统计信息应汇总在一起报告。

直通DirectPath

[root@10-53-220-58 ~]# cd /data/
[root@10-53-220-58 data]# ls
[root@10-53-220-58 data]# sudo dd if=/dev/zero of=/data/fio_test_file bs=1G count=32
32+0 records in
32+0 records out
34359738368 bytes (34 GB) copied, 36.9548 s, 930 MB/s
[root@10-53-220-58 data]# vi fio_test_config.fio
[root@10-53-220-58 data]# sudo fio fio_test_config.fio
randwrite: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
...
fio-3.7
Starting 4 threads
randwrite: Laying out IO file (1 file / 32768MiB)
randwrite: Laying out IO file (1 file / 32768MiB)
randwrite: Laying out IO file (1 file / 32768MiB)
randwrite: Laying out IO file (1 file / 32768MiB)
Jobs: 4 (f=4): [m(4)][100.0%][r=102MiB/s,w=103MiB/s][r=26.2k,w=26.3k IOPS][eta 00m:00s] 
randwrite: (groupid=0, jobs=4): err= 0: pid=9048: Fri Apr 12 13:55:32 2024
   read: IOPS=26.0k, BW=102MiB/s (107MB/s)(29.8GiB/300001msec)
    slat (nsec): min=3381, max=69136, avg=4802.53, stdev=1677.41
    clat (usec): min=21, max=19215, avg=113.61, stdev=218.66
     lat (usec): min=34, max=19220, avg=118.47, stdev=218.66
    clat percentiles (usec):
     |  1.00th=[   48],  5.00th=[   51], 10.00th=[   53], 20.00th=[   55],
     | 30.00th=[   57], 40.00th=[   59], 50.00th=[   65], 60.00th=[   74],
     | 70.00th=[   77], 80.00th=[   85], 90.00th=[   93], 95.00th=[  343],
     | 99.00th=[ 1270], 99.50th=[ 1369], 99.90th=[ 1516], 99.95th=[ 1565],
     | 99.99th=[ 1958]
   bw (  KiB/s): min=20680, max=36128, per=24.99%, avg=26004.68, stdev=2194.12, samples=2397
   iops        : min= 5170, max= 9032, avg=6501.10, stdev=548.54, samples=2397
  write: IOPS=25.0k, BW=102MiB/s (106MB/s)(29.7GiB/300001msec)
    slat (usec): min=4, max=135, avg= 6.16, stdev= 1.90
    clat (nsec): min=1292, max=1175.4k, avg=27396.71, stdev=3962.20
     lat (usec): min=19, max=1181, avg=33.62, stdev= 4.09
    clat percentiles (nsec):
     |  1.00th=[18304],  5.00th=[21888], 10.00th=[23680], 20.00th=[24960],
     | 30.00th=[26496], 40.00th=[26752], 50.00th=[27264], 60.00th=[27520],
     | 70.00th=[28288], 80.00th=[29312], 90.00th=[31616], 95.00th=[34048],
     | 99.00th=[39168], 99.50th=[41216], 99.90th=[43776], 99.95th=[45312],
     | 99.99th=[50944]
   bw (  KiB/s): min=20280, max=36008, per=24.99%, avg=25980.38, stdev=2247.11, samples=2397
   iops        : min= 5070, max= 9002, avg=6495.02, stdev=561.78, samples=2397
  lat (usec)   : 2=0.04%, 4=0.01%, 10=0.01%, 20=0.90%, 50=50.88%
  lat (usec)   : 100=44.43%, 250=1.06%, 500=0.46%, 750=0.46%, 1000=0.53%
  lat (msec)   : 2=1.22%, 4=0.01%, 10=0.01%, 20=0.01%
  cpu          : usr=4.49%, sys=11.07%, ctx=15587579, majf=0, minf=10
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=7804560,7797520,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=102MiB/s (107MB/s), 102MiB/s-102MiB/s (107MB/s-107MB/s), io=29.8GiB (31.0GB), run=300001-300001msec
  WRITE: bw=102MiB/s (106MB/s), 102MiB/s-102MiB/s (106MB/s-106MB/s), io=29.7GiB (31.9GB), run=300001-300001msec

Disk stats (read/write):
    dm-2: ios=7801557/7794611, merge=0/0, ticks=862107/186550, in_queue=1048584, util=100.00%, aggrios=7804560/7797562, aggrmerge=0/1, aggrticks=857542/181607, aggrin_queue=1039149, aggrutil=100.00%
  nvme0n1: ios=7804560/7797562, merge=0/1, ticks=857542/181607, in_queue=1039149, util=100.00%
www.zhangfangzhou.cn

虚拟化Virtualization

[root@10-53-220-44 data]# sudo dd if=/dev/zero of=/data/fio_test_file bs=1G count=32
32+0 records in
32+0 records out
34359738368 bytes (34 GB) copied, 71.8294 s, 478 MB/s
[root@10-53-220-44 data]# vi fio_test_config.fio
[root@10-53-220-44 data]# sudo fio fio_test_config.fio
randwrite: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
...
fio-3.7
Starting 4 threads
randwrite: Laying out IO file (1 file / 32768MiB)
randwrite: Laying out IO file (1 file / 32768MiB)
randwrite: Laying out IO file (1 file / 32768MiB)
randwrite: Laying out IO file (1 file / 32768MiB)
Jobs: 4 (f=4): [m(4)][100.0%][r=78.2MiB/s,w=77.8MiB/s][r=20.0k,w=19.9k IOPS][eta 00m:00s]
randwrite: (groupid=0, jobs=4): err= 0: pid=9016: Fri Apr 12 14:43:21 2024
   read: IOPS=20.4k, BW=79.7MiB/s (83.6MB/s)(23.4GiB/300001msec)
    slat (usec): min=4, max=114, avg= 6.78, stdev= 1.98
    clat (usec): min=42, max=38534, avg=134.39, stdev=190.16
     lat (usec): min=53, max=38545, avg=141.24, stdev=190.14
    clat percentiles (usec):
     |  1.00th=[   79],  5.00th=[   87], 10.00th=[   89], 20.00th=[   92],
     | 30.00th=[   93], 40.00th=[   96], 50.00th=[   98], 60.00th=[  101],
     | 70.00th=[  104], 80.00th=[  106], 90.00th=[  113], 95.00th=[  161],
     | 99.00th=[ 1205], 99.50th=[ 1319], 99.90th=[ 1467], 99.95th=[ 1516],
     | 99.99th=[ 2057]
   bw (  KiB/s): min=   73, max=24856, per=16.35%, avg=13347.32, stdev=9723.44, samples=2397
   iops        : min=   18, max= 6214, avg=3336.67, stdev=2431.02, samples=2397
  write: IOPS=20.4k, BW=79.6MiB/s (83.5MB/s)(23.3GiB/300001msec)
    slat (usec): min=5, max=115, avg= 8.06, stdev= 1.98
    clat (nsec): min=1403, max=38544k, avg=44914.21, stdev=37185.93
     lat (usec): min=32, max=38560, avg=53.04, stdev=37.23
    clat percentiles (usec):
     |  1.00th=[   36],  5.00th=[   39], 10.00th=[   40], 20.00th=[   42],
     | 30.00th=[   43], 40.00th=[   44], 50.00th=[   45], 60.00th=[   46],
     | 70.00th=[   47], 80.00th=[   48], 90.00th=[   50], 95.00th=[   52],
     | 99.00th=[   57], 99.50th=[   59], 99.90th=[   69], 99.95th=[   87],
     | 99.99th=[  135]
   bw (  KiB/s): min=   71, max=25352, per=16.36%, avg=13341.18, stdev=9726.48, samples=2397
   iops        : min=   17, max= 6338, avg=3335.13, stdev=2431.78, samples=2397
  lat (usec)   : 2=0.01%, 10=0.01%, 20=0.01%, 50=45.37%, 100=33.22%
  lat (usec)   : 250=19.16%, 500=0.43%, 750=0.43%, 1000=0.46%
  lat (msec)   : 2=0.91%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%
  cpu          : usr=3.53%, sys=11.43%, ctx=12237590, majf=0, minf=11
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=6121601,6116013,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=79.7MiB/s (83.6MB/s), 79.7MiB/s-79.7MiB/s (83.6MB/s-83.6MB/s), io=23.4GiB (25.1GB), run=300001-300001msec
  WRITE: bw=79.6MiB/s (83.5MB/s), 79.6MiB/s-79.6MiB/s (83.5MB/s-83.5MB/s), io=23.3GiB (25.1GB), run=300001-300001msec

Disk stats (read/write):
    dm-2: ios=6120646/6115090, merge=0/0, ticks=806337/255779, in_queue=1062184, util=100.00%, aggrios=6121601/6116035, aggrmerge=0/1, aggrticks=801719/251319, aggrin_queue=1051124, aggrutil=100.00%
  sdb: ios=6121601/6116035, merge=0/1, ticks=801719/251319, in_queue=1051124, util=100.00%
www.zhangfangzhou.cn

4、结果
直通 (DirectPath)
read: IOPS=26.0k, BW=102MiB/s (107MB/s)(29.8GiB/300001msec)
write: IOPS=25.0k, BW=102MiB/s (106MB/s)(29.7GiB/300001msec)

虚拟化Virtualization
read: IOPS=20.4k, BW=79.7MiB/s (83.6MB/s)(23.4GiB/300001msec)
write: IOPS=20.4k, BW=79.6MiB/s (83.5MB/s)(23.3GiB/300001msec)

 

读取 (Read) 写入 (Write) 带宽 (Bandwidth)
直通 (DirectPath) 26.0k 25.0k 102MiB/s (107MB/s)
虚拟化 (Virtualization) 20.4k 20.4k 79.7MiB/s (83.6MB/s)

存储虚拟化后,读性能损失约21.5%,写性能损失约18.4%

CentOS Linux release 7.9.2009 编译安装最新版HAProxy 2.8 LTS长期支持版,并创建Shell监听脚本

编译安装HAProxy 2.8 LTS版本,官方源码包下载地址:http://www.haproxy.org/download/

由于CentOS7 之前版本自带的lua版本比较低并不符合HAProxy要求的lua最低版本(5.3)的要求,因此需要编译安装较新版本的lua环境,然后才能编译安装HAProxy。

1、查看现有的lua版本,如果版本太旧需要先更新lua

lua -v 
Lua 5.1.4  Copyright (C) 1994-2008 Lua.org, PUC-Rio

2、安装基础命令及编译依赖环境

yum install gcc readline-devel

3、下载和编译lua-5.4.6

cd /usr/local/src
wget --no-check-certificate https://www.lua.org/ftp/lua-5.4.6.tar.gz
tar xvf lua-5.4.6.tar.gz

开始编译

cd  lua-5.4.6
make linux test

查看编译安装的版本
src/lua -v
Lua 5.4.6  Copyright (C) 1994-2023 Lua.org, PUC-Rio

把旧版本移走,换成新的版本
mv /usr/bin/lua /usr/bin/lua.old
cp /usr/local/src/lua-5.4.6/src/lua /usr/bin/lua

确认现在的版本
lua -v 
Lua 5.4.6  Copyright (C) 1994-2023 Lua.org, PUC-Rio

4、安装Haproxy

创建用户
useradd -r -s /sbin/nologin haproxy

5、安装依赖

yum -y install gcc openssl-devel pcre-devel systemd-devel

6、下载和编译

wget --no-check-certificate  https://www.haproxy.org/download/2.8/src/haproxy-2.8.9.tar.gz
tar xvf haproxy-2.8.9.tar.gz
cd haproxy-2.8.9/


make TARGET=linux-glibc USE_OPENSSL=1 USE_PCRE=1 USE_SYSTEMD=1 USE_LUA=1 LUA_INC=/usr/local/src/lua-5.4.6/src LUA_LIB=/usr/local/src/lua-5.4.6/src
make install PREFIX=/etc/haproxy

创建软连接
ln -s /etc/haproxy/sbin/haproxy /usr/bin/haproxy
www.zhangfangzhou.cn
查看版本
haproxy -v
HAProxy version 2.8.9-1842fd0 2024/04/05 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2028.
Known bugs: http://www.haproxy.org/bugs/bugs-2.8.9.html
Running on: Linux 3.10.0-1160.105.1.el7.x86_64 #1 SMP Thu Dec 7 15:39:45 UTC 2023 x86_64

7、创建配置文件

vi /etc/haproxy/haproxy.cfg
global
  maxconn 100000
  chroot /etc/haproxy
  stats socket /var/tmp/haproxy.sock mode 600 level admin
  user haproxy
  group haproxy
  daemon
  pidfile /var/tmp/haproxy.pid

defaults
  option http-keep-alive
  option forwardfor
  maxconn 100000
  mode http
  timeout connect 300000ms
  timeout client 300000ms
  timeout server 300000ms

listen stats
  mode http
  bind 0.0.0.0:3306
  stats enable
  log global
  stats uri /haproxy-status
  stats auth haadmin:Admin@2018..

listen db_port
  bind :3306
  mode tcp
  log global
  balance roundrobin
  option tcplog
  server db1 10.53.123.104:3306 check inter 3000 fall 2 rise 3
  server db2 10.53.123.105:3306 check inter 3000 fall 2 rise 3
  server db3 10.53.123.106:3306 check inter 3000 fall 2 rise 3


8、创建service服务

vi /usr/lib/systemd/system/haproxy.service
[Unit]
Description=HAProxy Load Balancer
After=syslog.target network.target

[Service]
ExecStartPre=/usr/bin/haproxy -f /etc/haproxy/haproxy.cfg -c -q
ExecStart=/usr/bin/haproxy -Ws -f /etc/haproxy/haproxy.cfg
ExecReload=/bin/kill -USR2 $MAINPID
LimitNOFILE=100000

[Install]
WantedBy=multi-user.target

systemctl enable haproxy
systemctl start haproxy
9、设置监听脚本

vi ha_watchdog.sh
#!/bin/bash
#www.zhangfangzhou.cn
# 定义日志文件路径
LOG_FILE="/var/log/haproxy_watchdog.log"

# 定义 HAProxy 进程的名称
HAPROXY_PROCESS="haproxy"

# 检查 HAProxy 进程是否在运行
check_haproxy_process() {
    # 使用 pgrep 命令检查是否存在名为 haproxy 的进程
    pgrep -x $HAPROXY_PROCESS > /dev/null
}

# 启动 HAProxy 进程
start_haproxy() {
    echo "$(date +'%Y-%m-%d %H:%M:%S') - Starting HAProxy..." >> $LOG_FILE
    systemctl start haproxy
}

# 主循环
while true; do
    # 检查 HAProxy 进程是否在运行
    if ! check_haproxy_process; then
        # 如果进程不存在,则重新启动 HAProxy 并记录日志
        start_haproxy
    fi
    # 休眠一段时间后再次检查
    sleep 60
done


#设置开机启动
chmod +x ha_watchdog.sh

vi /etc/rc.local
nohup /root/ha_watchdog.sh &

一台 Linux 服务器最多能支撑多少个 TCP 连接?

一台 Linux 服务器最多能支撑多少个 TCP 连接?一台Linux机器上最多能建立多少个TCP连接?

困惑很多人的并发问题

我发现有很多同学对一个基础问题始终是没有彻底搞明白。那就是一台服务器最大究竟能支持多少个网络连接?我想我有必要单独发一篇文章来好好说一下这个问题。

很多同学看到这个问题的第一反应是65535。原因是:“听说端口号最多有65535个,那长连接就最多保持65535个了”。是这样的吗?还有的人说:“应该受TCP连接里四元组的空间大小限制,算起来是200多万亿个!”
如果你对这个问题也是理解的不够彻底,那么今天讲个故事讲给你听!

一次关于服务器端并发的聊天

"TCP连接四元组是源IP地址、源端口、目的IP地址和目的端口。任意一个元素发生了改变,那么就代表的是一条完全不同的连接了。拿我的Nginx举例,它的端口是固定使用80。另外我的IP也是固定的,这样目的IP地址、目的端口都是固定的。剩下源IP地址、源端口是可变的。所以理论上我的Nginx上最多可以建立2的32次方(ip数)×2的16次方(port数)个连接。这是两百多万亿的一个大数字!!"

"进程每打开一个文件(linux下一切皆文件,包括socket),都会消耗一定的内存资源。如果有不怀好心的人启动一个进程来无限的创建和打开新的文件,会让服务器崩溃。所以linux系统出于安全角度的考虑,在多个位置都限制了可打开的文件描述符的数量,包括系统级、用户级、进程级。这三个限制的含义和修改方式如下:"

  • 系统级:当前系统可打开的最大数量,通过fs.file-max参数可修改
  • 用户级:指定用户可打开的最大数量,修改/etc/security/limits.conf
  • 进程级:单个进程可打开的最大数量,通过fs.nr_open参数可修改

"我的接收缓存区大小是可以配置的,通过sysctl命令就可以查看。"

sysctl -a | grep rmem
net.ipv4.tcp_rmem = 4096 87380 8388608
net.core.rmem_default = 212992
net.core.rmem_max = 8388608

"其中在tcp_rmem"中的第一个值是为你们的TCP连接所需分配的最少字节数。该值默认是4K,最大的话8MB之多。也就是说你们有数据发送的时候我需要至少为对应的socket再分配4K内存,甚至可能更大。"

"TCP分配发送缓存区的大小受参数net.ipv4.tcp_wmem配置影响。"

sysctl -a | grep wmem
net.ipv4.tcp_wmem = 4096 65536 8388608
net.core.wmem_default = 212992
net.core.wmem_max = 8388608

"在net.ipv4.tcp_wmem"中的第一个值是发送缓存区的最小值,默认也是4K。当然了如果数据很大的话,该缓存区实际分配的也会比默认值大。"

服务端百万连接达成记

“准备啥呢,还记得前面说过Linux对最大文件对象数量有限制,所以要想完成这个实验,得在用户级、系统级、进程级等位置把这个上限加大。我们实验目的是100W,这里都设置成110W,这个很重要!因为得保证做实验的时候其它基础命令例如ps,vi等是可用的。“

活动连接数量确实达到了100W:

ss -n | grep ESTAB | wc -l  
1000024

当前机器内存总共是3.9GB,其中内核Slab占用了3.2GB之多。MemFree和Buffers加起来也只剩下100多MB了:

cat /proc/meminfo
MemTotal:        3922956 kB
MemFree:           96652 kB
MemAvailable:       6448 kB
Buffers:           44396 kB
......
Slab:          3241244KB kB

通过slabtop命令可以查看到densty、flip、sock_inode_cache、TCP四个内核对象都分别有100W个:

结语

互联网后端的业务特点之一就是高并发. 但是一台服务器最大究竟能支持多少个TCP连接,这个问题似乎却又在困惑着很多同学。希望今天过后,你能够将这个问题踩在脚下摩擦!

转载于 https://mp.weixin.qq.com/s/Lkyj42NtvqEj63DoCY5btQ

CentOS 7.8使用devtoolset-9使用高版本gcc version 9.3.1 20200408 (Red Hat 9.3.1-2) (GCC)编译安装Redis 6.0.5

CentOS 7.8 编译安装Redis 6.0.5报错
In file included from server.c:30:0:
server.h:1045:5: error: expected specifier-qualifier-list before ‘_Atomic’
_Atomic unsigned int lruclock; /* Clock for LRU eviction */
^
server.c: In function ‘serverLogRaw’:
server.c:1028:31: error: ‘struct redisServer’ has no member named ‘logfile’
int log_to_stdout = server.logfile[0] == '\0';
^
server.c:1031:23: error: ‘struct redisServer’ has no member named ‘verbosity’
if (level < server.verbosity) return;
^
server.c:1033:47: error: ‘struct redisServer’ has no member named ‘logfile’
fp = log_to_stdout ? stdout : fopen(server.logfile,"a");
^
server.c:1046:47: error: ‘struct redisServer’ has no member named ‘timezone’
nolocks_localtime(&tm,tv.tv_sec,server.timezone,server.daylight_active);
^
server.c:1046:63: error: ‘struct redisServer’ has no member named ‘daylight_active’
nolocks_localtime(&tm,tv.tv_sec,server.timezone,server.daylight_active);
......
......

问题原因
CentOS 7的gcc版本为4.8.5,Redis 6.0.5最低需要gcc4.9,因此需要升级gcc版本
from redis 6.0.5, building redis from source code needs C11 support.The version of gcc in CentOS 7 is 4.8.5, but C11 was introduced in 4.9.

解决办法
1、手动编译gcc大于4.9的版本
2、安装 devtoolset-9(使用高版本gcc version 9.3.1 20200408 (Red Hat 9.3.1-2) (GCC))编译安装Redis 6.0.5
yum install centos-release-scl -y
yum install devtoolset-9 -y

临时使用高版本gcc version 9.3.1 20200408 (Red Hat 9.3.1-2) (GCC) (推荐使用这个方法)
export CC=/opt/rh/devtoolset-9/root/usr/bin/gcc
export CPP=/opt/rh/devtoolset-9/root/usr/bin/cpp
export CXX=/opt/rh/devtoolset-9/root/usr/bin/c++

wget http://download.redis.io/releases/redis-stable.tar.gz
tar zxf redis-stable.tar.gz && cd redis-stable
make -j2 && make install
if [ -f "/root/redis-stable/src/redis-server" ]; then
mkdir -p /usr/local/redis/{bin,etc,var}
/bin/cp /root/redis-stable/src/{redis-benchmark,redis-check-aof,redis-check-rdb,redis-cli,redis-sentinel,redis-server} /usr/local/redis/bin/
/bin/cp /root/redis-stable/redis.conf /usr/local/redis/etc/

cd /root/redis-stable/src
id -u redis >/dev/null 2>&1
[ $? -ne 0 ] && useradd -M -s /sbin/nologin redis

chown -R redis:redis /usr/local/redis/{var,etc}
#
if [ -e /bin/systemctl ]; then
cat > /lib/systemd/system/redis-server.service << "EOF"
[Unit]
Description=Redis In-Memory Data Store
After=network.target

[Service]
Type=forking
PIDFile=/var/run/redis/redis.pid
User=redis
Group=redis

Environment=statedir=/var/run/redis
PermissionsStartOnly=true
ExecStartPre=/bin/mkdir -p ${statedir}
ExecStartPre=/bin/chown -R redis:redis ${statedir}
ExecStart=/usr/local/redis/bin/redis-server /usr/local/redis/etc/redis.conf
ExecStop=/bin/kill -s TERM $MAINPID
Restart=always
LimitNOFILE=1000000
LimitNPROC=1000000
LimitCORE=1000000

[Install]
WantedBy=multi-user.target
EOF
systemctl enable redis-server

#确认Redis 6.0.5版本
[www@zhangfangzhou_cn ~]# redis-server -v
Redis server v=6.0.5 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=1987ac006866aa00

CentOS7.7安装 devtoolset-8(使用高版本gcc (GCC) 8.3.1 20190311)编译安装aria2-1.35.0
CentOS6安装devtoolset(使用高版本gcc)GCC 4.8 GCC 4.9 GCC 5.2

如何编译Linux Kernel(linux-5.6.12内核)并制作成rpm文件

如何编译内核及制作RPM包
CentOS 7 编译Linux Kernel(linux-5.6.12内核)并制作成rpm文件
1、下载Latest Stable Kernel
wget https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.6.12.tar.xz
tar -Jxf linux-5.6.12.tar.xz

2、安装依赖包
yum -y install openssh-devel elfutils-libelf-devel bc

3、从 /boot 目录将现有版本的内核编译config配置文件拷过来到放到新的内核源码解压目录内并重命名为.config的隐藏文件(这个文件保存了在安装系统时内核所安装的模块配置信,否则需要重新手动指定每一个模块的编译配置)
cd linux-5.6.12
cp /boot/config-3.10.0-1062.el7.x86_64 ./.config
或者
cp /boot/config-$(uname -r) ./.config

3、安装开发工具包组
yum -y groupinstall "development tools"

4、安装ncurse-devel包 (make menuconfig 文本界面窗口依赖包)
yum -y install ncurses-devel
运行 make menuconfig,开启文本界面的编译选项菜单窗口,可以对内核加载的模块编译选项进行调整,如修改编译后的内核名称、新添加之前系统缺少的模块等
make menuconfig

(1)修改内核名称
General setup --->local version -append to kernel release #注意不要有空格

-----------
出现空格的话会产生错误错误
[root@www.zhangfangzhou.cn linux-5.6.12]# sudo make modules_install
ln: target ‘5.6.12_zhangfangzhou.cn_20200510/source’ is not a directory
make[1]: *** [_modinst_] Error 1
make: *** [sub-make] Error 2
-----------

(2)新添加NTFS文件系统支持模块
File systems --->DOS/FAT/NT Filesystems --->NTFS file system support

5、确认配置文件中NTFS功能是否添加成功
vi .config

6、编译内核 #时间较长,具体时间根据硬件性能决定
make -j `cat /proc/cpuinfo | grep 'model name'|wc -l`
或者
## get thread or cpu core count using nproc command ##
make -j $(nproc)

7、编译安装模块
编译完成后执行make modules_install 安装内核模块
make modules_install

8、安装内核核心文件
make install

9、制作成linux-5.6.12内核rpm文件
yum -y install rpmdevtools

cd linux-5.6.12
make rpm-pkg ##同时构建源和二进制RPM软件包
或者
make binrpm-pkg ##仅构建二进制RPM软件包

Checking for unpackaged file(s): /usr/lib/rpm/check-files /root/rpmbuild/BUILDROOT/kernel-5.6.12_zhangfangzhou.cn_20200510-1.x86_64
Wrote: /root/rpmbuild/SRPMS/kernel-5.6.12_zhangfangzhou.cn_20200510-1.src.rpm
Wrote: /root/rpmbuild/RPMS/x86_64/kernel-5.6.12_zhangfangzhou.cn_20200510-1.x86_64.rpm
Wrote: /root/rpmbuild/RPMS/x86_64/kernel-headers-5.6.12_zhangfangzhou.cn_20200510-1.x86_64.rpm
Wrote: /root/rpmbuild/RPMS/x86_64/kernel-devel-5.6.12_zhangfangzhou.cn_20200510-1.x86_64.rpm

10、CentOS 7 更换最新内核
egrep ^menuentry /etc/grub2.cfg | cut -f 2 -d \' #查看内核版本
grub2-set-default 0

reboot重启

11、Debian/Ubuntu 更换最新内核
sudo update-initramfs -c -k 5.6.12
sudo update-grub

12、查看内核版本
uname -msr
Linux 5.6.12_zhangfangzhou.cn_20200510 x86_64
----------
#https://linuxconfig.org/how-to-compile-vanilla-linux-kernel-from-source-on-fedora
#https://linuxhint.com/compile-linux-kernel-centos7/

百度云磁盘CDS Linux数据盘扩展分区

百度云磁盘CDS Linux数据盘扩展分区
随着Web业务的持续增加,原先400G的Web服务器数据磁盘已经不能满足使用,为满足Web业务的需要,决定扩容数据磁盘到1000G。
在百度云磁盘CDS扩容只是将磁盘的存储容量扩大,不会扩展分区和文件系统。

实战记录百度云磁盘CDS Linux数据盘扩展分区,Web 服务器为CentOS release 6.10 (Final)系统,数据磁盘的格式为EXT4。

1、查看Web服务器数据磁盘挂载信息
#mount | grep "/dev/vdb"
/dev/vdb1 on /data type ext4 (rw)

2、查看是否有程序在Web服务器数据磁盘正在运行
fuser -m /dev/vdb1
/dev/vdb1: 5115

如果有的话,查找并结束掉进程
ps aux | grep 5115
#ps aux | grep 5115
apache 5115 0.2 0.1 449060 29048 ? S 10:56 0:03 /usr/sbin/httpd

3、关闭正在使用Web服务器数据磁盘的服务或者进程
service httpd stop 或者 kill -9 5115

4、编辑fstab,取消Web服务器数据磁盘自动挂载
vi /etc/fstab
#/dev/vdb1 /data ext4 defaults 0 0

5、卸载Web服务器数据磁盘
#umount /dev/vdb1

#umount /dev/vdb1 如果有程序正在运行,则会这样提示
umount: /data: target is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
取消正在运行的程序 比如webserver mysql

6、调整分区大小
(1)查看分区表
fdisk /dev/vdb

p print the partition table
p
Command (m for help): p
Disk /dev/vdb: 1073.7 GB, 1073741824000 bytes
16 heads, 63 sectors/track, 2080507 cylinders
Units = cylinders of 1008 * 512 = 516096 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x2c594bf0

Device Boot Start End Blocks Id System
/dev/vdb1 1 2080507 1048575496+ 83 Linux

(2) 删除当前分区。当屏幕显示"已选择分区 1
d delete a partition
d

(3)新建分区
n add a new partition
n

(4)保存修改并退出
w write table to disk and exit
w
The partition table has been altered!

----------------------------
如果fstab有自动挂载则会出现这样的提示,需要提前执行第4步。
Calling ioctl() to re-read partition table.

WARNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table. The new table will be used at
the next reboot or after you run partprobe(8) or kpartx(8)
Syncing disks.
---------------------------
8、执行命令 lsblk /dev/vdb 查看当前文件系统是否已增加分区表
lsblk /dev/vdb
#lsblk /dev/vdb
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vdb 252:16 0 1000G 0 disk
└─vdb1 252:17 0 1000G 0 part /data

9、执行命令 e2fsck -n /dev/vdb1 查看当前文件系统的状态是否为 clean
e2fsck -n /dev/vdb1
如果有异常则会提示
/dev/vdb1 contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/vdb1: 5641753/26214400 files (2.5% non-contiguous), 97404495/104857570 blocks

修复
fsck.ext4 -C0 /dev/vdb1

10、通知内核更新分区表 通知内核需同步数据盘的分区表信息。
yum -y install parted
partprobe /dev/vdb

11、完成文件系统扩展
resize2fs /dev/vdb1
resize2fs 1.41.12 (17-May-2010)
Resizing the filesystem on /dev/vdb1 to 262142637 (4k) blocks.
The filesystem on /dev/vdb1 is now 262142637 blocks long.

Web服务器数据磁盘采用 ext4 文件系统,可以执行 resize2fs /dev/vdb1 命令以扩展文件系统大小。
对于 xfs 文件系统,需要先运行 mount /dev/vdb1 /data/ 命令,再运行xfs_growfs /dev/vdb1 以完成文件系统拓展。

12、编辑fstab,启用Web服务器数据磁盘自动挂载
vi /etc/fstab
/dev/vdb1 /data ext4 defaults 0 0

13、挂载全部
mount -a

14、确认文件系统扩展完毕
df -hT
Filesystem Type Size Used Avail Use% Mounted on
/dev/vda1 ext4 20G 8.3G 11G 45% /
tmpfs tmpfs 12G 0 12G 0% /dev/shm
/dev/vdb1 ext4 985G 366G 569G 40% /data
/dev/vdc1 ext4 394G 270G 105G 73% /backup

---------------------------
https://cloud.baidu.com/doc/CDS/s/Lk0629a17

CentOS7.7安装 devtoolset-8(使用高版本gcc (GCC) 8.3.1 20190311)编译安装aria2-1.35.0

Aria2是一个支持HTTP/HTTPS, FTP, SFTP, BitTorrent和Metalink等协议的轻量级命令行下载工具。
编译安装aria2-1.35.0的时候出现错误,提示需要高版本gcc支持。
......
checking whether g++ supports C++11 features with -std=c++11 ... no
checking whether g++ supports C++11 features with -std=c++11 -stdlib=libc++... no
checking whether g++ supports C++11 features with -std=c++0x ... no
checking whether g++ supports C++11 features with -std=c++0x -stdlib=libc++... no
configure: error: *** A compiler with support for C++11 language features is required.
......

CentOS7.7默认的GCC版本为gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
#gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
#cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)

解决方法有两种
1、手动编译gcc>4.8的版本
自行编译
2、安装devtoolset-8(使用高版本gcc (GCC) 8.3.1 20190311
yum install centos-release-scl -y
yum install devtoolset-8 -y
启用devtoolset-8
#scl enable devtoolset-8 -- bash
#source /opt/rh/devtoolset-8/enable

临时编译前使用高版本gcc (GCC) 8.3.1 20190311 (推荐使用这个方法)
export CC=/opt/rh/devtoolset-8/root/usr/bin/gcc
export CPP=/opt/rh/devtoolset-8/root/usr/bin/cpp
export CXX=/opt/rh/devtoolset-8/root/usr/bin/c++
继续编译安装aria2-1.35.0
......

CentOS6安装devtoolset(使用高版本gcc)GCC 4.8 GCC 4.9 GCC 5.2

CentOS6安装devtoolset(使用高版本gcc)GCC 4.8 GCC 4.9 GCC 5.2

echo 0 > /proc/sys/kernel/hung_task_timeout_secs” disables this message.

tail /var/log/messages
echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 19 05:27:57 web kernel: Call Trace:
May 19 05:27:57 web kernel: [<ffffffff811d10a0>] ? sync_buffer+0x0/0x50
May 19 05:27:57 web kernel: [<ffffffff81549183>] io_schedule+0x73/0xc0
May 19 05:27:57 web kernel: [<ffffffff811d10e0>] sync_buffer+0x40/0x50
May 19 05:27:57 web kernel: [<ffffffff81549c6f>] __wait_on_bit+0x5f/0x90
May 19 05:27:57 web kernel: [<ffffffff811d10a0>] ? sync_buffer+0x0/0x50
May 19 05:27:57 web kernel: [<ffffffff81549d18>] out_of_line_wait_on_bit+0x78/0x90
May 19 05:27:57 web kernel: [<ffffffff810a6920>] ? wake_bit_function+0x0/0x50
May 19 05:27:57 web kernel: [<ffffffff811d1096>] __wait_on_buffer+0x26/0x30
May 19 05:27:57 web kernel: [<ffffffffa00707ef>] jbd2_journal_commit_transaction+0x117f/0x14f0 [jbd2]
May 19 05:27:57 web kernel: [<ffffffff8108fb2b>] ? try_to_del_timer_sync+0x7b/0xe0
May 19 05:27:57 web kernel: [<ffffffffa0075a38>] kjournald2+0xb8/0x220 [jbd2]
May 19 05:27:57 web kernel: [<ffffffff810a68a0>] ? autoremove_wake_function+0x0/0x40
May 19 05:27:57 web kernel: [<ffffffffa0075980>] ? kjournald2+0x0/0x220 [jbd2]
May 19 05:27:57 web kernel: [<ffffffff810a640e>] kthread+0x9e/0xc0
May 19 05:27:57 web kernel: [<ffffffff8100c28a>] child_rip+0xa/0x20
May 19 05:27:57 web kernel: [<ffffffff810a6370>] ? kthread+0x0/0xc0
May 19 05:27:57 web kernel: [<ffffffff8100c280>] ? child_rip+0x0/0x20
May 19 05:28:40 web kernel: end_request: I/O error, dev vdb, sector 564522839
May 19 05:33:42 web kernel: end_request: I/O error, dev vdb, sector 564522839
May 19 05:38:44 web kernel: end_request: I/O error, dev vdb, sector 564522839

进程等待IO时,经常处于D状态,即TASK_UNINTERRUPTIBLE状态,处于这种状态的进程不处理信号,所以kill不掉,如果进程长期处于D状态,那么肯定不正常,
原因可能有二
1)IO路径上的硬件出问题了,比如硬盘坏了(只有少数情况会导致长期D,通常会返回错误)
2)内核自己出问题了
这种问题不好定位,而且一旦出现就通常不可恢复,kill不掉,通常只能重启恢复了。
内核针对这种开发了一种hung task的检测机制。
基本原理是:定时检测系统中处于D状态的进程,如果其处于D状态的时间超过了指定时间(默认120s,可以配置),则打印相关堆栈信息,也可以通过proc参数配置使其直接panic。

1、查看是否存在坏块
/sbin/badblocks -v /dev/sdc

2、问题分析
May 19 05:27:57 web kernel: [<ffffffff811d10a0>] ? sync_buffer+0x0/0x50
May 19 05:27:57 web kernel: [<ffffffff81549183>] io_schedule+0x73/0xc0
May 19 05:27:57 web kernel: [<ffffffff811d10e0>] sync_buffer+0x40/0x50
May 19 05:27:57 web kernel: [<ffffffff81549c6f>] __wait_on_bit+0x5f/0x90

3、临时方案
根据应用程序情况,对vm.dirty_ratio,vm.dirty_background_ratio两个参数进行调优设置。
# sysctl -w vm.dirty_ratio=10
# sysctl -w vm.dirty_background_ratio=5
# sysctl -p

如果系统永久生效,修改/etc/sysctl.conf文件。
#vi /etc/sysctl.conf

vm.dirty_background_ratio = 5
vm.dirty_ratio = 10

重启系统生效
http://www.361way.com/kernel-hung-task-analysis/4326.html