基于DHCP服务器使用 PXE 引导和 kickstart (UEFI) 自动安装 ESXi 7

基于DHCP服务器使用 PXE 引导和 kickstart (UEFI) 自动安装 ESXi 7
基于UEFI的PXE安装 ESXi 7
PXE网络引导自动化安装 ESXi 7 详解
PXE+Kickstart 批量安装 ESXi 7
使用PXE 和TFTP 引导ESXi 安装ESXi 7 程序

1 基于DHCP服务器使用 PXE 引导和 kickstart (UEFI) 自动安装 ESXi 7 录制视频

2 浪潮NF5280M5服务器基于DHCP服务器使用 PXE 引导和 kickstart (UEFI) 自动安装 ESXi 7 录制视频

#CentOS 7 系统 安装dhcp服务器 、 tftp-server服务器 、 web服务器
yum install -y dhcp tftp-server httpd

1、配置DHCP服务器

守护进程 :/etc/sbin/dhcpd
脚本:/etc/init.d/dhcpd
端口:67(bootps) 服务端,68(bootpc) 客户端
配置文件:/etc/dhcp/dhcpd.conf
租约信息:/var/lib/dhcpd/dhcpd.leases
配置文件:cp /usr/share/doc/dhcp-4.2.5/dhcpd.conf.example /etc/dhcp/dhcpd.conf
修改DHCP监听特定端口 vi /etc/sysconfig/dhcpd

#编辑DHCP服务器配置文件
vi /etc/dhcp/dhcpd.conf

default-lease-time 600;
max-lease-time 7200;
log-facility local7;

subnet 10.53.220.0 netmask 255.255.255.0 {
  range 10.53.220.41 10.53.220.49;
  option domain-name-servers 10.1.0.1, 114.114.114.114;   #DNS服务器IP地址
  option routers 10.53.220.1;                             #网关地址
  next-server 10.53.220.224;                              # 指定TFTP服务器地址
  filename "/mboot.efi";                                  # 指定网络引导映像文件
}
#绑定IP地址
host zhangfangzhou {                                      #zhangfangzhou 随意指定,但必须唯一
  hardware ethernet 00:50:56:99:06:b7;
  fixed-address 10.53.220.48;
}

#DHCP的服务设置开机启动、启动DHCP服务
systemctl enable dhcpd && systemctl start dhcpd

########防火墙设置
firewall-cmd --zone=public --add-port=67/udp --permanent
firewall-cmd --zone=public --add-port=68/udp --permanent
#重新载入配置
firewall-cmd --reload
firewall-cmd --list-all #查看防火墙规则,只显示/etc/firewalld/zones/public.xml中防火墙策略

2、配置 tftp 服务器

启用 tftp 服务
vi /etc/xinetd.d/tftp

# default: off
# description: The tftp server serves files using the trivial file transfer \
#       protocol.  The tftp protocol is often used to boot diskless \
#       workstations, download configuration files to network-aware printers, \
#       and to start the installation process for some operating systems.
service tftp
{
        socket_type             = dgram
        protocol                = udp
        wait                    = yes
        user                    = root
        server                  = /usr/sbin/in.tftpd
        server_args             = -s /var/lib/tftpboot # TFTP服务器顶级目录
        disable                 = no    # 从yes修改为no
        per_source              = 11
        cps                     = 100 2
        flags                   = IPv4
}

#tftp的服务设置开机启动、启动tftp服务
systemctl enable tftp && systemctl start tftp

########防火墙设置
firewall-cmd --zone=public --add-port=69/tcp --permanent
firewall-cmd --zone=public --add-port=69/udp --permanent
#重新载入配置
firewall-cmd --reload
firewall-cmd --list-all #查看防火墙规则,只显示/etc/firewalld/zones/public.xml中防火墙策略

3、挂载ESXi 7 镜像

#创建文件夹,主要是下载ESXi 7和挂载ESXi 7使用
mkdir -p /var/lib/tftpboot/{iso,ESXi70u2}
wget -P /var/lib/tftpboot/iso  http://10.53.123.144/ISO/ESXi/VMware-VMvisor-Installer-7.0U2a-17867351.x86_64.iso
cd /var/lib/tftpboot
mount /var/lib/tftpboot/iso/VMware-VMvisor-Installer-7.0U2a-17867351.x86_64.iso /var/lib/tftpboot/ESXi70u2     #重启后需要重新mount

#1、UEFI启动 ESXi 7
将上面DHCP服务器指定的启动镜像文件复制到指定目录下。上面的设置示例名为“mboot.efi”,因此将其复制为该名称。
cd /var/lib/tftpboot/
cp -p /var/lib/tftpboot/ESXi70u2/efi/boot/bootx64.efi /var/lib/tftpboot/mboot.efi

直接在 tftpboot 下复制名为 boot.cfg 的文件,该文件描述了引导设置。
cp -p /var/lib/tftpboot/ESXi70u2/efi/boot/boot.cfg /var/lib/tftpboot/boot.cfg
 #2、传统BIOS启动 ESXi 7
如果要使用旧版 BIOS,则需要3.86版的 syslinux 软件包。从 https://www.kernel.org/pub/linux/utils/boot/syslinux/ 下载包
cd /tmp
wget https://mirrors.edge.kernel.org/pub/linux/utils/boot/syslinux/3.xx/syslinux-3.86.tar.gz
tar xvzf syslinux-3.86.tar.gz
cp -p syslinux-3.86/core/mboot.efi /var/lib/tftpboot/
#3、修改boot.cfg
删除 boot.cfg 中文件路径的“/”
sed -i 's|/||g' boot.cfg

4、检查boot.cfg

根据您的环境更改“标题”和“前缀”。为“title”使用一个描述性名称是个好主意,它是自动安装过程中显示的标题字符。“前缀prefix”指定存储 ESXi 安装程序的目录。在这种情况下,它将是“ESXi70u2”。

修改title和refix名称
prefix=ESXi70u2 (ESXi70u2就是上面ESXi ISO文件mount的文件夹)

# vi /var/lib/tftpboot/boot.cfg
bootstate=0
title=Loading ESXi installer www.zhangfangzhou.cn
timeout=5
prefix=ESXi70u2
kernel=b.b00
kernelopt=runweasel cdromBoot
modules=jumpstrt.gz --- useropts.gz --- features.gz --- k.b00 --- uc_intel.b00 --- uc_amd.b00 --- uc_hygon.b00 --- procfs.b00 --- vmx.v00 --- vim.v00 --- tpm.v00 --- sb.v00 --- s.v00 --- atlantic.v00 --- bnxtnet.v00 --- bnxtroce.v00 --- brcmfcoe.v00 --- brcmnvme.v00 --- elxiscsi.v00 --- elxnet.v00 --- i40enu.v00 --- iavmd.v00 --- icen.v00 --- igbn.v00 --- irdman.v00 --- iser.v00 --- ixgben.v00 --- lpfc.v00 --- lpnic.v00 --- lsi_mr3.v00 --- lsi_msgp.v00 --- lsi_msgp.v01 --- lsi_msgp.v02 --- mtip32xx.v00 --- ne1000.v00 --- nenic.v00 --- nfnic.v00 --- nhpsa.v00 --- nmlx4_co.v00 --- nmlx4_en.v00 --- nmlx4_rd.v00 --- nmlx5_co.v00 --- nmlx5_rd.v00 --- ntg3.v00 --- nvme_pci.v00 --- nvmerdma.v00 --- nvmxnet3.v00 --- nvmxnet3.v01 --- pvscsi.v00 --- qcnic.v00 --- qedentv.v00 --- qedrntv.v00 --- qfle3.v00 --- qfle3f.v00 --- qfle3i.v00 --- qflge.v00 --- rste.v00 --- sfvmk.v00 --- smartpqi.v00 --- vmkata.v00 --- vmkfcoe.v00 --- vmkusb.v00 --- vmw_ahci.v00 --- clusters.v00 --- crx.v00 --- elx_esx_.v00 --- btldr.v00 --- esx_dvfi.v00 --- esx_ui.v00 --- esxupdt.v00 --- tpmesxup.v00 --- weaselin.v00 --- loadesx.v00 --- lsuv2_hp.v00 --- lsuv2_in.v00 --- lsuv2_ls.v00 --- lsuv2_nv.v00 --- lsuv2_oe.v00 --- lsuv2_oe.v01 --- lsuv2_oe.v02 --- lsuv2_sm.v00 --- native_m.v00 --- qlnative.v00 --- vdfs.v00 --- vmware_e.v00 --- vsan.v00 --- vsanheal.v00 --- vsanmgmt.v00 --- tools.t00 --- xorg.v00 --- gc.v00 --- imgdb.tgz --- basemisc.tgz --- resvibs.tgz --- imgpayld.tgz
build=7.0.2-0.0.17867351
updated=0

到这一步可以实现通过PXE网络安装ESXi

5、配置web 服务器

#默认网站目录/var/www/html

systemctl enable httpd && systemctl start httpd

########防火墙设置
firewall-cmd --zone=public --add-port=80/tcp --permanent
firewall-cmd --zone=public --add-port=443/tcp --permanent
#重新载入配置
firewall-cmd --reload
firewall-cmd --list-all #查看防火墙规则,只显示/etc/firewalld/zones/public.xml中防火墙策略

6、配置 ESXi 7 的 kickstart

示例 kickstart 文件存储在 ESXi 的 /etc/vmware/weasel/ks.cfg 中。将此复制到 http 服务器上的 /var/www/html

#示例 1 kickstart 文件,给ESXi 7 配置 静态IP地址
vi /var/www/html/ks.cfg
#####################
# Accept the VMware End User License Agreement
vmaccepteula
 
# Set the root password for the DCUI and Tech Support Mode
rootpw P@ssw0rd
 
# The install media is in the CD-ROM drive
install --firstdisk --overwritevmfs
 
# Set the network on the first network adapter
network --bootproto=static --device=vmnic0 --ip=10.53.220.199 --netmask=255.255.255.0 --vlanid=0 --gateway=10.53.220.1 --hostname=10.53.220.199 --nameserver=10.1.0.1

#vmserialnum
vmserialnum --esx=5U4TK-DML1M-M8550-XK1QP-1A052
# reboot after install
reboot
 
# run the following command only on the firstboot
%firstboot --interpreter=busybox

sleep 10
# enable & start remote ESXi Shell (SSH)
vim-cmd hostsvc/enable_ssh
vim-cmd hostsvc/start_ssh
sleep 1
# enable & start ESXi Shell (TSM)
vim-cmd hostsvc/enable_esx_shell
vim-cmd hostsvc/start_esx_shell
sleep 1
# enable High Performance
esxcli system settings advanced set --option=/Power/CpuPolicy --string-value="High Performance"
sleep 1
#Disable ipv6
esxcli network ip set --ipv6-enabled=0
sleep 1
#不加入体验
esxcli system settings advanced set -o /UserVars/HostClientCEIPOptIn -i 2
sleep 1
#enable ntp
/bin/esxcli system ntp set --server=ntp.aliyun.com
sleep 1
/bin/chkconfig ntpd on
sleep 1
#Disable ShellWarning
esxcli system settings advanced set -o /UserVars/SuppressShellWarning -i 1
sleep 1
cat << EOF > /etc/ssh/keys-root/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAgEA6PP8/RDdYIqUq6DE3zj9Qs8AF3uzfYH5lYrB+mMxxfI8kq2NIzQlsW1KDaH/fWYTkkK120lPUu97lWic9Li3Z3iFR6Nh4q6PVTfBNt4xOx754Ipqtpefk+9sLZAYEPK8pnRP0QZv7CtDFv842tfUYIrVNnecRQTFfNtnGDcXnO1u2RE1kq6Tr5N3595PbPLDKczjOFnS+jy0MeKKHPJcZfYz7TUTSzTwTHYbPRRaQ8/0eihUwzpAmXRo9NYNle26qp6+SlRsjGSBcUr0rh0wSe6r/C2btnpOUd//aFvcl8plhyb++nivlhB71v+6I0UcPoSXOIVs/1QuHEMbv7Ircjb2emqHtZDpk8KSYhgV0ZdbAq9XOcux76eok//xtjbleKPAcDMY/KotIEh7QX4NLQxSJOm5gCLkn5kbrHfVx6nWlGzVVds1sDlcnSAWul5lFiI5ZkArXFKcnm+aCnStPpx5SSCpZpMUdZvt8ZA7vLx3xjMDFwv5HTuTFwB9mlZrdfqp5USC2mWC3eAAPE7GxDSfJv9epteIYywIP9NVT3Z4ng9z6jrcFfA4GFlfLrk8J71cnxZ/AWZXXUwp3ooE/Cp3jc473VpK7FZwjQ7Xz9PD8WQgMOO4xnGQhPWlxhRuoTYyQVa0xOBO9gh9Cuc6zq5FQgYQEcSB+/FBJ/YNDIc= www.zhangfangzhou.cn
EOF
sleep 1
date > /finished.stamp

# Restart a last time
reboot


#示例 2 kickstart 文件,ESXi 7 通过 DHCP 服务器获得IP地址
​vi /var/www/html/ks.cfg

# Accept the VMware End User License Agreement
vmaccepteula
 
# Set the root password for the DCUI and Tech Support Mode
rootpw P@ssw0rd
 
# The install media is in the CD-ROM drive
install --firstdisk --overwritevmfs
 
# Set the network on the first network adapter
network --bootproto=dhcp

#vmserialnum
vmserialnum --esx=5U4TK-DML1M-M8550-XK1QP-1A052
# reboot after install
reboot
 
# run the following command only on the firstboot
%firstboot --interpreter=busybox

sleep 10
# enable & start remote ESXi Shell (SSH)
vim-cmd hostsvc/enable_ssh
vim-cmd hostsvc/start_ssh
sleep 1
# enable & start ESXi Shell (TSM)
vim-cmd hostsvc/enable_esx_shell
vim-cmd hostsvc/start_esx_shell
sleep 1
# enable High Performance
esxcli system settings advanced set --option=/Power/CpuPolicy --string-value="High Performance"
sleep 1
#Disable ipv6
esxcli network ip set --ipv6-enabled=0
sleep 1
#不加入体验
esxcli system settings advanced set -o /UserVars/HostClientCEIPOptIn -i 2
sleep 1
#enable ntp
/bin/esxcli system ntp set --server=ntp.aliyun.com
sleep 1
/bin/chkconfig ntpd on
sleep 1
#Disable ShellWarning
esxcli system settings advanced set -o /UserVars/SuppressShellWarning -i 1
sleep 1
cat << EOF > /etc/ssh/keys-root/authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAgEA6PP8/RDdYIqUq6DE3zj9Qs8AF3uzfYH5lYrB+mMxxfI8kq2NIzQlsW1KDaH/fWYTkkK120lPUu97lWic9Li3Z3iFR6Nh4q6PVTfBNt4xOx754Ipqtpefk+9sLZAYEPK8pnRP0QZv7CtDFv842tfUYIrVNnecRQTFfNtnGDcXnO1u2RE1kq6Tr5N3595PbPLDKczjOFnS+jy0MeKKHPJcZfYz7TUTSzTwTHYbPRRaQ8/0eihUwzpAmXRo9NYNle26qp6+SlRsjGSBcUr0rh0wSe6r/C2btnpOUd//aFvcl8plhyb++nivlhB71v+6I0UcPoSXOIVs/1QuHEMbv7Ircjb2emqHtZDpk8KSYhgV0ZdbAq9XOcux76eok//xtjbleKPAcDMY/KotIEh7QX4NLQxSJOm5gCLkn5kbrHfVx6nWlGzVVds1sDlcnSAWul5lFiI5ZkArXFKcnm+aCnStPpx5SSCpZpMUdZvt8ZA7vLx3xjMDFwv5HTuTFwB9mlZrdfqp5USC2mWC3eAAPE7GxDSfJv9epteIYywIP9NVT3Z4ng9z6jrcFfA4GFlfLrk8J71cnxZ/AWZXXUwp3ooE/Cp3jc473VpK7FZwjQ7Xz9PD8WQgMOO4xnGQhPWlxhRuoTYyQVa0xOBO9gh9Cuc6zq5FQgYQEcSB+/FBJ/YNDIc= www.zhangfangzhou.cn
EOF
sleep 1
date > /finished.stamp

# Restart a last time
reboot

7、最终boot.cfg配置文件

# vi /var/lib/tftpboot/boot.cfg
bootstate=0
title=Loading ESXi installer www.zhangfangzhou.cn
timeout=5
prefix=ESXi70u2
kernel=b.b00
kernelopt=ks=http://10.53.220.224/ks.cfg
modules=jumpstrt.gz --- useropts.gz --- features.gz --- k.b00 --- uc_intel.b00 --- uc_amd.b00 --- uc_hygon.b00 --- procfs.b00 --- vmx.v00 --- vim.v00 --- tpm.v00 --- sb.v00 --- s.v00 --- atlantic.v00 --- bnxtnet.v00 --- bnxtroce.v00 --- brcmfcoe.v00 --- brcmnvme.v00 --- elxiscsi.v00 --- elxnet.v00 --- i40enu.v00 --- iavmd.v00 --- icen.v00 --- igbn.v00 --- irdman.v00 --- iser.v00 --- ixgben.v00 --- lpfc.v00 --- lpnic.v00 --- lsi_mr3.v00 --- lsi_msgp.v00 --- lsi_msgp.v01 --- lsi_msgp.v02 --- mtip32xx.v00 --- ne1000.v00 --- nenic.v00 --- nfnic.v00 --- nhpsa.v00 --- nmlx4_co.v00 --- nmlx4_en.v00 --- nmlx4_rd.v00 --- nmlx5_co.v00 --- nmlx5_rd.v00 --- ntg3.v00 --- nvme_pci.v00 --- nvmerdma.v00 --- nvmxnet3.v00 --- nvmxnet3.v01 --- pvscsi.v00 --- qcnic.v00 --- qedentv.v00 --- qedrntv.v00 --- qfle3.v00 --- qfle3f.v00 --- qfle3i.v00 --- qflge.v00 --- rste.v00 --- sfvmk.v00 --- smartpqi.v00 --- vmkata.v00 --- vmkfcoe.v00 --- vmkusb.v00 --- vmw_ahci.v00 --- clusters.v00 --- crx.v00 --- elx_esx_.v00 --- btldr.v00 --- esx_dvfi.v00 --- esx_ui.v00 --- esxupdt.v00 --- tpmesxup.v00 --- weaselin.v00 --- loadesx.v00 --- lsuv2_hp.v00 --- lsuv2_in.v00 --- lsuv2_ls.v00 --- lsuv2_nv.v00 --- lsuv2_oe.v00 --- lsuv2_oe.v01 --- lsuv2_oe.v02 --- lsuv2_sm.v00 --- native_m.v00 --- qlnative.v00 --- vdfs.v00 --- vmware_e.v00 --- vsan.v00 --- vsanheal.v00 --- vsanmgmt.v00 --- tools.t00 --- xorg.v00 --- gc.v00 --- imgdb.tgz --- basemisc.tgz --- resvibs.tgz --- imgpayld.tgz
build=7.0.2-0.0.17867351
updated=0

到这里可以基于DHCP实现通过PXE自动安装 ESXi 7、自动设置系统密码、自动接受许可、自动添加产品密钥、自动启用SSH服务、自动启用shell服务、电源模式自动设置为高性能、禁用IPV6、不参加(CEIP)客户体验改善计划、自动配置NTP、自动添加私钥。

https://www.zhangfangzhou.cn/pxe-kickstart-install-uefi-esxi-7.html

浪潮 NF5280M4 服务器 CPU 升级实战与经验总结

一、背景说明

为搭建 KVM 虚拟化测试平台,充分盘活现有硬件资源,本次选取浪潮(Inspur)NF5280M4 服务器进行定向硬件升级,在不更换整机平台的前提下,将原配置的 Intel Xeon E5-2630 v4 处理器升级为 Intel Xeon E5-2690 v4。

二、CPU 型号参数对比
项目 E5-2630 v4 E5-2690 v4
架构 Broadwell-EP Broadwell-EP
制程工艺 14 nm 14 nm
核心数 10 核 14 核
线程数 20 线程 28 线程
基础频率 2.2 GHz 2.6 GHz
最大睿频 3.1 GHz 3.5 GHz
三级缓存 25 MB 35 MB
TDP 85 W 135 W
内存支持 DDR4-2400 DDR4-2400

三、准备工具与材料

防静电手环;
十字螺丝刀;
酒精棉片/无尘布;
导热硅脂。

四、服务器下电与拆机
正常关闭操作系统;
断开服务器电源线及所有外设连接;
按压静电释放按钮(如有),佩戴防静电手环;
打开服务器上盖。

五、CPU 拆卸顺序说明
拆卸顺序原则:
先拆 CPU2(副 CPU,通常靠近扩展槽一侧);
再拆 CPU1(主 CPU,通常靠近内存通道 0)。

原因:
避免在操作过程中对主 CPU 区域造成误触;
便于空间操作和散热器取放。

六、拆卸原有 CPU(E5-2630 v4));垂直取出旧 CPU,放入防静电盒中保存。

七、安装新 CPU(E5-2690 v4)对齐 CPU 金色三角标识与插槽标识;垂直放入 CPU,确认完全贴合;按顺序压下插槽压杆并锁定;清理散热器底部旧硅脂,重新均匀涂器拧紧螺丝。
双路服务器需 CPU1、CPU2 同步更换为同型号处理器,避免兼容及性能问题。

八、上电检查与验证
盖好服务器上盖,连接电源,登录操作系统;

思杰云桌面 Citrix XenDesktop 基于 Citrix Gateway 的定时登录访问控制配置

一、思杰云桌面整体架构说明

思杰云桌面(Citrix Virtual Apps and Desktops)采用分层、集中管控的架构设计,整体包括预控与安全接入层、控制层、资源层及基础设施层,为平台的高可用性、安全性及可扩展性提供系统性保障。

在预控与安全接入层,部署Citrix ADC(Citrix Gateway)作为统一访问入口,承担外部用户接入控制、流量调度与安全策略执行等关键职能。ADC 通过负载均衡、访问策略控制、身份认证集成(如 LDAP、双因子认证)、会话安全防护等能力,对用户访问行为进行集中管理,是云桌面对外服务的第一道安全防线。

在控制层,Delivery Controller(DDC) 负责虚拟桌面与应用的发布和会话调度;StoreFront(SF) 提供统一的用户访问入口;Director 用于系统运行监控、会话管理及故障分析。这些核心组件依赖后端数据库集群存储站点配置及运行数据,其稳定性直接关系到云桌面整体服务能力。

在资源与基础设施层,云桌面运行于统一的虚拟化平台(如 VMware、Hyper-V),并由集中存储系统提供桌面镜像、用户数据及配置文件的持久化支撑,确保性能稳定与数据安全。


二、限制登录时间的必要性

综合云桌面系统的架构特点及实际安全运行情况,通过 Citrix ADC(Citrix Gateway) 实施登录时间限制,非办公时段关闭登录入口,可显著降低被漏洞扫描、暴力破解和自动化攻击命中的概率,从源头减少安全风险。

1、登录Citrix ADC(Citrix Gateway)找到LDAP策略

一个脚本搞定Ubuntu\Rocky登录后自动展示 CPU/内存/多盘使用情况

登录Linux服务器后,自动展示平时关注的系统信息,一目了然。
1、效果展示

2、脚本内容
vi /etc/profile.d/sysinfo.sh

#!/usr/bin/env bash

# ================== Colors ==================
GREEN="\033[1;32m"
YELLOW="\033[1;33m"
CYAN="\033[1;36m"
RESET="\033[0m"

# ================== Separator ==================
LINE_LEN=60
LINE=$(printf '%*s' "$LINE_LEN" '' | tr ' ' '-')

# ================== OS & Kernel ==================
if [ -f /etc/os-release ]; then
    OS_NAME=$(awk -F= '/^PRETTY_NAME/ {gsub(/"/,""); print $2}' /etc/os-release)
else
    OS_NAME=$(uname -s)
fi

KERNEL_VER=$(uname -r)

# ================== Basic Info ==================
HOSTNAME=$(hostname)

UPTIME=$(uptime -p 2>/dev/null | sed 's/^up //')
[ -z "$UPTIME" ] && UPTIME=$(uptime | awk -F',' '{print $1}')

LOADAVG=$(awk '{print $1", "$2", "$3}' /proc/loadavg)

# ================== Memory ==================
read MEM_TOTAL MEM_USED <<<$(free -m | awk '/^Mem:/ {print $2, $3}')
MEM_PCT=$(( MEM_USED * 100 / MEM_TOTAL ))

# ================== IP Address ==================
IP_ADDR=$(hostname -I 2>/dev/null | awk '{print $1}')
IP_ADDR=${IP_ADDR:-"N/A"}

# ================== CPU Usage (top, fast) ==================
CPU_IDLE=$(top -bn1 2>/dev/null \
  | grep -E "Cpu\(s\)|%Cpu\(s\)" \
  | sed 's/,/\n/g' \
  | awk '/id/ {print $1}')

CPU_USAGE=$(awk "BEGIN {printf \"%.0f\", 100 - $CPU_IDLE}")

# ================== Output ==================
echo -e "\n${GREEN}Welcome, system information overview (系统信息概览)${RESET}"
echo -e "${YELLOW}${LINE}${RESET}"

printf "| %-16s | %-38s |\n" "Resource (资源)" "Status (使用情况)"
printf "|------------------|----------------------------------------|\n"
printf "| %-16s | %-38s |\n" "Hostname" "$HOSTNAME"
printf "| %-16s | %-38s |\n" "OS Version" "$OS_NAME"
printf "| %-16s | %-38s |\n" "Kernel Version" "$KERNEL_VER"
printf "| %-16s | %-38s |\n" "IP Address" "$IP_ADDR"
printf "| %-16s | %-38s |\n" "CPU Usage" "${CPU_USAGE}%"
printf "| %-16s | %-38s |\n" "Memory" "${MEM_USED}MB / ${MEM_TOTAL}MB (${MEM_PCT}%)"
printf "| %-16s | %-38s |\n" "Load Average" "$LOADAVG"
printf "| %-16s | %-38s |\n" "Uptime" "$UPTIME"

echo -e "${YELLOW}${LINE}${RESET}"
echo -e "${CYAN}Disk Usage (磁盘使用情况)${RESET}"
echo -e "${YELLOW}${LINE}${RESET}"

printf "| %-20s | %-10s | %-10s | %-6s |\n" \
       "Mount Point" "Used" "Total" "Use%"
printf "|----------------------|------------|------------|--------|\n"

df -h -x tmpfs -x devtmpfs | awk 'NR>1 {
    printf "| %-20s | %-10s | %-10s | %-6s |\n", $6, $3, $2, $5
}'

echo -e "${YELLOW}${LINE}${RESET}"
echo -e "${GREEN}Operate carefully. (谨慎操作)${RESET}\n"

编译安装 Nginx 1.20.2,并增加 Sticky 和 nginx_upstream_check_module 模块

nginx_upstream_check_module 模块的作用主要是为 Nginx 的 upstream(上游服务器) 提供主动健康检查 和一些负载均衡增强功能。

1、具体来说,它的功能和作用如下:
主动健康检查(Active Health Check)
默认 Nginx 的 upstream 只能被动发现后端宕机(通过连接失败或响应超时)。
安装了 nginx_upstream_check_module 后,可以:
定期向后端服务器发送 HTTP 请求或 TCP 探针。
判断后端服务器是否可用。
动将不可用的服务器标记为 down,从负载均衡池中剔除。

2、支持多种类型:
type=http → 用于 HTTP 服务。
type=https → 用于 HTTPS 服务。
type=tcp → 用于 TCP 服务(非 HTTP)。

interval:检查间隔,例如 `interval=5000` 表示每 5 秒检查一次。
rise:连续成功多少次后,判定节点恢复正常。
fall:连续失败多少次后,判定节点不可用。
timeout:检查请求超时时间(毫秒)。

3、可自定义 HTTP 请求报文。
判断响应状态码是否正常,例如 2xx 或 3xx。

4、设置可用更新源

wget -O /etc/yum.repos.d/epel.repo https://mirrors.aliyun.com/repo/epel-7.repo
wget -4 --no-check-certificate -O /etc/yum.repos.d/CentOS-Base.repo https://www.zhangfangzhou.cn/third/Centos-7.repo

5、下载源码

### Nginx 源码
wget https://nginx.org/download/nginx-1.20.2.tar.gz
tar -zxvf nginx-1.20.2.tar.gz

### zlib(可选,用于数据压缩)
wget http://www.zlib.net/zlib-1.2.13.tar.gz
tar -zxf zlib-1.2.13.tar.gz

### PCRE(支持正则表达式匹配)
wget --no-check-certificate https://ftp.pcre.org/pub/pcre/pcre-8.45.tar.gz
tar -zxf pcre-8.45.tar.gz

### OpenSSL(支持 HTTPS)
wget https://www.openssl.org/source/openssl-1.1.1t.tar.gz
tar -zxf openssl-1.1.1t.tar.gz
mv openssl-1.1.1t openssl

### Sticky 模块
wget https://bitbucket.org/nginx-goodies/nginx-sticky-module-ng/get/08a395c66e42.zip -O sticky.zip
unzip sticky.zip
mv nginx-goodies-nginx-sticky* nginx-sticky

### nginx_upstream_check_module
wget https://github.com/yaoweibin/nginx_upstream_check_module/archive/master.zip -O nginx_upstream_check_module-master.zip
unzip nginx_upstream_check_module-master.zip

6、编译安装

yum -y install libtool  
#zlib 是提供数据压缩之用的库 (非必要编译安装)
cd ~
cd zlib-1.2.13
./configure --prefix=/usr/local/zlib
make && make install
echo "/usr/local/zlib/lib" > /etc/ld.so.conf.d/zlib.conf
ldconfig

#pcre PCRE库是一组函数,它们使用与Perl 5相同的语法和语义实现正则表达式模式匹配.(非必要编译安装)
cd ~
sudo yum install gcc-c++ -y
tar -zxf pcre-8.45.tar.gz
cd pcre-8.45
./configure --prefix=/usr/local/pcre --enable-utf8
make && make install
~/pcre-8.45/libtool --finish  /usr/local/pcre/lib/
echo "/usr/local/pcre/lib/" > /etc/ld.so.conf.d/pcre.conf
ldconfig

## 安装 Patch 工具(为 nginx_upstream_check_module 打补丁)
yum install -y patch

ngstable=1.20.2
#Install Nginx
cd ~
yum -y install gzip man
tar -zxf nginx-${ngstable}.tar.gz
#
#Custom nginx name
sed -i 's@^#define NGINX_VER          "nginx/" NGINX_VERSION@#define NGINX_VER          "Microsoft-IIS/10.0/" NGINX_VERSION@g'  ~/nginx-${ngstable}/src/core/nginx.h
sed -i 's@^#define NGINX_VAR          "NGINX"@#define NGINX_VAR          "Microsoft-IIS"@g'  ~/nginx-${ngstable}/src/core/nginx.h
sed -i '30,40s@nginx@Microsoft-IIS@g'  ~/nginx-${ngstable}/src/http/ngx_http_special_response.c
sed -i '45,50s@nginx@Microsoft-IIS@g' ~/nginx-${ngstable}/src/http/ngx_http_header_filter_module.c
#
#Nginx shows the file name length of a static directory file
sed -i 's/^#define NGX_HTTP_AUTOINDEX_PREALLOCATE  50/#define NGX_HTTP_AUTOINDEX_PREALLOCATE  150/'  ~/nginx-${ngstable}/src/http/modules/ngx_http_autoindex_module.c
sed -i 's/^#define NGX_HTTP_AUTOINDEX_NAME_LEN     50/#define NGX_HTTP_AUTOINDEX_NAME_LEN     150/'  ~/nginx-${ngstable}/src/http/modules/ngx_http_autoindex_module.c
#
yum install -y patch
cd ~/nginx-${ngstable}
patch -p1 < ../nginx_upstream_check_module-master/check_1.20.1+.patch
# patch -p1 < ../nginx_upstream_check_module-master/check_1.20.1+.patch
patching file src/http/modules/ngx_http_upstream_hash_module.c
patching file src/http/modules/ngx_http_upstream_ip_hash_module.c
patching file src/http/modules/ngx_http_upstream_least_conn_module.c
patching file src/http/ngx_http_upstream_round_robin.c
patching file src/http/ngx_http_upstream_round_robin.h

#Copy NGINX manual page to /usr/share/man/man8:
cp -f ~/nginx-${ngstable}/man/nginx.8 /usr/share/man/man8
gzip /usr/share/man/man8/nginx.8

cd ~/nginx-${ngstable}
./configure --prefix=/usr/local/nginx --user=www --group=www \
--build=CentOS \
--modules-path=/usr/local/nginx/modules \
--with-openssl=/root/openssl \
--with-pcre=/root/pcre-8.45 \
--with-zlib=/root/zlib-1.2.13 \
--add-module=/root/nginx-sticky \
--add-module=/root/nginx_upstream_check_module-master \
--with-http_stub_status_module \
--with-http_secure_link_module \
--with-threads \
--with-file-aio \
--with-http_v2_module \
--with-http_ssl_module \
--with-http_gzip_static_module \
--with-http_gunzip_module \
--with-http_realip_module \
--with-http_flv_module \
--with-http_mp4_module \
--with-http_sub_module \
--with-http_dav_module \
--with-stream \
--with-stream=dynamic \
--with-stream_ssl_module \
--with-stream_realip_module \
--with-stream_ssl_preread_module
make -j$(nproc)
make install

7、平滑升级nginx

mv /usr/local/nginx/sbin/nginx /usr/local/nginx/sbin/nginx.old

#然后拷贝一份新编译的二进制文件:
cp ~/nginx-${ngstable}/objs/nginx /usr/local/nginx/sbin/

#检测配置
nginx -t
kill -USR2 `cat /var/run/nginx.pid`
kill -HUP `cat /var/run/nginx.pid`

8、修改配置

upstream oah {
sticky; # 会话保持
server 10.53.121.51:8080;
server 10.53.121.52:8080;
server 10.53.121.53:8080;
server 10.53.121.66:8080;
server 10.53.121.67:8080;
# TCP 健康检查,只检测端口是否可用
check interval=5000 rise=2 fall=3 timeout=3000 type=tcp;
}

9、添加status页面
location /status {
check_status;
}

如何在Ubuntu 22.04上安装GeForce RTX 4090和GeForce RTX 2080TI驱动程序和CUDA

如何在Ubuntu 22.04上安装GeForce RTX 4090驱动程序和CUDA
How to Install GeForce RTX 4090 Drivers and CUDA on Ubuntu 22.04

如何在Ubuntu 22.04上安装GeForce RTX 2080TI驱动程序和CUD
How to Install GeForce RTX 2080 TI Drivers and CUDA on Ubuntu 22.04

1. 打开终端并更新
使用sudo更新apt软件包列表并使用sudo升级apt软件包

sudo apt update && sudo apt upgrade -y  
sudo apt install build-essential dkms linux-headers-$(uname -r) -y

2. 确定可用驱动
使用sudo命令列出ubuntu驱动程序列表

sudo ubuntu-drivers list
......
nvidia-driver-565, (kernel modules provided by nvidia-dkms-565)
nvidia-driver-535, (kernel modules provided by linux-modules-nvidia-535-generic)
nvidia-driver-575, (kernel modules provided by nvidia-dkms-575)
nvidia-driver-555-open, (kernel modules provided by nvidia-dkms-555-open)
nvidia-driver-575-open, (kernel modules provided by nvidia-dkms-575-open)
nvidia-driver-570-server, (kernel modules provided by linux-modules-nvidia-570-server-generic)
nvidia-driver-555, (kernel modules provided by nvidia-dkms-555)
nvidia-driver-535-server-open, (kernel modules provided by linux-modules-nvidia-535-server-open-generic)
nvidia-driver-560, (kernel modules provided by nvidia-dkms-560)
nvidia-driver-570-server-open, (kernel modules provided by linux-modules-nvidia-570-server-open-generic)
nvidia-driver-570-open, (kernel modules provided by linux-modules-nvidia-570-open-generic)
nvidia-driver-535-server, (kernel modules provided by linux-modules-nvidia-535-server-generic)
nvidia-driver-560-open, (kernel modules provided by nvidia-dkms-560-open)
nvidia-driver-565-open, (kernel modules provided by nvidia-dkms-565-open)
nvidia-driver-550, (kernel modules provided by linux-modules-nvidia-550-generic)
nvidia-driver-550-open, (kernel modules provided by linux-modules-nvidia-550-open-generic)
nvidia-driver-570, (kernel modules provided by linux-modules-nvidia-570-generic)
nvidia-driver-535-open, (kernel modules provided by linux-modules-nvidia-535-open-generic)
open-vm-tools-desktop

3.禁用 Nouveau 驱动:创建配置文件并更新,重启系统

sudo cat <<EOF | sudo tee /etc/modprobe.d/blacklist-nvidia.conf
blacklist nouveau
blacklist nvidia
blacklist nvidiafb
blacklist rivafb
EOF
sudo update-initramfs -u
sudo reboot

4.安装驱动

sudo apt install nvidia-driver-570 -y    #sudo apt install nvidia-driver-580 -y

5.验证GeForce RTX 4090驱动是否正常加载

root@lenovo-ThinkStation-PX:~# nvidia-smi 
Fri Jul 18 17:13:35 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.169                Driver Version: 570.169        CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4090        Off |   00000000:2A:00.0 Off |                  Off |
| 32%   42C    P8             34W /  450W |      87MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  NVIDIA GeForce RTX 4090        Off |   00000000:3D:00.0 Off |                  Off |
| 30%   34C    P8             16W /  450W |      15MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   2  NVIDIA GeForce RTX 4090        Off |   00000000:BD:00.0 Off |                  Off |
| 32%   32C    P8              6W /  450W |      15MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   3  NVIDIA GeForce RTX 4090        Off |   00000000:E1:00.0 Off |                  Off |
| 32%   33C    P8             15W /  450W |      15MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A            2453      G   /usr/lib/xorg/Xorg                       46MiB |
|    0   N/A  N/A            2564      G   /usr/bin/gnome-shell                     13MiB |
|    1   N/A  N/A            2453      G   /usr/lib/xorg/Xorg                        4MiB |
|    2   N/A  N/A            2453      G   /usr/lib/xorg/Xorg                        4MiB |
|    3   N/A  N/A            2453      G   /usr/lib/xorg/Xorg                        4MiB |
+-----------------------------------------------------------------------------------------+

验证GeForce RTX 2080 TI驱动是否正常加载

root@su:~# nvidia-smi 
Sat Nov  1 12:04:44 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.95.05              Driver Version: 580.95.05      CUDA Version: 13.0     |
+-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 2080 Ti     Off |   00000000:00:0B.0 Off |                  N/A |
| 16%   35C    P0             86W /  250W |       1MiB /  11264MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

6.安装 CUDA

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-8

用 nvcc --version 确认cuda的版本,如果显示Command nvcc not found,则编辑~/.bashrc

vim ~/.bashrc
export PATH=/usr/local/cuda-12.8/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

#更新变量
source ~/.bashrc

root@su:~# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0

7.锁定驱动版本防止升级冲突

sudo apt-mark hold nvidia-driver-570
sudo apt-mark hold cuda-toolkit-12-8

GeForce RTX 4090

GeForce RTX 2080TI

如何在UUbuntu 24.04 安装 NVIDIA L40 显卡驱动与 CUDA(12.8 / 13.1)

To install the NVIDIA L40 driver and cuda on Ubuntu 24.04
适用于 NVIDIA L40 / L40S 数据中心显卡,在 Ubuntu 24.04 LTS 系统下部署深度学习、AI 推理、CUDA 计算环境。

一、环境说明
操作系统:Ubuntu 24.04 LTS(x86_64 架构)
GPU 型号:NVIDIA L40(数据中心级显卡)
NVIDIA 驱动
nvidia-driver-570-server(推荐,生产环境稳定)
nvidia-driver-590-server(可选,用于支持 CUDA 13.1)
CUDA Toolkit
CUDA 12.8(长期稳定版本,推荐)
CUDA 13.1(新特性版本,测试/验证使用)
BIOS 启动模式:UEFI (不要开启EFI安全引导,Secure Boot 会阻止未签名或非官方签名的内核模块加载,而 NVIDIA 驱动、VFIO 及 DKMS 模块往往不满足 Secure Boot 的签名要求)
Above 4G Decoding:已开启(GPU 透传 / 大显存 BAR 必需)

二、打开终端并更新
使用sudo更新apt软件包列表并使用sudo升级apt软件包

sudo apt update && sudo apt upgrade -y

三、查看 Ubuntu 可用的 NVIDIA 驱动版本
使用sudo命令列出ubuntu驱动程序列表

sudo ubuntu-drivers list
......
nvidia-driver-565, (kernel modules provided by nvidia-dkms-565)
nvidia-driver-535, (kernel modules provided by linux-modules-nvidia-535-generic)
nvidia-driver-575, (kernel modules provided by nvidia-dkms-575)
nvidia-driver-555-open, (kernel modules provided by nvidia-dkms-555-open)
nvidia-driver-575-open, (kernel modules provided by nvidia-dkms-575-open)
nvidia-driver-570-server, (kernel modules provided by linux-modules-nvidia-570-server-generic)
nvidia-driver-590-server, (kernel modules provided by linux-modules-nvidia-590-server-generic)
nvidia-driver-555, (kernel modules provided by nvidia-dkms-555)
nvidia-driver-535-server-open, (kernel modules provided by linux-modules-nvidia-535-server-open-generic)
nvidia-driver-560, (kernel modules provided by nvidia-dkms-560)
nvidia-driver-570-server-open, (kernel modules provided by linux-modules-nvidia-570-server-open-generic)
nvidia-driver-570-open, (kernel modules provided by linux-modules-nvidia-570-open-generic)
nvidia-driver-535-server, (kernel modules provided by linux-modules-nvidia-535-server-generic)
nvidia-driver-560-open, (kernel modules provided by nvidia-dkms-560-open)
nvidia-driver-565-open, (kernel modules provided by nvidia-dkms-565-open)
nvidia-driver-550, (kernel modules provided by linux-modules-nvidia-550-generic)
nvidia-driver-550-open, (kernel modules provided by linux-modules-nvidia-550-open-generic)
nvidia-driver-570, (kernel modules provided by linux-modules-nvidia-570-generic)
nvidia-driver-535-open, (kernel modules provided by linux-modules-nvidia-535-open-generic)
open-vm-tools-desktop

四、安装驱动
L40 / A100 / H100 等数据中心卡,Server 驱动比普通驱动更适合 L40 / 多卡服务器 / 无显示环境

sudo apt install nvidia-driver-570-server    #数据中心卡(如L40)专用驱动,稳定性更好,支持 MIG、多用户多实例等特性

如果需要新版可以安装
sudo apt install nvidia-driver-590-server

五、重启系统

sudo reboot

六、验证驱动是否正常加载

root@su:~# nvidia-smi
Thu Jun 12 06:14:12 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 570.133.20             Driver Version: 570.133.20     CUDA Version: 12.8     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA L40                     Off |   00000000:13:00.0 Off |                    0 |
| N/A   29C    P0             79W /  300W |       0MiB /  46068MiB |      3%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI              PID   Type   Process name                        GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|  No running processes found                                                             |
+-----------------------------------------------------------------------------------------+

7、安装 CUDA
安装CUDA12.8

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install cuda-toolkit-12-8

用 nvcc --version 确认cuda的版本,如果显示Command nvcc not found,则编辑~/.bashrc,配置 CUDA 环境变量(解决 nvcc not found)

vim ~/.bashrc
export PATH=/usr/local/cuda-12.8/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-12.8/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

#更新变量
source ~/.bashrc

root@su:~# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0

安装CUDA 13.1

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-ubuntu2404.pin
sudo mv cuda-ubuntu2404.pin /etc/apt/preferences.d/cuda-repository-pin-600
wget https://developer.download.nvidia.com/compute/cuda/13.1.1/local_installers/cuda-repo-ubuntu2404-13-1-local_13.1.1-590.48.01-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2404-13-1-local_13.1.1-590.48.01-1_amd64.deb
sudo cp /var/cuda-repo-ubuntu2404-13-1-local/cuda-*-keyring.gpg /usr/share/keyrings/
sudo apt-get update
sudo apt-get -y install cuda-toolkit-13-1


用 nvcc --version 确认cuda的版本,如果显示Command nvcc not found,则编辑~/.bashrc,配置 CUDA 环境变量(解决 nvcc not found)

vim ~/.bashrc
export PATH=/usr/local/cuda-13.1/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-13.1/lib64:${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

#更新变量
source ~/.bashrc

root@su:~# nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Fri_Feb_21_20:23:50_PST_2025
Cuda compilation tools, release 12.8, V12.8.93
Build cuda_12.8.r12.8/compiler.35583870_0

八、锁定驱动版本防止升级冲突

sudo apt-mark hold nvidia-driver-570-server
sudo apt-mark hold cuda-toolkit-12-8



Q1:CUDA 版本一定要和 nvidia-smi 显示一致吗?
不需要完全一致,只要 驱动版本 ≥ CUDA 需求版本 即可。

Q2:可以同时安装多个 CUDA 版本吗?
可以,需手动切换 /usr/local/cuda 软链接或 PATH。

基于VMware 组织唯一标识符(OUI)和vCenter Server ID的虚拟机MAC地址规划

VMware vCenter Server MAC 地址分配与冲突避免

1. vCenter Server MAC 地址分配机制

在企业级 VMware 虚拟化环境中,多个 vCenter Server 可能共存。默认情况下,vCenter Server 使用 VMware 组织唯一标识符 (OUI) 和 vCenter Server ID 生成虚拟机 MAC 地址。

MAC 地址格式

VMware OUI:00:50:56
完整格式:00:50:56:XX:YY:ZZ

  • XX 计算方式80 + vCenter Server ID(十六进制表示)

  • YY、ZZ:随机分配的十六进制值

地址分配范围

  • 适用于 0-63 之间的 vCenter Server 实例 ID

  • 每个 vCenter Server 最多可分配 64,000 个唯一 MAC 地址

  • 地址范围:

    • 00:50:56:80:YY:ZZ(vCenter ID = 0)

    • 00:50:56:81:YY:ZZ(vCenter ID = 1)

    • ...

    • 00:50:56:BF:YY:ZZ(vCenter ID = 63)

分配方式 vCenter Server

ID

MAC地址

范围

OUI 0 00:50:56:80:YY:ZZ
OUI 1 00:50:56:81:YY:ZZ
OUI 2 00:50:56:82:YY:ZZ
OUI 3 00:50:56:83:YY:ZZ
OUI 4 00:50:56:84:YY:ZZ
OUI 5 00:50:56:85:YY:ZZ
OUI 6 00:50:56:86:YY:ZZ
OUI 7 00:50:56:87:YY:ZZ
OUI 8 00:50:56:88:YY:ZZ
OUI 9 00:50:56:89:YY:ZZ
OUI 10 00:50:56:8A:YY:ZZ
OUI 11 00:50:56:8B:YY:ZZ
OUI 12 00:50:56:8C:YY:ZZ
OUI 13 00:50:56:8D:YY:ZZ
OUI 14 00:50:56:8E:YY:ZZ
OUI 15 00:50:56:8F:YY:ZZ
OUI 16 00:50:56:90:YY:ZZ
OUI 17 00:50:56:91:YY:ZZ
OUI 18 00:50:56:92:YY:ZZ
OUI 19 00:50:56:93:YY:ZZ
OUI 20 00:50:56:94:YY:ZZ
OUI 21 00:50:56:95:YY:ZZ
OUI 22 00:50:56:96:YY:ZZ
OUI 23 00:50:56:97:YY:ZZ
OUI 24 00:50:56:98:YY:ZZ
OUI 25 00:50:56:99:YY:ZZ
OUI 26 00:50:56:9A:YY:ZZ
OUI 27 00:50:56:9B:YY:ZZ
OUI 28 00:50:56:9C:YY:ZZ
OUI 29 00:50:56:9D:YY:ZZ
OUI 30 00:50:56:9E:YY:ZZ
OUI 31 00:50:56:9F:YY:ZZ
OUI 32 00:50:56:A0:YY:ZZ
OUI 33 00:50:56:A1:YY:ZZ
OUI 34 00:50:56:A2:YY:ZZ
OUI 35 00:50:56:A3:YY:ZZ
OUI 36 00:50:56:A4:YY:ZZ
OUI 37 00:50:56:A5:YY:ZZ
OUI 38 00:50:56:A6:YY:ZZ
OUI 39 00:50:56:A7:YY:ZZ
OUI 40 00:50:56:A8:YY:ZZ
OUI 41 00:50:56:A9:YY:ZZ
OUI 42 00:50:56:AA:YY:ZZ
OUI 43 00:50:56:AB:YY:ZZ
OUI 44 00:50:56:AC:YY:ZZ
OUI 45 00:50:56:AD:YY:ZZ
OUI 46 00:50:56:AE:YY:ZZ
OUI 47 00:50:56:AF:YY:ZZ
OUI 48 00:50:56:B0:YY:ZZ
OUI 49 00:50:56:B1:YY:ZZ
OUI 50 00:50:56:B2:YY:ZZ
OUI 51 00:50:56:B3:YY:ZZ
OUI 52 00:50:56:B4:YY:ZZ
OUI 53 00:50:56:B5:YY:ZZ
OUI 54 00:50:56:B6:YY:ZZ
OUI 55 00:50:56:B7:YY:ZZ
OUI 56 00:50:56:B8:YY:ZZ
OUI 57 00:50:56:B9:YY:ZZ
OUI 58 00:50:56:BA:YY:ZZ
OUI 59 00:50:56:BB:YY:ZZ
OUI 60 00:50:56:BC:YY:ZZ
OUI 61 00:50:56:BD:YY:ZZ
OUI 62 00:50:56:BE:YY:ZZ
OUI 63 00:50:56:BF:YY:ZZ

潜在冲突风险

如果两个 vCenter Server 具有相同的实例 ID,可能会生成相同的 MAC 地址,导致:

  • 网络冲突(数据包丢失、通信异常)

  • IP 绑定错误(DHCP 服务器误分配 IP)


2. 修改 vCenter Server 实例 ID 以避免冲突

修改步骤(vSphere Client)

  1. 导航vCenter Server > 配置

  2. 选择 常规 > 编辑

  3. 调整 运行时设置

    • vCenter Server 唯一 ID 中输入 0-63 范围内的唯一值

    • 设置 vCenter Server 受管地址(IPv4/IPv6/FQDN)

    • 确保 vCenter Server 名称 正确

  4. 保存更改并重启 vCenter Server

重要提示

  • 修改实例 ID 不会影响已分配的 MAC 地址

  • 若需重新分配 MAC 地址,可将虚拟机网卡设置为手动 MAC,然后改回自动分配

www.zhangfangzhou.cn

MySQL主从数据库Last_Error: Error ‘Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs’ on query

MySQL 版本: MySQL 5.7.31
MySQL 架构: 主从数据库架构
告警节点: 从数据库
运行场景: 某大型制造企业

             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 1118
                   Last_Error: Error 'Row size too large. The maximum row size for the used table type, not counting BLOBs, is 65535. This includes storage overhead, check the manual. You have to change some columns to TEXT or BLOBs' on query. Default database: 'ekp'. Query: 'alter table ekp_ff4cf69105f28480891e add column fd_3ddc5be2f5270e varchar(4000)'

解决办法,修改MySQL配置参数,从新进行数据同步。

[mysqld]
port=3360
bind_address=0.0.0.0
server-id = 128
log_bin = mysql-bin
binlog_cache_size = 4M
binlog_format = mixed
expire_logs_days = 99
relay-log=mysqld-relay-bin
init-connect = 'SET NAMES utf8mb4'
character-set-server = utf8mb4
datadir=/home/mysql
socket=/home/mysql/mysql.sock
symbolic-links=0
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
open_files_limit = 5000
key_buffer_size = 256M
query_cache_size = 64M
thread_cache_size = 64
lower_case_table_names = 1
default_storage_engine = InnoDB
innodb_file_format = Barracuda   # 启用 Barracuda,支持大行
innodb_file_per_table = 1             # 独立表空间,便于管理
innodb_default_row_format = DYNAMIC   # 默认 DYNAMIC 行格式,避免行大小限制
innodb_buffer_pool_size = 2048M       # 增加缓冲池(根据内存调整)
innodb_log_file_size = 512M                # 增加日志文件大小,支持大表变更
innodb_log_buffer_size = 16M             # 日志缓冲区,提升写入性能
innodb_large_prefix = 1                        # MySQL 5.7 需启用,支持大索引前缀
performance_schema = 0
explicit_defaults_for_timestamp
skip-external-locking
skip-name-resolve
max_allowed_packet=2560M
wait_timeout=60000
[client]
port = 3360
socket=/home/mysql/mysql.sock