corosync & pacemaker群集-命令
2024-06-28 16:01:43
供稿:网友
使用pcs shell配置corosync & pacemaker群集PacemakerPacemaker,即Cluster Resource Manager(CRM),管理整个HA,客户端通过pacemaker管理监控整个集群。CRM支持ocf和lsb两种资源类型:ocf格式的启动脚本在/usr/lib/ocf/resource.d/下面。lsb的脚本一般在/etc/rc.d/init.d/下面。1、常用的集群管理工具:(1)基于命令行crm shell/pcs(2)基于图形化pygui/hawk/lcmc/pcs2、相关的资源文件:(1)/usr/lib/ocf/resource.d,pacemaker资源库文件位置,可安装资源包:resource-agents 获取更多ocf格式的资源。(2)/usr/sbin/fence_***,Fencing设备的执行脚本名称,可安装资源包:fence-agents 获取更多Fencing设备资源。3、查看使用说明: [shell]# man ocf_heartbeat_*** ## 查看OCF资源说明,man ocf_heartbeat_apache [shell]# man fence_*** ## 查看Fencing设备说明,man fence_vmware4、参考文档https://github.com/ClusterLabshttp://clusterlabs.org/doc/http://www.linux-ha.org/doc/man-pages/man-pages.htmlhttps://access.redhat.com/documentation/en-US/Red_Hat_EnterPRise_Linux/6/html/Configuring_the_Red_Hat_High_Availability_Add-On_with_Pacemaker/index.html在群集配置过程中参考了互联网上众多优秀文章,在此感谢原作者!!!以下记录整理了在vmware esxi5.5 + centos6.6环境中使用PCS命令配置corosync & pacemaker群集的一些操作,由于本人水平有限,仅供参考:--------------------------------------------------1.安装群集软件: [shell]# yum -y install corosync pacemaker pcs [shell]# yum -y install fence-agents resource-agents2.拷贝配置文件、启动脚本 [shell]# mkdir -p /etc/cluster/ [shell]# ln -s /etc/rc.d/init.d/corosync /etc/rc.d/init.d/cman [shell]# ln -s /usr/sbin/corosync-cmapctl /usr/sbin/corosync-objctl [shell]# cp /etc/corosync/corosync.conf.example /etc/corosync/corosync.conf注意:群集需要严格的时间同步机制,如果启用了防火墙需要开放相应的端口。--------------------------------------------------PCS(Pacemaker/Corosync configuration system)命令配置群集示例:一、建立群集:1、配置群集节点的认证as the hacluster user: [shell]# pcs cluster auth node11 node122、创建一个二个节点的群集 [shell]# pcs cluster setup --name mycluster node11 node12 [shell]# pcs cluster start --all ## 启动群集3、设置资源默认粘性(防止资源回切) [shell]# pcs resource defaults resource-stickiness=100 [shell]# pcs resource defaults4、设置资源超时时间 [shell]# pcs resource op defaults timeout=90s [shell]# pcs resource op defaults5、二个节点时,忽略节点quorum功能 [shell]# pcs property set no-quorum-policy=ignore6、没有 Fencing设备时,禁用STONITH 组件功能在 stonith-enabled="false" 的情况下,分布式锁管理器 (DLM) 等资源以及依赖DLM 的所有服务(例如 cLVM2、GFS2 和 OCFS2)都将无法启动。 [shell]# pcs property set stonith-enabled=false [shell]# crm_verify -L -V ## 验证群集配置信息二、建立群集资源1、查看可用资源 [shell]# pcs resource list ## 查看支持资源列表,pcs resource list ocf:heartbeat [shell]# pcs resource describe agent_name ## 查看资源使用参数,pcs resource describe ocf:heartbeat:ipaddr22、配置虚拟IP [shell]# pcs resource create ClusterIP ocf:heartbeat:IPaddr2 / ip="192.168.10.15" cidr_netmask=32 nic=eth0 op monitor interval=30s 3、配置Apache(httpd) [shell]# pcs resource create WebServer ocf:heartbeat:apache / httpd="/usr/sbin/httpd" configfile="/etc/httpd/conf/httpd.conf" / statusurl="http://localhost/server-status" op monitor interval=1min4、配置Nginx [shell]# pcs resource create WebServer ocf:heartbeat:nginx / httpd="/usr/sbin/nginx" configfile="/etc/nginx/nginx.conf" / statusurl="http://localhost/ngx_status" op monitor interval=30s5.1、配置FileSystem [shell]# pcs resource create WebFS ocf:heartbeat:Filesystem / device="/dev/sdb1" directory="/var/www/html" fstype="ext4" [shell]# pcs resource create WebFS ocf:heartbeat:Filesystem / device="-U 32937d65eb" directory="/var/www/html" fstype="ext4"5.2、配置FileSystem-NFS [shell]# pcs resource create WebFS ocf:heartbeat:Filesystem / device="192.168.10.18:/MySQLdata" directory="/var/lib/mysql" fstype="nfs" / options="-o username=your_name,passWord=your_password" / op start timeout=60s op stop timeout=60s op monitor interval=20s timeout=60s6、配置Iscsi [shell]# pcs resource create WebData ocf:heartbeat:iscsi / portal="192.168.10.18" target="iqn.2008-08.com.starwindsoftware:" / op monitor depth="0" timeout="30" interval="120" [shell]# pcs resource create WebFS ocf:heartbeat:Filesystem / device="-U 32937d65eb" directory="/var/www/html" fstype="ext4" options="_netdev"7、配置DRBD [shell]# pcs resource create WebData ocf:linbit:drbd / drbd_resource=wwwdata op monitor interval=60s [shell]# pcs resource master WebDataClone WebData / master-max=1 master-node-max=1 clone-max=2 clone-node-max=1 notify=true [shell]# pcs resource create WebFS ocf:heartbeat:Filesystem / device="/dev/drbd1" directory="/var/www/html" fstype="ext4"8、配置MySQL [shell]# pcs resource create MySQL ocf:heartbeat:mysql / binary="/usr/bin/mysqld_safe" config="/etc/my.cnf" datadir="/var/lib/mysql" / pid="/var/run/mysqld/mysql.pid" socket="/tmp/mysql.sock" / op start timeout=180s op stop timeout=180s op monitor interval=20s timeout=60s9、配置Pingd,检测节点与目标的连接有效性 [shell]# pcs resource create PingCheck ocf:heartbeat:pingd / dampen=5s multiplier=100 host_list="192.168.10.1 router" / op monitor interval=30s timeout=10s10、创建资源clone,克隆的资源会在全部节点启动 [shell]# pcs resource clone PingCheck [shell]# pcs resource clone ClusterIP clone-max=2 clone-node-max=2 globally-unique=true ## clone-max=2,数据包分成2路 [shell]# pcs resource update ClusterIP clusterip_hash=sourceip ## 指定响应请求的分配策略为:sourceip三、调整群集资源1、配置资源约束 [shell]# pcs resource group add WebSrvs ClusterIP ## 配置资源组,组中资源会在同一节点运行 [shell]# pcs resource group remove WebSrvs ClusterIP ## 移除组中的指定资源 [shell]# pcs resource master WebDataClone WebData ## 配置具有多个状态的资源,如 DRBD master/slave状态 [shell]# pcs constraint colocation add WebServer ClusterIP INFINITY ## 配置资源捆绑关系 [shell]# pcs constraint colocation remove WebServer ## 移除资源捆绑关系约束中资源 [shell]# pcs constraint order ClusterIP then WebServer ## 配置资源启动顺序 [shell]# pcs constraint order remove ClusterIP ## 移除资源启动顺序约束中资源 [shell]# pcs constraint ## 查看资源约束关系, pcs constraint --full2、配置资源位置 [shell]# pcs constraint location WebServer prefers node11 ## 指定资源默认某个节点,node=50 指定增加的 score [shell]# pcs constraint location WebServer avoids node11## 指定资源避开某个节点,node=50 指定减少的 score [shell]# pcs constraint location remove location-WebServer## 移除资源节点位置约束中资源ID,可用pcs config获取 [shell]# pcs constraint location WebServer prefers node11=INFINITY ## 手工移动资源节点,指定节点资源的 score of INFINITY [shell]# crm_simulate -sL ## 验证节点资源 score 值3、修改资源配置 [shell]# pcs resource update WebFS ## 更新资源配置 [shell]# pcs resource delete WebFS ## 删除指定资源4、管理群集资源 [shell]# pcs resource disable ClusterIP ## 禁用资源 [shell]# pcs resource enable ClusterIP ## 启用资源 [shell]# pcs resource failcount show ClusterIP ## 显示指定资源的错误计数 [shell]# pcs resource failcount reset ClusterIP ## 清除指定资源的错误计数 [shell]# pcs resource cleanup ClusterIP ## 清除指定资源的状态与错误计数四、配置Fencing设备,启用STONITH1、查询Fence设备资源 [shell]# pcs stonith list ## 查看支持Fence列表 [shell]# pcs stonith describe agent_name ## 查看Fence资源使用参数,pcs stonith describe fence_vmware_soap2、配置fence设备资源 [shell]# pcs stonith create ipmi-fencing fence_ipmilan / pcmk_host_list="pcmk-1 pcmk-2" ipaddr="10.0.0.1" login=testuser passwd=acd123 / op monitor interval=60smark:If the device does not support the standard port parameter or may provide additional ones, you may also need to set the special pcmk_host_argument parameter. See man stonithd for details.If the device does not know how to fence nodes based on their uname, you may also need to set the special pcmk_host_map parameter. See man stonithd for details.If the device does not support the list command, you may also need to set the special pcmk_host_list and/or pcmk_host_check parameters. See man stonithd for details.If the device does not expect the victim to be specified with the port parameter, you may also need to set the special pcmk_host_argument parameter. See man stonithd for details.example: pcmk_host_argument="uuid" pcmk_host_map="node11:4;node12:5;node13:6" pcmk_host_list="node11,node12" pcmk_host_check="static-list"3、配置VMWARE (fence_vmware_soap) 特别说明:本次实例中使用了第3项(pcs stonith create vmware-fencing fence_vmware_soap)这个指定pcmk配置参数才能正常执行Fencing动作。3.1、确认vmware虚拟机的状态: [shell]# fence_vmware_soap -o list -a vcenter.example.com -l cluster-admin -p <password> -z ## 获取虚拟机UUID [shell]# fence_vmware_soap -o status -a vcenter.example.com -l cluster-admin -p <password> -z -U <UUID>## 查看状态 [shell]# fence_vmware_soap -o status -a vcenter.example.com -l cluster-admin -p <password> -z -n <vm name>3.2、配置fence_vmware_soap [shell]# pcs stonith create vmware-fencing-node11 fence_vmware_soap / action="reboot" ipaddr="192.168.10.10" login="vmuser" passwd="vmuserpd" ssl="1" / port="node11" shell_timeout=60s login_timeout=60s op monitor interval=90s [shell]# pcs stonith create vmware-fencing-node11 fence_vmware_soap / action="reboot" ipaddr="192.168.10.10" login="vmuser" passwd="vmuserpd" ssl="1" / uuid="421dec5f-c484-3d69-ddfb-65af46530581" shell_timeout=60s login_timeout=60s op monitor interval=90s [shell]# pcs stonith create vmware-fencing fence_vmware_soap / action="reboot" ipaddr="192.168.10.10" login="vmuser" passwd="vmuserpd" ssl="1" / pcmk_host_argument="uuid" pcmk_host_check="static-list" pcmk_host_list="node11,node12" / pcmk_host_map="node11:421dec5f-c484-3d69-ddfb-65af46530581;node12:421dec5f-c484-3d69-ddfb-65af46530582" / shell_timeout=60s login_timeout=60s op monitor interval=90s注:如果配置fence_vmware_soap设备时用port=vm name在测试时不能识别,则使用uuid=vm uuid代替;建议使用 pcmk_host_argument、pcmk_host_map、pcmk_host_check、pcmk_host_list 参数指明节点与设备端口关系,格式: pcmk_host_argument="uuid" pcmk_host_map="node11:uuid4;node12:uuid5;node13:uuid6" pcmk_host_list="node11,node12,node13" pcmk_host_check="static-list"4、配置SCSI [shell]# ls /dev/disk/by-id/wwn-* ## 获取Fencing磁盘UUID号,磁盘须未格式化 [shell]# pcs stonith create iscsi-fencing fence_scsi / action="reboot" devices="/dev/disk/by-id/wwn-0x600e002" meta provides=unfencing5、配置DELL DRAC [shell]# pcs stonith create dell-fencing-node11 fence_drac.....6、管理 STONITH [shell]# pcs resource clone vmware-fencing ## clone stonith资源,供多节点启动 [shell]# pcs property set stonith-enabled=true ## 启用 stonith 组件功能 [shell]# pcs stonith cleanup vmware-fencing ## 清除Fence资源的状态与错误计数 [shell]# pcs stonith fence node11 ## fencing指定节点五、群集操作命令1、验证群集安装 [shell]# pacemakerd -F ## 查看pacemaker组件,ps axf | grep pacemaker [shell]# corosync-cfgtool -s ## 查看corosync序号 [shell]# corosync-cmapctl | grep members ## corosync 2.3.x [shell]# corosync-objctl | grep members ## corosync 1.4.x2、查看群集资源 [shell]# pcs resource standards ## 查看支持资源类型 [shell]# pcs resource providers ## 查看资源提供商 [shell]# pcs resource agents ## 查看所有资源代理 [shell]# pcs resource list ## 查看支持资源列表 [shell]# pcs stonith list ## 查看支持Fence列表 [shell]# pcs property list --all ## 显示群集默认变量参数 [shell]# crm_simulate -sL ## 检验资源 score 值3、使用群集脚本 [shell]# pcs cluster cib ra_cfg ## 将群集资源配置信息保存在指定文件 [shell]# pcs -f ra_cfg resource create ## 创建群集资源并保存在指定文件中(而非保存在运行配置) [shell]# pcs -f ra_cfg resource show ## 显示指定文件的配置信息,检查无误后 [shell]# pcs cluster cib-push ra_cfg ## 将指定配置文件加载到运行配置中4、STONITH 设备操作 [shell]# stonith_admin -I ## 查询fence设备 [shell]# stonith_admin -M -a agent_name ## 查询fence设备的元数据,stonith_admin -M -a fence_vmware_soap [shell]# stonith_admin --reboot nodename ## 测试 STONITH 设备5、查看群集配置 [shell]# crm_verify -L -V ## 检查配置有无错误 [shell]# pcs property ## 查看群集属性 [shell]# pcs stonith ## 查看stonith [shell]# pcs constraint ## 查看资源约束 [shell]# pcs config ## 查看群集资源配置 [shell]# pcs cluster cib ## 以xml格式显示群集配置6、管理群集 [shell]# pcs status ## 查看群集状态 [shell]# pcs status cluster [shell]# pcs status corosync [shell]# pcs cluster stop [node11] ## 停止群集 [shell]# pcs cluster start --all ## 启动群集 [shell]# pcs cluster standby node11 ## 将节点置为后备standby状态,pcs cluster unstandby node11 [shell]# pcs cluster destroy [--all] ## 删除群集,[--all]同时恢复corosync.conf文件 [shell]# pcs resource cleanup ClusterIP ## 清除指定资源的状态与错误计数 [shell]# pcs stonith cleanup vmware-fencing ## 清除Fence资源的状态与错误计数