虚拟化容器,大数据,DBA,中间件,监控。

RHCS+ORACLE双机热备安装配置

24 03月
作者:admin|分类:DBA运维

RHCS+ORACLE双机热备安装配置

我的环境是两台IBM X3850 X5服务器,一台HP EVA4400存储,安装的操作系统是RedHat AS5.4 64位,oracle 10.2.0.4,节点1:kms1,节点2:kms2
kms1:  133.0.104.45
133.0.104.49 IBM BMC地址
kms2: 133.0.104.46
133.0.104.48 IBM BMC地址

下面是安装步骤:
1.安装软件包
yum install cluster*
yum install rgmanage*
yum install cman*
yum install *ipim*
2.双网卡绑定
将eth0,eth1绑定为bond0

vi /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
BOOTPROTO=none
ONBOOT=yes
TYPE=Ethernet
USERCTL=no
IPADDR=133.0.104.45
NETMASK=255.255.255.0
NETWORK=133.0.104.0
BROADCAST=133.0.104.255
GATEWAY=133.0.104.62

vi /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=none
ONBOOT=yes
USERCTL=no
MASTER=bond0
SLAVE=yes
注:不要有实际网卡的MAC地址。
vi /etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
BOOTPROTO=none
ONBOOT=yes
USERCTL=no
MASTER=bond0
SLAVE=yes
另一台主机的也做同样的操作,IP是133.0.104.46
3.编辑/etc/hosts文件
将两台主机的IP和主机名对应关系写进去,将下面内容添加到/etc/hosts,两个主机做同样操作
133.0.104.45   kms1
133.0.104.46   kms2
4.配置集群
(1)system-config-network打开图形配置界面
进入到 cluster的图形配置界面,点击 “Create New Configuration”来创建一个新的集群配置.
(2)给集群命名。注意一个局域网内不能有两个重名的Cluster。
在RHCS5里默认使用多播心跳,用户选中“Custom Configure Multicast”可自定义多播地址。
(3)选定左端的Cluster Nodes,按Add a Cluster Node来添加cluster节点。注意这里添加的节点名需要能被正确解析,比如在/etc/hosts文件中添加相关记录,解析的IP地址为心跳IP。
(4)添加fence devices(栅栏设备–电源管理器)。设置IBM服务器BMC芯片IP和管理用户口令。
(IBM bmc芯片配置方法:在服务器启动过程中根据提示按下F1进入BIOS,
选择advanced setup BMC setup BMC network,更改管理IP地址。
另外也可以选择BMC accout settings设置管理用户及其密码,默认情况下BMC的管理帐户为USERID,密码PASSW0RD,注意密码里是数字零而非字母欧)。
(5)将设置好的fence device依次设定给前面创建好的cluster node。注意每个几点添加的是本机的IPMI fence,确定。
(6)建立失效域,将前面建好节点加到域里。
(7)建立所要用的资源和服务,按照要求填写相关内容,虚拟IP资源。
(8)最后创建服务,在服务里面按照实际启动顺序添加前边建立的资源(ip,file system,scripts)
(9)设置完毕之后,选择 文件 –> 保存,把设置保存到/etc/cluster/cluster.conf文件里。然后把cluster.conf文件拷贝到每个节点上。
5.启动集群
rhcs1: service cman start
rhcs2: service cman start
注:此时在rhcs1上执行完命令之后,在rhcs2也要执行,不然rhcs1找不到rhcs2,就会将rhcs2 fence掉,重启,重启也没关系,重启好了重新执行命令加入集群就行了
rhcs1: service rgmanager start
rhcs2: service rgmanager start
启动资源
现在可以用ip add list可以看见vip生成了
clustat -ll命令可以查看集群的状态
6.总结
正常情况下,按上面的配置就能成功了,但在这次的配置过程中,由于局方限制了组播,导致启动了节点1之后再启动节点2,节点2不能发现节点1,直接将节点 1 fence掉了,两边只要都启服务,都会去重启对方,解决办法是用广播,或者磁盘心跳等来解决,我这里用广播来解决,在clustet.conf添加一 条<cman broadcast=”yes” />解决了。。,下面是找不到心跳的时候的日志
tail -f /var/log/message

Dec  6 13:16:23 kms1 openais[8161]: [CMAN ] CMAN 2.0.115 (built Aug  5 2009 08:24:57) started
Dec  6 13:16:23 kms1 openais[8161]: [MAIN ] Service initialized ‘openais CMAN membership service 2.01′
Dec  6 13:16:23 kms1 openais[8161]: [SERV ] Service initialized ‘openais extended virtual synchrony service’
Dec  6 13:16:23 kms1 openais[8161]: [SERV ] Service initialized ‘openais cluster membership service B.01.01′
Dec  6 13:16:23 kms1 openais[8161]: [SERV ] Service initialized ‘openais availability management framework B.01.01′
Dec  6 13:16:23 kms1 openais[8161]: [SERV ] Service initialized ‘openais checkpoint service B.01.01′
Dec  6 13:16:23 kms1 openais[8161]: [SERV ] Service initialized ‘openais event service B.01.01′
Dec  6 13:16:23 kms1 openais[8161]: [SERV ] Service initialized ‘openais distributed locking service B.01.01′
Dec  6 13:16:23 kms1 openais[8161]: [SERV ] Service initialized ‘openais message service B.01.01′
Dec  6 13:16:23 kms1 openais[8161]: [SERV ] Service initialized ‘openais configuration service’
Dec  6 13:16:23 kms1 openais[8161]: [SERV ] Service initialized ‘openais cluster closed process group service v1.01′
Dec  6 13:16:23 kms1 openais[8161]: [SERV ] Service initialized ‘openais cluster config database access v1.01′
Dec  6 13:16:23 kms1 ccsd[8152]: Initial status:: Quorate
Dec  6 13:16:23 kms1 openais[8161]: [SYNC ] Not using a virtual synchrony filter.
Dec  6 13:16:23 kms1 openais[8161]: [TOTEM] Creating commit token because I am the rep.
Dec  6 13:16:23 kms1 openais[8161]: [TOTEM] Saving state aru 0 high seq received 0
Dec  6 13:16:23 kms1 openais[8161]: [TOTEM] Storing new sequence id for ring 10
Dec  6 13:16:23 kms1 openais[8161]: [TOTEM] entering COMMIT state.
Dec  6 13:16:23 kms1 openais[8161]: [TOTEM] entering RECOVERY state.
Dec  6 13:16:23 kms1 openais[8161]: [TOTEM] position [0] member 133.0.104.45
Dec  6 13:16:23 kms1 openais[8161]: [TOTEM] previous ring seq 12 rep 133.0.104.45
Dec  6 13:16:23 kms1 openais[8161]: [TOTEM] aru 0 high delivered 0 received flag 1
Dec  6 13:16:23 kms1 openais[8161]: [TOTEM] Did not need to originate any messages in recovery.
Dec  6 13:16:23 kms1 openais[8161]: [TOTEM] Sending initial ORF token
Dec  6 13:16:23 kms1 openais[8161]: [CLM  ] CLM CONFIGURATION CHANGE
Dec  6 13:16:23 kms1 openais[8161]: [CLM  ] New Configuration:
Dec  6 13:16:23 kms1 openais[8161]: [CLM  ] Members Left:
Dec  6 13:16:23 kms1 openais[8161]: [CLM  ] Members Joined:
Dec  6 13:16:23 kms1 openais[8161]: [CLM  ] CLM CONFIGURATION CHANGE
Dec  6 13:16:23 kms1 openais[8161]: [CLM  ] New Configuration:
Dec  6 13:16:23 kms1 openais[8161]: [CLM  ]     r(0) ip(133.0.104.45)
Dec  6 13:16:23 kms1 openais[8161]: [CLM  ] Members Left:
Dec  6 13:16:23 kms1 openais[8161]: [CLM  ] Members Joined:
Dec  6 13:16:23 kms1 openais[8161]: [CLM  ]     r(0) ip(133.0.104.45)
Dec  6 13:16:23 kms1 openais[8161]: [SYNC ] This node is within the primary component and will provide service.
Dec  6 13:16:24 kms1 openais[8161]: [TOTEM] entering OPERATIONAL state.
Dec  6 13:16:24 kms1 openais[8161]: [CMAN ] quorum regained, resuming activity
Dec  6 13:16:24 kms1 openais[8161]: [CLM  ] got nodejoin message 133.0.104.45
Dec  6 13:18:10 kms1 fenced[8180]: kms2 not a cluster member after 60 sec post_join_delay
Dec  6 13:18:10 kms1 fenced[8180]: fencing node “kms2″
Dec  6 13:18:24 kms1 fenced[8180]: fence “kms2″ success

下面是配好了的cluster.conf
[root@kms1 ~]# cat /etc/cluster/cluster.conf
<?xml version=”1.0″?>
<cluster alias=”kms_rhcs” config_version=”8″ name=”kms_rhcs”>
<fence_daemon post_fail_delay=”0″ post_join_delay=”60″/>
<clusternodes>
<clusternode name=”kms1″ nodeid=”1″ votes=”1″>
<fence>
<method name=”1″>
<device name=”kms1_fence”/>
</method>
</fence>
</clusternode>
<clusternode name=”kms2″ nodeid=”2″ votes=”1″>
<fence>
<method name=”1″>
<device name=”kms2_fence”/>
</method>
</fence>
</clusternode>
</clusternodes>
<cman expected_votes=”1″ two_node=”1″/>
<cman broadcast=”yes” />
<fencedevices>
<fencedevice agent=”fence_ipmilan” auth=”" ipaddr=”133.0.104.49″ login=”USERID” name=”kms1_fence” passwd=”PASSW0RD”/>
<fencedevice agent=”fence_ipmilan” auth=”" ipaddr=”133.0.104.48″ login=”USERID” name=”kms2_fence” passwd=”PASSW0RD”/>
</fencedevices>
<rm>
<failoverdomains>
<failoverdomain name=”kms_domain” ordered=”0″ restricted=”1″>
<failoverdomainnode name=”kms1″ priority=”1″/>
<failoverdomainnode name=”kms2″ priority=”1″/>
</failoverdomain>
</failoverdomains>
<resources>
<ip address=”133.0.104.47″ monitor_link=”1″/>
</resources>
<service autostart=”1″ domain=”kms_domain” name=”kms_serv”>
<ip ref=”133.0.104.47″/>
</service>
</rm>
</cluster>
这上面只有一个IP资源,其他的没做配置,文件没保留
oracle启动脚本:
#!/bin/bash
export ORACLE_HOME=/oracle/product/10.2.0/db_1
export ORACLE_SID=kmsdb
start() {
su – oracle<<EOF
echo “Starting Listener….”
$ORACLE_HOME/bin/lsnrctl start
echo  “Starting Oracle10g Server… ”
sqlplus / as sysdba
startup
exit;
EOF
}
stop() {
su – oracle<<EOF
echo “Shutting down Listener……”
$ORACLE_HOME/bin/lsnrctl stop
echo “Shutting down Oracle10g Server…”
sqlplus / as sysdba
shutdown immediate;
exit
EOF
}
case “$1″ in
start)
start
;;
stop)
stop
;;
*)
echo “Usage: $0 {start|stop}”
;;
esac

浏览1903 评论0
返回
目录
返回
首页
oracle单个文件误删在线恢复 ORACLE RAC 监听配置 (listener.ora tnsnames.ora)