1、当前环境
(1)BWP和BWD在同一个数据库中,BWD的实例为03,BWP的实例为02
2、问题描述
(1)BWP和BWD无法同时启动。
(2)当BWD启动时,启动BWP,提示:Exiting with Return-Code 3. (No more child processes)
(3)当BWP启动时,BWD一直不能正常启动,直至timeout
3、查询日志
(1)如下日志:
HANABW:/usr/sap/BWP/HDB02/hanabw/trace> vi nameserver_alert_hanabw.trc
意思是系统开启了replication 功能,占用端口,
55589]{-1}[-1/-1] 2021-07-31 14:27:07.072139 e tns_ddl TNSClient.cpp(00917) : setServiceType no topology databaseId use service default: 2
[55602]{-1}[-1/-1] 2021-07-31 14:27:07.158369 e TrexNet Responder.cpp(00568) : can't listen on port 127.0.0.1:30302: host unknown
[55602]{-1}[-1/-1] 2021-07-31 14:27:07.158621 e TrexNet Responder.cpp(00568) : can't listen on port 127.0.0.2:30302: host unknown
[55602]{-1}[-1/-1] 2021-07-31 14:27:07.945683 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(00637) : Listener cannot be started, because port 30301 is already in use!
[55602]{-1}[-1/-1] 2021-07-31 14:27:07.945695 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(00638) : A system replication primary uses replication ports in the range of the next instancenr (own=02, next=03)
[55602]{-1}[-1/-1] 2021-07-31 14:27:07.945700 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(00639) : Please check, that there is no other system on this machine using instancenr 03! This is just a hint and possibly not the root cause ..
[55602]{-1}[-1/-1] 2021-07-31 14:27:07.945701 e sr_dataaccess DisasterRecoveryPrimaryImpl.cpp(00640) : In general the port range 30300-30399 must not be used by any other process when system replication is turned on!
[55602]{-1}[-1/-1] 2021-07-31 14:27:07.945772 f PersistenceLayer PersistenceController.cpp(00604) : startup failed exception 1: no.2110008 (Basis/IO/Stream/impl/NetworkChannel.cpp:736)
Error address in use: $msg$, rc=98: Address already in use
$NetworkChannel$=
ListenerChannel FD 32 [0x00002ab1cf59a5c0] {refCnt=1, idx=18446744073709551615} 0.0.0.0/30301_tcp->(invalid) New,[----]
exception throw location:
1: 0x00002aaff0e4c0d0 in Stream::NetworkChannelBase::bindLocal()+0xd0 at NetworkChannel.cpp:736 (libhdbbasis.so)
4、问题分析:
(1)之前我给BWP做了一个replication的同步复制,也就是容灾到备机做测试。做完后我就没关闭,它在做replication过程会启用30300-30399之间的端口号
(2)BWD使用的ID是03,所以会使用30303
(3)BWP使用的ID是02,所以会使用30302.但是BWP启动时,因为有replication的功能,会使用30302之后的端口比如30303,所以总是提示被占用。
5、解决思路
我把BWP的replication 容灾功能关了,那么BWP启动时,30303就不会再使用到,也就不会在启动中和BWD有冲突。
(1)BWP-HANA数据库关闭后无法启动,提示HDB Daemon not running
(2)考虑内存占用问题,把BWD的hana库关闭
(3)重启BWP HANA,可以正常启动