Patch Name: PHSS_31015 Patch Description: s700_800 11.X MC/ServiceGuard and SG-OPS Edition A.11.14 Creation Date: 04/07/08 Post Date: 04/07/16 Hardware Platforms - OS Releases: s700: 11.00 11.11 s800: 11.00 11.11 Products: MC/ServiceGuard A.11.14 ServiceGuard OPS Edition A.11.14 Filesets: DLM-Pkg-Mgr.CM-PKG,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP Package-Manager.CM-PKG,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP DLM-Pkg-Mgr.CM-PKG-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP Package-Manager.CM-PKG-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP DLM-ATS-Core.ATS-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP ATS-CORE.ATS-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP DLM-ATS-Core.ATS-RUN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP ATS-CORE.ATS-RUN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP DLM-NMAPI.CM-NMAPI,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP DLM-Clust-Mon.CM-CORE,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP Cluster-Monitor.CM-CORE,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP DLM-Clust-Mon.CM-CORE-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP Cluster-Monitor.CM-CORE-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP Automatic Reboot?: No Status: General Release Critical: Yes PHSS_31015: MEMORY_LEAK ABORT When vgdisplay is issued on a volume group activated in shared mode, there is a small memory leak in cmlvmd. When cmquerycl, cmcheckconf, or cmapplyconf is run, and a node has no LVM configured, cmclconfd may abort. PHSS_30769: ABORT ServiceGuard daemon cmcld might abort if cluster is configured with multiple heartbeats and one of the networks is highly loaded or a local switch is happening on it. PHSS_30448: ABORT PANIC MEMORY_LEAK In 2 node ServiceGuard cluster using serial heartbeat link, cmcld can abort with segmentation violation or bus error resulting in a node TOC. One observed stack trace is: (gdb) bt #0 0xa2f58 in cl_comm_reply+0x9f0 () #1 0xbc81c in rcomm_health_event_handler+0x288 () #2 0x146784 in cl_event_loop+0x458 () #3 0x1f146c in cma__thread_base+0x204 () #4 0x1f3c40 in cma__thread_start1+0x38 () #5 0x1f36d8 in cma__thread_start0_PA20+0xc () There is a 16k memory leak exposed in the subagent, /usr/lbin/cmsnmpd, when retrieving new ServiceGuard cluster configuration information when there are CVM or System Multi Node packages configured in the cluster. PHSS_30028: OTHER ServiceGuard can reduce the number of file descriptors available for package applications, which may cause applications requiring 1024 or more file descriptors to fail. If ServiceGuard cluster reformation happens when a cmhaltpkg command is in progress, the cmhaltpkg command may return an error even when package has halted successfully. The error would look like: cmhaltpkg : Package orafintest is not currently running. Check the syslog and pkg log files for more detailed information. This can also happen if cmhaltpkg is called from the control script of another package at package start time. If a user attempts to add a quorum server while the cluster is running, the cmapplyconf does not add the cluster lock (correct behavior), but it succeeds without logging an error message. For example, a user can create a one node cluster with no cluster lock, start the cluster, and then use cmapplyconf to add a node and a quorum server. The result is a two node cluster with no cluster lock. If the root filesystem is full, a package may fail to halt successfully. In the package log file, the following error messages can be seen: umount_pidsxxxx: Cannot find or open the file. vgchange_pidsxxxx: Cannot create the specified file. The package log file will be: /etc/cmcluster/(package name)/ (package name).control.log PHSS_29915: ABORT HANG In a cluster configured with 3 or more heartbeat LANs, the ServiceGuard daemon cmcld may wait indefinitely for replies to a message it has sent out and not complete a crucial step in a cluster reformation. The node will TOC. The ServiceGuard daemon cmcld may be triggered by unreliable network traffic to begin consuming 100% of the CPU. On a single CPU system this could result in a system hang. PHSS_29561: HANG ABORT PANIC The cmcld daemon may log the message "timers delayed x.x seconds" due to kernel latency issues, or a network partition may separate nodes in the cluster. A ServiceGuard cluster of more than 2 nodes with a cluster lock, after experiencing such a hang or partition, may result in the formation of 2 clusters. This is a corner case where the hang or partition happens while a node is joining a previously formed 2-node cluster. The joining node forms a cluster with the original coordinator node, while the non-coordinator node forms a cluster by itself. cmcld may hang in an accept() call on the local communications socket if the socket pops but there is no connection to accept. This causes various threads to hang and frequent cluster reformations. Eventually when a connection comes along and the accept() call proceeds, all the threads resume execution but the processing of all the backed up activity results in a deadlock. The node is unable to respond to a sync request and aborts. cmcld may hang in an accept() call on a remote communications socket if the socket pops but there is no connection to accept. If the cluster contains more than one node at the time, the problem node may TOC. If the cluster contains only one node at the time, cmcld may hang, commands may hang and other nodes may not join the cluster. PHSS_29122: ABORT CORRUPTION HANG PANIC If an EMS resource is configured with no RESOURCE_UP_VALUE criteria, a later online change of the resource may result in cmcld abort. If an online delete for a package is in progress and at the same time another resource becomes available on a node that satisfies the requirements to run that package, then cmcld may core dump after the online delete. During a cmhaltnode, the VxVM-CVM-pkg will be halted and package switching is disabled. If the shutdown of the node does not complete successfully, the VxVM-CVM-pkg will still be disabled and can not be restarted. Since the VxVM-CVM-pkg is up on other nodes in cluster, it will get stuck in the STARTING state. A cmhaltnode or cmhaltcl and cmhaltpkg will hang after that point. If during the cmhaltcl or cmhaltnode, CVM disk groups are still active at VXVM-CVM-pkg halt time, the VxVM-CVM-pkg halt will timeout and fail to halt. vxclustd will hang waiting for the disk groups to be deactivated. Because FAILFAST is set for this system package, the timeout of the VxVM- CVM-pkg will cause the node to TOC. In rare circumstances cmhaltnode will cause a node to TOC or panic. The final entries in syslog will look something like the following: Jul 9 17:20:53 lead cmcld: Timed out node zinc. It may have failed. Jul 9 17:20:53 lead cmcld: Attempting to form a new cluster . . . . . . Jul 9 17:20:54 lead cmclconfd[4166]: Data corruption detected during message read. The final message in the log cache of the cmcld core file will look like the following: 40104010: Aborting: cmdsrv/cmdsrv_rops.c 547 (Shutdown failure - sleep timed out) When a package is in the starting state and ServiceGuard enters final part of a node shutdown, the package is ignored and will stay up and lead to possible data corruption. Packages can be in this state when a cmmodpkg or cmrunpkg is issued after a cmhltnode has begun. PHSS_28851: HANG After an upgrade of ServiceGuard cluster from version A.10.06 or earlier to A.11.14, cmrunnode command may fail. If a port-scanning utility such as the Linux application "nmap" is executed against a node running ServiceGuard, cmcld on the node may hang and unexpectedly fail. PHSS_27725: ABORT cmcld could abort when a package with name more than 36 characters and an ip address is halted. cmcld could abort (with assertion failure) if it goes for the quorum server, the qs lock is granted but there is a reconfiguration before lock granted message arrives. Under rare circumstances, if a node cannot update its system clock for an extended period of time, the node and one more node in the cluster will fail. If the cluster is not more than 3 nodes, the whole cluster will fail. If cluster is 4 node with cluster lock or of more than 4 nodes then the rest should reform a cluster. The node which experienced system clock problem does a TOC first while the other node does TOC shortly after that. The syslog on the node with the system clock problem may not log any information. One of other node in the cluster will log the message below in the syslog at an interval equal to the node timeout. For example if the node timeout is 3 seconds then every 3 seconds following message will be seen in syslog (in addition to other messages): 10:10:03 Timed out node NODEA. It may have failed. 10:10:03 Attempting to adjust cluster membership ...... ...... 10:10:06 Timed out node NODEA. 10:10:06 Attempting to adjust cluster membership ...... PHSS_27246: ABORT When using a contributed support tool to disable and then re-enable safety time, the tool fails to restore safety time protection properly, resulting in node TOC. If cmviewcl command is issued at the same time either package configured resource, subnet or service being deleted from the configuration, then cmviewcl may fail with a SIGSEGV creating a core. A series of single point network card or hub failures may cause a cl_sync timeout resulting in the entire cluster going down. At cmcld start up, i.e. cmrunnode or cmruncl, syslog shows this message, "cmcld: Assertion failed: pnet != NULL, file: comm_link.c, line: 140." cmcld immediately aborts and dumps core. Service Assistant Daemon (cmsrvassistd) can dump core in /var/adm/cmcluster if SIGCHLD is delivered while in the middle of a syslog call. PHSS_26056: ABORT CORRUPTION OTHER Service Assistant Daemon (cmsrvassistd) may abort with a core dump if SIGCHLD is delivered while in the middle of a syslog call. ServiceGuard daemon (cmcld) may abort with a core dump when a package is started or halted. ServiceGuard daemon (cmcld) may abort with a core dump when multiple cmapplyconf or cmrunnode commands are issued and any one of them is aborted. A ServiceGuard OPS node may TOC after a false unclean shutdown. ServiceGuard daemon (cmcld) may abort with core dump due to DLPI errors. ServiceGuard daemon (cmcld) may abort with core dump during cluster formation. A package may be started on two nodes, causing data corruption. Category Tags: defect_repair enhancement general_release critical panic halts_system corruption memory_leak Path Name: /hp-ux_patches/s700_800/11.X/PHSS_31015 Symptoms: PHSS_31015: 1. When vgdisplay is issued on a volume group activated in shared mode, there is a small memory leak in cmlvmd. 2. When cmquerycl, cmcheckconf, or cmapplyconf is run, and a node has no LVM configured, cmclconfd may abort. 3. If the OpenView Operations (OVO) library exists at /opt/OV/lib/libopccv.sl then every time cmsnmpd attempts to send an OPC message an error is written to the cmsnmpd log at /var/adm/SGsnmpsuba.log and the call fails so OVO does not get the message. If the OVO library does not exist at this location, then no errors are logged, since cmsnmpd will not attempt to send OPC messages The error logged is, ***Could not load the shared library: '/opt/OV/lib/libopccv.sl', Exec format error This error is repeated many times, once for every attempt to send a message. The full error reported by dld when it tries to do the load is: ***/usr/lib/dld.sl: Can't shl_load() a library containing Thread Local Storage: /usr/lib/libpthread.1 /usr/lib/dld.sl: Exec format error 4. cmquerycl issued with -c option and with -n on a cluster with dual cluster lock will fail with error message: cmquerycl : Node does not have access to the cluster lock physical volume cluster lock PHSS_30769: 1. ServiceGuard daemon cmcld might abort if cluster is configured with multiple heartbeats and one of the networks is highly loaded or a local switch is happening on it. The following messages can be seen in Syslog: node2 cmcld: Pausing HB connection to xx.xx.xx.xx node2 cmcld: Timed out node node1. node2 cmcld: Attempting to form a new cluster node2 cmcld: Assertion failed: icp->in_state == CL_CONN_INBOUND_READY, file: rcomm/comm_ip_state.c, line: 183 2. ServiceGuard does not handle EMS error correctly when the monitoring for the resource failed. The SG package depending on the resource continues to run and is not halted. 3. Configuration commands such as cmgetconf fail after reporting disks do not have an ID when they do: Warning: The disk at /dev/dsk/c25t0d0 on node kelvin does not have an ID, or a disk label. Error: Unable to determine a unique identifier for physical volume /dev/dsk/c25t0d0 on node kelvin. Use pvcreate to give the disk an identifier. The following errors are reported in syslog: Feb 6 20:01:07 kelvin cmclconfd[6345]: Unable to open disk /dev/dsk/c25t0d0: Resource temporarily unavailable Feb 6 20:02:20 kelvin cmclconfd[6345]: Physical volume /dev/dsk/c25t0d0 in volume group /dev/vgXX does not have an ID! PHSS_30448: 1. In ServiceGuard cluster using Fiber Channel cluster lock disk, cmcld can get stuck on partially open cluster lock disk. The following messages can be observed in syslog: cmcld: Unable to query the health of cluster lock disk /dev/dsk/c3t6d6: Device busy cmcld: Check device, power, and cables. Issuing diskinfo on the cluster lock disk in this state fails with device busy. 2. In 2 node ServiceGuard cluster using serial heartbeat link, cmcld can abort with segmentation violation or bus error resulting in a node TOC. One observed stack trace is: (gdb) bt #0 0xa2f58 in cl_comm_reply+0x9f0 () #1 0xbc81c in rcomm_health_event_handler+0x288 () #2 0x146784 in cl_event_loop+0x458 () #3 0x1f146c in cma__thread_base+0x204 () #4 0x1f3c40 in cma__thread_start1+0x38 () #5 0x1f36d8 in cma__thread_start0_PA20+0xc () 3. In a stressed ServiceGuard environment there may be the following cmsnmpd subagent errors logged in /var/adm/SGsnmpsuba.log: "Protocol failure talking with cmclconfd" "***Error: reading status of SUBNET: -16" "***Error: retrieving package statuses: -26" "***Error: retrieving package statuses: -9" 4. There is a 16k memory leak exposed in the subagent, /usr/lbin/cmsnmpd, when retrieving new ServiceGuard cluster configuration information when there are CVM or System Multi Node packages configured in the cluster. 5. These messages show up in syslog when a configuration command (cmquerycl, cmcheckconf, cmapplyconf) is issued: "Unable to query the I/O interface: Key is undefined for specified token, or token is NULL." "Unable to get interface type for disk [disk name]" PHSS_30028: 1. cmscancl produces the wrong output in the "network connection checking" section when HyperFabric interfaces are present. If clic0(Hyperfabric interface) and lan0(Ethernet interface) are each in a point-to-point connection, cmscancl shows that the connection between clic0(HyperFabric interface) and lan0(Ethernet interface) is OK, which is incorrect. 2. ServiceGuard can reduce the number of file descriptors available for package applications, which may cause applications requiring 1024 or more file descriptors to fail. 3. If ServiceGuard cluster reformation happens when a cmhaltpkg command is in progress, the cmhaltpkg command may return an error even when package has halted successfully. The error would look like: cmhaltpkg : Package orafintest is not currently running. Check the syslog and pkg log files for more detailed information. This can also happen if cmhaltpkg is called from the control script of another package at package start time. 4. If a user attempts to add a quorum server while the cluster is running, the cmapplyconf does not add the cluster lock (correct behavior), but it succeeds without logging an error message. For example, a user can create a one node cluster with no cluster lock, start the cluster, and then use cmapplyconf to add a node and a quorum server. The result is a two node cluster with no cluster lock. 5. If the root filesystem is full, a package may fail to halt successfully. In the package log file, the following error messages can be seen: umount_pidsxxxx: Cannot find or open the file. vgchange_pidsxxxx: Cannot create the specified file. The package log file will be: /etc/cmcluster/(package name)/(package name).control.log PHSS_29915: 1. The ServiceGuard daemon, cmcld, is aborted in the presence of CPU starvation and/or frequent network packet loss. One of the following errors is shown in syslog: cmcld: Assertion failed: node != (cm_node_t *)0, file: cm/comm.c, line: 930 or cmcld: Assertion failed: icp->state == CL_CONN_CLOSING, file: rcomm/comm_ip_state.c, line: 171 2. Applications which need a number of file descriptors larger than 2048 might fail if starting as ServiceGuard packages. 3. The ServiceGuard daemon cmcld may be triggered by unreliable network traffic to begin consuming 100% of the CPU. On a single CPU system this could result in a system hang. 4. In a cluster configured with 3 or more heartbeat LANs, the ServiceGuard daemon cmcld may wait indefinitely for replies to a message it has sent out and not complete a crucial step in a cluster reformation. The node will TOC with the following messages in syslog: cmcld: Halting to preserve data integrity cmcld: Reason: This node did not reach sync step 0 for activity 3 within timeout cmcld: Aborting! This node did not reach sync step 0 for activity 3 within timeout (file: utils.c, line: 228) PHSS_29561: 1. cmcld may hang in an accept() call on the local communications socket if the socket pops but there is no connection to accept. This causes various threads to hang and frequent cluster reformations. Eventually when a connection comes along and the accept() call proceeds, all the threads resume execution but the processing of all the backed up activity results in a deadlock. The node is unable to respond to a sync request and aborts with the following syslog messages: vmunix: Halting to preserve data integrity vmunix: Reason: This node did not reach sync step 0 for activity 3 within timeout cmcld: Daemon exiting due to halt message from node vmunix: Service Guard Aborting! vmunix: Cause: This node did not reach sync step 0 for activity 3 within timeout(File: utils.c, Line: 228) cmcld: Halting to preserve data integrity cmcld: Reason: This node did not reach sync step 0 for activity 3 within timeout cmcld: Aborting! This node did not reach sync step 0 for activity 3 within timeout (file: utils.c, line: 228) 2. cmquerycl shows inappropriate error message. For example, if you have an existing 3 node cluster configuration which does not include a site LAN in the configuration and then you attempt to create a new cluster configuration that includes a 4th node which has a LAN card connected to the site LAN, you would see: Perform cmquerycl -c -n -n -n -n. The following error is printed: Error: Heartbeat subnet is not available on all nodes. Site LAN is obviously not configured on the first 3 nodes, and just because cmquerycl finds it on the new node does not mean this constitutes an error. 3. Package control scripts do not display the patch ID of the ServiceGuard version they are generated from. 4. When there is a network component between two interfaces that does not allow any data link level (DLPI) traffic through, commands such as cmquerycl, cmcheckconf and cmapplyconf do not report the illegal configuration, but may instead print a misleading error message: Error: Non-uniform connections detected, successfully received from but did not receive from . This is probably due to heavy network traffic or heavy load on . 5. cmcld may hang in an accept() call on a remote communications socket if the socket pops but there is no connection to accept. If the cluster contains more than one node at the time, the problem node may TOC. If the cluster contains only one node at the time, cmcld may hang, commands may hang and other nodes may not join the cluster. A typical stack trace of cmcld obtained during the hang may look like: #0 0x1e0150 in cma__dispatch () #1 0x1dfe5c in cma__block () #2 0x1d7f0c in cma__int_wait () #3 0x1f3418 in cma__io_wait () #4 0x1bfe04 in cma_accept () #5 0xc0213a28 in accept () from /usr/lib/libc.2 #6 0x155468 in sg_accept () #7 0xab5e4 in cl_comm_ip_accept () #8 0xa3480 in cl_comm_ip_loop () #9 0x1ee32c in cma__thread_base () #10 0x1f0b00 in cma__thread_start1 () #11 0x1f0598 in cma__thread_start0_PA20 () 6. cmclconfd gets EIO from recv if cmcld exits in the middle of tcp connection and prints this message to the syslog: Data corruption detected during message read. 7. The cmcheckconf/cmapplyconf will fail with inappropriate error messages if CLUSTER_NAME or NODE_NAME in the cluster ascii file is more than 39 characters. A similar problem exists if PKG_NAME or SERVICE_NAME in the package ascii file is more than 40 characters. cmapplyconf may succeed with a CLUSTER_NAME of 40 to 42 characters in length, but after that, other cluster commands may fail, stating that the cluster cannot be found. 8. In a ServiceGuard cluster configured with a serial heartbeat link, cmcld may abort if the heartbeat LAN becomes congested and experiences delays. When this happens the following messages will be logged to syslog.log: cmcld: Out of order message 1346602 > 1346601 from node 1 cmcld: Received REQ msg 1346602 req 0 from node 1 for group 1 service 2 cmcld: cl_abort: abort cl_kepd_printf failed: Invalid argument cmcld: cl_kepd_printf, fstat: kepd_fd=7, st_dev=1073741827, st_ino=658, st_rdev=-486539264 cmcld: Aborting! out of order message 9. In the SG package control scripts it is possible to specify that VxVM disk imports are done in parallel by setting the variable CONCURRENT_DISKGROUP_OPERATIONS to something other than 1. This was added in the theory that this would improve performance, but the design of vxclustd is such that it only works in a serial fashion. In some rare cases multiple concurrent requests can cause problems and result in failed disk group (dg) imports. 10. In a few cases when DLPI primitives fail to complete during ServiceGuard configuration commands such as cmquerycl, cmcheckconf, or cmapplyconf fail, there are no error messages that end-users can see other than non-specific indications of network driver problems. 11. The cmcheckconf and cmapplyconf commands succeed in adding a node into a cluster even when the node is already a member of another cluster. The commands should instead fail in this case. 12. The ServiceGuard daemon, cmcld, is aborted after it fails to receive UDP data with EAGAIN errno while trying to join a cluster. The following error is shown in syslog: cmcld: recvfrom failed: 11 Resource temporarily unavailable cmcld: Aborting! UDP recvfrom failed (file: rcomm/comm_ ip_recv.c, line 662) 13. The commands cmhaltnode, cmhaltcl can hang when cmtaped is set to be started by SG but is not running when the commands are issued. 14. The cmcld daemon may log the message "timers delayed x.x seconds" due to kernel latency issues, or a network partition may separate nodes in the cluster. A ServiceGuard cluster of more than 2 nodes with a cluster lock, after experiencing such a hang or partition, may result in the formation of 2 clusters. This is a corner case where the hang or partition happens while a node is joining a previously formed 2- node cluster. The joining node forms a cluster with the original coordinator node, while the non-coordinator node forms a cluster by itself. 15. Inactive TCP connections that are stale between nodes may never be detected and cleaned up. The stale connections do not cause any problems with cluster behaviour, but they should be cleaned up. PHSS_29122: 1. When executing cmapplyconf, if the VxVM-CVM-pkg package is specified along with the cluster configuration and/or failover packages, the command may fail with: Unable to link to /pkgs/VxVM-CVM-pkg, object does not exist cdb_db_prepare - 2 error occurred Error: Unable to apply the configuration change: No such file or directory 2. When a cluster is applied using cmapplyconf, an error may be seen with message "cmapplyconf - number of nodes specified exceeds maximum allowed". 3. When running CVM using the VxVM-CVM-pkg, vxclustd may abort during cluster reformation while trying to query the heartbeat network information: VxVM-CVM-PKG.log will show: ERROR: Cluster volume manager is inactive after 60 secs 4. If an EMS resource is configured with no RESOURCE_UP_VALUE criteria, a later online change of the resource may result in cmcld abort. Syslog will show: Jun 18 21:02:24 cmcld: Aborting: cl_ems_support.c 1448 (Unknown resource type 5. If a node is shutting down, and during that time the following things happen in order: a) cmmodpkg command is issued (ignoring the cmhaltnode warning) b) online apply deletes the same package (ignoring the cmhaltnode warning) c) lvmd exits the shutdown, due to an external vg still active then it is possible cmcld on the shutting down node will abort. The stack trace from the core dump will contain: #3 0x4355020:0 in cl_assfail (module=2, assertion=0x4000bca0 "p_ptr !=NULL", file=0x4000bcb0 "pkg/pkg_owner_handler.c", line=1635) at utils/cl_log.c:921 And syslog will show: cmcld: Assertion failed: p_ptr != NULL, file: pkg/pkg_owner_handler.c, line: 1635 6. If an online delete for a package is in progress and at the same time another resource becomes available on a node that satisfies the requirements to run that package, then cmcld may core dump after the online delete. Syslog may contain something like the following: Mar 29 16:38:53 node1 cmcld: Unknown package 32548 for message op 27 Mar 29 16:38:53 node1 cmcld: Aborting: pkg/pkg_coord_handler.c 115 (Unknown package). 7. If a node is shutting down, and at the same time if an online cmapplyconf deletes a package with EMS resources, it is possible for cmcld to abort and drop core. A stack trace of the core will contain something like the following: #2 0x60000000c04584d0:0 in abort+0x190 () from /usr/lib/hpux32/libc.so.1 #3 0x432b330:0 in cl_list_next (element=0x401a6d80) at utils/cl_list.c:314 #4 0x417f3b0:0 in pm_resource_shutdown () at pkg/pkg_resource.c:1085 #5 0x40c5e60:0 in resource_shutdown () at daemon/cld.c:2816 cmclconfd will detect that cmcld has aborted and will put a message in syslog indicating that there has been a lost connection. 8. During a cmhaltnode, the VxVM-CVM-pkg will be halted and package switching is disabled. If the shutdown of the node does not complete successfully, the VxVM-CVM-pkg will still be disabled and can not be restarted. Since the VxVM-CVM-pkg is up on other nodes in cluster, it will get stuck in the STARTING state. A cmhaltnode or cmhaltcl and cmhaltpkg will hang after that point. 9. If the configuration of a package contains an EMS resource with "RESOURCE_START DEFERRED" and has duplicate "RESOURCE_UP_VALUE"s, it is possible for the cmhaltpkg and cmhaltnode commands to hang. 10. Package which has EMS resources and are configured with "RESOURCE_START DEFERRED" and have duplicate RESOURCE_UP_VALUEs. For example: RESOURCE_UP_VALUE > 2 RESOURCE_UP_VALUE < 5 AND > 2 may not halt cleanly. This could fail a cmhaltnode -f command. The package control log will contain: Invalid resource name 11.If an EMS resource monitor is exhibiting unpredictable behavior and keeps restarting constantly, it is possible for the cmcld to keep calling same set of functions recursively and causing a stack overflow, leading to cmcld core. The stack trace from the core will look like the following: #0 0xc01ef644 in __ldfcvt_r+0x54 () from /usr/lib/libc.2 #1 0xc01ef588 in _doprnt+0x48 () from /usr/lib/libc.2 #2 0xc0200aa8 in sprintf+0x60 () from /usr/lib/libc.2 #3 0x1556d4 in cdb_lookup_node_path+0x13c () #4 0x1588d4 in cdb_lookup_ip_address_path+0x78 () #5 0x159010 in cdb_lookup_ip+0xb0 () #6 0x7384c in pm_check_subnet_status+0x2e4 () #7 0x7349c in pm_owner_eval+0x1bc () #8 0x7a3fc in pm_resource_event+0xc68 () #9 0x5e37c in pm_check_and_deliver_owner_status_events+0x16c () #10 0x763e4 in notify_request+0x330 () #11 0x7a4a0 in pm_resource_event+0xd0c () #12 0x5e37c in pm_check_and_deliver_owner_status_events+0x16c () #13 0x763e4 in notify_request+0x330 () #14 0x7a4a0 in pm_resource_event+0xd0c () ... #109 0x763e4 in notify_request+0x330 () #110 0x8084c in pm_run_script_completed+0xce0 () #111 0x83510 in pm_exec_shell_script_completed+0x1f0 () #112 0x83e58 in pm_script_status_event+0x6a4 () #113 0x5e950 in pm_status_event+0x13c () #114 0x5eb38 in pm_event_handler+0xb0 () #115 0x15bf08 in cl_event_loop+0x72c () #116 0xc005b168 in __pthread_body+0x44 () from /usr/lib/libpthread.1 #117 0xc00649ec in __pthread_start+0x14 () from /usr/lib/libpthread.1 12. If during the cmhaltcl or cmhaltnode, CVM disk groups are still active at VXVM-CVM-pkg halt time, the VxVM-CVM-pkg halt will timeout and fail to halt. vxclustd will hang waiting for the disk groups to be deactivated. Because FAILFAST is set for this system package, the timeout of the VxVM-CVM-pkg will cause the node to TOC. 13.During hourly cluster lock health check if lock disk returns NOT_READY, then no message is logged about this in the syslog at this time. But if cluster is reformed later trying to form one node cluster, then cluster reformation might fail with message, cmcld: Obtaining Cluster Lock cmcld: Request to obtain cluster lock /dev/dsk/c7t1d0 failed: Device busy cmcld: Failed to request cluster lock. cmcld: Failed to get Cluster Lock. The earlier indication of the problem is not given by ServiceGuard during health check. 14. under rare circumstances, cmgetconf can dump core when the user enters cmgetconf and kills cmcld simultaneously stack trace would look like: Program terminated with signal 11, Segmentation fault. #0 0x080e6352 in cf_free_object_space_in_cl (cl=0x0, logh=0xbffe8970) at config/config_cdb_utils.c:174 in config/config_cdb_utils.c #0 0x080e6352 in cf_free_object_space_in_cl (cl=0x0, logh=0xbffe8970) at config/config_cdb_utils.c:1741 #1 0x080ecb99 in cf_destroy_cluster (cl=0xbffe9bcc) at config/config_cluster.c:1970 #2 0x080d37b6 in cf_cdb_to_cf (cl=0xbffe9bcc, trans_info=0x81af148, logh=0xbffe9b90) at config/config_cdb_load.c:358 #3 0x0813be15 in load_cdb (cluster_handle=0x81ae898, load_type=CDB_ANY_LOAD, config_file_cl=0xbffe9bcc, cdb_version=0xbffe9b4c, logh=0xbffe9b90) at config/config_query.c:1125 #4 0x0813c4ff in cf_get_existing_config (cluster_name=0x0, cl=0xbffe9c80, flags=2, err_list=0xbffe9c50, verbose=0) at config/config_query.c:1361 #5 0x080908d6 in getconf_main (argc=3, argv=0xbffec0e4) at cmd/cmd_config_too.c:328 #6 0x08093b83 in main (argc=3, argv=0xbffec0e4) at cmd/cmd_main.c:270 #7 0x4004b657 in __libc_start_main (main=0x8093494 , argc=3, ubp_av=0xbffec0e4, init=0x804b2b0 <_init>, fini=0x814e7b0 <_fini>, rtld_fini=0x4000dcd4 <_dl_fini>, stack_end=0xbffec0dc) at ../sysdeps/generic/libc- start.c:129 15.When VxVM-CVM-pkg package or any package with FAILFAST enabled cannot start, the node will TOC even if it's a single node cluster. This is expected behavior, however, more logging messages need to be added to clarify what has happened. 16. Node TOCs with this message in syslog: Assertion failed: ntohl(icp->hdr [which].node_id) == inp->id, file: comm_ip_recv.c, line: 957 17. In rare circumstances cmhaltnode will cause a node to TOC or panic. The final entries in syslog will look something like the following: Jul 9 17:20:53 lead cmcld: Timed out node zinc. It may have failed. Jul 9 17:20:53 lead cmcld: Attempting to form a new cluster . . . . . . Jul 9 17:20:54 lead cmclconfd[4166]: Data corruption detected during message read. The final message in the log cache of the cmcld core file will look like the following: 40104010: Aborting: cmdsrv/cmdsrv_rops.c 547 (Shutdown failure - sleep timed out) 18. If cmhaltnode is performed, the VxVM-CVM-pkg package on the node is halted, but the node is not halted since the LVM Volume Group is still activated. During this situation, the state of VxVM-CVM-pkg package is shown as "starting". With VxVM-CVM-pkg package in "starting" state no operation involving nodes could be performed. 19. In rare circumstances an assertion happens during shutdown. /var/log/messages: ... Nov 4 23:46:31 pabst cmcld: Request from node bud to start package pkg2966_4 on node pabst. Nov 4 23:46:31 pabst cmcld: Assertion failed: p_ptr->p_current_script == NULL, file: pkg/pkg_shell.c, line: 183 Nov 4 23:46:31 pabst cmcld: coredump.c: signal_handler: Begin. ... 20. EMS resources are counted incorrectly in cmapplyconf. ServiceGuard allows for monitoring of up to 60 EMS resources. When reapplying a package you get the following: "Error: 61 resources exceeds the maximum number of resources of 60 per cluster. " You may get this error even though you only have 60 resources defined in your configuration. 21. When a package is in the starting state and ServiceGuard enters final part of a node shutdown, the package is ignored and will stay up and lead to possible data corruption. Packages can be in this state when a cmmodpkg or cmrunpkg is issued after a cmhaltnode has begun. PHSS_28851: 1. After an upgrade of ServiceGuard cluster from version A.10.06 or earlier to A.11.14, cmrunnode command may fail with: # cmrunnode cmrunnode : Unable to determine the nodes on the current cluster cmrunnode : Either no cluster configuration file exists, or the file is corrupted, or cmclconfd is unable to run Also if /usr/sbin/convert command is issued manually it fails without any error an exit code of 1: # convert -f /etc/cmcluster/cmclconfig NOTE: Executing the conversion tool. # echo $? 1 # 2. When a service configured with SERVICE_FAIL_FAST_ENABLED set to "YES" fails, cmviewcl may display the node on which the service was running as "down" and "unknown", while displaying the package as "up" and "running", until the cluster reforms. 3. If a port-scanning utility such as the Linux application "nmap" is executed against a node running ServiceGuard, cmcld on the node may hang and unexpectedly fail. 4. ServiceGuard commands cmcheckconf/cmapplyconf -k option may fail if the volume groups mentioned in the cluster ascii file are not present on all the nodes in the cluster. 5. When SAM is used to create package scripts, the following will occur: All LVM Volume Groups that are added for the package are incorrectly identified as CVM STORAGE_GROUPs in the package ascii configuration script. The STORAGE_GROUP value is added even when there are no CVM disk groups configured on the system. The cmapplyconf of this package ascii configuration file will fail because the LVM Volume Group names are incorrectly specified as a CVM STORAGE_GROUPs, which do not exist. PHSS_27725: 1. The cmviewcl command intermittently fails with an error message: cmviewcl : Unable to query status for all packages: Device busy 2. The hpmcSGClusterDown trap is never generated or sent and the hpmcClusterState mib variable is never set to down by the cmsnmpd subagent when the cluster is halted. 3. When multiple package commands (cm*pkg) are issued during a cluster reformation, only the command issued last will succeed. The rest of them may hang. 4. Admin functions executed through ServiceGuard Manager are not logged in syslog on a cluster node where operation takes place. 5. The man page for cmcheckconf gives the wrong information for the -k option. The information for this option is actually for the cmquerycl -k option. 6. During an online package modification which removes previously-existing EMS resources, the previously-existing resources are not unregistered from EMS and deleted from memory properly, resulting in lingering monitor requests and unfreed heap space. 7. When FS_MOUNT_RETRY_COUNT is set to 1 and if the mount command fails to mount the file system due to mount point being busy, the script returns a failure even after successfully killing the processes using that mount point and being able to retry and successfully mount the file system in the second try. As a result the package fails to start on the node. Messages like the following will appear in the package log file: ERROR: Function check_and_mount ERROR: Failed to mount /dev/vg01/lvol01 8. An SG command could hang shortly after a cluster formation. 9. Online node reconfiguration with cmapplyconf fails occasionally because of a race condition problem. 10. In the cmlvmd shutdown routine, SLVM Shutdown ioctl is issued unnecessarily for SG OPS. syslog will show a message saying that SLVM is already initialized. 11. cmcld could abort when a package with name more than 36 characters and an ip address is halted. 12. cmcld may log the following error message to syslog while still functioning properly: cmcld: Failed to connect to : (Interrupted system call) 13. If two clusters have their private subnets bridged to the public subnet and there is no route from cluster A's interfaces to the cluster B's private subnet, there is a race condition between the non-routable udp broadcast packets coming from A's private subnet and the udp broadcast packets coming from A's public subnet that would cause cluster A from being able to discover cluster B. 14. cmcld could abort (with assertion failure) if it goes for the quorum server, the qs lock is granted but there is a cluster reformation before lock granted message arrives. 15. If any problem occurs during package startup after NFS services have been started, causing package start to fail, package restart would fail again even after fixing the problem. 16. It is possible that when a ServiceGuard package unmounts filesystems in umount_fs, not all filesystems are unmounted and the volume group deactivation fails with device busy. This is most likely to be true when CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS is set to a large number. 17. The following error message is found in the package control log file: "fuser: illegal option --C" 18. The cmviewconf command does not show the quorum server (if one is configured). 19. Erroneous hpmcSGPkgDown traps with blank nodenames are generated by cmsnmpd during cluster start up or reformations caused when SG nodes fail, halt or are started. 20. The cmsnmpd subagent doesn't update the hpmcSGPkgSubnetStatus MIB variable when a package's subnet fails or comes up. 21. When a node is halted, cmsnmpd shows inaccurate hpmcNodeRole mib varibles on the halted node. 22. During a normal cluster shutdown, cmsnmpd doesn't receive a node halted event from the SG subagent api on the last node in the cluster to be halted. This causes cmsnmpd to assume that the last node in the cluster failed, and sets the hpmcNodeStatus mib to "failed" and doesn't send out the appropriate hpmcSGNodeHalted trap. 23. When an Oracle RAC database instance goes down unexpectedly or is shutdown because "shutdown abort" is used, the surviving Oracle RAC database instances can take longer than normal time to finish database recovery and return to normal operation. 24. If a user tries to change a package script timeout using SAM/GUI without any other modifications, SAM/GUI will ignore the changes and package configuration will remain unchanged. 25. Under rare circumstances, if a node cannot update its system clock for an extended period of time, the node and one more node in the cluster will fail. If the cluster is not more than 3 nodes, the whole cluster will fail. If cluster is 4 node with cluster lock or of more than 4 nodes then the rest should reform a cluster. The node which experienced system clock problem does a TOC first while the other node does TOC shortly after that. The syslog on the node with the system clock problem may not log any information. One of other node in the cluster will log the messagfobelow in the syslog at an interval equal to the node timeout. For example if the node timeout is 3 seconds then every 3 seconds following message will be seen in syslog (in addition to other messages): 10:10:03 Timed out node NODEA. It may have failed. 10:10:03 Attempting to adjust cluster membership ...... ...... 10:10:06 Timed out node NODEA. 10:10:06 Attempting to adjust cluster membership ...... ...... 10:10:09 Timed out node NODEA. 10:10:09 Attempting to adjust cluster membership ...... ...... 10:10:12 Timed out node NODEA. 10:10:12 Attempting to adjust cluster membership 26. The ServiceGuard SNMP subagent. cmsnmpd, will sever its socket with cmcld if there are no packages configured in the cluster. This prohibits any Cluster-related SNMP traps or MIB variables from being generated or updated. An error message similar to the following will be observed in the subagent log file in /var/adm/SGsnmpsuba.log: ***Error: reading status of SUBNET PHSS_27246: 1. When using the unsupported contributed cmsetsafety tool to disable and then re-enable safety time, the tool fails to restore safety time protection properly, resulting in node TOC. 2. If the cmviewcl is issued at the same time that a package resource,subnet or service is being deleted from the configuration by another session using cmapplyconf, then the cmviewcl may fail with a SIGSEGV creating a core. The stack trace by GDB typically contains: #0 0xc01ffd40 in kill () from /usr/lib/libc.2 #1 0xc019b3b4 in raise () from /usr/lib/libc.2 #2 0xc01db550 in abort_C () from /usr/lib/libc.2 #3 0xc01db5ac in abort () from /usr/lib/libc.2 #4 0xd5ca8 in cdb_get_resource_list (cluster_handle=0x40025f38 "", pkg_name=0x40032b38 "pkg9424_2", num_resources=2, resource_list=0x400325f0) at config/config_cdb_data.c:1138 #5 0x8f560 in view_resource (cluster_handle=0x40025f38 "", pkg_name=0x40032b38 "pkg9424_2", node_name=0x40036df0 "buf", print_for_unowned=1) at cmd/cmd_view.c:2581 #6 0x8d224 in view_unowned_pkg (cluster_handle=0x40025f38 "", pkg_name=0x40032b38 "pkg9424_2", package_format=0x40022990 " vflag=1, lflag=1, plimit=1, pkg_status=0x40009340 "down", pkg_state=0x40009338 "halted", pkg_switching=0x400093a0 "disabled", pkg_owner=0x400354f8 "unowned", status_str=0x40008ed8 "up") at cmd/cmd_view.c:1859 #7 0x8ae88 in view_cluster (cluster_name=0x40026118 "STRESS_lvk_0419", vflag=1, lflag=1, numpkgs=0, pkgs=0x40025430, numnodes=0, nodes=0x40025440, climit=0, plimit=1, nlimit=0, gflag=0) #8 0x87a5c in view_main (argc=4, argv=0x7f7f01a4) at cmd/cmd_view.c:289 #9 0x76758 in main (argc=4, argv=0x7f7f01a4) at cmd/cmd_main.c:220 3. A package configured to use the large number of file systems spread across the multiple volume groups takes longer to mount the file systems. Also there is no mechanism provided in the control script for the user to specify additional options to fsck and umount commands used in the package control script. 4. When cmcld is running with more than ten network interface cards configured on a cluster node, its CPU utilization percentage raises significantly. This problem is mostly exposed with Superdome machines, or systems with large VLAN configuration. 5. A series of single point network card or hub failures may cause a cl_sync timeout resulting in the entire cluster going down. Syslog reports error: "Node id X did not reach sync step 0 for activity 3" 6. A cmapplyconf succeeds with unquoted 2-word value for string resource. For example: RESOURCE_UP_VALUE = very stable The cmapplyconf would succeed, but the resource would be "UP" when its value was "very" not "very stable". 7. In SAM/GUI a user is not able to see a hierarchy of EMS resources in package configuration screens. 8. The cmsnmpd subagent will store the package status as unknown, instead of down in the ServiceGuard MIB table when a package's node fails and the package is not restarted. 9. If cmrunnode or cmruncl times out, in a subsequent cluster formation a package configured with automatic start resources may fail to come up on its primary node. 10.ServiceGuard commands cmcheckconf/cmapplyconf with -P option and without -C option can take longer to finish. Even specifying -k option does not improve a performance. This can be also noticed if ServiceGuard is upgraded from 11.09 or earlier version to 11.13 or 11.14. 11.ServiceGuard commands cmcheckconf/cmapplyconf with -k option can take long time if there are large number of disks and volume group configured on sytem while only very few of them are mentioned in cluster ascii file. 12.At cmcld start up, i.e. cmrunnode or cmruncl, syslog shows this message, "cmcld: Assertion failed: pnet != NULL, file: comm_link.c, line: 140." cmcld immediately aborts and dumps core. 13.Service Assistant Daemon (cmsrvassistd) can dump core in /var/adm/cmcluster if SIGCHLD is delivered while in the middle of a syslog call. The stack trace in the core dump would look like this: #0 0x400c942f in tz_compute (tm=0xbffff064) at ../sysdeps/i386/bits/string.h:343 #1 0x400c95c4 in __tz_convert (timer=0xbfffeee8, use_localtime=1, tp=0xbffff064) at tzset.c:593 #2 0x400c576b in __localtime_r (t=0xbfffeee8, tp=0xbffff064) at localtime.c:33 #3 0x4010298d in vsyslog (pri=27, fmt=0x805f740 "Unable to send 64 bytes (Software caused connection abort).\n", ap=0xbffff0dc) at syslog.c:170 #4 0x401028a9 in syslog (pri=27, fmt=0x805f740 "Unable to send 64 bytes (Software caused connection abort).\n") at syslog.c:102 #5 0x0804ba2c in cl_vsyslog (private_data=0x0, category=131072, level=0, module=5, fmt=0x8059dc0 "Unable to send %d bytes (%s).\n", ap=0xbffff130) at utils/cl_syslog.c:91 #6 0x0805859b in cl_clog (clog_handle=0x0, category=131072, level=0, module=5, fmt=0x8059dc0 "Unable to send %d bytes (%s).\n") at utils/cl_clog.c:123 #7 0x0804c356 in cl_local_cl_send (fd=0, service_id=12,msg=0xbffff1ac, msg_length=32, flags=1, reply=0x0, timeout=0x0,logh=0x0 at lcomm/local_client.c:497 #8 0x0804b885 in handle_sig_chld (in=17) at servsen/serv_assist.c:1236 #9 Also, the message the user will see is: "Process creation daemon terminated due to a signal(11)." 14.After customer modified the hostname, packages using VxVM disk groups failed to start. 15.At package start up, busy mount point might not be freed up and re-mounted appropriately. PHSS_26056: 1. cmviewconf displays an incorrect HALT_SCRIPT_TIMEOUT value for a package when the RUN_SCRIPT_TIMEOUT is set to NO_TIMEOUT (0) and the HALT_SCRIPT_TIMEOUT is set to a non-zero value. 2. After an upgrade of a ServiceGuard cluster to version 11.13 from version 11.12 or earlier, if any package with an EMS resource has been added and deleted before upgrade, then the addition of any new package to the cluster after upgrade may fail. The cmapplyconf command will return error messages like: Error: Unable to apply the configuration change: Unknown error: 3015. Check the syslog file(s) for additional information. cmapplyconf : Unable to apply the configuration The syslog may contain the error messages like: cmcld: cdb_db_prepare - 3015 error occurred 5 3. If cmrunnode or cmapplyconf are stopped in the middle of execution and there are multiple such commands running concurrently, then the cmcld may fail with a SIGSEGV or SIGBUS creatig a core in /var/adm/cmcluster/core. The syslog will contain the messages like, cmlvmd: Could not read messages from /usr/lbin/cmcld: Software caused connection abort cmlvmd: CLVMD exiting cmsrvassistd[]: The cluster daemon aborted our connection. cmsrvassistd[]: Lost connection with ServiceGuard cluster daemon (cmcld): Software caused connection abort The stack trace by GDB typically contains: #0 0x105d94 in cdb_client_port_close () from /usr/lbin/cmcld #1 0x1413a0 in cl_thread_start () from /usr/lbin/cmcld #2 0x1aa8e8 in cma__thread_base () from /usr/lbin/cmcld #3 0x1aca38 in cma__thread_start1()from /usr/lbin/cmcld #4 0x1ac4d4 in cma__thread_start0 () from /usr/lbin/cmcld #5 0x105f0c in cdb_client_port_close () from /usr/lbin/cmcld 4. If a configuration operation gets aborted during a cluster reformation with a down node joining the cluster, cmcld may abort on the node that is rejoining with the following messages: Action - Invalid transaction state of NO_TRANS for node id x, (ABORTED) Internal error - Aborting: cdb/cdb_coord_comm.c 517 (Invalid transaction state) 5. When a local LAN failover fails, no error messages about the failure are logged to syslog. 6. When the concurrent fsck's have been defined in the package control script, the fsck's executed on the Journaled File System during a package start up log messages in a random order. As a result it is hard to associate the messages from the package control script log with the volume groups being checked. 7. When SAM GUI switches environments, certain tasks are no longer available. 8. Certain network load balancers or switches may not be able to complete local switch within ten seconds after a local switch occurs in ServiceGuard. This can result in the client side not experiencing the failover performance benefit that the network load balancer can provide. 9. The ServiceGuard daemon, cmcld, may experience SIGSEGV and accordingly dump core when a package is started or halted. The resulting stack trace will show segmentation violation. 10. The cmcheckconf/cmapplyconf command will fail for a package if EMS resource is not available on the node where command is issued even if that package only runs on other nodes in the cluster where the resource is available. The commands will fail with output similar to: Error: ems subclass request for failed, resource type (3016) Error: Failed to get type information for on node 11. Primarily on ServiceGuard OPS clusters, the cmrunnode command executed from the cmcluster rc script may fail. When this happens, other nodes in the cluster may log messages in syslog such as: cmcld: Detected different configuration data on node cmcld: Can not form cluster with node cmcld: Quitting due to configuration data version mismatch 12. If an EMS monitor on the system is not yet ready to monitor a resource on which a ServiceGuard package is dependent, the package will fail to start. The following messages may be seen in syslog: cmcld: ems monitor for is not ready above message repeats 2 times cmcld: Resource set to "UP". cmcld: Package cannot run on this node because resource does not meet package RESOURCE_UP_VALUE. 13. A package configured with a deferred start resource may start and halt immediately when the cluster starts up. The following messages may be seen in syslog: cmcld: Started package on node . cmcld: Package cannot run on this node because resource does not meet package RESOURCE_UP_VALUE. cmcld: Resource in package does not meet RESOURCE_UP_VALUE. cmcld: Executing ' stop' for package , as service . 14. Issuing the command "cmsetlog -M RES" to turn up logging in the resource module does not work. 15. A ServiceGuard OPS node may fail to halt resulting in a TOC. The following error messages may be seen in syslog: cmcld: CMGMSD successfully halted cmcld: Failed to unregister all resource monitor requests. cmcld: This node () has ceased cluster activities. cmcld: Daemon exiting cmcld: CMGMSD/GMS halted but unable to halt SG. Rebooting... 16. ServiceGuard A.11.14 now supports adding a new resource previously not configured in any package while the cluster is running. 17. ServiceGuard daemon cmcld aborts with the message "DLPI error! dl_errno: 1, dl_unix_errno: 0." in syslog. This leads to a system TOC. 18. SG supports only 60 packages. With this patch it now supports 150 packages. After installing this patch on all nodes in the cluster, the cluster must be brought down in order to increase the MAX_CONFIGURED_PACKAGES parameter in the cluster ascii file. Once this is changed, up to 150 packages may be configured. Note that once a cluster has more than 60 packages configured, any upgrade to SG version 11.14 MUST include patch PHSS_26056. So, after halting a node and upgrading that node to 11.14, PHSS_26056 must be applied before bringing that node back into the running cluster. If this procedure is not followed, any nodes running 11.14 without the patch could crash (TOC) due to cmcld dying with SIGSEGV or SIGBUS. Be sure to set AUTOSTART_CMCLD to 0 in /etc/rc.config.d/cmcluster before beginning the upgrade to 11.14. The SG11.14 patch PHSS_26056 will be released later this spring. Until it is released, customers using > 60 packages must remain on 11.13 and not upgrade to 11.14. Note that running 150 packages requires systems that have a lot of capacity. If your systems are not powerful enough, some of your packages may not start or may partially start. In this case, you will need to reduce the number of packages. Test each node by running the cluster on that node only (cmruncl -n node), and make sure all packages start that are configured to run on that node. NOTE: At this time ServiceGuard Manager does not support more than 60 packages per cluster. 19. When nodes configured for a particular package are only a subset of the nodes in the cluster, a call to cmGetstatus(CM_PKG_STATUS) may return -26, causing cmsnmpd to sever the socket connection with the ServiceGuard cmcld daemon. This behavior will happen when the user brings up the cluster or node and/or restarts the cmsnmpd subagent. Once the socket connection with SG is severed, no MIB variables or SNMP traps will be updated or sent, which results in stale data in the SG MIB table. 20. During cluster formation, cmcld can exit with a segmentation violation. The stack trace of the resulting core looks like: cl_local_srv_free\952 (00121284) (`thread(24)) ss_monitor_operation_phase_II\441 (001285A0) ss_cl_local_reply_event\944 (00129A88) ss_event_handler\1069 (00129F20) ss_event_handler (hpux_export stub) (00129E38) cl_event_loop\434 (001C34E0) cl_event_loop (hpux_export stub) (001C2D18) cma__thread_base+01e8 (002283B0) cma__thread_start1+0030 (0022A500) cma__thread_start0+0004 (00229F9C) 21. Shortly after a cluster starts (via cmruncl or cmrunnode on all nodes), cmcld can exit with the following message: Fatal internal error - Assertion failed: ntohl(node_ptr->node_info.p_state) == P_NOT_OWNED, file: pkg_list.c, line: 298 It is possible that before cmcld exits, packages may have started up on this node, however these packages will not be halted. So when the remaining nodes in the cluster take over the packages that were running on this node, it is possible for data corruption to occur if VxVM disk groups are used in the packages, since they are activated on more than one node. Also, software components which communicate with the affected application may experience connection problems associated with the package's IP address appearing on two nodes at the same time. 22. A package configured with automatic start resources may start on an adoptive node instead of on the primary node during cluster startup, due to the resources being registered with EMS earlier on the adoptive node than on the primary node. 23. Not all resource monitor requests are unregistered with EMS when cmcld exits, so the next time cmcld starts up and registers the same requests with EMS, it will not get immediate notifications regarding the state of the resources, and packages will not be able to start. 24. After a package has been added to a cluster, cmsnmpd may not update the MIB and hence the package may not be available as a resource to be monitored by another package. 25. When the PACKAGE environment variable is improperly set in the package control script, the script fails with errors such as the following: cmmodnet : Subnet is not a configured subnet. cmmodnet : Use the "netstat -in" command to list the configured subnets. No errors are logged to syslog. 26.If there are multiple cluster nodes issuing configuration queries at roughly the same time, cmgetconf can silently timeout. 27.When a node with a node ID that is not the first or last node ID in the cluster is removed from a ServiceGuard OPS Cluster, the "cmviewcl -l group" command will return an error message like: cmviewcl : Failed to convert node_name xxx to node_id. 28.In a 2 node ServiceGuard cluster, if cmcld on one node experiences a long kernel hang and again tries to join the cluster then the whole cluster can crash. This can be seen on more than 2 nodes if cmcld on all the nodes except on one node experiences long kernel hang. The syslog on node which does not experience the kernel hang will log messages like: cmcld: Timed out node . It may have failed. cmcld: Attempting to form a new cluster cmcld: Safety time set for 128.96 seconds from now cmcld: Did not receive all votes: 1 out of 2 cmcld: All votes (100) are required at this point. vmunix: SCSI: Reset requested from above -- lbolt: 246237, bus: 2^M^M cmcld: Got at least 50 votes: 1 out of 2 last active nodes. cmcld: Obtaining Cluster Lock cmcld: Successfully issued request for cluster lock /dev/dsk/c2t8d0 vmunix: SCSI: Resetting SCSI -- lbolt: 246337, bus: 2^M^M vmunix: SCSI: Reset detected -- lbolt: 246337, bus: 2^M^M cmcld: Cluster lock disk /dev/dsk/c2t8d0 appears healthy cmcld: Successfully obtained the Cluster Lock cmcld: lock id: 6 cmcld: Turning off safety time protection since the cluster cmcld: may now consist of a single node. If ServiceGuard cmcld: fails, this node will not automatically halt cmcld: Active node has voted for me cmcld: Enabling safety time protection cmcld: Enabled safety time with 257774 cmcld: Attempting to adjust cluster membership cmcld: Safety time set for 7.71 seconds from now cmcld: Active node has voted for me cmcld: Clearing Cluster Lock 29.When a shutdown(1m) command is run from two nodes concurrently, it can cause cmhaltnode to fail. This can happen if one node has completed its cmhaltnode and the other node is still running cmhaltnode. This problem can also be seen if a cmhaltnode command is halting the cluster on one node and another node in the cluster does a TOC or a reboot before the cmhaltnode command completes. The /etc/rc.log.old will contain messages or command will exit with messages like: Warning: Do not modify or enable packages until the halt operation is completed. Halting Package cmhaltnode : Unable to halt package : Socket is not connected Check the syslog and pkg log files for more detailed information: cmhaltnode : Warning : node failed to HALT ERROR: Unable to halt cluster on this node. 30.Large numbers of the following message are logged to the syslog.log file: Mar 18 10:00:48 HGALUX07 cmclconfd[15865]: Unable to attach to network interface 1. This happens whenever customers try to view properties of objects in SG MGR, or when cmquerycl, cmcheckconf, cmapplyconf are issued. 31.A series of short kernel hangs on one node lead to cluster reformation and continues during reformation. This opens a small timing window where the node that is healthy hits the assertion failure, cmcld: Assertion failed: !node->hb_eligible, file: election.c, line: 5699. 32.cmapplyconf continually fails with Error: Unable to begin the configuration change 33. cmsnmpd will not store cluster name in the mib definition when started while cluster or local node are halted. A call to "resls /cluster/status" will result in output which is missing the cluster name. 34. When nodes configured for a particular package are only a subset of the nodes in the cluster, a call to cmGetstatus(CM_PKG_STATUS) may return -26, causing cmsnmpd to sever the socket connection with the ServiceGuard cmcld daemon. This behavior will happen when the user brings up the cluster or node and/or restarts the cmsnmpd subagent. Once the socket connection with SG is severed, no MIB variables or SNMP traps will be updated or sent, which results in stale data in the SG MIB table. Defect Description: PHSS_31015: 1. The memory leak is happening as cmlvmd was not freeing up the memory it created while constructing the reply to be sent to the command. Resolution: Free up the memory. 2. When no LVM is configured, cmclconfd may try to free an uninitialized stack variable. If the variable happens to be non-zero at the time, cmclconfd may abort. Resolution: Initialize the affected variables to NULL and only free them if they are not NULL. 3. The OpenView Operations (OVO) library /opt/OV/lib/libopccv.sl requires libpthread.1 to be able to load. However, since the OPC functionality is no longer supported, ServiceGuard A.11.14 patches and later are no longer linked to this library file. This leads to a loading problem of the OVO library. Resolution: Remove OPC functionality altogether. 4. The cmquerycl does validation that all cluster volume groups are correctly configured on new nodes which are getting added. During that validation, validation for second cluster lock is not done correctly and command exits out with error. Resolution: Validation code is fixed to do correct validation for second cluster lock. PHSS_30769: 1. In the case of multiple heartbeat connections if a local switch is taking place on one network or if one network is under heavy load, the heartbeat messages might be slower on that network. ServiceGuard handles this by pausing other connections allowing the slow connections catch up. If a cluster reformation takes place during this time the paused connections are not cleaned up correctly and an assertion is hit. Resolution : Clean up paused connections correctly. 2. The problem occurs when the internal sdb_data is not checked or verified for its data type. The problem was discovered and fixed in A.11.15 and the fix is included in this A.11.14 patch. Resolution: Add the same data type check to A.11.14. 3. This device open failure with EAGAIN occurs only from FC60 disk arrays. The nature of the occurrence is very transient and the return message of the resource temporarily not available can be due to the timing on the hardware and its firmware. Investigation was done to conclude that an immediate retry of the device open will always be successful. Resolution: Add to retry the device open only when it returns with EAGAIN. PHSS_30448: 1. The cmcld opens a physical link so that before cluster lock acquisition bus reset can be done to clear any pending I/0. For a Fiber Channel cluster lock disk, this open returns successfully but the disk can get partially open (its LUN size is 0) and all subsequent tries of cmcld to access the disk would fail. To fix this problem at HP-UX level the device needs to be closed by any process. Resolution: For Fiber Channel Storage bus reset are not supported, therefore cmcld does not open Fiber Channel cluster lock devices anymore. 2. In 2 node ServiceGuard cluster using a serial heartbeat link, in one code path an uninitialised pointer is referenced which can result in a cmcld abort. The code path is only executed if a serial heartbeat is configured. Resolution: Removed uninitialised pointer reference and use another variable which serves the same purpose. 3. Subagent errors are caused by race conditions in a stressed environment when the subagent is trying to retrieve cluster status information while there's an online reconfiguration, node is halting, or subagent unexpectedly lost connection with cmcld daemon. Resolution: The expected errors will be handled accordingly when cmcld goes down. Also added a few more retry iterations to avoid a race condition when getting cluster information when there's an online reconfig happening. 4. When the subagent prepares to get new SG configuration information it deletes the entire package structure before freeing the dependencies and storage group lists. Resolution: Free package dependencies and storage group lists first, before freeing entire package structure. 5. When SG does disk probing during configuration process, it tries to query the I/O interface of the disk and only expects to see type "INTERFACE". When it is another type, "VIRTBUS" in this case, SG goes ahead and tries to go up and query the parent node in the I/O tree, but couldn't. Resolution: Make change so that SG recognizes type "VIRTBUS". PHSS_30028: 1. The PPA values of the HyperFabric interfaces are the same as ethernet network interfaces thus causing the linkloop command used in cmscancl to give incorrect results. Clic0(HyperFabric interface) has PPA 0, same as lan0(Ethernet interface). Linkloop uses the card PPA number, and so can't really distinguish between cards. Resolution: Skip the network connectivity check for non-LAN hardware (i.e. HyperFabric, ATM etc.), if any, since the linkloop command is supported only for LAN hardware 2. The cmclconfd and cmcld daemons recalculate the number of file descriptors for their own environment and set their value in the system (currently 1024 set by cmclconfd). The same value gets passed to cmcld and from cmcld to the applications starting from ServiceGuard, thus creating a problem. Resolution: Even if cmclconfd and cmcld daemons recalculate and set a new file descriptor limit, the original value will be restored for child processes so that they will have the original system limit. 3. In this case the cmhaltpkg command was in progress while waiting for the package control script to finish. During that time if reconfiguration happens then the command does not know if the halt of the control script was successful or not. The retry to learn the status might have found that the package is not running even though it might have halted successfully. So the command exits with an error Resolution: As this is a race condition and multiple things are happening at a particular time, the error cannot be avoided in all cases. So a clearer message is displayed if this condition is encountered. 4. cmapplyconf does not properly check for adding cluster lock while the cluster is running. Resolution: Modified code to make this check correctly. 5. When a package starts to halt, the package control script will create a temporary file on root filesystem to save the pids for vgchange and umount. If the root filesystem is full, the temporary file will fail to create and thus the package will fail to halt. Resolution: Use the local variables to keep the pid information instead of creating the temporary file. PHSS_29915: 1. While in the middle of re-initializing a redundant heartbeat TCP connection, another heartbeat connection is closed and the cleanup logic did not handle this combination of events correctly. Resolution: The logic error is fixed. 2. The cmcld daemon recalculates the number of file descriptors for its own environment and sets its value in the system (currently 2048 is set by cmcld). The same value gets passed to processes starting from cmcld, thus creating the problem. Resolution: Even if cmcld daemon recalculates and sets a new file descriptors limit, the original value will be restored & will be set by cmsrvassistd so that child processes will have the original system limit. 3. Unreliable network traffic may result in inconsistent network connections and cause cmcld to keep checking for but not actually reading incoming data in a loop. Resolution: Made changes such that cmcld will not keep checking for incoming data in a loop and such that inconsistent network connections will be cleaned up. 4. Due to a logic error, a node may discard a message on all its connections and never reply to the sender. Resolution: Fixed the logic error. PHSS_29561: 1. The local communications socket is blocking, so if the socket pops but there is no connection to accept, it will cause cmcld to hang. Resolution: Changed socket to be non-blocking. 2. cmquerycl checks for heartbeat subnet even when there is no subnet configured on the node. Resolution: Added a check to make sure that the code path to check for heartbeat subnet is executed only in case of a subnet configured on the node. 3. cmmakepkg only puts the ServiceGuard revision (e.g. 11.14) but not the patch ID in package control scripts. Resolution: Changed cmmakepkg to put the patch ID in package control scripts. 4. Commands do not check for the described illegal configuration. Resolution: Changed commands to check to make sure all connections that can communicate on the IP level can also communicate on the data link level. 5. The remote communications sockets are blocking, so if one pops but there is no connection to accept, it will cause cmcld to hang. Resolution: Changed sockets to be non-blocking. 6. This message is misleading to the customer. There is no data corruption in the customer's package. EIO from recv means the tcp connection was closed during recv, which may happen when cmcld shuts down while cmclconfd is in the middle of recv. Resolution: Changed the message to: I/O error detected during message read. 7. The ServiceGuard manual mentions the limit on CLUSTER_NAME and on some other parameters as 40. Internally strcpy is used to copy the parameters into ServiceGuard data structures. As strcpy behaves differently under such condition, it creates a problem and command returns inappropriate error message. Resolution: Enforce the length of string while reading the ascii file. If it exceeds maximum allowed limit then print the error. 8. During heartbeat exchange, some other messages are also exchanged, for example health message. Also during processing of such message if error is encountered then the error is returned back. Due to varying network speed and serial link speed the error returned on serial link was delayed while other message in ip network reached first. This created an out of order sequence and thus cmcld aborted. Resolution: Make sure that only appropriate traffic means only heartbeat is directed on serial link. For other traffic only ip network is used. 9. vxclustd does not support the concurrent activation of disk groups (dgs) and doing so might create a problem. Resolution: Remove the option of CONCURRENT_DISKGROUP_OPERATIONS provided into package control script. This will ensure that dgs are activated sequentially. 10. The ServiceGuard config daemon, cmclconfd, does not log DLPI errors in a few cases when the DLPI primitives fail to complete. This causes the debugging of problems to be more difficult on the ServiceGuard production bits. Resolution: Make change to the config daemon so that DLPI error messages can now be logged. 11. The ServiceGuard command cmcheckconf or cmapplyconf does not recognize the fact that a node that is being added to a cluster is already a member in another cluster. Resolution: The commands will keep track all the clusters that each node in the ASCII file is currently belonging to if any, so that they can report the error accordingly. 12. The ServiceGuard daemon, cmcld, was not prepared to deal with EAGAIN error when the kernel temporarily runs out of resource. Thus causes the abort. Resolution: The daemon is now more resilient to transient errors such as EAGAIN when it fails to receive UDP data. Doing so will keep the daemon running instead of aborting when a temporary error occurs. 13. When halting a service, Service Guard expects that in the routine where it actually does the halting, an event will be posted, so after calling the routine, it goes ahead and deletes the event without checking to see if the routine returns successfully. Resolution: If the routine returns with error, log an error message and do not delete the event. 14. While a node is joining a 2-node cluster, there is a kernel hang on the coordinator node or a network partition that separates the 2 non-joining nodes. The non-coordinator node gets the cluster lock and forms a 1-node cluster. Once the coordinator node resumes execution, a logic error allows it to set or clear the cluster lock and form a 2 node cluster with the joining node. Resolution: The logic error is fixed and assertions added to ensure that the same kind of error is not introduced in the future. 15. ServiceGuard does not set the keep alive option on connections, which would detect staleness after a certain period of time. Resolution: Set SO_KEEPALIVE option on all connections. PHSS_29122: 1. Due to dependencies that must be satisfied concerning CVM disk groups, during the execution of cmapplyconf the VxVM-CVM-pkg package must be processed first before the cluster configuration and/or failover packages with CVM disk groups are processed. If cmapplyconf is invoked with the VxVM-CVM-pkg package along with the cluster configuration and/or other failover packages, it is not guaranteed that the VxVM-CVM-pkg package is processed first, which may produce the aforementioned error. Resolution: Impose new limitations on cmapplyconf. The user may invoke cmapplyconf with the VxVM-CVM-pkg package only, or cmapplyconf with the cluster configuration and/or failover packages but not the VxVM-CVM-pkg package. The cmapplyconf man page has also been updated to reflect these new limitations. 2. The function cmdlm_info() determines the maximum number of nodes that can be supported by the node where cmapplyconf is run. But that value is valid only if the return value of that function is 0 and not -1. Currently the caller function doesn't consider the return value before considering the value determined as valid. This could cause a problem. Resolution: Added a check in the caller code to check for the return value before considering the value determined by cmdlm_info(). 3. The API to get the cluster configuration information has a check to see if the configuration version has changed since the lookups occurred. It is possible on an SGeRAC cluster that cmgmsd transactions could slip in and bump the version, causing the API call to fail. Resolution: The fix is to turn off this checking for realtime API clients (which is only vxclustd). This is ok because vxclustd does not care if the cmgmsd part of the CDB tree is modified. COM B.01.04.01 Patch PHSS_29123 or later will also be needed to complete this fix (for the cmprovider part), but the commands change in this SG patch can stand alone without harm. 4. cmapplyconf does not check to make sure at least one RESOURCE_UP_VALUE criterion is defined for each resource. When a resource is configured with no criteria defined, invalid information is stored in cmcld, which makes for fatal comparison operations. Resolution: Change cmapplyconf to check that at least one RESOURCE_UP_VALUE criterion is defined for each resource. 5. During a node shutdown, the halting node postponed the status update requests, causing the node to core, if the shutdown was aborted. Resolution: Fixed the code to process the status update requests even when the node is in the process of shutting down. 6. During an online delete of a package, if a resource becomes available and if the package is runnable on a node, it is possible after the package delete, cmcld may core, as it did not handle the notify ownership request appropriately. Resolution: Fixed the code to handle the notify message from the owner appropriately during an online package delete operation. 7. When a cmhaltnode is in the middle of shutting down a node, and if the user ignores the warning message and does an online change which deletes a package with EMS resources, it may cause data corruption in cmcld, leading to a cmcld core. This was caused as there was no mutex protection for a critical region accessed by two different threads simultaneously. Resolution: Modified code to access the critical region by only one thread at all times. 8. The System Multinode Package got node disabled during a cmhaltnode sequence. If the node exits the halting sequence, and if a cmrunpkg is issued for the System Multinode package, the package will be stuck in starting state as the package is not running all nodes. Any cmhaltnode command issued after that would hang. Resolution: Fixed the code to not node disable the System Multinode Package when processing a cmhaltnode command. 9. Even though there are multiple instances of the same resource criteria for the same package, cmcld still registers only one request with EMS framework. So, when multiple stop requests are issued, only the first one succeeds while the duplicate ones can't succeed, as they are already unregistered. Resolution: The fix is to not add duplicate entries for the same resource criteria (upper or lower). 10.When the cmstopres was issued the second time, for the second instance of the same criteria, the entry was not found in the resource list. This generated an ENOENT error, which failed the cmstopres command, thus failing the haltpkg and then cmhaltnode. Resolution: The duplicate stop requests are not considered to be an error. And cmstopres command returns as success. 11.If there are many EMS resources configured in a package, and if the resource monitor associated with those EMS resources keeps failing and restarting constantly, it may cause cmcld to call the same set of functions recursively, making the stack to grow too big and causing stack overflow, leading to a cmcld core. Resolution: Fixed the code to iteratively handle the EMS resource failure events. 12.If a ServiceGuard cluster is configured with packages using CVM dgs then, CVM daemon, vxclustd is started under system multinode package, VxVM-CVM-pkg. During cmhaltcl or cmhaltnode, before cluster halting, this package is halted. But if some dgs are active which have been activated outside package (manually) or some package did not deactivate it properly, then halt of vxclustd will fail. This will result in a VxVM-CVM-pkg halt failure. As this runs as a system multinode package which is set to NODE_FAIL_FAST_ENABLED, the node will TOC. Resolution: Before actual halt of VxVM-CVM-pkg, a preshutdown script will be issued against vxclustd to make sure that no dgs are active. If there are, then appropriate error messages will be logged and command will fail. If not, then further halt will proceed. The script will be provided by Veritas in VxVM patch PHCO_29600. The script location is: /etc/cmcluster/cvm/VxVM-CVM-pkg-preshutdown.sh Without the script existing on the system, behaviour will default to original behaviour. 13.During hourly health check, cmcld does realise the problem that cluster lock disk is returning NOT_READY (EBUSY). But there is no log message which indicates that this kind of problem has been experienced. Actually there is already a message logged at a higher logging level, so if tuning is on then such message can be seen. Resolution: Log the message indicating the problem at default level so that it will be the syslog for user for early correction of problem. 14. When cmcld is killed while cmgetconf is running, cmgetconf might dump core in a corner case. Resolution: To enhance the checking mechanism to make sure that an empty data structure is not read. 15. This is not a defect but a designed behavior. Since VxVM-CVM-pkg uses FAILFAST option for both service and node, once it fails, the node will TOC. Resolution: Logging messages are added to notify customer when FAILFAST is enabled and that node can TOC when package fails, even if it's the only node in the cluster. 16. This is a corner case where the node received a garbage message in which the id of the sender is not valid. This causes the assertion to fail and the node to TOC. Resolution: To make the node resilient against any garbage message it receives, this assertion was removed and the connection will be dropped if this ever happens again. 17. Cluster membership routines in cmcld allowed halting nodes participation in election process of forming a new cluster. Resolution: Cluster membership code in cmcld was modified to prevent halting nodes from voting in new elections. The coordinator was also changed to ask for a new election if a vote is received from halting node. 18. System Multi Node package gets stuck in starting state. Resolution: Fix pkg_shutdown to delay halting of system-multi-node packages until after lvm daemon is halted. Also fix cancel to use pm event. 19. Assertion happened during shutdown because a package run or halt came in after part of cmcld was stopped. Resolution: Added synchronization in node shutdown to prevent processing package commands. 20. Test for Maximum number of resources was done before it was determined if it was a new resource. Resolution: Moved test to after resource was added. 21. Currently when a package is in starting state when enter final part of a node shutdown, the package is ignored and will stay up and lead to possible data corruption. Packages can be in this state when a cmmodpkg or cmrunpkg is issued after a cmhltnode has begun. Resolution: The fix is to quit the shutdown if any package is starting when we begin to shut the node down. PHSS_28851: 1. Due to a defect in the convert utility which converts old binary configuration files of ServiceGuard into a new format, the conversion fails and the cmrunnode command fails. If convert is invoked manually then it will also fail without any error. Resolution: The defect in convert utility is fixed. 2. Node information is often obtained in a different way from package information, so inconsistent data may be displayed. Resolution: Do extra checking to make sure the information displayed is consistent between nodes and packages. 3. This is actually a problem from CMA thread. When nmap is running and trying to connect with cmcld ports, cma_accept is revoked to receive data. This cma_accept in turn will call fstat on the file descriptor passed to it. There are times when fstat returns error numbers that represent transient problems and cma is supposed to handle these errnos appropriately. However, the problem is that cma_accept exits and the thread terminates abnormally. This leads to cmcld abort. Resolution: A fix is provided from the CMA team and since ServiceGuard is linked with libcma statically, in order to get the fix it is necessary that customers experiencing this problem install this patch, which is already linked with libcma. 4. With the -k option, cmcheckconf/cmapplyconf commands send a list of volume groups (as mentioned in the cluster ascii file) to be verified to all nodes. If a volume is not present on any node then the command treats it as an error and fails. Resolution : Even if the volume groups mentioned in the cluster ascii file are not present on some nodes, cmcheckconf and cmapplyconf will not fail. Rather it will correctly configure them in the cluster as is done without the -k option. 5. ServiceGuard maintains an internal list of volume/disk groups. The SAM GUI uses this list in configuration for LVM volume groups. The commands cmapplyconf/cmcheckconf use this same list for CVM disk groups. Each entry was clearly marked as CVM or LVM. The defect was exposed only when a package was created with SAM GUI and retrieved with cmgetconf. Resolution : Modified cmgetconf, cmcheckconf and cmapplyconf to ignore LVM volume groups in internal list. PHSS_27725: 1. During cluster reconfiguration, if a node trying to join the cluster responds last to the config_com probe, it updates the pnode information for other nodes as EBUSY, which results in the subsequent failures. Resolution: Don't update the configd cache if responding node is in the EBUSY state. 2. The hpmcSGClusterDown trap was never implemented because there was no reliable way for the SG subagent api to send a "cluster down" event because each node halts independently and once cmcld stops running, cmsnmpd can no longer retrieve the cluster status from the SG subagent api. Resolution: When cmsnmpd receives an event that the local node has halted or failed, then it will locally check the cluster status and set the hpmcClusterStatus mib variable to "down" and send a cluster down trap if the cluster is down. 3. During a reconfiguration, if multiple package commands are issued, only the command issued last will succeed. Rest of them will hang. The timer that got created to retry the package commands, was getting overwritten by successive commands. So, only the command that was issued last will succeed, and the rest of the commands get lost without being replied. This leads to a command hang, for the ones that were lost without being replied. Resolution: Fixed the timer creation code for the commands, to create individual timers for each command, so that all the commands can be retried and replied correctly. 4. This is an enhancement request. Remote admin requests executed through Object Manager were not tracked through syslog on a cluster node where operation takes place. Resolution: Enhanced admin code paths in config library and ServiceGuard daemons to log more information about the user requesting the request, where the request being issued and how is it issued. 5. The manual entry for cmcheckconf incorrectly described cmcheckconf -k as cmquerycl -k. Resolution: Make change so the manual page displays the right description. 6. Due to a coding error, resources that do not exist in the new package configuration are skipped over and not handled. Resolution: When the configuration transaction is being committed, obtain resource information from delete list compiled during the prepare transaction phase instead of from CDB. 7. Package with FS_MOUNT_RETRY_COUNT set to 1 failed to start even after script successfully cleaned up the busy mount point and successfully mounted the file system. Resolution: Handle the scenario described above correctly in the script by returning a success to the caller of the freeup_busy_mountpoint_and_mount_fs(). 8. During cluster formation, timers may be inadvertently cancelled. Resolution: Correct the code that cancels timers during cluster reformation. 9. cdb thread and reconfig thread have potential race condition during online node reconfiguration Resolution: Rectify the code to remove the race condition. 10. SLVM initialization ioctl is not given for SG, but SLVM shutdown ioctl is issued. Resolution: Do not issue SLVM shutdown ioctl for SG. 11. A message sent from cmmodnet to cmcld contained too little space for the package name. cmcld read off the end of the message, causing the core. Resolution: Increased the size of the message. 12. During a call to connect(), EINTR (interrupted system call) is returned, causing the call to fail. This occurs within local messaging mechanisms, so normal cluster operations are not affected. Resolution: Retry the connect() call if EINTR is returned, since it is just a transient error. 13. If the non routable probe reach the config daemon on B, B will try to respond to this probe, but the response will be dropped by the kernel because there is no route to A's private subnet from B. Then when the packet from the public subnet reaches B, the config daemon drops this probe because B already received a probe from A and considers this a duplicate probe. Resolution: Instead of doing limited udp broadcast, do a subnet directed broadcast, which would be received on all subnets but the message will be ignored by the driver if the subnet specified in the broadcast does not match the subnet of the interface itself. 14. cmcld could abort (with assertion failure) if it goes for the quorum server, the qs lock is granted but there is a reconfiguration before lock granted message arrives. Resolution: Remove the assertion, this state should be allowed. 15. Since package start failed after NFS services had been started, the undo sequence should take care of halting the NFS services. However, this step is skipped leading to package restart failure since control script tries to start NFS services which are already running. Resolution: Add the step of halting NFS services to undo sequence in package control script to handle the case where error occurs after NFS services started. 16. The mount and umount patches, PHCO_24635 and PHCO_26451 introduced a retry mechanism so mount command could get a correct result from mount table without locking it. This mechanism however still sometimes returns an incorrect list of mounted file systems since retry is not done indefinitely. Since SG relies on result from mount command, there are times when SG is not able to unmount all file systems because of incorrect result it got from mount command. Resolution: Executing mount once instead of CONCURRENT_MOUNT_AND_UMOUNT_OPERATIONS time reduces the chance of mount returning incorrect information, thus reducing the chance of umount_fs missing file systems needed to be unmounted. 17. This is a coding mistake. The option -c was meant to be used to display the use of the busy mount point together with file systems under that mount point. Resolution: Get rid of -C option. The option -c is not used either since in the case there is no file systems under the busy mount point, fuser command will fail, although -c option is just for verbose purpose. 18. This is an enhancement submitted by WTEC to include the quorum server and its parameters in the cmviewconf output. Resolution: Added code to display the quorum server and its properties if one is defined. 19. This defect was introduced by SG 11.13 patch PHSS_27087 and SG 11.14 patch PHSS_27246 and is caused because there is no efficient way to send a PkgDown trap when a coordinator node failed and packages went down without all cluster nodes knowing the previous package statuses. Resolution: Changed cmsnmpd and package manager to replicate package status information on all nodes in the cluster. Any package status change from "up" to "down" will be detected by all nodes in the cluster and trigger the appropriate PkgDown trap to be sent with the corresponding node name. 20. The subnet down/up SG api events didn't trigger the cmsnmpd to correctly update the package subnet status mib variable on all running nodes in the cluster. Resolution: cmsnmpd correctly updates the package subnet mib variable on all running nodes and sets the subnet status to "unknown" on all halted nodes. 21. The node halted event doesn't trigger cmsnmpd to set all hpmcNodeRole mibs to "unknown" on the halted node. Resolution: Set all hpmcNodeRole mibs to "unknown" when a node is halted. 22. The problem is a side effect of performance improvements in the shutdown path of cmcld. Because cmcld is shutting down faster, messages cached on the client side MAY not have the time needed (race condition) to be delivered to the subagent in time. Resolution: Make sure the client side delivers the cached messages regardless of whether cmcld is still there or not. If the connection drops (cmcld shutdown), put the local connection (lcomm) into cache flush mode. After the cache is flushed then deliver the drop connection errno. 23. When an Oracle process goes away, it can take up to 10 seconds for cmgmsd to detect the death of a process. After detecting the death of a process, cmgmsd starts group membership reconfiguration to clean up this process and inform the surviving Oracle instances about the new group membership. Oracle cannot start database recovery unit it receives the new group membership. Resolution: Customers should adjust maxfiles parameter to make sure there are enough file descriptors to support the needed number of Oracle foreground processes. The reason for this is as follows: In the current implementation, cmgmsd checks the health of the local group member processes every 10 seconds by looking up the kernel process table. cmgmsd is changed to keep a socket connection to every local group member process. When the process goes away, cmgmsd can detect the socket connection is broken immediately because the kernel closes the socket connection for death processes. This allows cmgmsd to detect the death processes very quickly. The monitoring of cmgmsd registered Oracle processes has been enhanced to use select call. The result is faster detection of crashed Oracle processes. Since the socket used for cmgmsd client registration is now used for monitoring, socket connections will not be released. So, each cmgmsd client will require one file descriptor for socket during run time. Clients of cmgmsd can be Oracle foreground and background processes. Foreground processes are needed to serve Oracle DB clients. Depending on how Oracle DBMS is configured, each foreground process can serve one or more DB clients. 24. SAM/GUI code sets the flag indicating that the script timeouts changes have been made, but then does not check it when deciding whether to update configuration Resolution: Fixed SAM/GUI code to check whether the flag indicating changes in script timeouts is set. 25. This problem happens due to the system clock not being updated for a prolonged duration. The ServiceGuard daemon cmcld relies on the system clock to create internal events like sending Heartbeat after each heartbeat interval, etc. But cmcld does respond to external events created by other nodes in the cluster. So if on a node the system clock stops working then cmcld on that node is running in an unstable manner. This unstable cmcld creates a problem for itself and one of the other nodes in the cluster which ultimately results in those 2 nodes failing. And if only 1 node remains, then that node will also TOC due to lack of quorum. Resolution: Changes are made to the cmcld daemon so that it can detect a system clock problem. This detection is driven by external events. If the system clock stops working for a time equal to 2 node timeouts then a warning will be logged into syslog until there have been 5 node timeouts. After that the node will kill itself so that other nodes in the cluster can form a new cluster, excluding the problematic node. If a node TOC should not happen after a short while then an increase in node timeout is required. For more details on this see Special Installation Instructions section. 26. Previously the ServiceGuard subagent would retrieve subnet statuses only if package subnets were configured in the cluster. Changes were made to retrieve all subnet statuses even when no packages or package subnets are configured in the cluster. A defect was exposed in the ServiceGuard subagent api: when no packages are configured, an error code is returned to the subagent. This resulted in the subagent severing the socket connection with cmcld, and not updating or generating any status-related SNMP MIB variables or traps. The work-around for this behavior is to configure at least one package. After a package is configured, the ServiceGuard SNMP subagent, cmsnmpd, should be stopped and restarted using: /sbin/init.d/cmsnmpagt stop /sbin/init.d/cmsnmpagt start Note that neither the package nor the cluster need to be running or restarted. Resolution: The subagent api is modified to return a non error code when no packages are found when requesting for IPv6 or IPv4 subnet statuses. PHSS_27246: 1. When safety time is disabled, a timer is started to simulate safety time protection, but when safety time is re-enabled, the timer is not cancelled and eventually pops, leading to node TOC. Resolution: Change support tool to cancel the timer when enabling safety time. 2. cmviewcl command tried to reference the CDB object that no longer exists after online delete operation. Resolution: While getting object list from CDB, add an extra check to verify that the number of objects retrieved from the CDB are as expected. 3. The package control script used the default file system type while mounting the file systems. The mount command spent additional time in determining the file system type required to mount the file system. Also the array variables to provide additional options to the fsck and umount command didn't exit in the package control script. Resolution: Add variables FS_UMOUNT_OPT, FS_FSCK_OPT and FS_TYPE to package control script template. These additional variables can be used as described in the comment section of the package control script. 4. Due to the support of online hotswap LAN cards and APA product, ServiceGuard's network manager inefficiently checks for change of MAC address of each LAN card on a regular basis. This check does consume lots of CPU power, and the problem starts exposing when there are many LAN cards configured in the cluster node where cmcld is running. Resolution: Efficiently redesign the checking mechanism so that it will not take lots of system CPU power while keeping the supported features intact. 5. Network connections (heartbeat and general service) are not reestablished when the physical network is restored until cluster reformation time. Connections are not cleaned up fast enough when physical network goes down. This defect was originally root caused in JAG ad94082. A quick fix was put into PHSS_25499. That fix has been backed out. This is the complete fix for that problem. Resolution: Add the 'rcomm health monitor' to monitor health of connections. Reestablish responding connections, disconnect non-responding connections. 6. The parser which reads and parses the package ascii file, was looking for key word "AND." If a token was not "AND" or "and" it was ignored. Resolution: The package ascii parser was modified to print error when token after first up value is not "AND." 7. SAM/GUI did not use a proper routine to look up EMS resources and therefore could not go beyond "/". Resolution: The package configuration code was modified to properly traverse EMS hierarchy 8. When cmsnmpd tries to determine if a package status has changed when a node failure causes a cluster reconfig, the valid/invalid bit is never checked and the packages local flags aren't updated. Resolution: Changed cmsnmpd to update a package's local flags and to identify a package status change when a package status changes from invalid to valid after a node failure or cluster reconfig. 9. Resource monitor requests are not unregistered with EMS when cmcld exits from a cmrunnode/cmruncl time-out, so the next time cmcld starts up and registers the same requests with EMS, it will not get immediate notifications regarding the state of the resources, and a package will not be able to start on that node. Resolution: Unregister resource monitor requests before cmcld exits from a cmrunnode/cmruncl time-out. 10.In ServiceGuard release of 11.12 and later, a probing mechanism is added when only -P option is used without -C option. This was mainly done to validate CVM disk group. This probing can take a long time particularly on system having large number of disk and/or volume groups. Resolution: Don't do probing if only -P option is specified without -C option. 11.ServiceGuard commands cmcheckconf/cmapplyconf even with -k option opens all volume groups found in lvmtab file. It will not matter how many volume groups are mentioned in cluster ascii file, all volumes will be probed. Resolution: When -k option is specified then probe volume groups which are mentioned in cluster ascii file and skip rest of volume groups found in lvmtab file. 12.This problem happens when customer tries to modify the bridged net configuration. If the cluster has existing binary configuration, cmcheckconf/cmapplyconf are supposed to update the binary configuration accordingly. However, these commands fail to do so and only until cmruncl/cmrunnode do their own network probing does ServiceGuard realize the bridged net configuration has been changed. At this time, cmcld goes through the list of network cards it found, compare with what exists in the binary configuration generated by cmapplyconf but could not find a match, hence the assertion failure. Resolution: Made change so cmcheckconf/cmapplyconf update binary configuration correctly. 13.We are not properly blocking signals in the Service Assistant Daemon. Resolution: We should only unblock signals before entering the select call, and they should be blocked during all other times. 14.This is actually not a defect. What happened was, when VxVM is initialized, it stores the hostname as a variable called 'hostid'. The package control script use both this hostid and the value of the hostname command. As a result, this hostid and hostname should always match, which means if whenever hostname is modified, hostid should be updated accordingly. Resolution: Added comment in package control script specifying hostid needs to be changed if hostname is changed, using the vxdctl command. 15.The function used to free up mount point was not called with the right option. Resolution: Added -c option to the fuser function call so all files beneath the busy mount point would be displayed and all the processes using the files would received SIGKILL. PHSS_26056: 1. cmviewconf checks the wrong variable when determining what value to display for HALT_SCRIPT_TIMEOUT. Resolution: Modify cmviewconf to check the correct variable when determining what value to display for HALT_SCRIPT_TIMEOUT. 2. While adding package information to the cluster database on ServiceGuard version 11.13, cmapplyconf tries to get rid of unused EMS resources. There is a coding error there which leads to the command failure. Resolution: The coding error is fixed and the correct routine is now called to remove the resources completely from the cluster database. 3. Multiple commands create multiple transactions in the queue. When one of the commands is aborted, the corresponding transaction is also aborted. A lock is released and a pointer is moved to next transaction. As the lock is released another thread may come and delete the next transaction thinking that it has been aborted. Later when that deleted transaction is referenced, cmcld dumps core with SIGBUS or SIGSEGV. Resolution: The fix is to always go back to the first transaction when a transaction is aborted and destroy the transaction. Also make sure that no transaction pointers are held while the lock is released. Instead, re-lookup will be done to find the correct transaction. 4. There is an invalid assertion in the code that checks that all nodes are in a legal state corresponding to the reply message received from a node. It is asserted that a state of NO_TRANS is not legal when it is. Resolution: The fix was to change the code so that NO_TRANS is considered a legal state at this point. 5. Error messages describing local LAN failover failures are not logged to syslog in a production environment. Resolution: Make change such that the error messages are logged to syslog in a production environment. 6. This is an enhancement request to make the control script messages easier to read. Resolution: The package control script template is updated to use -p option during fsck on the Journal File Systems. 7. The problem occurs due to SAM GUI code not properly going through necessary checks. Resolution: Fix has been implemented to properly transmit code checks. 8. The ServiceGuard unsolicited ARP reply broadcasts are not sent in rapid intervals after a local switch. This causes a delay in receiving the ARP reply from network devices. Resolution: The unsolicited ARP replies are now sent every second during the first ten broadcasts, and then the interval starts increasing exponentially. 9. The package start or halt notification message may fail to send due to connection abort, but an unexpected reply message is received while the data associated with the reply has been cleaned up. Thus a segmentation violation occurs. Resolution: Do not send the reply message if the request message has failed due to connection abort. 10. The cmcheckconf/cmapplyconf commands tries to contact EMS resource registar on the remote node where package is suppose to run and resource is available. But due to a linking problem with the EMS toolkit, the command ends up talking to the local node where the resource is unavailable. Resolution: Corrected the linking problem of commands with EMS toolkit. 11. The cmrunnode command collects cluster configuration information from all nodes and copies the latest one before starting the cluster. But sometimes during system startup when all systems are starting, the cmrunnode command can fail to collect the cluster configuration information which can result in failure of cluster formation. Resolution: A fix is added to make sure that the cmrunnode command collects correct cluster configuration version, and if unable to do so, it fails with error. The startup script will then retry the command for 10 minutes and if still unsuccessful, it will then give up. 12. ServiceGuard does not retry registering a resource if the EMS monitor returns RM_NOT_READY. Resolution: When an EMS monitor returns RM_NOT_READY, keep on trying to register the resource until the monitor returns RM_ACCEPT. 13. The control script for a package configured with a deferred start resource may complete successfully before the resource is actually registered with EMS. When the ServiceGuard daemon checks to make sure the resource is monitored and up, the check will fail and the package will be halted. Resolution: Changed the cmstartres command used in the package control script to not complete until the resource is registered with EMS. 14. The cmsetlog command does not accept "RES" as a valid module. Resolution: Fixed cmsetlog to accept "RES" as a valid module and turn up logging in the resource module appropriately. 15. A coding error in the ServiceGuard daemon shutdown sequence makes it possible for the daemon to interpret the unregistering of resource monitor requests to be unsuccessful even in a successful case. On a ServiceGuard OPS node, the shutdown sequence must be clean, otherwise the node will be rebooted (TOC). Resolution: Fixed the coding error such that the success of the unregistering operation is determined correctly. 16. There has been some confusion as to whether this feature is supported. The ServiceGuard lab is now making an official announcement that it is. Please note that this feature only works through cmapplyconf on the command line. Changes are being made to SAM GUI to allow both online and offline resource addition (JAGae16264). 17. This is actually a DLPI bug. The DLS provider somehow returns dl_errno as 1, which means bad address, for a temporary resource shortage. It should return dl_errno as 4 with unix_errno as ENOBUFS or ENOSR instead so ServiceGuard could handle this transient problem accordingly. Resolution: A DLPI patch will be released to fix this problem. The workaround solution in Service Guard is to abort only if we receive the dl_errno 1 too frequently in a relatively long period of time, which indicates a permanent, serious problem. Otherwise, the problem is transient and will be ignored. 18. Not a defect. 19. cmGetstatus() incorrectly uses the number of nodes the package can run on instead of the number of nodes in the cluster when validating that each node the package is configured to run on is actually in the cluster. The only workaround is to configure each package to run on only the first P nodes in the cluster. Resolution: cmGetstatus() was changed to use the number of nodes in the cluster when validating each node that the package can run on. 20. If sending a message to cmsrvassistd fails, the service sensor can free a pointer twice, resulting in a segmentation violation. Resolution: Changed the service sensor to only free reply messages when the send succeeded. 21. When a package is to be started, the coordinator node sends a "start request" to the node that's supposed to run the package. If the reply to that message indicates a problem due to an upcoming cluster formation, then the state of that package was being reset to "not busy". If another event arrives before the cluster formation (e.g. a resource becomes available on the coordinator node), then the package may be started. However, the original message could have made it to the node that was supposed to run the package, so the package could be run on both nodes. Resolution: Keep the package state "busy" so nothing else will happen to the package until the cluster formation event arrives. 22. During cluster startup, on each node the Package Manager thread queues up events on the EMS thread to register resources with EMS without waiting for the events to complete. The EMS thread on one node may get CPU time earlier than the EMS thread on another node, so at the time nodes are evaluated for package ownership, resources for a certain package may be registered on an adoptive node but not on the primary node. Resolution: Make the Package Manager thread wait for the EMS thread to finish registering resources with EMS before carrying on with other initialization operations. 23. This defect was originally addressed in JAGad93682, the fix for which was included in PHSS_25124. We are now backing out the fix for JAGad93682 (when registering a monitor request, send START/STOP/START to unregister any lingering request and register a fresh request) and implementing a more correct solution, namely making sure that all monitor requests are unregistered with EMS before cmcld exits. Resolution: Make the main daemon thread wait until the EMS thread finishes unregistering all monitor requests before deleting the EMS thread and exiting. 24. cmsnmpd may hold stale cluster data that prevents it from updating the MIB correctly. The workaround is to restart cmsnmpd. Resolution: Refresh the cluster data held by cmsnmpd every time a configuration change occurs. 25. The PACKAGE environment variable should never be explicitly set in the package control script, since it is obtained from ServiceGuard when the script is executed. However, in case the user sets it unknowingly, more intuitive error messages should be logged. Resolution: Log the following message to syslog when the PACKAGE environment variable specifies a package name that cannot be found in the configuration: cmcld: Unable to lookup package . Documentation in the package control script has also been enhanced to warn the user not to set the PACKAGE environment variable. 26 When a query is sent it includes an id. The ids are unique within a node but not within a cluster. This can cause the receiver to believe they have already sent a reply to a specific query even though it's really from someone else. Resolution: Include the sender's node name as part of the query id. 27.The Node_id is not changed after cluster is reconfigured. So when a node with a node ID that is not the first or last node ID in the cluster is removed from the Cluster, there will be a free slot in node_id list, and then cmviewcl will not be able to get the node name for the removed node_id. Resolution: Continue to check the next node_id instead of reporting this error. 28.The problem happens when one node of a cluster hangs, causing a cluster reformation, and then returns immediately before the cluster reformation completes (late vote). If the cluster reformation is in the last phase when the hung node returns and votes, the coordinator must determine if it will accept the node back into the election. There is a small window during which this determination is done incorrectly. Resolution: The fix is to accurately determine whether the hung node should be accepted back into the election. This prevents the election from being restarted and both nodes TOCing by safety timer expiring. In some cases, the hung node will be allowed back in, and in other cases it will TOC. 29.The cmhaltnode command halts packages first. While halting the packages if other nodes in the cluster reboot or halt, the cluster communications for halting the package may get disconnected, resulting in an error, ENOTCONN. This error causes the cmhaltnode command to exit out without halting the cluster. Resolution: If an ENOTCONN error is generated before completing the cmhaltnode command, the command will now handle this and will retry to halt cluster services again, but this time the rebooted or halted node will not be used for the cluster communications for the package being halted. 30.At network probing phase, ServiceGuard tries to bind to network interfaces of unsupported type. Resolution: Check for and skip lan cards of unsupported type. 31.SG design assumed once a node votes late and gets deferred, it's no longer heartbeating with coordinator. It turns out, although rarely, this does happen. Resolution: At election timeout, drop any node that's hb_eligible but did not send vote. 32. A cmapplyconf command is started, but goes away immediately. The proxy server does not know this because the proxy server did not check bind failure to the command's lcomm port. Proxy server believes the command is there, so it starts a transaction (acquires config lock) and waits for the transaction to start. Proxy will never know that the command is already gone. Subsequent applyconfs will fail since the config lock is held already. Resolution: Make sure the transaction is not started until after the bind has completed successfully. If the command goes away after the bind has completed the transaction will be cleaned up. 33. The fix submitted for JAGad68565 in PHSS_24678 (SG11.13) and PHSS_24536 (SG11.09) to intialize all Emanate Cluster related variables to empty strings when the cluster or local node wasn't running caused resls to show the cluster name as an empty string if the local node is halted. Resolution: Change was made to initialize all status variables to empty string when cmsnmpd first starts, independent of whether the cluster or local node are up or down. 34. cmGetstatus() incorrectly uses the number of nodes the package can run on instead of the number of nodes in the cluster when validating that each node the package is configured to run on is actually in the cluster. The only workaround is to configure each package to run on only the first P nodes in the cluster. Resolution: cmGetstatus() was changed to use the number of nodes in the cluster when validating each node that the package can run on. Enhancement: No (superseded patches contained enhancements) PHSS_28849: This patch contains changes to accommodate a new infrastructure for testing. As a result the size of this patch is larger than the previous patch in this patch line. PHSS_27725: This patch delivers new functionality for logging the audit messages into the syslog.log for the admin operations. This patch delivers new functionality for including the quorum server and its parameters in the cmviewconf command. PHSS_27246: This patch delivers new functionality for package control script to do parallel fsck and umount. For this purpose variables FS_UMOUNT_OPT FS_FSCK_OPT and FS_TYPE are added to package control script. The comment section of package control script describes use of these variables. SR: 8606231688 8606233054 8606230826 8606229591 8606232772 8606238968 8606227696 8606242547 8606250049 8606229966 8606236658 8606232561 8606241953 8606242718 8606237295 8606229495 8606245169 8606248970 8606234353 8606237504 8606245185 8606232614 8606246814 8606251394 8606231669 8606248834 8606251434 8606251633 8606248845 8606254001 8606254986 8606247612 8606249052 8606244410 8606257766 8606251320 8606245185 8606254920 8606259876 8606256716 8606260131 8606262131 8606233259 8606244305 8606260489 8606249878 8606264328 8606268205 8606261124 8606247648 8606258432 8606242547 8606256106 8606256331 8606208266 8606260426 8606264135 8606214892 8606261781 8606255339 8606272001 8606280988 8606271637 8606278861 8606278820 8606269861 8606282343 8606283230 8606267626 8606280203 8606269037 8606281709 8606281543 8606269292 8606287005 8606284273 8606289077 8606283370 8606287690 8606284468 8606297528 8606304286 8606294079 8606280420 8606308755 8606287685 8606287688 8606307146 8606304271 8606307795 8606316216 8606302528 8606304014 8606309796 8606298749 8606296542 8606298565 8606296320 8606307156 8606321422 8606308105 8606306620 8606321306 8606319826 8606290226 8606285726 8606293511 8606299079 8606321527 8606301298 8606325992 8606320321 8606329554 8606259371 8606318010 8606323190 8606327010 8606310984 8606326802 8606330137 8606322722 8606329533 8606331371 8606331747 8606331750 8606333423 8606331371 8606335370 8606330279 8606340496 8606343063 8606343072 8606339173 8606351564 8606299501 8606319400 8606352983 8606312342 8606304114 8606316560 8606323570 8606345054 8606358660 8606357053 8606355632 8606343588 8606364598 8606363789 8606365756 8606370366 Patch Files: DLM-Pkg-Mgr.CM-PKG,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: Package-Manager.CM-PKG,fr=A.11.14,fa=HP-UX_B.11.00_32/64, v=HP: /usr/lbin/cm/C/CMpack.ou /usr/lbin/cm/C/CMpackadmin.ui /usr/lbin/cm/C/CMpackconf.ui /usr/lbin/cm/C/pkgconfig.xpm /usr/lib/libcmpkg.1 /usr/sbin/cmhaltpkg /usr/sbin/cmhaltserv /usr/sbin/cmmakepkg /usr/sbin/cmmigrate /usr/sbin/cmmodnet /usr/sbin/cmmodpkg /usr/sbin/cmrunpkg /usr/sbin/cmrunserv /usr/sbin/cmstartres /usr/sbin/cmstopres DLM-Pkg-Mgr.CM-PKG-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64, v=HP: Package-Manager.CM-PKG-MAN,fr=A.11.14, fa=HP-UX_B.11.00_32/64,v=HP: /usr/share/man/man1m.Z/cmhaltpkg.1m /usr/share/man/man1m.Z/cmhaltserv.1m /usr/share/man/man1m.Z/cmmakepkg.1m /usr/share/man/man1m.Z/cmmigrate.1m /usr/share/man/man1m.Z/cmmodnet.1m /usr/share/man/man1m.Z/cmmodpkg.1m /usr/share/man/man1m.Z/cmrunpkg.1m /usr/share/man/man1m.Z/cmrunserv.1m /usr/share/man/man1m.Z/cmstartres.1m /usr/share/man/man1m.Z/cmstopres.1m DLM-ATS-Core.ATS-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: ATS-CORE.ATS-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: /usr/share/man/man1m.Z/stapplyconf.1m /usr/share/man/man1m.Z/stcheckconf.1m /usr/share/man/man1m.Z/stdeleteconf.1m /usr/share/man/man1m.Z/stgetconf.1m /usr/share/man/man1m.Z/stquerycl.1m /usr/share/man/man1m.Z/streclaim.1m /usr/share/man/man1m.Z/stviewcl.1m /usr/share/man/man4.Z/atsconf.4 /usr/share/man/man5.Z/ats.5 DLM-ATS-Core.ATS-RUN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: ATS-CORE.ATS-RUN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: /etc/cmcluster/sharedtape/ats_tapelibs /usr/lbin/cmtaped /usr/lib/nls/msg/C/ats.cat /usr/sbin/stapplyconf /usr/sbin/stcheckconf /usr/sbin/stdeleteconf /usr/sbin/stdisplay /usr/sbin/stgetconf /usr/sbin/stquerycl /usr/sbin/streclaim /usr/sbin/stsetlog /usr/sbin/stviewcl DLM-NMAPI.CM-NMAPI,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: /opt/nmapi/nmapi2/lib/libnmapi2.1 /opt/nmapi/nmapi2/lib/libnmapi2.sl /opt/nmapi/nmapi2/lib/pa20_64/libnmapi2.1 /opt/nmapi/nmapi2/lib/pa20_64/libnmapi2.sl /usr/contrib/bin/gmsetlog /usr/lbin/cmgmsd /usr/lib/libcmdlm.1 /usr/lib/libcmdlm.dlm.1 DLM-Clust-Mon.CM-CORE,fr=A.11.14,fa=HP-UX_B.11.00_32/64, v=HP: Cluster-Monitor.CM-CORE,fr=A.11.14,fa=HP-UX_B.11.00_32/64, v=HP: /etc/cmcluster.conf /sbin/init.d/cmcluster /usr/contrib/bin/cmsetlog /usr/contrib/bin/cmsetsafety /usr/contrib/bin/get_sn /usr/contrib/bin/sscnfmtr /usr/lbin/cm/C/CMcore.ou /usr/lbin/cm/C/CMcoreadmin.ui /usr/lbin/cm/C/CMcoreconf.ui /usr/lbin/cm/C/clconfig.xpm /usr/lbin/cm/C/cmcluster.xpm /usr/lbin/cmclconfd /usr/lbin/cmcld /usr/lbin/cmlogd /usr/lbin/cmlvmd /usr/lbin/cmsnmpd /usr/lbin/cmsrvassistd /usr/lbin/cmui /usr/lib/libcmcore.1 /usr/lib/libcmcore.sl /usr/lib/libcmdlm.sl /usr/lib/libcmpkg.sl /usr/lib/libcmres.1 /usr/lib/libcmres.sl /usr/lib/libsgcl.2 /usr/lib/libsgcl.sl /usr/newconfig/usr/lib/libcmdlm.1 /usr/newconfig/usr/lib/libcmpkg.1 /usr/obam/lib/help/C/cm/cm.hv /usr/obam/lib/help/C/cm/cm.hvk /usr/obam/lib/help/C/cm/cm00.ht /usr/obam/lib/help/C/cm/cm01.ht /usr/obam/lib/help/C/cm/cm02.ht /usr/obam/lib/help/C/cm/cm03.ht /usr/sbin/cmapplyconf /usr/sbin/cmcheckconf /usr/sbin/cmdeleteconf /usr/sbin/cmgetconf /usr/sbin/cmhaltcl /usr/sbin/cmhaltnode /usr/sbin/cmquerycl /usr/sbin/cmruncl /usr/sbin/cmrunnode /usr/sbin/cmscancl /usr/sbin/cmviewcl /usr/sbin/cmviewconf /usr/sbin/convert DLM-Clust-Mon.CM-CORE-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64, v=HP: Cluster-Monitor.CM-CORE-MAN,fr=A.11.14, fa=HP-UX_B.11.00_32/64,v=HP: /usr/share/man/man1m.Z/cmapplyconf.1m /usr/share/man/man1m.Z/cmcheckconf.1m /usr/share/man/man1m.Z/cmdeleteconf.1m /usr/share/man/man1m.Z/cmgetconf.1m /usr/share/man/man1m.Z/cmhaltcl.1m /usr/share/man/man1m.Z/cmhaltnode.1m /usr/share/man/man1m.Z/cmquerycl.1m /usr/share/man/man1m.Z/cmruncl.1m /usr/share/man/man1m.Z/cmrunnode.1m /usr/share/man/man1m.Z/cmscancl.1m /usr/share/man/man1m.Z/cmsnmpd.1m /usr/share/man/man1m.Z/cmviewcl.1m /usr/share/man/man1m.Z/cmviewconf.1m /usr/share/man/man5.Z/cm.5 what(1) Output: DLM-Pkg-Mgr.CM-PKG,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: /usr/lbin/cm/C/CMpack.ou: RCS $Header: CMpack.ou,v 82.2 98/10/19 19:13:55 ssa Exp $ /usr/lbin/cm/C/CMpackadmin.ui: $Revision: 82.2 $ /usr/lbin/cm/C/CMpackconf.ui: $Revision: 82.2 $ /usr/lbin/cm/C/pkgconfig.xpm: $Revision: 82.2 $ /usr/lib/libcmpkg.1: MC/ServiceGuard Product $Revision: 82.2 $ Build date: Fri Jul 2 13:35:28 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/sbin/cmhaltpkg: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmhaltserv: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmmakepkg: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmmigrate: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmmodnet: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmmodpkg: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmrunpkg: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmrunserv: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmstartres: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmstopres: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 DLM-Pkg-Mgr.CM-PKG-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64, v=HP: /usr/share/man/man1m.Z/cmhaltpkg.1m: None /usr/share/man/man1m.Z/cmhaltserv.1m: None /usr/share/man/man1m.Z/cmmakepkg.1m: None /usr/share/man/man1m.Z/cmmigrate.1m: None /usr/share/man/man1m.Z/cmmodnet.1m: None /usr/share/man/man1m.Z/cmmodpkg.1m: None /usr/share/man/man1m.Z/cmrunpkg.1m: None /usr/share/man/man1m.Z/cmrunserv.1m: None /usr/share/man/man1m.Z/cmstartres.1m: None /usr/share/man/man1m.Z/cmstopres.1m: None DLM-ATS-Core.ATS-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: /usr/share/man/man1m.Z/stapplyconf.1m: None /usr/share/man/man1m.Z/stcheckconf.1m: None /usr/share/man/man1m.Z/stdeleteconf.1m: None /usr/share/man/man1m.Z/stgetconf.1m: None /usr/share/man/man1m.Z/stquerycl.1m: None /usr/share/man/man1m.Z/streclaim.1m: None /usr/share/man/man1m.Z/stviewcl.1m: None /usr/share/man/man4.Z/atsconf.4: None /usr/share/man/man5.Z/ats.5: None DLM-ATS-Core.ATS-RUN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: /etc/cmcluster/sharedtape/ats_tapelibs: Advanced Tape Services A.11.09 /usr/lbin/cmtaped: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:44:13 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/lib/nls/msg/C/ats.cat: None /usr/sbin/stapplyconf: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:57:38 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/sbin/stcheckconf: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:57:38 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/sbin/stdeleteconf: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:57:38 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/sbin/stdisplay: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:44:19 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/sbin/stgetconf: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:57:38 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/sbin/stquerycl: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:57:38 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/sbin/streclaim: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:57:38 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/sbin/stsetlog: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:57:38 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/sbin/stviewcl: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:57:38 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ DLM-NMAPI.CM-NMAPI,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: /opt/nmapi/nmapi2/lib/libnmapi2.1: A.11.14 Date: 07/02/04 Patch: PHSS_31015 Build date: Fri Jul 2 13:50:34 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /opt/nmapi/nmapi2/lib/libnmapi2.sl: A.11.14 Date: 07/02/04 Patch: PHSS_31015 Build date: Fri Jul 2 13:50:34 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /opt/nmapi/nmapi2/lib/pa20_64/libnmapi2.1: Build date: Fri Jul 2 13:51:53 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux - 64 bit A.11.14 Date: 07/02/04 Patch: PHSS_31015 /opt/nmapi/nmapi2/lib/pa20_64/libnmapi2.sl: Build date: Fri Jul 2 13:51:53 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux - 64 bit A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/contrib/bin/gmsetlog: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:48:12 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/lbin/cmgmsd: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:46:49 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/lib/libcmdlm.1: Build date: Fri Jul 2 13:35:33 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/lib/libcmdlm.dlm.1: Build date: Fri Jul 2 13:35:33 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux DLM-Clust-Mon.CM-CORE,fr=A.11.14,fa=HP-UX_B.11.00_32/64, v=HP: /etc/cmcluster.conf: None /sbin/init.d/cmcluster: $Revision: 82.2 $ /usr/contrib/bin/cmsetlog: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:43:09 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/contrib/bin/cmsetsafety: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:43:09 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/contrib/bin/get_sn: get_sn Revision 1.7 Build date: Fri Jul 2 13:44:43 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/contrib/bin/sscnfmtr: sscnfmtr Revision 1.1 Build date: Fri Jul 2 13:44:40 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/lbin/cm/C/CMcore.ou: None /usr/lbin/cm/C/CMcoreadmin.ui: $Revision: 82.2 $ /usr/lbin/cm/C/CMcoreconf.ui: $Revision: 82.2 $ /usr/lbin/cm/C/clconfig.xpm: $Revision: 82.2 $ /usr/lbin/cm/C/cmcluster.xpm: $Revision: 82.2 $ /usr/lbin/cmclconfd: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:39:14 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Config Daemon Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/lbin/cmcld: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:27 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Daemon Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ NET: Version: B.11.00 $Date: 97/10/15 10:44:23 $ plumb.c + JAGae66196 Testing $Revision: 1.2.119.8 $ $Date: 97/07/21 09:54:39 $ /usr/lbin/cmlogd: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:37 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Log Daemon /usr/lbin/cmlvmd: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:44:55 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/lbin/cmsnmpd: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:43:57 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 Copyright 1992-1996 SNMP Research, Incorporated SNMP Research Distribution version 14.0.0.0 /usr/lbin/cmsrvassistd: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:43:17 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/lbin/cmui: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:43:27 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/lib/libcmcore.1: Cluster Monitor Product $Revision: 82.2 $ Build date: Fri Jul 2 13:35:21 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/lib/libcmcore.sl: Cluster Monitor Product $Revision: 82.2 $ Build date: Fri Jul 2 13:35:21 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/lib/libcmdlm.sl: Build date: Fri Jul 2 13:35:33 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/lib/libcmpkg.sl: MC/ServiceGuard Product $Revision: 82.2 $ Build date: Fri Jul 2 13:35:28 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/lib/libcmres.1: Build date: Fri Jul 2 13:45:04 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux MC/ServiceGuard Resource Lib $Revision: 82.2 $ /usr/lib/libcmres.sl: Build date: Fri Jul 2 13:45:04 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux MC/ServiceGuard Resource Lib $Revision: 82.2 $ /usr/lib/libsgcl.2: Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ Build date: Fri Jul 2 13:48:04 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/lib/libsgcl.sl: Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ Build date: Fri Jul 2 13:48:04 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/newconfig/usr/lib/libcmdlm.1: Build date: Fri Jul 2 13:35:31 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/newconfig/usr/lib/libcmpkg.1: MC/ServiceGuard Product $Revision: 82.2 $ Build date: Fri Jul 2 13:35:25 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux /usr/obam/lib/help/C/cm/cm.hv: None /usr/obam/lib/help/C/cm/cm.hvk: None /usr/obam/lib/help/C/cm/cm00.ht: None /usr/obam/lib/help/C/cm/cm01.ht: None /usr/obam/lib/help/C/cm/cm02.ht: None /usr/obam/lib/help/C/cm/cm03.ht: None /usr/sbin/cmapplyconf: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmcheckconf: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmdeleteconf: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmgetconf: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmhaltcl: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmhaltnode: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmquerycl: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmruncl: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmrunnode: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmscancl: None /usr/sbin/cmviewcl: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:42:59 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 /usr/sbin/cmviewconf: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:44:22 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux A.11.14 Date: 07/02/04 Patch: PHSS_31015 Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ /usr/sbin/convert: HP92453-02A.10.20 HP-UX SYMBOLIC DEBUGGER (END.O) $R evision: 74.03 $ Build date: Fri Jul 2 13:44:33 PDT 2004 Build id: ibld_sgops_a1114patch_makefile Build platform: hpux Cluster Monitor Product $Revision: 82.2 $ MC/ServiceGuard Product $Revision: 82.2 $ A.11.14 Date: 07/02/04 Patch: PHSS_31015 DLM-Clust-Mon.CM-CORE-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64, v=HP: /usr/share/man/man1m.Z/cmapplyconf.1m: None /usr/share/man/man1m.Z/cmcheckconf.1m: None /usr/share/man/man1m.Z/cmdeleteconf.1m: None /usr/share/man/man1m.Z/cmgetconf.1m: None /usr/share/man/man1m.Z/cmhaltcl.1m: None /usr/share/man/man1m.Z/cmhaltnode.1m: None /usr/share/man/man1m.Z/cmquerycl.1m: None /usr/share/man/man1m.Z/cmruncl.1m: None /usr/share/man/man1m.Z/cmrunnode.1m: None /usr/share/man/man1m.Z/cmscancl.1m: None /usr/share/man/man1m.Z/cmsnmpd.1m: None /usr/share/man/man1m.Z/cmviewcl.1m: None /usr/share/man/man1m.Z/cmviewconf.1m: None /usr/share/man/man5.Z/cm.5: None cksum(1) Output: DLM-Pkg-Mgr.CM-PKG,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: 931382500 622 /usr/lbin/cm/C/CMpack.ou 1779608406 65684 /usr/lbin/cm/C/CMpackadmin.ui 3998340222 65789 /usr/lbin/cm/C/CMpackconf.ui 2580877693 3083 /usr/lbin/cm/C/pkgconfig.xpm 342853724 12288 /usr/lib/libcmpkg.1 1378783710 2815696 /usr/sbin/cmhaltpkg 1378783710 2815696 /usr/sbin/cmhaltserv 1378783710 2815696 /usr/sbin/cmmakepkg 1378783710 2815696 /usr/sbin/cmmigrate 1378783710 2815696 /usr/sbin/cmmodnet 1378783710 2815696 /usr/sbin/cmmodpkg 1378783710 2815696 /usr/sbin/cmrunpkg 1378783710 2815696 /usr/sbin/cmrunserv 1378783710 2815696 /usr/sbin/cmstartres 1378783710 2815696 /usr/sbin/cmstopres DLM-Pkg-Mgr.CM-PKG-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64, v=HP: 3322328795 2051 /usr/share/man/man1m.Z/cmhaltpkg.1m 4246628612 1711 /usr/share/man/man1m.Z/cmhaltserv.1m 2682816919 5917 /usr/share/man/man1m.Z/cmmakepkg.1m 4080129418 2486 /usr/share/man/man1m.Z/cmmigrate.1m 2633729816 1743 /usr/share/man/man1m.Z/cmmodnet.1m 940193978 4223 /usr/share/man/man1m.Z/cmmodpkg.1m 3141703228 2275 /usr/share/man/man1m.Z/cmrunpkg.1m 2657909477 2304 /usr/share/man/man1m.Z/cmrunserv.1m 3351545495 1653 /usr/share/man/man1m.Z/cmstartres.1m 4266023591 1590 /usr/share/man/man1m.Z/cmstopres.1m DLM-ATS-Core.ATS-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: 777275294 1552 /usr/share/man/man1m.Z/stapplyconf.1m 3833761248 1246 /usr/share/man/man1m.Z/stcheckconf.1m 1836821640 1475 /usr/share/man/man1m.Z/stdeleteconf.1m 546650879 1151 /usr/share/man/man1m.Z/stgetconf.1m 3734418992 1657 /usr/share/man/man1m.Z/stquerycl.1m 3206086086 1096 /usr/share/man/man1m.Z/streclaim.1m 3100984753 1447 /usr/share/man/man1m.Z/stviewcl.1m 3727882504 2582 /usr/share/man/man4.Z/atsconf.4 889809455 1663 /usr/share/man/man5.Z/ats.5 DLM-ATS-Core.ATS-RUN,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: 272811593 595 /etc/cmcluster/sharedtape/ats_tapelibs 4197870272 1300176 /usr/lbin/cmtaped 4215879058 40853 /usr/lib/nls/msg/C/ats.cat 2934433428 2897616 /usr/sbin/stapplyconf 2934433428 2897616 /usr/sbin/stcheckconf 2934433428 2897616 /usr/sbin/stdeleteconf 225683879 108240 /usr/sbin/stdisplay 2934433428 2897616 /usr/sbin/stgetconf 2934433428 2897616 /usr/sbin/stquerycl 2934433428 2897616 /usr/sbin/streclaim 2934433428 2897616 /usr/sbin/stsetlog 2934433428 2897616 /usr/sbin/stviewcl DLM-NMAPI.CM-NMAPI,fr=A.11.14,fa=HP-UX_B.11.00_32/64,v=HP: 36080002 303104 /opt/nmapi/nmapi2/lib/libnmapi2.1 36080002 303104 /opt/nmapi/nmapi2/lib/libnmapi2.sl 681762763 166936 /opt/nmapi/nmapi2/lib/pa20_64/libnmapi2.1 681762763 166936 /opt/nmapi/nmapi2/lib/pa20_64/libnmapi2.sl 349433831 231120 /usr/contrib/bin/gmsetlog 2493902080 272080 /usr/lbin/cmgmsd 2240388594 12288 /usr/lib/libcmdlm.1 2240388594 12288 /usr/lib/libcmdlm.dlm.1 DLM-Clust-Mon.CM-CORE,fr=A.11.14,fa=HP-UX_B.11.00_32/64, v=HP: 2695878994 415 /etc/cmcluster.conf 953829478 8009 /sbin/init.d/cmcluster 2559905767 2479824 /usr/contrib/bin/cmsetlog 2559905767 2479824 /usr/contrib/bin/cmsetsafety 218212224 65536 /usr/contrib/bin/get_sn 1338314454 53248 /usr/contrib/bin/sscnfmtr 1204333406 547 /usr/lbin/cm/C/CMcore.ou 1864590287 67664 /usr/lbin/cm/C/CMcoreadmin.ui 3261135812 67494 /usr/lbin/cm/C/CMcoreconf.ui 2246167907 2918 /usr/lbin/cm/C/clconfig.xpm 427854504 2921 /usr/lbin/cm/C/cmcluster.xpm 2119745231 3458768 /usr/lbin/cmclconfd 636018082 3839696 /usr/lbin/cmcld 1546609115 247504 /usr/lbin/cmlogd 1359416131 3081936 /usr/lbin/cmlvmd 2676804635 2848464 /usr/lbin/cmsnmpd 3444337282 259792 /usr/lbin/cmsrvassistd 1771981440 3557072 /usr/lbin/cmui 1825500974 12288 /usr/lib/libcmcore.1 1825500974 12288 /usr/lib/libcmcore.sl 2240388594 12288 /usr/lib/libcmdlm.sl 342853724 12288 /usr/lib/libcmpkg.sl 1398003844 12288 /usr/lib/libcmres.1 1398003844 12288 /usr/lib/libcmres.sl 3023652439 2244608 /usr/lib/libsgcl.2 3023652439 2244608 /usr/lib/libsgcl.sl 2845033350 12288 /usr/newconfig/usr/lib/libcmdlm.1 3529261786 12288 /usr/newconfig/usr/lib/libcmpkg.1 2762498041 60805 /usr/obam/lib/help/C/cm/cm.hv 1562564889 38 /usr/obam/lib/help/C/cm/cm.hvk 1574646855 1012 /usr/obam/lib/help/C/cm/cm00.ht 1388000456 54488 /usr/obam/lib/help/C/cm/cm01.ht 3220109143 16620 /usr/obam/lib/help/C/cm/cm02.ht 2207438358 104406 /usr/obam/lib/help/C/cm/cm03.ht 1378783710 2815696 /usr/sbin/cmapplyconf 1378783710 2815696 /usr/sbin/cmcheckconf 1378783710 2815696 /usr/sbin/cmdeleteconf 1378783710 2815696 /usr/sbin/cmgetconf 1378783710 2815696 /usr/sbin/cmhaltcl 1378783710 2815696 /usr/sbin/cmhaltnode 1378783710 2815696 /usr/sbin/cmquerycl 1378783710 2815696 /usr/sbin/cmruncl 1378783710 2815696 /usr/sbin/cmrunnode 198071293 17567 /usr/sbin/cmscancl 1378783710 2815696 /usr/sbin/cmviewcl 165038505 2500304 /usr/sbin/cmviewconf 2972436981 2541264 /usr/sbin/convert DLM-Clust-Mon.CM-CORE-MAN,fr=A.11.14,fa=HP-UX_B.11.00_32/64, v=HP: 2368840658 5219 /usr/share/man/man1m.Z/cmapplyconf.1m 1305326567 3047 /usr/share/man/man1m.Z/cmcheckconf.1m 3070612920 2689 /usr/share/man/man1m.Z/cmdeleteconf.1m 3518871518 2602 /usr/share/man/man1m.Z/cmgetconf.1m 799481426 1542 /usr/share/man/man1m.Z/cmhaltcl.1m 3275518081 1926 /usr/share/man/man1m.Z/cmhaltnode.1m 1604918929 7498 /usr/share/man/man1m.Z/cmquerycl.1m 416588552 1725 /usr/share/man/man1m.Z/cmruncl.1m 172159675 1631 /usr/share/man/man1m.Z/cmrunnode.1m 3351906598 2614 /usr/share/man/man1m.Z/cmscancl.1m 4284727292 5047 /usr/share/man/man1m.Z/cmsnmpd.1m 2689553493 4141 /usr/share/man/man1m.Z/cmviewcl.1m 189023689 1457 /usr/share/man/man1m.Z/cmviewconf.1m 3027341212 1376 /usr/share/man/man5.Z/cm.5 Patch Conflicts: None Patch Dependencies: None Hardware Dependencies: None Other Dependencies: None Supersedes: PHSS_26056 PHSS_27246 PHSS_27725 PHSS_28851 PHSS_29122 PHSS_29561 PHSS_29915 PHSS_30028 PHSS_30448 PHSS_30769 Equivalent Patches: None Patch Package Size: 10890 KBytes Installation Instructions: Please review all instructions and the Hewlett-Packard SupportLine User Guide or your Hewlett-Packard support terms and conditions for precautions, scope of license, restrictions, and, limitation of liability and warranties, before installing this patch. ------------------------------------------------------------ 1. Back up your system before installing a patch. 2. Login as root. 3. Copy the patch to the /tmp directory. 4. Move to the /tmp directory and unshar the patch: cd /tmp sh PHSS_31015 5. Run swinstall to install the patch: swinstall -x autoreboot=true -x patch_match_target=true \ -s /tmp/PHSS_31015.depot By default swinstall will archive the original software in /var/adm/sw/save/PHSS_31015. If you do not wish to retain a copy of the original software, include the patch_save_files option in the swinstall command above: -x patch_save_files=false WARNING: If patch_save_files is false when a patch is installed, the patch cannot be deinstalled. Please be careful when using this feature. For future reference, the contents of the PHSS_31015.text file is available in the product readme: swlist -l product -a readme -d @ /tmp/PHSS_31015.depot To put this patch on a magnetic tape and install from the tape drive, use the command: dd if=/tmp/PHSS_31015.depot of=/dev/rmt/0m bs=2k Special Installation Instructions: For ServiceGuard OPS Edition Clusters using OPS 8.1.6 or higher do the following: 1) Halt OPS and ServiceGuard on the node the patch is to be installed on. 2) Install this patch on that node. 3) Restart ServiceGuard and OPS on that node. 4) Patch needs to be installed on all nodes in the cluster. For MC/ServiceGuard Clusters, do the following: 1) Halt ServiceGuard on the node the patch is to be installed on. 2) Install this patch on that node. 3) Restart ServiceGuard on that node. 4) Patch needs to be installed on all nodes in the cluster. If installing the patch on an unpatched MC/ServiceGuard cluster, do the following: 1) Kill all EMS monitors (e.g. diskmond, mibmond, etc) on each node before starting ServiceGuard on that node. Defect 25 (JAGae48414) listed for patch PHSS_27725 requires some consideration for the node timeout for some very specific customers. This fix introduces a change in behavior for ServiceGuard in the case where the system clock is not updated for a certain time period. In this situation, the node will TOC if the system clock is not advancing for 5 node timeout periods. This change will make sure that whole cluster does not fail. And it will also make sure that Mission Critical applications are started on another node which does not exhibit the system clock problem. Large systems with higher number of CPUs/high amount of memory/large IO configurations are more susceptible to this phenomenon than small systems. It is recommended that for large systems a higher setting of the node timeout value from 5 to 8 seconds should be used. In addition a higher value of node timeout of 5 to 8 seconds is also recommended for systems where any of the following symptoms have been seen before installation of this patch: - a series of reconfigurations spaced by the node timeout value for no apparent reason & resulting in the same membership. - or after installation of this patch following messages are seen in the syslog: - Warning : Kernel ticks_since_boot is not advanced in the past xx seconds. - or a system crash with following messages on console or in the crash dump: - FAILURE : Kernel ticks_since_boot has not been advanced for xx seconds, which is greater than or equal to maximum allowable interval of XX seconds. This additional consideration is only required for defect 25 in PHSS_27725. This step is not required for any other fix in this or other patches. Defect 1 (JAGae67631) listed for patch PHSS_28851 requires the convert utility to be used manually on each node in the cluster after the patch is installed to correct the problem. The following command should be used for running convert manually, assuming that the old configuration file is located at /etc/cmcluster/cmclconfig: # convert -f /etc/cmcluster/cmclconfig The cmrunnode command should then be reissued on each node. This is required only if symptoms are similar to Defect #1 listed in PHSS_28851. This step is not required for any other fix in this or other patches.