Patch Name: PHKL_28720 Patch Description: s700_800 11.22 MCA patch Creation Date: 03/03/05 Post Date: 03/03/07 Hardware Platforms - OS Releases: s700: 11.22 s800: 11.22 Products: N/A Filesets: OS-Core.CORE2-KRN,fr=B.11.22,fa=HP-UX_B.11.22_IA,v=HP Automatic Reboot?: Yes Status: General Release Critical: Yes PHKL_28720: OTHER System will not perform a crash dump when Fatal MCAs occur, PHKL_28093: PANIC PHKL_28203: OTHER Crash dump generated for INIT does not contain all relevant register information required by kernel debuggers and crash analyzers. Category Tags: defect_repair hardware_enablement general_release critical panic manual_dependencies Path Name: /hp-ux_patches/s700_800/11.X/PHKL_28720 Symptoms: PHKL_28720: ( SR:8606274247 CR:JAGae38324 ) An MCA is a Machine Check Abort and indicates a serious system error; MCAs on IPF systems are equivalent to HPMCs on PA platforms. Note that this is an IPF only Patch. For IPF platforms, there are a number of MCA related issues in the 11.22 initial release of HP-UX. This patch will supply support engineers with better memory crash dumps for analysis in the case of an MCA. This helps them better support customers who are experiencing difficulties with IPF platforms. Some MCAs may cause the system to reboot with no indication of what happened. No console output will be seen. No crash dump will be taken for support engineers to analyze. The "stack unwind problem" refers to the debugger's inability to get a stack trace of the processor crash events from a system memory crash dump after an MCA. After the dump is written, and the system reboots, the expected behavior for debuggers such as kwdb, or q4 is a display of each processor's stack trace, using the "trace event X" command. But instead the "trace event" command fails with an assertion failure message or unwind internal library error. PHKL_28093: ( SR:8606276344 CR:JAGae40422 ) On the Intel OEM i870 platform only, HP-UX will fail to initialize the 4 port Fast Ethernet card. This console message will print repeatedly: btlan: Initializing 10/100BASE-TX card at 0/3/31/0/2/0/4/0. btlan: Failed to link driver ISR. wsio_claim init failed isc=0xe00000010dff1400 name=btlan PHKL_28203: ( SR:8606276202 CR:JAGae40280 ) On IPF, the stack unwind problem after INIT refers to the inability to get a stack trace of the processor crash events from a system memory dump after an INIT. After the dump is written, and the system reboots, the expected behavior for debuggers such as kwdb, q4, or p4 is a display of each processor's stack trace, using the "trace event X" command. But instead the "trace event" command fails with an assertion failure message or unwind internal library error. This problem exists for IPF systems only. Defect Description: PHKL_28720: ( SR:8606274247 CR:JAGae38324 ) Fatal MCAs are MCAs that will cause a PCI bus to be reset thus leaving all the devices on that bus in a reset turn-on state. When such an MCA occurs, the OS_MCA handler cannot access any I/O devices resulting in no console output and no crash dump taken. The complete internal state of the system at the time of an MCA is not being preserved by the kernel in the crash event structures. Additionally, the debuggers are using the incorrect unwind context to unwind the stack after an MCA. Resolution: On IPF platforms, the patch will allow the correct information to be stored in a crash dump in the case of an MCA. The kernel fixes in this patch include changes to save all register information to be later used by the debuggers. This will help support engineers in analyzing problems more effectively when they have the correct stack trace and processor save state information. This patch PHKL_28720 will allow PCI devices to function enough to take a crash dump and print to the serial console after a FATAL MCA occurs and causes PCI buses to be reset. Restoration of the VGA console when using an ATI Radeon based graphics card will not be supported at this time. If a FATAL MCA occurs when using the VGA console (as in a workstation setup) the system will perform a crash dump but no output will be printed to the screen. Fire-GL Cards are supported and will print to the screen. For this to work X windows must be running and VGA the only console selected. All serial consoles are supported. The following dump devices will be supported when this patch is installed (without the need for other supporting patches): Workstations ----------------- On board IDE on zx2000 Add-in SCSI on zx2000 and zx6000 (A6829A, A6828A) Add-n Fiber Channel on zx2000 and zx6000 (A6795A) Servers --------- Add-in SCSI on rx2600 and rx5670 (A6829A,A6828A) Add-in Fiber Channel on rx2600 and rx5670 (A6795A) (A6795A) The following dump devices will be supported when this patch is installed with the following supporting patches (as indicated) With out any the following patches the crash dump will not occur on the specified devices: Workstations ----------------- On board SCSI on zx6000 (PHSS_27990) Servers --------- On Board SCSI on rx2600 (PHSS_27990) On Board SCSI on rx5670 (PHKL_28787) released 6/03 Firmware Dependencies --------------------- For HP platforms (rx5670, rx2600, zx6000, and zx2000), a firmware upgrade to the following version is needed for a complete solution for the stack unwind problem. With out this firmware update a correct stack trace may not be obtainable.: rx5670 - SFW 2.11 or better rx2600, zx6000, zx2000 - SFW 1.61 or better For HP platforms (rx5670, rx2600, and zx6000), a firmware upgrade to the versions below are required so the systems will not reboot before a crash dump is performed. These firmware dependencies are only needed on systems with more than one processor. Without upgrading to the firmware revisions listed below the system may still suffer from reset after an MCA, though less often if the patch were not installed at all: rx5670 (SFW with Madison Support) or better, released 7/03 rx2600, zx6000 - SFW 1.90 or better Other Dependencies ------------------ Patch Dependency: PHCO_28066 Patch adds support for Q4 and Kwdb. These patches must be installed together. PHKL_28093: ( SR:8606276344 CR:JAGae40422 ) On the Intel OEM i870 platform, HP-UX does not properly initialize PCI cards that contain a bus bridge, such as the 4 port Fast Ethernet card. They will not be seen by HP-UX. Resolution: The fix was made to support the following configuration: PCI Root Bridge | Built-in PCI-PCI Bridge | Add-on Generic PCI-PCI Bridge | I/O Controller The root cause of this defect is that HP-UX assumes only one level of PCI-to-PCI bus bridge. The 4 port Fast Ethernet card has a PCI-to-PCI bus bridge on it, and so uncovered two places in the HP-UX kernel that assumed one level of bridge. PHKL_28203: ( SR:8606276202 CR:JAGae40280 ) On IPF the complete internal state of the system at the time of an INIT is not being preserved by the kernel in the crash event structures. Additionally, the debuggers are using the incorrect unwind context to unwind the stack after an INIT. Resolution: The kernel fixes include changes to save all register information to be later used by the debuggers. Enhancement: No SR: 8606274247 8606276202 8606276344 Patch Files: OS-Core.CORE2-KRN,fr=B.11.22,fa=HP-UX_B.11.22_IA,v=HP: /usr/conf/lib/libdump-pdk.a(asm_crash.o) /usr/conf/lib/libdump-pdk.a(mca.o) /usr/conf/lib/libdump-pdk.a(mca_asm.o) /usr/conf/lib/libio-pdk.a(ia64_psm.o) /usr/conf/lib/libpci.a(p2pb_cdio.o) /usr/conf/lib/libpci.a(pci_cdio.o) what(1) Output: OS-Core.CORE2-KRN,fr=B.11.22,fa=HP-UX_B.11.22_IA,v=HP: /usr/conf/lib/libdump-pdk.a(asm_crash.o): asm_crash.s $Date: 2003/02/19 08:24:16 $Revision: r1 1.22/2 PATCH_11.22 (PHKL_28720) /usr/conf/lib/libdump-pdk.a(mca.o): mca.c $Date: 2003/02/19 08:24:16 $Revision: r11.22/1 PATCH_11.22 (PHKL_28720) /usr/conf/lib/libdump-pdk.a(mca_asm.o): mca_asm.s $Date: 2003/02/19 08:24:16 $Revision: r11. 22/1 PATCH_11.22 (PHKL_28720) /usr/conf/lib/libio-pdk.a(ia64_psm.o): ia64_psm.c $Date: 2003/02/19 08:24:16 $Revision: r11 .22/1 PATCH_11.22 (PHKL_28720) /usr/conf/lib/libpci.a(p2pb_cdio.o): p2pb_cdio.c $Date: 2002/10/21 09:21:40 $Revision: r1 1.22/1 PATCH_11.22 (PHKL_28093) /usr/conf/lib/libpci.a(pci_cdio.o): pci_cdio.c $Date: 2003/02/19 08:24:16 $Revision: r11 .22/2 PATCH_11.22 (PHKL_28720) cksum(1) Output: OS-Core.CORE2-KRN,fr=B.11.22,fa=HP-UX_B.11.22_IA,v=HP: 776155804 85664 /usr/conf/lib/libdump-pdk.a(asm_crash.o) 3621983446 119440 /usr/conf/lib/libdump-pdk.a(mca.o) 1072969818 8048 /usr/conf/lib/libdump-pdk.a(mca_asm.o) 3563885995 254368 /usr/conf/lib/libio-pdk.a(ia64_psm.o) 3657876146 20544 /usr/conf/lib/libpci.a(p2pb_cdio.o) 1149294561 138416 /usr/conf/lib/libpci.a(pci_cdio.o) Patch Conflicts: None Patch Dependencies: s700: 11.22: PHCO_28066 s800: 11.22: PHCO_28066 Hardware Dependencies: None Other Dependencies: Firmware Dependencies --------------------- For HP platforms (rx5670, rx2600, zx6000, and zx2000), a firmware upgrade to the following version is needed for a complete solution for the stack unwind problem. With out this firmware update a correct stack trace may not be obtainable.: rx5670 - SFW 2.11 or better rx2600, zx6000, zx2000 - SFW 1.61 or better For HP platforms (rx5670, rx2600, and zx6000), a firmware upgrade to the versions below are required so the systems will not reboot before a crash dump is performed. These firmware dependencies are only needed on systems with more than one processor. Without upgrading to the firmware revisions listed below the system may still suffer from reset after an MCA, though less often if the patch were not installed at all: rx5670 (SFW with Madison Support) or better, released 7/03 rx2600, zx6000 - SFW 1.90 or better For HP platforms (rx5670, rx2600, zx6000, and zx2000), a firmware upgrade to the following version is desirable for a complete solution: rx5670 - SFW 2.11 or better rx2600, zx6000, zx2000 - SFW 1.61 or better Supersedes: PHKL_28203 PHKL_28093 Equivalent Patches: None Patch Package Size: 650 KBytes Installation Instructions: Please review all instructions and the Hewlett-Packard SupportLine User Guide or your Hewlett-Packard support terms and conditions for precautions, scope of license, restrictions, and, limitation of liability and warranties, before installing this patch. ------------------------------------------------------------ 1. Back up your system before installing a patch. 2. Login as root. 3. Copy the patch to the /tmp directory. 4. Move to the /tmp directory and unshar the patch: cd /tmp sh PHKL_28720 5. Run swinstall to install the patch: swinstall -x autoreboot=true -x patch_match_target=true \ -s /tmp/PHKL_28720.depot By default swinstall will archive the original software in /var/adm/sw/save/PHKL_28720. If you do not wish to retain a copy of the original software, include the patch_save_files option in the swinstall command above: -x patch_save_files=false WARNING: If patch_save_files is false when a patch is installed, the patch cannot be deinstalled. Please be careful when using this feature. For future reference, the contents of the PHKL_28720.text file is available in the product readme: swlist -l product -a readme -d @ /tmp/PHKL_28720.depot To put this patch on a magnetic tape and install from the tape drive, use the command: dd if=/tmp/PHKL_28720.depot of=/dev/rmt/0m bs=2k Special Installation Instructions: None