Patch Name: PHKL_29346 Patch Description: s700_800 11.00 NFS read ahead, page cache, pageout hang Creation Date: 03/07/31 Post Date: 03/10/21 Hardware Platforms - OS Releases: s700: 11.00 s800: 11.00 Products: N/A Filesets: OS-Core.CORE2-KRN,fr=B.11.00,fa=HP-UX_B.11.00_32,v=HP OS-Core.CORE2-KRN,fr=B.11.00,fa=HP-UX_B.11.00_64,v=HP Automatic Reboot?: Yes Status: General Release Critical: Yes PHKL_29346: HANG PHKL_26007: PANIC PHKL_28694: PANIC PHKL_22792: PANIC PHKL_21511: PANIC CORRUPTION Category Tags: defect_repair general_release critical panic halts_system corruption Path Name: /hp-ux_patches/s700_800/11.X/PHKL_29346 Symptoms: PHKL_29346: ( SR:8606303331 CR:JAGae66686 ) A uniprocessor machine will hang. If a TOC is taken, vhand's stack trace will look similar to this: csuperpage_lock devswap_vfdcheck for_val3 for_val2 for_val2 foreach_valid devswap_pageout stealpages vhand main On a multiprocessor system, vhand will consume 100% of a cpu in system mode as seen using a performance monitoring tool. This has been observed in a system under high memory pressure while starting up a large database application. PHKL_26007: ( SR:8606225327 CR:JAGad94414 ) Duplicate ( SR:8606225843 CR:JAGad94916 ) When under heavy memory/swap pressure, the system may panic with a data page fault in page handling routines. Two panics that have been seen have the following stack traces, but there could be others: panic+0x14 report_trap_or_int_and_panic+0x84 trap+0xd9c nokgdb+0x8 hdl_cwfault+0xb80 prot_fault+0xa8 pfault+0x15c trap+0x4ec nokgdb+0x8 and panic+0x14 report_trap_or_int_and_panic+0x84 interrupt+0x1d4 ihandler+0x928 pdv_modset2_0+0x348 pdmodset+0x170 hdl_unsetbits+0x264 pageiocleanup+0x11c pageiodone+0x2c biodone+0x1f0 lv_complete+0xc4 lv_terminate+0xd4 lv_parwrite_done+0x470 lv_end+0x174 biodone+0x1f0 scsi_fast_cbfn+0xb7c c720_call_cbfns+0x60 c720_isr+0x64c sapic_interrupt+0x2c mp_ext_interrupt+0x318 ihandler+0x904 PHKL_28694: ( SR:8606293168 CR:JAGae56918 ) When an application mmap's an NFS or UFS file that is larger than 2GB, the system may panic when accessing the file. For an NFS file, the panic stack trace may be similar to: panic: pdremap panic+0x6c pdremap+0x5a4 hdl_addtrans+0x358 hdl_kmap_bp+0x39c nfs_read_ahead+0x3ac start_next_read_ahead+0xb8 checkprotid+0x318 hdl_pfault+0x4c4 pfault+0x120 trap+0x444 thandler+0xd20 The panic has not been seen on UFS, although theoretically it could occur. This panic will not be seen on local VxFS files. PHKL_22792: ( SR: 8606147554 CR: JAGad16896 ) Customer will see a data-page-fault panic in pdprotget(). A typical panic stack would look like: panic+0x14 report_trap_or_int_and_panic+0x80 trap+0xdb8 nokgdb+0x8 pdprotget+0xa0 process_read_ahead_pages+0x1a8 hdl_pfault+0x6a0 pfault+0x104 copyin+0x170 uiomove+0xc0 rwip+0x398 ufs_rdwr+0x124 vno_rw+0x84 writev+0x17c syscall+0x480 $syscallrtn+0x0 ( SR: 8606170083 CR: JAGad39347 ) Customer will see a data-page-fault panic in pdprotget(). A typical panic stack would look like: panic+0x14 assfail+0x3c _assfail+0x2c sl_pre_check+0x120 spinlock+0x18 vfault+0xd8 trap+0x10e4 nokgdb+0x8 pdprotget+0x224 process_read_ahead_pages+0x36c hdl_vfault+0x518 vfault+0x2b8 trap+0x1370 nokgdb+0x8 copyin+0xbc uiomove+0x4a4 rw3vp+0x7ec nfs3_write+0x140 nfs3_rdwr+0x8c vno_rw+0xa4 4_2fa4_cl_rwuio+0x298 write+0x78 syscall+0x5fc $syscallrtn+0x0 ( SR: 8606156441 CR: JAGad25776 ) pfd corruption -- manifests as data-page-fault in pdv_modset2_0 or pdv_protaccset2_0. A typical panic stack would look like: panic+0x14 report_trap_or_int_and_panic+0x84 interrupt+0x1d4 $ihndlr_rtn+0x0 pdv_modset2_0+0x27c pdmodset+0x14c hdl_unsetbits+0x264 pageiocleanup+0x11c pageiodone+0x2c biodone+0x1f0 lv_complete+0xc0 lv_terminate+0xd4 lv_parwrite_done+0x148 lv_end+0x128 biodone+0x1f0 scsi_fast_cbfn+0x260 c720_isr+0x52c sapic_interrupt+0x2c up_ext_interrupt+0x2c8 ivti_patch_to_nop2+0x0 idle+0x1c4 swidle_exit+0x0 PHKL_21511: ( SR: 8606130257 CR: JAGac95128 ) System panics with a data page fault due to various reasons, such as virtual address not mapped, allocation of page zero, or using a page of size zero. In most cases, the system is doing heavy I/O across NFS mounts. A typical panic stack would look like: panic+0x14 allocate_page+0x12c allocpfd+0x24 vfdfill+0x58 vm_fill_in_pages+0x64 vm_prepare_io+0x28 nfs_pagein+0x20c virtual_fault+0x1c4 vfault+0xf4 trap+0x714 Defect Description: PHKL_29346: ( SR:8606303331 CR:JAGae66686 ) Vhand uses an incorrect algorithm to lock all the sub-pages of a very large superpage (i.e. 256MB). If it fails to lock the first sub-page, it will try to relock and unlock all the sub-pages for each sub-page in the superpage. This is consuming the entire cpu resource for the cpu vhand is running on. Vhand will ultimately recover, but only after a long time (>20 minutes). Resolution: Detect when vhand can not lock the first sub-page, and skip the entire superpage, proceeding to the next. PHKL_26007: ( SR:8606225327 CR:JAGad94414 ) Duplicate ( SR:8606225843 CR:JAGad94916 ) Under heavy memory/swap pressure, it is possible for only a partial subset of pages from a large page to be in the page cache. This can lead to an inconsistent state in the page cache that may subsequently cause panics in many ways, but typically in the fault handler path. Resolution: When there is swap pressure and we bring a page in, we forcefully remove it from the page cache, so that the problem of having sub pages in the page cache does not occur. PHKL_28694: ( SR:8606293168 CR:JAGae56918 ) When calculating the next read ahead address for an NFS or UFS file, data types used in the calculation may cause sign extension, resulting in an incorrect address being generated. When the kernel tries to access the faulty address, it causes a panic. Resolution: Added casts for variables in the next read ahead address calculations to prevent sign extension. PHKL_22792: ( SR: 8606147554 CR: JAGad16896 ) process_read_ahead_pages() both in its forward and backward loops, does not correctly check for the presence of translation. Resolution: Check both for the existence of an address translation for the physical page, and make sure it is translated to our address. ( SR: 8606170083 CR: JAGad39347 ) Incorrect parameter passed to pdprotget() call. Resolution: Pass the updated virtual address instead of the starting address again and again. ( SR: 8606156441 CR: JAGad25776 ) We were freeing the pfd at the granularity of the superpage instead of the 4K sub-page. Resolution: Use freepfd_one() to release a single 4K sub-page, instead of freepfd(). PHKL_21511: ( SR: 8606130257 CR: JAGac95128 ) For better performance, the user data access pattern is tracked and read-ahead is implemented for sequential I/O's. A defect in the read-ahead code decrements the reference count of pages within a large page incorrectly. This results in freeing still in-use pages and corrupting the memory freelist. When invalid pages are returned to the allocation routines, the system panics with data page fault. Resolution: Decrement reference count of large pages correctly. Enhancement: No SR: 8606130257 8606147554 8606156441 8606170083 8606225327 8606225843 8606293168 8606303331 Patch Files: OS-Core.CORE2-KRN,fr=B.11.00,fa=HP-UX_B.11.00_32,v=HP: /usr/conf/lib/libhp-ux.a(vfs_vm.o) /usr/conf/lib/libhp-ux.a(vm_devswap.o) OS-Core.CORE2-KRN,fr=B.11.00,fa=HP-UX_B.11.00_64,v=HP: /usr/conf/lib/libhp-ux.a(vfs_vm.o) /usr/conf/lib/libhp-ux.a(vm_devswap.o) what(1) Output: OS-Core.CORE2-KRN,fr=B.11.00,fa=HP-UX_B.11.00_32,v=HP: /usr/conf/lib/libhp-ux.a(vfs_vm.o): vfs_vm.c $Date: 2003/07/15 04:04:08 $Revision: r11ro s/11 PATCH_11.00 (PHKL_29346) /usr/conf/lib/libhp-ux.a(vm_devswap.o): vm_devswap.c $Date: 2003/07/15 04:04:08 $Revision: r 11ros/7 PATCH_11.00 (PHKL_29346) OS-Core.CORE2-KRN,fr=B.11.00,fa=HP-UX_B.11.00_64,v=HP: /usr/conf/lib/libhp-ux.a(vfs_vm.o): vfs_vm.c $Date: 2003/07/15 04:04:08 $Revision: r11ro s/11 PATCH_11.00 (PHKL_29346) /usr/conf/lib/libhp-ux.a(vm_devswap.o): vm_devswap.c $Date: 2003/07/15 04:04:08 $Revision: r 11ros/7 PATCH_11.00 (PHKL_29346) cksum(1) Output: OS-Core.CORE2-KRN,fr=B.11.00,fa=HP-UX_B.11.00_32,v=HP: 3835301424 43276 /usr/conf/lib/libhp-ux.a(vfs_vm.o) 3156765309 22992 /usr/conf/lib/libhp-ux.a(vm_devswap.o) OS-Core.CORE2-KRN,fr=B.11.00,fa=HP-UX_B.11.00_64,v=HP: 593362189 94440 /usr/conf/lib/libhp-ux.a(vfs_vm.o) 646197296 45696 /usr/conf/lib/libhp-ux.a(vm_devswap.o) Patch Conflicts: None Patch Dependencies: s700: 11.00: PHKL_18543 s800: 11.00: PHKL_18543 Hardware Dependencies: None Other Dependencies: None Supersedes: PHKL_28694 PHKL_26007 PHKL_22792 PHKL_21511 Equivalent Patches: PHKL_27825: s700: 11.11 s800: 11.11 Patch Package Size: 120 KBytes Installation Instructions: Please review all instructions and the Hewlett-Packard SupportLine User Guide or your Hewlett-Packard support terms and conditions for precautions, scope of license, restrictions, and, limitation of liability and warranties, before installing this patch. ------------------------------------------------------------ 1. Back up your system before installing a patch. 2. Login as root. 3. Copy the patch to the /tmp directory. 4. Move to the /tmp directory and unshar the patch: cd /tmp sh PHKL_29346 5. Run swinstall to install the patch: swinstall -x autoreboot=true -x patch_match_target=true \ -s /tmp/PHKL_29346.depot By default swinstall will archive the original software in /var/adm/sw/save/PHKL_29346. If you do not wish to retain a copy of the original software, include the patch_save_files option in the swinstall command above: -x patch_save_files=false WARNING: If patch_save_files is false when a patch is installed, the patch cannot be deinstalled. Please be careful when using this feature. For future reference, the contents of the PHKL_29346.text file is available in the product readme: swlist -l product -a readme -d @ /tmp/PHKL_29346.depot To put this patch on a magnetic tape and install from the tape drive, use the command: dd if=/tmp/PHKL_29346.depot of=/dev/rmt/0m bs=2k Special Installation Instructions: None