Patch Name: PHKL_30373 Patch Description: s700_800 11.11 select(2) delay/hang and poll(2) hang Creation Date: 04/07/06 Post Date: 04/08/20 Hardware Platforms - OS Releases: s700: 11.11 s800: 11.11 Products: N/A Filesets: OS-Core.CORE2-KRN,fr=B.11.11,fa=HP-UX_B.11.11_32,v=HP OS-Core.CORE2-KRN,fr=B.11.11,fa=HP-UX_B.11.11_64,v=HP Automatic Reboot?: Yes Status: General Release Critical: No (superseded patches were critical) PHKL_25233: HANG Category Tags: defect_repair general_release critical halts_system Path Name: /hp-ux_patches/s700_800/11.X/PHKL_30373 Symptoms: PHKL_30373: ( SR:8606314509 CR:JAGae77273 ) select(2) system call with a timeout lesser than 10ms takes a longer delay. ( SR:8606347877 CR:JAGaf08699 ) select(2) system call with non zero file descriptor and zero file descriptor masks, can result in premature expiration. PHKL_25233: ( SR:8606217425 CR:JAGad86577 ) poll(2) hangs when timeout is set to 1msec for 0 file descriptors. ( SR:8606199534 CR:JAGad68721 ) On a multi-processor system, a thread may hang in select(2) sleeping on a pipe or socket read, even when data is present on those channels. It most likely happens when the system is heavily loaded, or when multiple threads call select(2) on the same read channel simultaneously. PHKL_23843: ( SR:8606177051 CR:JAGad46286 ) Select(2) system call may never return when it is used as a fast sleep/wakeup mechanism (with the parameter "range of file descriptors" set to 0), if it receives signals periodically at the interval that is shorter than the timeout interval of the select(2). ( SR:8606180428 CR:JAGad49649 ) On a multi-processor system, a thread may hang in select(2) sleeping on a pipe or socket read, even when data is present on those channels. It most likely happens when the system is heavily loaded, or when multiple threads call select(2) on the same read channel simultaneously. Defect Description: PHKL_30373: ( SR:8606314509 CR:JAGae77273 ) select(2) system call with a timeout lesser than 10ms takes a longer delay. This happens because the timeout variable is incremented by 1 clock tick if the input is less than 10ms. Resolution: The timeout variable is not incremented by 1 clock tick if the specified value is less than 10ms. However, this will happen only when the tunable select_enh (provided in PHKL_30516), is set. ( SR:8606347877 CR:JAGaf08699 ) select(2) system call, called with non zero file descriptors and zero file descriptor masks is used as a sleep/wakeup mechanism. But the code follows the regular select path. Resolution: select(2) follows the select_as_nanosleep( ) path when called with non zero file descriptors and zero file descriptor masks. PHKL_25233: ( SR:8606217425 CR:JAGad86577 ) When poll(2) is called with 0 file descriptors and a timeout of 1msec, it hangs in unselect(). This is due to a race that occurs in unselect(). This race occurs because the thread's status shows that it is in a running state when it is actually being scheduled to be put to sleep by making an entry in its sleep channel. Resolution: When a thread is in unselect() and its state is running (TSRUN), check if its sleep channel is set, and remove the thread from the sleep queue so that it does not miss a wakeup. ( SR:8606199534 CR:JAGad68721 ) The wakeup call to a thread sleeping in select(2) may be lost on a multi-processor system under the following conditions: 1) The thread migrates from one processor to other just before going to sleep in select(2). This could causes the wakeup routine to use a wrong sleep lock. 2) Multiple threads calling select(2) on the same read channel go to sleep at the same time. This sleep collision could be undetected within a certain time window. Resolution: The thread remembers the cpu on which it initially started on, and uses this cpu id to get the sleep lock. Throughout the life of the thread this cpu id is used even if the thread migrated to another cpu. This resolves problems related to thread migration. Enhancement has been made in sleep to make it atomic, when putting the thread to sleep and setting its sleep channel. This resolves the race between sleep and thread collision. The patches PHNE_25084 and PHKL_25389 are required by this patch as they deliver some of the new functionality described above. PHKL_23843: ( SR:8606177051 CR:JAGad46286 ) Select(2) restarts upon receiving a signal, if SA_RESTART has been set for the interrupting signal. When it is used as a fast sleep/wakeup mechanism and receives the signal, select(2) will restart with its original timeout interval. If it receives signals periodically at the interval shorter than its own timeout interval, select() will act as if it never returns until the signal stops. Resolution: Force select(2) not to restart when it is used as a sleep/wakeup mechanism -- when its "range of file descriptors" parameter is 0. ( SR:8606180428 CR:JAGad49649 ) The wakeup call to a thread sleeping in select(2) may be lost on a multi-processor system under the following conditions: 1) The thread migrates from one processor to other just before going to sleep in select(2). This could causes the wakeup routine to use a wrong sleep lock. 2) Multiple threads calling select(2) on the same read channel go to sleep at the same time. This sleep collision could be undetected within a certain time window. Resolution: Added a checking mechanism to make sure there is no migration taking place when choosing a sleep lock. Closed the window where a sleep collision could be undetected. Enhancement: No SR: 8606177051 8606180428 8606199534 8606217425 8606314509 8606347877 Patch Files: OS-Core.CORE2-KRN,fr=B.11.11,fa=HP-UX_B.11.11_32,v=HP: /usr/conf/lib/libfs.a(select.o) OS-Core.CORE2-KRN,fr=B.11.11,fa=HP-UX_B.11.11_64,v=HP: /usr/conf/lib/libfs.a(select.o) what(1) Output: OS-Core.CORE2-KRN,fr=B.11.11,fa=HP-UX_B.11.11_32,v=HP: /usr/conf/lib/libfs.a(select.o): select.c $Date: 2004/07/04 09:16:13 $Revision: r11.1 1/4 PATCH_11.11 (PHKL_30373) OS-Core.CORE2-KRN,fr=B.11.11,fa=HP-UX_B.11.11_64,v=HP: /usr/conf/lib/libfs.a(select.o): select.c $Date: 2004/07/04 09:16:13 $Revision: r11.1 1/4 PATCH_11.11 (PHKL_30373) cksum(1) Output: OS-Core.CORE2-KRN,fr=B.11.11,fa=HP-UX_B.11.11_32,v=HP: 3238114226 14872 /usr/conf/lib/libfs.a(select.o) OS-Core.CORE2-KRN,fr=B.11.11,fa=HP-UX_B.11.11_64,v=HP: 3102173598 41576 /usr/conf/lib/libfs.a(select.o) Patch Conflicts: None Patch Dependencies: s700: 11.11: PHKL_25389 PHKL_30516 PHNE_25084 s800: 11.11: PHKL_25389 PHKL_30516 PHNE_25084 Hardware Dependencies: None Other Dependencies: None Supersedes: PHKL_25233 PHKL_23843 Equivalent Patches: None Patch Package Size: 50 KBytes Installation Instructions: Please review all instructions and the Hewlett-Packard SupportLine User Guide or your Hewlett-Packard support terms and conditions for precautions, scope of license, restrictions, and, limitation of liability and warranties, before installing this patch. ------------------------------------------------------------ 1. Back up your system before installing a patch. 2. Login as root. 3. Copy the patch to the /tmp directory. 4. Move to the /tmp directory and unshar the patch: cd /tmp sh PHKL_30373 5. Run swinstall to install the patch: swinstall -x autoreboot=true -x patch_match_target=true \ -s /tmp/PHKL_30373.depot By default swinstall will archive the original software in /var/adm/sw/save/PHKL_30373. If you do not wish to retain a copy of the original software, include the patch_save_files option in the swinstall command above: -x patch_save_files=false WARNING: If patch_save_files is false when a patch is installed, the patch cannot be deinstalled. Please be careful when using this feature. For future reference, the contents of the PHKL_30373.text file is available in the product readme: swlist -l product -a readme -d @ /tmp/PHKL_30373.depot To put this patch on a magnetic tape and install from the tape drive, use the command: dd if=/tmp/PHKL_30373.depot of=/dev/rmt/0m bs=2k Special Installation Instructions: None