Patch Name:  PHKL_2874

Patch Description: s700 8.07 Patch for NFS corruption/EACCESS/automount

This patch fixes data corruption and/or premature end-of-file conditions
across NFS.  The problem has been experienced by two customers in two
different ways:

   Case1:  The customer was using the -async option on the NFS server.
           The client was doing sequential writes using the O_APPEND
           flag.  The client was also running multiple biods.

           The O_APPEND forces the client to lseek to the end-of-file
           before processing the write.  Therefore, if the end-of-file
           is incorrect, the client will overwrite sections of the file
           and the file will end up shorter than expected.

           The root cause for this case was simply that the client was
           processing NFS write replies out-of-order.  This was caused
           by the fact that the server was sending the replies too close
           to each other due to the use of -async.

           A good workaround to this problem is to run with 0 biods or
           to run with 1 biod.  Not so good workarounds include using
           O_SYNC along with O_APPEND or not using the -async option
           on the server.

   Case2:  The customer was using a Sun server running NFS version 4.2.
           The client was doing sequential writes with a random read
           executed once-in-a-while to verify the last 512 bytes written.
           The read was sometimes failing with 0 bytes returned, indicating
           end-of-file.  (That is called a premature end-of-file.)

           The root cause for this case was in the behavior of the Sun
           NFS 4.2 server.  The Sun 4.2 server code returns stale file
           attributes whenever a duplicate write request is made.  The
           HP client was receiving incorrect information from the server,
           specifically, the location of the end-of-file (i.e. the file
           size was too small).  Note that this would happen only on the
           reply to a duplicate write request.

           Since the read was happening on the last 512 bytes, the HP
           client would believe the end-of-file coming back from the
           Sun server (as a result of a duplicate write) and return 0
           bytes to the user program.

The fix for these two problems was to force the HP client code to double
check with the server BEFORE truncating the file size.  Essentially, if
a write reply comes back with a file size that is smaller than expected,
the client will go across-the-wire with a getattr call to double check.
If the getattr returns the smaller file size again, then the HP client
assumes the file has honestly been truncated.

==================================================================

PHKL_2475:

        This patch fixes a bug that occurs in an NFS server when it receives a
        duplicate CREATE request.  The bug is that this duplicate request may
        fail with EACCES even though the original request succeeded.  This
        patch allows the NFS server to properly process the duplicate CREATE
        request.  The corresponding 9.01/S700 patch is PHKL_2476.

A typical scenario where this problem may occur is a bunch of NFS clients
working very heavily (> 100 NFS requests per second; see nfsstat(1M) on a
single server.  One customer experienced the problem when running lots of
rm/cp/mv commands; once in a while a cp or mv command would fail with
"permission denied" which translates into an EACCES returned from a
creat system call.

The problem is caused by the following sequence of events:
=========================================================

        o The client generates a CREATE request with 0000 permissions.
        o The server performs the CREATE operation and sends a reply.
        o The reply is dropped and never received by the client.
        o The client generates a second/duplicate CREATE request.
        o The server attempts to do a second CREATE which sets the
          u.u_error flag to EACCES.
        o The sever identifes the second CREATE as a duplicate request.
        o The sever fails to reset the u.u_error flag to 0; this is key!
        o The server performs a LOOKUP operation to return the file id.
        o The entry is not in the directory name lookup cache (dnlc) which
          causes the server to invoke native file system code.
        o The native file system code sees the EACCES in u.u_error and
          returns immediately.

The fix was simply to clear the u.u_error flag after detecting a duplicate
request and before performing the LOOKUP operation.

PHKL_1785: Fixes NFS data loss.

This patch fixes a problem where some writes to an NFS file can be lost if
another process is stat(2)'ing the file.  The process writing the file has
to be also doing a lseek(2) to the end of the file between writes.  This
combination is likely to cause the problem.  It is also possible that multiple
processes stat()'ing the file while one is writing it could cause the problem.

PHKL_1602: All NFS kernel patches.

   This patch contains all the kernel patches for NFS.  It is designed to
   make it easier for customers get all the patches at once.  There are
   seven problems fixed with this patch that were not fixed with PHKL_1102,
   the previous NFS super KERNEL patch.  They are the first seven problems
   described below.

   This patch was originally released with a bug in nfs_server.o that
   could cause spurios error messages of "rfs_readdir:  bad directory:
   mangled entry."  This would rarely occur but was not a real error.
   This patch has been updated to fix this bug and you should make sure
   that the what strings in your kernel match the one given below for
   nfs_server.o.

   This patch fixes a problem where NFS file systems may be incorrectly marked
   as "ignore" in /etc/mnttab on diskless systems.  This would cause commands
   like bdf to not report disk space for the file systems.  The "ignore"
   designation is only used by the automounter but sometimes a diskless client
   would mark a file system as "ignore" in /etc/mnttab.  The problem would
   occur whether or not the automounter was running on the system.  Also if
   the automounter was running on the system, file systems may not have been
   marked "ignore" when they should have been.  This patch fixes both symptoms
   of the problem.

   This patch allows the automounter to work correctly on diskless clusters.
   Without this patch the clients can not access the automounted file systems
   which are configured as direct-mapped mounts.  You should also install
   patch PHCO_1527 to fix a booting problem when the automounter is used in a
   diskless cluster.

   The patch lowers some timeout values that are used to control how often
   the lock manager retries remote locks.  Without this patch, retries will
   take 15 seconds which can be very painful.

   The patch also fixes a memory leak that can occur using NFS.  The amount
   of memory available to the system can be bled off until there is not
   enough left to execute processes.  This only occurs if a non-HP machine
   is an NFS client.

   The patch fixes an infinite loop that occurs if NFS trys to read a mangled
   directory entry.  In the case of a readdir(2) system call, if the directory
   length is 0, the kernel can infinite loop making the processor unavailable
   to any process.  The processor will still respond to interuppts so it
   the machine will respond to ping(1m), but it is essentially locked up.
   Note that readdir() is used by the ls(1) command, so just using ls over
   NFS can cause the hang.

   This patch also fixes a problem where a process that opens a NFS file and
   then reads and writes to it in a loop runs slowly.  The problem was that
   every read after a write on the same file was causing a flush of the to the
   server.

   Without this patch, automount will not clean up direct-mapped mount
   points correctly when it is killed.  This can cause processes
   to hang if they access these mount points.  This was originally in
   PHKL_0836.  This fix is in HP-UX 9.0.

   The fix in patch PHKL_1374 is included in this patch since it was
   mistakenly made so that applying PHKL_1374 would undo part of
   PHKL_1102 or vice-versa.  It fixes a rare problem where a full or
   nearly full disk that has a fragment size greater than 1k can a panic
   with the message "freeing free frag".  Only SDS disk arrays have
   larger blocksizes by default, but other filesystems can have large
   blocksizes if created using options to makefs or newfs.  This fix is
   in HP-UX 9.0.

   This also patch fixes a problem where using the automounter can cause
   processes to hang if too many processes try to talk to the automounter
   daemon at the same time.  The processes would use all the CLIENT
   structures, but the automounter needed one to service the process's
   request so it would sleep waiting for one to be freed, while the processes
   would hold the CLIENT structures until the automounter responded.  Classic
   deadlock.  This patch fixes the problem by increasing the number of CLIENT
   structures from 6 to 100.  This was originally in PHKL_0876.  This fix is
   in HP-UX 9.0.

   A problem is fixed where making a directory, then mounting a nfs
   file system on that directory with invalid options acts like it succeeds
   but really doesn't.  An example of an invalid option is rsize=-1.  This
   fix is in HP-UX 9.0.

   Finally, this patch also fixes a problem where clients may keep stale data
   indefinitely.  If a file on the NFS server is updated within a second of
   the last access by an NFS client, the client may keep the stale data
   indefinitely.  This fix is in HP-UX 9.0.

The NFS Problem on HP-UX:
========================
After the first access, the NFS client kernel has the file attributes and
file data in its cache.  In a subsequent access, the kernel client code will:

        - go across the wire to refresh the file attributes
        - compare the new modification time (coming over the wire),
          using the seconds only, to the previous modification time (stored
          in the client's rnode).
             o if the seconds are the same (namely, the server managed to
               modify the file within a second of the client's last
               access), the client kernel code will NOT invalidate the
               file data in its buffer cache and will therefore keep the
               stale file data in its buffer cache indefinitely.

How come Sun's implementation works?
===================================
The Sun code compares seconds and microseconds; it's very unlikely that
set server could modify the file within a microsecond of the client's
last access.

Why is HP-UX different?
======================
The system call utime(2) allows user level programs to change the
modification time for any particular file.  Unfortunately, this
system call takes time_t as an argument (HP-UX/POSIX/Bell), not
a struct timeval (see <sys/time.h>).  The time_t limitation limits
the granularity of the time stamp to seconds; consequently, utime(2)
resets the microseconds field to 0.  (NOTE: Sun's implementation of
utime(2) is BSD-based and does take a struct timeval as an argument.)

The HP-UX NFS client code change has a comment stating that "We've had
instances of things like fbackup(1), ftio(1), file(1) etc. zero out the
microseconds field and cause an unexpected "killed on text modification
error" even though the contents of the file had not really changed."  In
other words, with the original Sun code, if a backup utility on the server
uses utime(2) on an executable that is running on a client, the client kernel
code will kill the process because it thinks that the executable has been
modified on the server.

The change for HP-UX was made to eliminate this problem for executables.
In essence, the HP-UX client code ignores differences in the microseconds
fields, which unfortunately, creates the problem we are now patching.

PHKL_0736: system panics with bus error in nfs kernel code.

   The system panics with a bus error in the kernel routine do_bio().

   The file fixed was nfs/nfs_vnops.c

PHKL_0836: automount fails to clean-up properly when killed.

   Without this patch, automount will not clean up direct-mapped mount
   points correctly when it is killed.  This can cause processes
   to hang if they access these mount points.

   The file fixed was nfs/nfs_vnops.c

PHKL_0876: Processes hang when using NFS automounter

   This patch fixes a problem where using the automounter can cause processes
   to hang if too many processes try to talk to the automounter daemon at the
   same time.  The processes would use all the CLIENT structures, but the
   automounter needed one to service the process's request so it would sleep
   waiting for one to be freed, while the processes would hold the CLIENT
   structures until the automounter responded.  Classic deadlock.  This patch
   fixes the problem by increasing the number of CLIENT structures from 6 to
   100.

   The file fixed was nfs/nfs_subr.o

PHKL_0942: automounter patches.  "Data segmentation fault" in do_bio.

   This patch is a compilation of all previous patches that affect the
   automounter.  It is designed to make it easier for customers.

PHKL_1102: automounter patches.  NFS patches.

   This patch is a compilation of all previous patches that affect the
   automounter and NFS.  It is designed to make it easier for customers
   get all the patches at once.

PHKL_1374: Fixes freeing free frag for systems with > 1k frag sizes


Path Name:  /hp-ux_patches/s700/8.X/PHKL_2874

Effective Date:  930825

Patch Files:  
/system/PHKL_2874/new/dux_lookup.o
/system/PHKL_2874/new/dux_mount.o
/system/PHKL_2874/new/klm_lckmgr.o
/system/PHKL_2874/new/nfs_server.o
/system/PHKL_2874/new/nfs_subr.o
/system/PHKL_2874/new/nfs_vnops.o
/system/PHKL_2874/new/ufs_alloc.o

SR/APR#:  5003094326

"what" string/timestamp:  
dux_lookup.o:
         PATCH_8.07:  dux_lookup.o  1.7.61.4    92/10/21 PHKL_1602
dux_mount.o:
         PATCH_8.07:   dux_mount.o   1.8.61.5   92/10/21 PHKL_1602
klm_lckmgr.o:
         PATCH_8.07:   klm_lckmgr.o  1.4.61.3   92/10/21 PHKL_1602
nfs_server.o:
         nfs_server.c       $Date: 93/04/28 16:22:16 $ $Revision: 1.17.77.4 $
PATCH_8.07 (PHKL_2475)
nfs_subr.o:
         PATCH_8.07:       nfs_subr.o     1.12.61.2    92/03/02 PHKL_0942
PHKL_1102 PHKL_1602
nfs_vnops.o:
         PATCH_8.07 nfs_vnops.c Revision: 1.17.77.11 Date: 93/08/25 18:37:32
PHKL_2874
ufs_alloc.o:
         PATCH_8.07:   ufs_alloc.o     1.29.61.2    92/11/16 PHKL_0942
PHKL_1102 PHKL_1374 PHKL_1602


"sum" output:  
8835 25 dux_lookup.o
50705 34 dux_mount.o
35170 11 klm_lckmgr.o
50502 46 nfs_server.o
50480 28 nfs_subr.o
9146 49 nfs_vnops.o
43189 34 ufs_alloc.o


Dependencies:  None.

Supersedes: PHKL_0736 PHKL_0836 PHKL_0876 PHKL_0942 PHKL_1102 PHKL_1374
            PHKL_1602 PHKL_1785 PHKL_2475

Patch Package Size:  185 Kbytes

Installation Instructions:  

     Please review all instructions and the Hewlett-Packard
     SupportLine User Guide or your Hewlett-Packard support terms
     and conditions for precautions, scope of license,
     restrictions, and, limitation of liability and warranties,
     before installing this patch.

     Note: Please back up your system before you patch.

---------------------------------------------------------------------------
After getting the patch onto your machine, unshar the patch (sh PHKL_2874).

To install this patch do the following:
1) Run /etc/update (Note: you must be logged in as root to update
   a system).
2) Once in the update "Main Menu" move the highlighted line to "Change
   Source or Destination ->" and press "Return" or "Select Item".
3) Make sure the highlighted item in the "Change Source or Destination"
   window is "From Tape Device to Local System ...", then press "Return" or
   "Select Item".
4) You should now be in the "From Tape Device to Local System" window.
   Change the "Source:  /dev/rmt/0m" to "Source: /tmp/PHKL_2874.updt"
   (this assumes that you are in the /tmp directory where PHKL_2874.updt
   has been placed).  Note: You must enter the complete path name.
5) Press "Done".
6) From here on follow the standard directions for update.

The customized script that update runs will move the original software
to /system/PHKL_2874/orig.  HP recommends keeping this software there in
order to recover from any potential problems.  It is also recommended that
you move the PHKL_2874.text file to /system/PHKL_2874 to be retained for
future reference.

If you wish to put this patch on a magnetic tape and update from the tape
drive, dd a copy of the patch to the tape drive.  As an example the
following will create a copy of the patch that update can read:
dd if=PHKL_2874.updt of=/dev/rmt/0m bs=2048