Our goal was not to build tools but to evaluate them. Nevertheless, some tool-building was inevitable. To achieve the necessary expressibility with ESC/Java, we needed to add one new pragma and some flexibility for the use of so-called ghost variables. With the help of Sanjay Ghemawat, we implemented Eraser for Java. The Eraser algorithm finds races by checking all loads and stores to detect shared variables that are accessed by different threads without using a lock to ensure mutual exclusion.
While checking Mercator, we identified six basic concurrency control techniques:
ESC Java's annotation language was designed to specify the monitored object technique, but all the others represent specification challenges. Eraser deals with the first four techniques automatically, and the rest require minimal annotations. We have designed annotation strategies to allow ESC/Java checking of all the techniques. Our strategies make heavy use of ghost variables and of the ESC/Java annotations "defined_if" and "writable_if" (the latter of which we had to add in the course of our research).
We haven't annotated all of Mercator for ESC/Java, so all of the races that we found were found with Eraser. Mercator consists of approximately 25,000 lines of code, and Eraser produced fifteen warnings on the first run. Of these warnings, two were real races and the rest were false alarms. The two races were fixed, and sixteen lines of Eraser annotations were added to supress the false alarms.
We have also checked Pachyderm [4], the Java mail and news system built by Andrew Birrell et. al. There were a large number of warnings, primarily because Pachyderm uses the Java windowing classes, which seem to be a fulsome source of races. Andrew claimed that three of the warnings were worth acting on in his code.
Our first conclusion is that Mercator uses more synchronization techniques than were envisioned by the designers of ESC/Java.
Our second conclusion is that Eraser is more effective than ESC at finding race conditions, especially given a large, unannotated program.
Third, ESC/Java requires far more annotation work. A result of this, however, is that it allows for the documentation of the synchronization design of the program.
We have not experienced coverage to be a problem with either tool, but it seems still to be an open issue for each tool, for different reasons. In Eraser we don't have a clear coverage measure, because in addition to passing through each branch of the program at least once, one needs to check that each shared variable has been accessed by at least two threads.
[1] Stefan Savage, Michael Burrows, Greg Nelson, Patrick Sobalvarro, and Thomas E. Anderson. Eraser: A dynamic data race detector for multi-threaded programs. ACM Transactions on Computer Systems (TOCS), 15(4):391-411, November 1997. Also appeared in Proceedings of the Sixteenth ACM Symposium on Operating System Principles, October 5-8, 1997, St. Malo, France, Operating System Review 31(5), ACM Press, 1997, ISBN 0-89791-916-5, pp 27-37.
[2] David Detlefs, K. Rustan M. Leino, Greg Nelson, and James B. Saxe. Extended static checking. Research Report 159, Compaq Systems Research Center, Palo Alto, CA, December 1998. Also see: Extended Static Checking Web page.
[3] Allan Heydon and Marc Najork. Mercator: A Scalable, Extensible Web Crawler. To appear in World Wide Web, December 1999. See also: The Home page of the Mercator Web Crawler.
[4] The Pachyderm Email System