I attended a conference USENIX2002. I hope you and the LtU readers might be interested in some of my notes. The previous, subjectivity, disclaimer applies. * X Meets Z: Verifying Correctness in the Presence of POSIX Threads Bart Massey, Portland State University; and Robert T. Bauer, Rational Software Corp. Freenix track. The paper showed how formal methods helped solve a really hard multi-threaded problem. The problem arose during the development of XCB, a client library for the X protocol. See the XCB/XCL talk below. The library is supposed to be _transparently_ used in a multi-threaded environment. Making the library thread-safe is hard, because the X protocol does not, by its design, prevent deadlocks. Therefore, the client library (XCB in our case) must be careful to avoid these potential deadlocks. Alas, the careful design is hard. The author tried a brute force approach and failed. His algorithm may prevent a deadlock in one case and allow it in another. Bart Massey will correct the algorithm, only to find another fatal flaw later. After four attempts, he realized that there must be a better way. As a professor at Portland State University, Bart Massey teaches Z. The textbook in his class is "The way of Z. Practical Programming with formal methods". Interestingly, this book is written by a radiologist. Bart Massey decided then to follow what he teaches. He wanted to specify POSIX threads in Z, specify the X server behavior in Z, describe the properties of the desired client algorithm in Z, and then formally prove that the whole system is deadlock free. The latter part -- the formal proof -- is still wanting. Unfortunately the semantics of POSIX threads is not well-specified. Still, the mere fact of describing the client behavior in Z has given a crucial insight that lead to the correct algorithm. That algorithm is implemented in the current version of XCB. Although there is no formal proof of its correctness (yet), the extensive testing and use revealed no problems. Thus the lesson of this talk is that formal methods are indeed useful. Good ideas come up just from mere writing down a formal specification. Engineers use Math and logic extensively in their work; software engineers and opensource developers should do that too. * XCL: An Xlib Compatibility Layer for XCB Jamey Sharp and Bart Massey, Portland State University. Freenix track. XCB is C bindings for the X protocol. Unlike XLib, XCB deals only with communication, marshaling and similar issues. XCB offers far simpler and consistent interface to the X protocol than XLib does. XCB is very small (27 KB compiled) and is therefore particularly suitable for embedded systems. XCB can be used in a multi-threaded environment. The code for XCB is mainly _automatically_ generated from the specification of the X protocol! XLib contains much more functionality than mere a client-side of the X protocol. XCL is an attempt to implement some of that functionality, using XCB to communicate with the server. XCL+XCB is meant as a replacement of XLib (a lighter version of XLib, to be precise). The goal is largely achieved. XCL+XCB is 55 KB compiled (whereas XLib is 665 KB. Some of that can be reduced by replacing pervasive macros with procedure calls). Some of the XCL code is automatically generated; most of the other is just copied from the XLib source code and adjusted to use XCB. XCL does not implement all of the XLib. In particular, XCL does not support X color management, i18n and caches. As the presenter showed, these features, although accounting for more than the half of XLib, are hardly used at all. Most of the toolkits do i18n and color management on their own. The author reported the success in relinking of several X windows applications (rxvt and gw) to use XCL rather than XLib. http://xcb.cs.pdx.edu/ * Cyclone: A Safe Dialect of C Trevor Jim, AT&T Labs-Research; and Greg Morrisett, Dan Grossman, Michael Hicks, James Cheney, and Yanling Wang, Cornell. General track. The talk made a flimsy case for Cyclone. The presenter never justified his assertions that developers really need to know precise details of the data representation and to access bits and bytes of complex data structures. Incidentally, if these assertions were true, the developers should not be using Cyclone instead of C. Cyclone introduces several kinds of pointers, including a fat pointer. A fat pointer carries the range information, needed for run-time bound checks. A fat pointer is not merely a pointer, yet its details are opaque to developers. A dereference operation via a fat pointer is not atomic (which creates problems for using Cyclone in multi-threaded programs). The idea of a never-null pointer and of a static reasoning of these pointers is interesting. Let's consider: int @f = x; where '@' marks a never-null pointer, similar to '*' marking a regular pointer. An assignment or an initialization of a never-null pointer entail an _implicit_ run-time nullarity check. However, if the value to assign is known to the compiler to be a never-null pointer, no run-time checks will be inserted. Note that the Cyclone compiler adds implicit run-time checks -- which make the language safer but remove the programmer from "the bare metal." If a programmer doesn't mind an abstraction of the bare metal, the programmer then would be more comfortable with a more expressive language like OCaml, for example, or Java. Performance: from 0 overhead to 3x slowdown, in pointer-intensive applications. After the talk, several persons noted that 3x slowdown is unacceptable. Cyclone still does not support multi-threading nor completely manual memory management (Cyclone supports arena management and GC?). One person asked a funny and witty question: with never-null pointers (marked by '@' rather than the regular '*'), fat pointers (marked by '?') and contemplated forward-only pointers, won't the designers of Cyclone run out of characters to mark these pointers? Have they considered Unicode? http://www.cs.cornell.edu/projects/cyclone/ * The AGFL Grammar Work Lab Cornelis H.A. Koster and Erik Verbruggen, KUN. Freenix track. The talk has described the first _free_ parser generator for natural languages, under the GNU public license (GPL/LGPL). The parser generator takes a grammar and a lexicon and yields a parser. The latter reads the text and converts it into a Head-Modifier (H-M) form. Generated parsers can be used commercially royalty-free. The H-M form is a set of pairs [head, modifiers] where 'head' and 'modifiers' are (possibly empty) strings of words. For example, the phrase "new walking shoes" is translated into [N:shoes, A:walking] [N:shoes, A:new] Both the head and the modifiers are typed: N (for noun), P (pronoun), A (adjective), V (a verbform). H-M pairs represent _major_ relations expressed in the text; "unimportant" words are eliminated. The parser can also normalize the tense of verbs. Here's a more elaborate example: "Every PhD student gets a reduced price for the ETAP conference" is parsed into: [N:student, N:PhD] [N:student, V:gets] [V:gets, N:price] [P:it,V:reduced] [V:reduced,N:price] [N:price, for N:conference] [N:conference, N:ETAP] The conversion from a passive particle phrase "a reduced price" to an active verb construction [P:it,V:reduced] [V:reduced,N:price] is remarkable. The original phrase did not say who authorized the reduction of the price for the conference. The parser has determined that the actor is missing, and inserted a dummy actor 'it': [P:it,V:reduced]. H-M pairs produced by a parser can be used for text classification, information retrieval, and question answering systems. The parser can also be used in educational software. In fact, the parser has been used by the authors in a text (patent application) classification system called Peking. The AGFL package comes with a sample grammar and a lexicon, of English. AGFL stands for an "affix grammar over a finite lattice", which is a _context-free_ grammar with set-valued attributes. The parser is non-deterministic, and therefore is not fast. It is, however, robust: it does not crash when is presented with unknown words or an incorrect phrase. Again, the package is free. The AGFL package is available at: http://www.cs.kun.nl/agfl/ The grammar of English is very illuminating: http://www.cs.kun.nl/agfl/npx/grammar.html Given an ambiguous phrase, the parser currently returns only one parsing alternative (chosen more or less arbitrarily and heuristically). Version 2.0 will incorporate statistics and probabilities. The parser is more scalable and faster than DCG is Prolog. The system is written in a CDL3 language. The CDL system is also a parser generator, which yields fast _deterministic_ parsers. When I asked the speaker to compare his approach with a "dominant" pure statistical approach, he said that extremes are rarely sufficient. Statistical approaches need a lot of statistics to represent the basic facts of the language (e.g., the Subject-Verb-Object order of English). The latter can be more succinctly and conveniently expressed by rules. It's more productive to use rules for most of the grammar, and statistics to resolve ambiguities. Cornelis H.A. Koster conducted a BoF on natural language processing. The main topic of the BoF was again the statistical vs. rule-based classification and retrieval of documents. He mentioned an interesting fact that stemming and singular/plural normalization do not improve classification much. One of the drawbacks of the pure-statistical approach to the information retrieval is that the relevance score is highly non-linear. A relevance score of 99% tells that the document in question is most likely highly relevant to a query. OTH, a relevance score of 70% often means that the document has nothing to do with the topic. The precision and the recall of pure statistical classifications and retrieval reach a plateau. You need to add the rules of the grammar to get over the plateau. Dr. Koster has briefly described his document classification system Peking. The documents to classify are patent application. First he parses the document into HM pairs as outlined above. The system collects the statistics of HM pairs and winnows out stiff and noisy terms. What's left are significant pairs, which describe "the essence" of a set of documents. * Plan9 and Inferno As the year before, Vita Nuova had a booth at USENIX Expo. Two exhibitors were constantly busy demonstrating the two remarkable operating systems. Inferno now runs natively on a iPAQ. Inferno supports only one language: Limbo. Because the language is safe, Inferno does not need a kernel-user-space boundary. This makes a system fast. A Limbo virtual machine, DIS, is rather similar to JVM. It's possible to map (all?) JVM codes to DIS. Alas, DIS does not support tail-recursion. Roger Peppe, one of the exhibitors, showed a Tk emulation in Limbo. Essentially, he wrote a Limbo/Tk. He also showed an amazingly transparent networking in Inferno/Plan9. A serial port in Plan9 is represented by several files. You can control the port with the regular 'echo' command. He demonstrated a driver for a digital camera, written in "shell" (in Limbo). The interface to the driver is a file system again. In Plan9/Inferno, it's possible to create 'virtual files' which are associated with a pair of handlers rather than with a storage. When you load the camera driver, you see a set of files in a directory you designate. Doing "echo zoom 0.5 > kodak/ctl" changes the zoom factor. You can see the pictures by executing "ls -l kodak/img/" at the shell prompt. Roger Peppe showed a GUI, again written in shell (Limbo/Tk to be precise) that talks to the camera by reading and writing to the files in the kodak/ directory. You can mount the kodak directory on another computer: iPAQ that was connected to the Plan9 laptop via a wireless link. You can indeed control the camera from the iPAQ. The transparency of networking in Plan9/Inferno is astounding. The underlying protocol, Styx, is well thought out and concise. It takes fewer than 10 pages to completely describe it. OS developers could greatly enhance their systems if they implement Styx. Distributive programming in Limbo is based on channels -- like in communicating sequential processes. Threads are synchronized through rendez-vous, like in Ada. From the distributive system point of view, Limbo programming looks surprisingly similar to Erlang. Plan9 and Inferno binaries are freely available, even for commercial use. The source, the core, costs money. The source cannot be freely distributed, but it can be modified and used in a commercial application royalty-free. Changes do not have to be reported back to Vita Nuova. See the FAQs at www.vitanuova.com [Below are three summaries from USENIX02 that talk about threads -- and programming language's support for threads -- or the lack of it. The common theme is that threads are really a bad idea, just as John Ousterhout said in 1996.] * Cooperative Tasking Without Manual Stack Management Atul Adya, Jon Howell, Marvin Theimer, Bill Bolosky, and John Douceur, Microsoft Research. General track. The talks has stirred up a lively debate, which continued after the talk and all through the lunch. There are two major concurrency models: multi-threading and event-driven computations. As Ousterhout pointed out, threads are a bad idea, for most of the purposes. The acrimony between the two models is caused by conflating several concepts. The talk took time to explain different axes of a concurrent application. ** task management *** serial task management. On one end is a serial task management: a task executes on a CPU until it is finished. No other task is scheduled until the first task completes. This strategy is the easiest to reason about; yet it is inefficient as the CPU idles when the task waits for i/o completion. *** preemptive task management Execution of tasks can interleave on a uniprocessor or overlap on multiprocessors. This strategy seems efficient, until we consider the communication between tasks. If tasks are completely separate and do not share any data, there is no danger of them being interleaved or overlapping [see a data partitioning axis below]. However, the tasks do communicate. In the threaded model, tasks share the same address space and communicate by updating shared variables. With pre-emptive multi-threading, a thread is generally never sure that it is working with consistent data structures. A thread can be pre-empted at any moment. A newly scheduled thread may alter the data structures. To be sure of consistency of its data, a thread must lock the data structures and then verify (and if needed, restore) invariants. However a thread must not hold locks for a long time. Furthermore, to avoid deadlocks, the thread must release locks in certain situations (when it failed to acquire all needed locks, for example or before doing a slow i/o). Once the thread released the locks and then re-acquired them, it must verify the invariants all over again. *** co-operative task management This is a compromise between the serial and the pre-emptive management. Unlike the serial task management, a task may surrender the CPU before the completion and let another task (thread) run in its address space. Unlike the pre-emptive management, a thread yields control on its own accord, at well-defined points in its execution. Cooperative multi-threading does not eliminate the problem of consistency of data structures. After the thread yielded the control and then regained it, the thread has to verify all invariants. However, a thread yields at the points defined by its programmer. Between these points, the computation can be assumed serial. No locks are needed to guarantee the consistency of thread's data. Therefore, the cooperative multitasking model is significantly less error-prone than the pre-emptive one. The cooperative multitasking often has a better performance, too, due to the elimination of the locking overhead. The work on the Flash web server and the last year USENIX presentation ["High-performance memory-based web servers: kernel and user-space performance" by Philippe Joubert et al] showed that multi-threaded web servers are not competitive against a finite state machine with an efficient event notification. ** stack management Stack management is making sure a thread can access its local data after the thread has lost and then regained the control. Automatic stack management is the one performed, for example, by C thread libraries. A thread library saves the stack and other context of the thread that has lost control, and restores the context for the thread scheduled to run next. Manual stack management is when a programmer must pack a callback and all the data for it in a closure. The programmer thus manually creates a continuation frame -- to be executed when an event arrives. Other axes are: i/o management (synchronous vs. asynchronous), conflict management (optimistic vs. pessimistic; semaphores vs. monitors), data partitioning. The paper explicitly considers only the task and stack management axes. In a multi-threaded application, pre-emption of a thread is transparent to a programmer. The thread library or the OS take care of saving and restoring thread's stack. OTH, an event-driven application often must explicitly specify which function to execute when an event arrives and which data to pass it. These data must be manually packed in a closure. The manual stack management rips a C function apart and leads to a code that is difficult to maintain. The talk gave several good illustrations of horrors of event-based programming in a language that does not natively support closures and continuations. The paper mentions that the transparency of thread's pre-emption and resumption is actually a drawback, which makes multi-threaded programs difficult to reason about. The paper contends that the immensely convoluted style of typical event-passing programs has nothing to do with the cooperative task management. The convoluted style is the consequence of the manual _stack_ management. The presenter specifically mentioned that Scheme doesn't suffer that style problem. In Scheme, event-based programming looks elegant and natural. As the paper says, "some languages have a built-in facility for transparently constructing closures; Scheme's call-with-current continuation is an obvious example. Such a facility obviates the idea of manual stack management altogether. The paper focuses on the stack management problem in conventional languages without elegant closures." The rest of the paper and of the talk showed techniques to make cooperative-multitasking programming, specifically in C, more pleasant. The paper demonstrated adapters that allow mixing of cooperative-thread and preemptive-thread code (written by disparate programmers). Note a USENIX01 talk "A toolkit for user-level file systems" by David Mazieres, who used C++ templates to automate building of continuations. The solution advocated in the talk is to use co-routines based on "fibers". I think though that subliminally the message of the talk was to use Scheme. * Improving Wait-Free Algorithms for Interprocess Communication in Embedded Real-Time Systems Hai Huang, Padmanabhan Pillai, and Kang G. Shin, University of Michigan. General track. This talk, perhaps unwittingly, has confirmed the previous talk's observation that threads are really a bad idea. The gist of a threading model is a (conceptually) parallel execution of computations that communicate via shared data. This model is fault-prone, deadlock-prone, priority--inversion-prone; the model exhibits a poor scalability due to locking. The model is hard to reason about, e.g., it's hard to compute the worst execution time of a multi-threaded task. The latter is a particular drawback in real-time (RT) systems. A better IPC model is a wait-free inter-process communication. Concurrently running tasks can communicate through a shared state -- but in a way that requires no locking and still guarantees race-free execution. The drawback is bigger space requirements. The talk has described a memory efficient wait-free algorithm for a single-writer multiple-reader problem. The algorithm is a combination of two kinds of non-blocking protocols. One is a non-blocking write (NBW) protocol, which is as fast as a simple write to shared memory without any synchronization. Another class of lock-free write/read protocols relies on the presence of certain atomic operations: compare-and-set or increment or decrement. Both classes of the lock-free protocols trade space for safety. The interesting result of the paper is that by combining the two classes, it's possible to design a fast lock-free protocol that takes less memory that each of the original algorithms separately. The insight is an appropriate partitioning of reader tasks into slow and fast readers. The fast readers and the writer interact via a NBW protocol; the slow readers interact via one of the protocols of the second class (e.g., a double-buffering protocol). The hybrid protocol takes less space than either NBW or the double-buffering protocol applied to all readers; The hybrid protocol is faster than the double-buffering algorithm alone. The paper and the talk showed the results of several benchmarks. The graphs demonstrated that the best locking solution to the single-writer multiple-readers problem has the least space requirements but absolutely horrible worst-case performance. * An Implementation of Scheduler Activations on the NetBSD Operating System Nathan J. Williams, Wasabi Systems Inc. Freenix track. The talk started with an overview of three implementation models of threads. In one model, threads exist only at the user level. All threads run within the same kernel process. The implementation is fast; however, when one thread enters the kernel or blocks, all other threads within the same process are blocked too. On the other extreme are kernel threads. Like processes, kernel threads are scheduled or blocked independently. Unlike processes, different threads may share the same address space, job identification, user credentials, etc. Although thread switching is faster than process switching, it is still relatively expensive. What's more important, each kernel thread takes kernel resources, which are limited. Kernel thread synchronization involves a system call and is therefore expensive. The third model -- which is the topic of the talk -- is a two-level implementation. N kernel threads run M (M>=N) user threads. Thread synchronization and switching occurs at the user level. When a kernel thread executing a user thread blocks on an i/o or other system call, the kernel "sends a signal" to an application, telling it that a user thread has blocked. The signal will be handled by the remaining kernel threads assigned to an application. If there are no other threads to handle the "signal", the kernel may launch a new thread and give it to the application. The "signal" handler (called an upcall handler) will mark the user thread that got itself blocked as non-runnable and will switch to another runnable user thread if any. This approach implies a co-operation between the kernel and the user application. The kernel scheduler notifies the application about scheduling decisions that affect the application: creation and deletion of kernel threads, a thread becoming blocked, unblocked, or pre-empted. In the latter case, the pre-emption event is delivered to an application _after_ it was resumed (so the application cannot stave off the pre-emption but still can know that it was off the CPU for some time). The NetBSD implementation described in the talk uses so-called schedule activations. Scheduling event notifications are called upcalls. They behave pretty much like POSIX signals. An application registers its intent to be notified of scheduling events by registering an upcall handler. The application should also allocate a set of stack frames to run the handler. The implementation described in the talk looks surprisingly similar to call/cc-based threads/co-routines. When the kernel executes an upcall, it runs a special trampoline code in the user space. The trampoline saves the stack pointer and the context of the previously running code, sets the stack frame and jumps to the upcall handler. Upcall handlers are supposed to "switch" to an appropriate user thread, hence, they do not return. To avoid running out of stack frames for upcall handlers, an application must determine which stack frames are no longer "relevant" and manually "garbage collect them" (i.e., mark them as free). The talk clearly showed kernel hackers re-discovering call/cc. Scheduler activations will be in the next release of NetBSD. The maximum number of concurrent threads ran without encountering resource limitations was 7000-8000. After the talk, Clem Cole commented that they have to support more threads and have to spend the enormous amount of time on debugging. Clem Cole said that DEC/Compaq thought they got threads right in their True64 OS. Until they tried to run a Netcenter, which launched 10,000-100,000 threads. A lot of bugs showed up then. * Biglook: A Widget Library for the Scheme Programming Language Erick Gallesio, University of Nice; and Manuel Serrano, INRIA. Freenix track. Like Tcl, Biglook relies on a programming language for building graphical user interfaces. It uses a CLOS-like object layer [generic functions rather than class methods] to represent widgets, and Scheme closures to handle events. Biglook is a (Bigloo) Scheme library. Biglook code can be compiled by a Bigloo compiler into C or JVM byte code. Biglook implemented a full-fledged IDE for Scheme: Bee. Biglook is a thin wrapper on the top of native back-end libraries. Currently, two native widget libraries are supported: GTK+ and Swing. The wrapping overhead is quite low: about 15% of additional allocations. The paper stresses several points that greatly contribute to Biglook expressiveness: 'virtual slots' in records, keyword arguments, and closures as call-backs. Virtual slots is the mechanism to relay actions on widget classes to the back-end widget library. Virtual slots also make the API concise and expressive. For example, three classical functions 'iconify', 'de-iconify' and 'window-iconified' are represented in Biglook by a single virtual slot 'visible' in a window class. The variable number of arguments and keyword arguments are essential to declarative GUI programming. Without them, the creation of a widget must be split into two steps: the instantiation of a widget and the setting of its attributes. An alternative, passing of a list of attributes to a widget constructor, involves a run-time overhead and disables compile-time checking of attributes and their values. In Biglook, keyword parameters are resolved at macro-expansion (i.e., compile) time. A macro instantiate::widget (which is automatically generated from widget's class declaration) does all the work of resolving keyword attributes. Other automatically generated macros are getters and setters, and a 'with-access::widget' binding form. During the talk, Manuel showed several GUIs and their complete code, which was only 10-lines long or shorter. Of a special note is a comparison of Biglook, Tcl/Tk, Swing and GTK on an example of a widget with several buttons. In Biglook, the creation of a button, its packing and the setting of the event handler are all part of the same expression. In Tcl/Tk, the code that deals with a button is spread out in three places. In Swing, the same code is split in 4 places. In GTK, in 5 places. The paper notes that in the experience of the authors, the Biglook source code is about twice smaller than the same interface implemented in Tcl/Tk, and about four times smaller that the interface implemented in C with GTK+. http://kaolin.unice.fr/Biglook