Archive of UserLand's first discussion group, started October 5, 1998.

Language Design, or: Why WORA Didn't Fly

Author:Paul Snively
Posted:11/9/1999; 4:49:56 PM
Topic:T. Nelson Critique of Embedded Markup
Msg #:12906 (In response to 12893)
Prev/Next:12905 / 12907

Erik Neu wrote:

Not to debate the point, but it's slightly ironic coming from someone who writes so well ;)

Thanks! I have to credit my parents, both of whom were educators of the old school that insisted that the point of teaching was to provide students with the tools to think critically. Among human beings, facile use of language is a crucial tool.

On to the main point, and I've altered the subject appropriately, I hope:

>>The most pernicious example of the latter problem is, of course, Java, which was designed by people who knew better but for whatever reason didn't follow through.>>

It's getting off-topic, but I'd be very interested to see you expound on this.

I'll try to be brief.

Within Computer Science, there exists a subdiscipline concerned with the creation of programming languages. One of the tasks involved in creating a programming language is to construct some mapping from constructs of the language to "computable functions." The question as to what is a computable function and what is not was addressed well before the invention of the digital computer by mathematical logicians such as Alonzo Church and Haskell Curry. Alonzo Church invented the Lambda Calculus in order to provide a formal foundation for mathematical logic and computation. This will become important a bit later on.

The invention of the digital computer immediately raised the question as to how a machine could be made to compute any computable function, which is turn raised the question of "minimum requirements" to even be a candidate for being able to compute any computable function. A brilliant mathematical logical prodigy in England named Alan Turing answered these questions definitively when he specified a general mechanical computing device that he proved was capable of computing any and all computable functions. The device has come to be known as a "Turing machine." One of the tricky intellectual questions of the day was whether it was possible in principle to specify an algorithm for the Turing machine that could determine whether the execution of an arbitrary algorithm on an equivalent Turing machine would halt or not (the "Halting Problem.") It was proven that it was in fact not possible, so the Halting Problem is the canonical example of a non-computable function.

Fast forwarding a bit, we come to the observation that "all modern programming languages are Turing equivalent," which means that they, too, can compute any computable function. In CS circles, it's popular to dismiss language X is better than language Y arguments by observing, hey, what's the difference? They're both Turing equivalent! This is always offered tongue in cheek, as a brief look at what a Turing machine is will immediately explain why one might wish to use a slightly more abstract language, and that expressive power isn't the only criterion to measure a language's value by.

So thanks to Turing and his predecessors, there is a deep relationship between mechanical/electronic computing and the whole realm of computability itself. When designing a programming language, at some point you have to specify what each construct means. The technical term for this realm is "specifying the language's semantics." Thanks to Alonzo Church's work on the Lambda Calculus, we have at least one formal foundation for specifying the semantics of any computation whatsoever, and hence for specifying the semantics of programming languages. Further work by folks like Scott and Strachy led to the formation of the "denotational semantics," which is probably still the most popular way to formally define the semantics of a programming language.

Some languages that have such formal semantics are: Scheme, Standard ML, and Ada(!) There are no doubt others, especially in the "purely functional language" camp. The point here is that it's possible to prove, mathematically, whether any given implementation of a language with formal semantics implements those semantics or not. In practice, it's painfully difficult, as anyone who's attempted Ada implementation certification can attest to. Also, at some point, you're likely to end up relying on code with no such formal semantics, e.g. if your Ada compiler is implemented in C or C++. Even if it were, how trustworthy is your understanding of the semantics of the processor your code will run on? Microprocessors' semantics are always specified formally (typically in a language called Z), but that didn't prevent the Pentium floating-point bug.

Nevertheless, even if a language implementation hasn't been proven correct, merely the existence of a formal semantics tends to imply that an implementation is far more likely to be conformant than if the specification is written in a much more ambiguous language like English, and if a question ever arises, a formal proof of correctness can be undertaken if needs be.

Now, to Java. Java is a very new language--about four and a half years old as far as its public availability is concerned, and a few years older in terms of its life within Sun. Regardless of which measure you use, it's certainly young enough that its development should have caught the eye of people within Sun who are aware of formal language semantics, and in particular, enough people associated with Java have also been associated with Scheme (e.g. Guy Steele) that, even if Java started out having no formal semantics, it should have received one, perhaps as part of the 1.0-1.1 transition. Then it would have been possible to use a proof-theoretic approach to conformance-testing as opposed to the purely operational JCK approach.

Granted, even with a formal semantics for Java, there'd still be major hurdles: I don't know of any OS whose API has a formal semantics, so the moment the JVM calls the OS, it becomes impossible to prove anything about the behavior (although this may change in the OS world; see EROS for an example). Solving this problem probably boils down to the JVM formally specifying semantics of APIs that it relies on (e.g. an abstract, minimalist thread package) and then demonstrating how the various OS thread APIs (POSIX, Win32, MacOS, etc.) conform to those minmal requirements.

Incidentally, just because Sun failed to provide a formal semantics for Java, all is not lost; see <http://www.ercim.org/publication/Ercim_News/enw36/attali.html> and <http://www-sop.inria.fr/oasis/java> and finally <http://www-sop.inria.fr/oasis/java>

Given that an explicit design goal of Java was correct functionality across platforms, I'm surprised that more consideration does not appear to have been given to formally specifying how Java was supposed to work! I'm especially surprised because James Gosling and crew are obviously extremely intelligent, highly experienced people. Nevertheless, I'm glad to see that other organizations are on the ball, and perhaps one day we'll see a widely-accepted semantics for Java, including formal specifications of the requirements for underlying OS APIs, and implementations that are proven to be compliant.




This page was archived on 6/13/2001; 4:53:26 PM.

© Copyright 1998-2001 UserLand Software, Inc.