ESC/Java User Manual       ESCJ 21--draft!         26 May 1998
--------------------

This document describes the programming tool ESC/Java, which attempts
to statically find common run-time errors in Java programs.  This
document is supposed to be easy to read and aims at getting ESC/Java
users up and running quickly.  This means that common situations are
described.  For information beyond what is described here, please
consult the ESC/Java Reference Manual or equivalent.


Invoking ESC/Java
~~~~~~~~~~~~~~~~~
The command line interface to ESC/Java resembles that of the javac(5)
compiler:

    escjava sourcefiles*

For example, if you would compile a program "Cup.java" with the
command line

    javac Cup.java

then you would run ESC/Java on this file with the command line

    escjava Cup.java


Overview
~~~~~~~~
ESC/Java tries to find program points where the given Java program may
cause a "checked run-time error".  Checked run-time errors include
null dereferences, array access errors, division by zero, and type
cast errors.  Java specifies that the run-time system will detect such
run-time errors and turn them into the throwing of an exception, which
the Java program can catch at run-time.  ESC/Java follows a stricter
discipline and will complain if a program may cause an checked
run-time error, even if the program actually catches the resulting
exception.  For example, ESC/Java would complain about a null
dereference on the third line of the following program fragment:

    Object[] a = null;
    try {
      int n = a.length;  // this will dereference null
    }
    catch (NullPointerException e) { }

despite the fact that the program fragment includes an exception
handler that catches the exception.

In addition to checking for checked run-time errors, ESC/Java can be
instructed to look for other errors, including concurrency errors like
race conditions and deadlocks.

ESC/Java is a modular checker.  This means that it can check
individual parts of a program and does not need global program
information.  For example, ESC/Java checks the implementation of a
class without using the program text of present or future subclasses
of that class.  Similarly, it checks the implementation of a method
without the code that calls the method and without the code that the
implementation calls.  These virtues of a modular checker can be
likened to those of most compilers, which can generate code for a
method implementation without knowing the code that calls the method
or the code that the method implementation calls.

How can a checker be modular?  After all, the other parts of a program
matter, too.  A compiler requires that every method have a declared
signature, which contains information like the number of parameters,
the types of parameters, and the return type.  In laying down target
code for a call, for example, the compiler needs only the signature of
the callee, not the implementation of the callee.  And in an
object-oriented program, there is not necessarily a one-to-one mapping
from call sites to implementations; nevertheless, compilers can still
cope because all possible callees for a given call site have the same
signature.  A modular checker relies on method specifications and
other specifications, similar to the way a compiler uses signatures.

A specification is a contract between a caller and a callee, or 
between a user of a data structure and its supplier.  As such,
specifications record programmer design decision, which usually cannot
be inferred from the program text itself.  Therefore, ESC/Java has an
annotation language with which a programmer can record various design
decisions.  From the information provided by annotations, a checker
may produce fewer spurious warnings.  At the same time, ESC/Java will
use the annotations to attempt to find places where the program
violates the contracts spelled out.

The ESC/Java annotation language has been designed to be simple and
useful in revealing errors and determining non-errors in programs.
ESC/Java does not catch every program error, nor does every warning
it reports indicate an actual program error.  The annotation language
includes facilities for suppressing particular warning messages
without introducing reciprocal checks.  A programmer can (and a good
engineer will) use these facilities when it appears that the effort
involved in writing down further specifications would outweigh the
benefit that such specifications will bring.

ESC/Java annotations go beyond the type annotations required by Java,
but have been designed so that the effort of writing them down is
somewhat analogous to that of writing type annotations.  In return,
the ESC/Java checker can find errors that a type checker never could
find.  This explains the name ESC/Java:  Extended Static Checker for
Java.


Format of annotations
~~~~~~~~~~~~~~~~~~~~~
ESC/Java annotations, or \pragmas/ as they are also called, are
supplied in special Java comments.  When the very first character
after the "/*" or "//" that begins a Java comment is "@", ESC/Java
parses the body of the comment as a sequence ESC/Java pragmas.  Such a
comment is called a pragma-containing comment.

Pragmas fall into four categories:  lexical pragmas, statement
pragmas, declaration pragmas, and modifier pragmas.  All pragmas
contained in one pragma-containing comment must be of the same
category.

A pragma-containing comment with lexical pragmas can occur anywhere a
Java comment can occur.  A pragma-containing comment with statement
pragmas can occur essentially wherever a Java statement can.  A
pragma-containing comment with declaration pragmas can occur wherever
the declaration of a class or interface member can occur.

A pragma-containing comment with modifier pragmas can occur anywhere a
Java modifier (like "private" or "final") can occur.  In addition, for
style and readability, such a comment can occur just before the ";"
that terminates a variable declaration and just before the "{" that
opens a method body.  For example, the two declarations

    /*@ monitored */ int x;

and

    int x /*@ monitored */ ;

are equivalent.  Similarly, the two declarations

    /*@ requires p != null */ void m(Object p) { ... }

and

    void m(Object p) /*@ requires p != null */ { ... }

are equivalent.  Note that the "p" in "p != null" refers to "m"'s
parameter, regardless of whether the pragma is placed before or after
the method's signature.  The order and alternative placement of
modifier pragmas is irrelevant.

A modifier pragma applies to all identifiers introduced by the
declaration.  For example,

    /*@ non_null */ Object x, y, z;

and equivalently

    Object x, y, z /*@ non_null */ ;

declare all of "x", "y", and "z" to be "non_null".  Note that a
pragma-containing comment must be placed either before the type of the
declaration or in the alternate position.  For example,

    Object x, y /*@ non_null */, z;

is not allowed.


Annotations
~~~~~~~~~~~
This section describes the most common ESC/Java pragmas.

----  non_null  ----

Perhaps the simplest way to get started with ESC/Java is to use the
"non_null" modifier pragma, which can be applied to any program
variable (static field, instance variable, local variable, or
parameter).  If a variable "x" is declared with

    /*@ non_null */ T x;

then ESC/Java will check that every assignment to "x" assigns a
non-null value.  This also means that "x" can always be deferenced
without any danger of causing a null-dereference error.

----  invariant  ----

At the heart of ESC/Java is the invariant declaration pragma.  By
putting the declaration

    /*@ invariant J; */

in a class "T", where "J" is a boolean expression that can be written
in the scope of "T".  This declares an \object invariant/ "J".  More
specially, if "J" mentions (implicitly or explicitly) the special
variable "this", then "J" is called an \instance invariant/;
otherwise, "J" is called a \static invariant/.  Roughly speaking,
object invariants are supposed to hold on every method boundary, that
is, on entry to and exit from every method.

Some examples of useful object invariants are

    int f, g;  /*@ invariant f < g; */

    char[] a;  /*@ invariant a == null || (0 <= n && n <= a.length); */
    int n;

--- Specification expressions
The expression "J" in "invariant J" is boolean \specification
expression/.  A specification expression is essentially a Java
expression that is free from subexpressions with possible side
effects.  In particular, a specification expression cannot contain any
assignment expressions or method invocations.  A specification
expression can include some subexpressions that are not part of Java.
These are described next.

In addition to the Java's boolean operators (like "!", "&&", and
"||"), a specification expression can use implication, written "==>".
Implication has lower binding power than "!", "&&", and "||".  Thus,
one of the example invariants displayed above can alternatively be
written as

    /*@ invariant a != null ==> 0 <= n && n < a.length; */

A specification can also contain a quantification.  This is useful to
express properties about the elements of an array.  For example,

    int[] a;
    /*@ invariant a != null &&
          (forall int i; 0 <= i && i < a.length ==> 0 <= a[i]); */

says that every element of "a" is a natural number.  The existential
quantification, which is used much less often, is written similarly:

    int[] a;
    /*@ invariant a != null &&
          (exists int i; 0 <= i && i < a.length && a[i] < 0); */

One more specification expression feature is worth mentioning at this
time.  In addition to the Java expression

    E instanceof T

which is "true" when "E" is null or when the dynamic type of "E" is
a subtype of "T", a specification expression can refer more generally
to the dynamic types of objects:  the dynamic type of an object "E" is
written "typeof(x)", a type "T" can be written (in places other than
the second argument to "instanceof") as "type(T)", the element type of
an array type "T" is written "elemType(T)", and types can be compared
using "==" or the subtype relation "<:".  For example, the invariant
in

    class Node {
      Node next;
      /*@ invariant next == null || typeof(next) == type(Node); */
      ...

says that the dynamic type of any successor object must be exactly
"T" (not a proper subtype of "T").  The invariant in

    class Node {
      Node next;
      /*@ invariant next == null || typeof(next) == typeof(this); */
      ...

says that lists chained by the "next" field are homogeneous.  For
example, if "o" is an object of a subtype "DoublyLinkedNode" of
"Node", then "o.next" is (either null or) also an object of type
"DoublyLinkedNode".  (Preconditions of binary methods can easily be
written in a similar way.)  Finally, the invariant in

    T[] a;  /* invariant a != null && type(U) <: elemType(typeof(a)); */

where "U" is some subtype of "T", implies that any expression whose
static type is "U" can be assigned to an element of "a" without any
danger of causing an "ArrayStoreException".  This invariant can be
written equivalently as

    T[] a;  /* invariant a != null && type(U[]) <: typeof(a); */

--- Establishing invariants initially
An instance invariant in a class "T" is established for a new
object by a constructor of "T".  The constructor bodies in Java
run from superclasses (beginning with "Object") to subclasses (ending
with the class that is the allocated type of the object).  Therefore,
the instance invariants of subclasses have not yet been established
when a superclass constructor is executed.  This means that
constructors of any non-"final" class "T" should not "give away" the
object being constructed ("this"), even if the object meets all
instance invariants declared in class "T".  ESC/Java will warn about
any use of "this" in a constructor other than that of assigning to the
fields of "this".  Note that a method call on "this" counts as such a
use.

A class invariant is established initially by the static initializers
and bodies of a class.  Checking this code is tricky, because any
cyclic dependencies between initialization code of different classes
can easily cause the checker to miss important errors.  ESC/Java takes
some preventive measures by imposing some restrictions, but these
restrictions are not described here.

--- When an object invariant is checked to hold
Object invariants need not hold after every assignment; they are
checked only on method boundaries.  In particular, they are assumed to
hold upon entry to a method implementation and are checked to hold
upon exit from a method implementation.

Checking object invariants only on method boundaries allows a method
to perform several assignments before an object invariant has been
re-established.  However, sometimes the re-establishing of an object
invariant "J" may require a few method calls.  In such a case, it
would be too strict to insist that all object invariants, including
"J", hold on every method boundary.

ESC/Java attempts to remedy this problem by not requiring that \all/
object invariants hold on every method boundary:  at a call site,
ESC/Java checks that all static invariants hold, and that all instance
invariants hold for objects stored in static fields and passed as
parameters to the method being called.  The rationale is that the
instance invariants a callee is most likely to rely on are the
instance invariants of the callee's parameters.  ESC/Java assumes that
a callee does not falsify any object invariant.

----  assert  ----

ESC/Java features the well-known \assert/ statement as a statement
pragma.  It has the form

    /*@ assert P; */

where "P" is a boolean specification expression.  ESC/Java will check
that "P" holds at this program point.

Asserts are useful because they provide some redundancy--a programmer
can write down a condition believed to be true at a particular program
point.  ESC/Java will check that the program text and the assert
are consistent.

----  nowarn  ----

Many of ESC/Java's warning messages point out not real program
errors but rather places where a programmer design decision has been
made.  The common response to such warnings is to add an annotation
that documents the design decision.  However, sometimes the
correctness of a piece of code boils down to something that would
require considerable effort to specify with annotations.  In such a
case, it is better to instruct ESC/Java to ignore the hazard, and move
on.

A crude way of telling ESC/Java to ignore a particular hazard is the
"nowarn" pragma, which specifies a list of hazards for which the
programmer will take responsibility.  The pragma applies to the
program line on which it is given, and applies to all places on that
line where an error of the specified kind may possibly appear.  For
example, suppose ESC/Java outputs a message like

    Cup.java(221,7): Null dereference error (Null)

for the program fragment

    219:  ...
    220:  if (x < y) {
    221:    z = o.f;
    222:  }

This message warns about the possibility of a null dereference error
on line 221, column 7, of the source file Cup.java.  The parenthesized
"(Null)" in the error message is a label that can be used with the
"nowarn" pragma.  If the reasoning behind why "x < y" implies that "o"
is non-null is non-trivial, then perhaps this is a good application
for the "nowarn" pragma.  The following annotated program fragment
will not produce the warning message:

    219:  ...
    220:  if (x < y) {
    221:    z = o.f;      //@ nowarn Null
    222:  }

----  assume  ----

While giving up on trying to further specify a program to avoid
spurious warning messages is, at some point, a good idea, use of the
"nowarn" pragma may seem undesirable since it does not document \why/
a programmer thinks a particular error won't occur.  Also, the
"nowarn" pragma applies to an entire source line, which may sometimes
be too course grained.  ESC/Java provides a different pragma that
improves on the "nowarn" pragma in these two respects, the "assume"
pragma.

Both an "assert" and an "assume" state a condition that a programmer
believes to hold at a particular program point.  The difference is
that ESC/Java checks the asserted condition, whereas it simply takes
the assumed condition for granted.  That is, the programmer takes
responsibility for that the assumed condition will hold.

The following annotated program fragment shows how to use an "assume"
with the example presented with "nowarn" above:

    ...
    if (x < y) {
      /*@ assume o != null; */
      z = o.f;
    }

Regrettably, no checker knows every true fact of life.  The "assume"
pragma can be used here, too.  For example, if the absence of checked
run-time errors in your program depend on certain properties of
multiplication, such as knowing that a square is never negative:

    if (x*x < a.length) {
      y = a[x*x];
    }

you can use an "assume" to make ESC/Java more informed:

    if (x*x < a.length) {
      /*@ assume 0 <= x*x; */
      y = a[x*x];
    }

This assume pragma states that the square of the program variable "x"
is non-negative.  A more natural way to write the pragma may be

    if (x*x < a.length) {
      /*@ assume (forall int n; 0 <= n*n); */
      y = a[x*x];
    }

which states the multiplication property more generally.

Note that, since the programmer is responsible for anything written in
an assume pragma, the checker may become less effective in finding
real errors after an assume that mentions an untruth.  For example, if
the programmer would accidentally write

    /*@ assume (forall int n; n*n < 0); */

then the checking downstream from this pragma is suspect.  Indeed,
ESC/Java will do no checking downstream from a pragma

    /*@ assume false; */

The moral of this story is:  Use "assume" pragmas, because you're
likely to need them, but be careful when you do.

----  defined_if  ----

A static field, instance variable, or local variable "x" of type "T"
can be annotated with a pragma

    T x /*@ defined_if P */;

where "P" is a boolean specification expression.  This pragma causes
ESC/Java to check that "P" is "true" any time the value of "x" is
read.

For example, suppose that a variable "n" keeps track of how many
elements of an array "a" are in use, but that "a" may be null if it is
not needed.  Then, one can declare

    int[] a;
    int n  /*@ defined_if a != null */;

This annotation records the design decision that the value of "n" is
irrelevant when "a" is null.  In this case, a program is likely to be
in error if it ever reads the value of "n" when "a" is null; the
annotation will prompt ESC/Java to look for this error.  A different
design decision for the same problem can be recorded as

    int[] a  /*@ defined_if 0 < n */;
    int n;

Here, the design decision is that "n" can always be read, but unless
"n" is positive, "a" should not be used.  Thus, in the first example,
a program would test "a != null && 0 < n" to find out if any items
are stored in the array, whereas in the second example, a program
would simply test "0 < n".

  Remark.
  For the picky, it may seem that "defined_only_if" is a better name
  for this pragma than "defined_if".  Yet a more precise name would be
  "meaningful_only_if".
  (End of Remark.)

  Remark.
  Perhaps unfortunately, ESC/Java does not provide a way to say when
  particular array elements are defined.  This could be useful since
  one could then express "a[i] is meaningful only if 0 <= i && i < n",
  which says that only the first "n" elements of "a" are in use.
  (End of Remark.)

----  uninitialized  ----

Java defines a strict and often helpful rule that every local variable
must have a \definitely assigned/ value when any access of its value
occurs [JLS, ch.16].  This essentially means that a Java program won't
compile if one of its methods contains a path between the declaration
and use of a local variable with no intervening assignments to that
variable, whether or not execution along that path is ever dynamically
possible.  This rule can catch common mistakes of forgetting to
initialize a variable.  However, the conservative nature of the test
rules out some programs where nothing could go wrong at run-time.
As an extreme example, any method containing the following
program fragment will result in a compile-time error:

    int x;
    if (false) {
      int y = x;
    }

A more common and useful example of a program fragment that won't
compile because of Java's definite assignment rule is one of the form:

    int x;             // declaration of "x"
    if (B) {
      x = E;           // assignment to "x"
    }
    ...;               // code that doesn't mention "x"
    if (C) {
      y = ... x ...;   // use of "x"
    }

If "C" is true only when "B" was true earlier, then this piece of code
make sense.  However, because there is no \definite/ assignment to "x"
before its use (caused by the conservative rules that Java uses to
define defintely assignment), this program fragment will cause a
compile-time error.  There is a simple workaround, however:  provide a
"dummy" initializer for "x":

    int x = 0;
    ...

With this assignment to "x", the compiler will no longer complain.
On the other hand, the compiler will not catch initialization errors
either, so no benefit is reaped from Java's definte assignment rule.
In particular, what if the programmer is wrong about thinking that "C"
is will true only if "B" was?  Then the use of "x" will return the
dummy value (here, 0), which means that the programming error goes
unnoticed, which is arguably worse.  (Raymie Stata has often expressed
a wish that Java's definite assignment rule would result in a run-time
error, rather than a compile-time error.)

ESC/Java to the rescue.  By annotating a local variable declaration
with the the pragma "uninitialized", ESC/Java will check that the
initializing value is never actually used.  Thus, one would write the
code fragment above as:

    /*@ uninitialized */ int x = 0;
    ...

The assignment to "x" will silence Java's conservative compile-time
checks, and the "uninitialized" pragma will engage ESC/Java more
precise compile-time checks.

----  unreachable  ----

Programmers commonly write "assert false" in places where control is
believed never to enter.  ESC/Java provides a special annotation that
expresses this fact in a less mysterious way, the statement pragma
"unreachable":

    /*@ unreachable; */

----  requires  ----

As alluded to in the introduction, methods and constructors have
specifications, which act as contracts between callers and
implementations.  

TBW.

----  ensures and modifies  ----

TBW.


----  loop_invariant  ----

TBW.

----  monitored_by  ----

TBW.

----  monitored  ----

TBW.

----  axiom  ----

TBW.


Advanced comment on syntax.
Pragmas whose syntax is more than a single keyword end with an
optional semi-colon.  In general, we advocate a annotation style where
a terminating semi-colon is used whenever a corresponding Java
construct would use a semi-colon.  That is, statement pragmas and
declaration pragmas look nice with a terminating semi-colon, whereas a
modifier pragmas often doesn't.  Without semi-colon, there may be
ambiguities, however, because most ESC/Java "keywords" are actually
parsed as identifiers which are at times compared with a set of
supported "keywords".  For example,

    /*@ requires (T) ensures modifies */

may parse as

    /*@ requires (T) ensures */
    /*@ modifies */

which seems to specify a precondition expression "(T) ensures"--that
is, an identifier "ensures" cast to a "T"--and an empty modifies
clause, or may parse as

    /*@ requires (T) */
    /*@ ensures modifies */

which seems to specify a precondition expression "T" and a
postcondition expression consisting of an identifier "modifies".  In
these cases, ESC/Java will follow the rule of parsing a pragma for as
long as possible.  If this is not what the programmer intended,
mysterious type errors may be reported.  Therefore, a programmer may
well want to take advantage of the fact that terminating semi-colons
are allowed.  For example, one may then write either

    /*@ requires (T) ensures ;
        modifies */

or

    /*@ requires (T) ;
        ensures modifies */

(End of Advanced comment on syntax.)