ESCJ 1:  Translating Java into units.

Raymie Stata and Rustan Leino.

[From post to src.sparta dated 25 Mar 97.]

[From Raymie and Rustan. Update of previous message based on discussion
last Thursday.]

This message describes a translation of Java declarations into
declarations in a language like Ecstatic [KRML 65] plus units.  We are
considering using this translation inside the ESC/Java tool.  We feel
that it provides a nice framework for generating "small" verification
conditions (in particular, it makes it easy to include checking of
only the "necessary" invariants in the verification condition).  In
that respect, it also provides a philosophy for what declarations and
invariants ought to be visible where.  Also, it provides a nice
framework for generating "small" background predicates.
 

** Background on Java

Java classes and interfaces live in "packages" (open-ended collections
of classes).

Java defines four levels of "access protection" on members of classes:
private, package, protected, and public.  Private members of a class
are visible only in the class itself; package members are visible in
any class inside the class's package; protected members are likewise
visible in any class inside the class's package and are also visible
to subclasses; public members are visible to all classes.

Classes and interfaces themselves have two levels of protection.  A
"public" class or interfaces is visible to all classes and interfaces
in all packages.  A "package" class or interface in package P is
visible only to other classes and interfaces inside P.  Java allows
members to be declared "public" in "package" classes and interfaces,
but it turns around and treats such members as if they were "package."
In our discussion below, we ignore "public" members declared in
"package" classes and interfaces, assuming instead that they've been
desugared into "package" members.

Java doesn't use the keyword "package" to designate package-level
protection; the lack of a keyword designates this level, and this
level is often called "default-access" in the official Java spec.  To
be more explicit, below we use the made-up keyword "package".
 

** Background on Ecstatic plus units

A unit is a collection of declarations.  A unit M is defined using the
syntax:

  unit M

A declaration is included in M using the syntax:

  M: <decl>

where <decl> is one of:

  import N,...  // Make decls of N and other units visible in M
  type T <: U,... // Declare T and make it a subtype of U,...
  var x : T  // Declare a variable of type T
  proc m(...) S  // Declare a procedure m with specification S
   // (specifications are not discussed in this message)
  impl m(...) is S // Declare an implementation of m with body S.
   // (More than one impl can be given for a proc,
   // and an impl may be given with a more
   // specific signature.)
  invariant J  // Declare an object invariant J

The syntax "unit M imports N,O" is shorthand for "unit M", "M: import
N", "M: import O".

We say that a unit "contains" identifiers declared in the unit.  We
say that a unit "makes visible" identifiers contained in the unit and
identifiers made visible by other units imported by the unit.  A unit
is allowed to references only those identifiers it makes visible.

** The translation

We assume that all Java names have been mangled so as to all be
distinct.

Every Java program gives rise to the following Ecstatic declarations:

    --> unit Public
        Public: type Object
        Public: ...a bunch of other, boiler-plate declarations

For every C -- either public or package -- our translation produces a
declaration "type C..." contained in the Public unit.  In addition,
for each "public" member m of a public class, our translation produces
a declaration for m contained in the Public unit.

Each package in Java gives rise to a unit:

  package P
    --> unit P import Public

In our translation, a unit like P makes visible all declarations
visible to all classes in P.  It does this by importing the Public
unit, by importing special units containing declarations of
"protected" members defined in P, and by itself containing
declarations of all "package" members of classes and interfaces
defined in P.  This will be explained in more detail below.

Every Java class definition produces three units and two declarations.
Assume "public class C extends D implements J, K { ... }" appears in
package P; the following translation results:

  public class C extends D implements J, K
    --> unit protC import Public
        unit allprotC import protC, allprotD
        unit privC import P, allprotC
        Public: type C <: D,J,K
        P: import protC

The unit "protC" contains protected members declared by C.  This unit
is imported by P, making these members visible to all classes defined
in P.  The unit "allprotC" makes visible all protected members of C
including inherited ones (note that allprotC is _not_ imported by P,
because protected members inherited from other classes are not
necessarily visible to all classes in P).  Finally, the unit "privC"
contains both private members and the code of C.

The following declarations are assumed to be declared in the class
"C" just mentioned.

  private T x;
    --> privC:  var x: C --> T

  package T x;
    --> P:  var x: C --> T

  protected T x;
    --> protC:  var x: C --> T

  public T x;
    --> Public:  var x: C --> T

If a variable "x" is declared as "static", then the rules above change
only in that "var x: C --> T" becomes "var x: T".  For example:

  private static T x;
    --> privC:  var x: T

A Java method declaration turns into two Ecstatic declarations, one
declaring the method and one giving an implementation thereof:

  protected T m(...) { S };
    --> protC:  method res: T := m(this: C, ...)
        privC:  impl res: T := m(this: C, ...) is S

This example shows a protected method.  A method declared with a
different access modifier is placed in the unit suggested by the
instance variable translation above.  If the Java method is declared
as "static", then the "this: C" parameter is omitted in the Ecstatic
translation.  If the Java declaration declares a new, "abstract"
method, then only the Escstatic "method" declaration is generated and
no "impl".  If the Java declaration overrides a superclass method
without changing the access protection of the method, then only the
Ecstatic "impl" declaration is generated, not the "method"
declaration.  If the method override also widens the access protection
of a method -- for example, turning a protected method public -- then
both the "method" and "impl" are generated.  (If we allow subclasses
to refine specifications of methods, then this would cause a new
"method" declaration to be generated.)

We have shown a class "C" declared as a "public class".  Had "C" been
declared as a "package class", then the translation would've been the
same (keeping in mind that "public" members are really "package"
members in the context of "package" classes).  This implies that
"package" classes are still declared in the Public unit.  This is
necessary because the signature of a public or protected method might
include a "package" class in it.  If this signature is inherited by
classes in another package, the type generated by this "package" class
must still be visible.
 

Now for interfaces:

  interface J extends K
    --> Public:  type J <: K

Interfaces are different from classes.  Every member of a "public
interface" is "public", and every member of a "package interface" is
"package".  Further, all methods are "abstract", and all fields are
"static" and "final".  With these caveats, translation of interface
members is as for classes.
 

** Annotations

Now for our annotations.  As stated above, every variable is declared
as "private", "package", "protected", or "public".  In addition, our
annotations declare each variable as "writable-deferred",
"writable-private", "writable-package", "writable-protected", or
"writable-public" a la KRML 74.  By default, the write attribute
matches the corresponding read attribute.  We define a function
"write-unit(x, C)" for every variable "x" and class "C".  In the
context of a class "C" and package "P", variable declarations give
rise to the following rules about "write-unit":

  writable-deferred x
    --> write-unit(x, C) is not defined

  writable-private x
    --> write-unit(x, C) = privC

  writable-package x
    --> write-unit(x, C) = P

  writable-protected x
    --> write-unit(x, C) = protC

  writable-private x
    --> write-unit(x, C) = Public

Also, the following rule about subtypes applies to any inherited
variable "x" for which the subtype contains no write attribute:

  class D extends C
    --> write-unit(x, D) = write-unit(x, C) if write-unit(x, C) defined
        otherwise write-unit(x, D) is not defined

For an update statement

  o.x = E

mentioned in a class "C", the checker emits a warning unless
"write-unit(x, T)" is imported by "privC", where "T" is the
*static* type of "o".
 

A class may contain invariant declarations.  If the context of a class
"C", we have:

  invariant J;
    --> X:  invariant (ALL this: C ::  this == null  \/  J)

where "X" is a unit such that for every variable "y" mentioned in "J",
"write-unit(y, C) = X".  If no such "X" exists, the Java invariant
declaration in not allowed.  (If "J" contains no variables, the
Ecstatic invariant declaration is placed in unit "privC".)
 

Legal Statement Privacy Statement