Changes in this revision:

Closures for the Java Programming Language (v0.5)

Gilad Bracha, Neal Gafter, James Gosling, Peter von der Ahé

Modern programming languages provide a mixture of primitives for composing programs. Most notably Scheme, Smaltalk, Ruby, and Scala have direct language support for parameterized delayed-execution blocks of code, variously called lambda, anonymous functions, or closures. These provide a natural way to express some kinds of abstractions that are currently quite awkward to express in Java. For programming in the small, anonymous functions allow one to abstract an algorithm over a piece of code; that is, they allow one to more easily extract the common parts of two almost-identical pieces of code. For programming in the large, anonymous functions support APIs that express an algorithm abstracted over some computational aspect of the algorithm. For example, they enable writing methods that act as library-defined control constructs.

Closure Literals

We introduce a syntactic form for constructing an anonymus function value:

Primary:
ClosureLiteral
ClosureLiteral:
{ FormalParametersopt => BlockStatementsopt Expressionopt }

Evaluating the closure literal expression results in a closure instance . A closure instance is converted to some object type by a closure conversion. In the nominal version of the specification, it is a compile-time error if a closure literal appears in a context where it is not subject to a closure conversion. In the functional version of the specification, if a closure literal is not subject to a closure conversion it is converted to the corresponding function type of the closure literal, which is the function type with: identical argument types; a return type that is the type of the final expression, if one exists, or java.lang.Unreachable if the closure literal's body cannot complete normally, or void otherwise; and a throws type list corresponding to the checked exception types that can be thrown from the body of the closure literal. The conversion, in either case, occurs entirely at compile-time.

Example: the following closure literal takes two integers and yields their sum:

{int x, int y => x+y}

A closure literal captures a block of code - the block statements and the expression - parameterized by the closure literal's formal parameters. All free lexical bindings - that is, lexical bindings not defined within the closure literal - are bound at the time of evaluation of the closure literal to their meaning in the lexical context in which the closure literal appears. Free lexical bindings include references to variables from enclosing scopes, and the meaning of this, break, continue, and return. Evaluating the closure literal does not cause the statements or expression to be evaluated, but packages them up at runtime with a representation of the lexical context to be invoked later.

Rationale: One purpose for closure literals is to allow a programmer to refactor common code into a shared utility, with the difference between the use sites being abstracted into a closure literal by the client. The code to be abstracted sometimes contains a break, continue, or return statement. This need not be an obstacle to the transformation. One implication of the specification is that a break or continue statement appearing within a closure literal's body may transfer to a matching enclosing statement. A return statement always returns from the nearest enclosing method or constructor. A function may outlive the target of control transfers appearing within it.

At runtime, if a break statement is executed that would transfer control out of a statement that is no longer executing, or is executing in another thread, the VM throws a new unchecked exception, UnmatchedNonlocalTransfer. Similarly, an UnmatchedNonlocalTransfer is thrown when a continue statement attempts to complete a loop iteration that is not executing in the current thread. Finally, an UnmatchedNonlocalTransfer is thrown when a return statement is executed if the method invocation to which the return statement would transfer control is not on the stack of the current thread.

Function Types

We introduce a syntactic form for a function type:

Type:
FunctionType
FunctionType:
{ Typesopt => Type Throwsopt }
Types:
Type
Type
, Types

Informally, a function type describes the set of functions that accept a given list of argument types, result in a value of the given type, and may throw the indicated checked exception types.

Example: the following assigns to the local variable plus a function that computes the sum of its two int arguments:

{int,int=>int} plus = {int x, int y => x+y};
  

A function type is translated into an instantiation of a generic system-generated interface type with a single method. The interface has a type parameter for each argument type and return type that is a reference type, and one additional type parameter for the exception signature. The generic instance of the interface used to represent a function type has covariant wildcards for the type arguments representing the return type and exception type, and contravariant wildcards for function argument types, except when a function type is used in the context of a declared supertype, in which case the type arguments appear without wildcards. [This paragraph is a placeholder for a more precise description, including the name of the interface, required for interoperability among compilers. We expect the generated interfaces to be in the package java.lang.function, which will be closed to user-defined code, and the naming scheme to be an extension of the JVM method signature scheme.]

Example. The variable declaration

{int,String=>Number throws IOException} xyzzy;

is translated into

interface Function1<R,A2,throws E> { // system-generated
    R invoke(int x1, A2 x2) throws E;
}
Function1<? extends Number,? super String,? extends IOException> xyzzy;

With the exception that the generated interface name and parameter names would be synthetic, and the compiler, runtime system, and debuggers should display these types using the function type syntax rather than in terms of the translation involving generics and wildcards.

Because a function type is an interface type, it can be extended by a user-defined interface and it can be implemented by a user-defined class.

This translation scheme causes a function type

{ T0 ... Tn => Tr throws E0, ... Em }

to be defined as a system-generated interface type with a single abstract method

Tr invoke(T0 x0, ... Tn xn) throws E0, ... Em;

that obeys the following subtyping relationship:

A function type { T0 ... Tn => Tr throws E0, ... Em } is a subtype of function type { U0 ... Ul => Ur throws X0, ... Xk } iff all of the following hold:

In English, these rules are

While the subtype rules for function types may at first glance appear arcane, they are defined this way for very good reason: this is the classical arrow rule, which has proven itself indispensible in languages that offer support for higher-order programming. And while the rules seem complex, function types do not add complexity to Java's type system because function types and their subtype relations can be understood as a straightforward application of generics and wildcards (existing constructs). From the programmer's perspective, function types just "do the right thing."

Closure Conversion

A closure literal may be assigned to a variable or parameter of any compatible interface type by the closure conversion:

There is a closure conversion from a closure literal to every interface type that has a single method m such that the closure literal is compatible with m. A closure literal is compatible with a method m iff all of the following hold:

The closure conversion supports converting a function that yields no result to an interface whose method returns java.lang.Void. In this case a null value is returned from the function. This is necessary to write APIs that support completion transparency: that is, in which an invocation of the API can return normally if and only if the function that is passed to it can complete normally.

If the target of the closure conversion extends the marker interface java.lang.RestrictedFunction then

The closure conversion to a "restricted" interface only allows functions that obey the current restrictions on anonymous instances. The motivation for this is to help catch inadvertent use of non-local control flow in situations where it would be inappropriate. Examples would be when the function is passed to another thread to run asynchronously, or stored in a data structure to be invoked later.

Example: We can create a closure instance that adds two to its argument like this:
interface IntFunction {
    int invoke(int i);
}
IntFunction plus2 = {int x => x+2};

Alternatively, using function types:

{int=>int} plus2 = {int x => x+2};
We can use the existing Executor framework to run a closure instance in the background:
void sayHello(java.util.concurrent.Executor ex) {
    ex.execute({=> System.out.println("hello"); });
}

Exception type parameters

To support exception transparency (that is, the design of control-invocation APIs in which the compiler can infer that exceptions thrown by the closure are thrown by the API), we add a new type of generic formal type argument to receive a set of thrown types. [This deserves a more formal treatment]

TypeParameter:
throws TypeVariable TypeBoundopt
ActualTypeArgument:
throws ExceptionList
ExceptionList:
ReferenceType
ReferenceType | ExceptionList

What follows is an example of how the proposal would be used to write an exception-transparent version of a method that locks and then unlocks a java.util.concurrent.Lock, before and after a user-provided block of code. In the nominal version of the proposal:

interface Block<T,throws E> {
    T invoke() throws E;
}
public static <T,throws E extends Exception>
T withLock(Lock lock, Block<T,E> block) throws E {
    lock.lock();
    try {
        return block.invoke();
    } finally {
        lock.unlock();
    }
}

Or using the functional version of the proposal:

public static <T,throws E extends Exception>
T withLock(Lock lock, {=>T throws E} block) throws E {
    lock.lock();
    try {
        return block.invoke();
    } finally {
        lock.unlock();
    }
}

This can be used as follows:

withLock(lock, {=>
    System.out.println("hello");
});

This uses a newly introduced form of generic type parameter. The type parameter E is declared to be used in throws clauses. If the extends clause is omitted, a type parameter declared with the throws keyword is considered to extend Exception (instead of Object, which is the default for other type parameters). This type parameter accepts multiple exception types. For example, if you invoke it with a block that can throw IOException or NumberFormatException the type parameter E would be given as IOException|NumberFormatException. In those rare cases that you want to use explicit type parameters with multiple thrown types, the keyword throws is required in the invocation, like this:

Locks.<throws IOException|NumberFormatException>withLock(lock, {=>
    System.out.println("hello");
});

You can think of this kind of type parameter accepting disjunction, "or" types. That is to allow it to match the exception signature of a function that throws any number of different checked exceptions. If the block throws no exceptions, the type argument would be the type null (see below).

Type parameters declared this way can be used only in a throws clause or as a type argument to a generic method, interface, or class whose type argument was declared this way too.

Throws type parameters are indicated in class files by the presence of the character '^' preceding the type parameter.

We introduce a meaning for the keyword null as a type name; it designates the type of the expression null. A class literal for the type of null is null.class. These are necessary to allow reflection, type inference, and closure literals to work for functions that result in the value null or that throw no checked exceptions. We also add the non-instantiable class java.lang.Null as a placeholder, and its static member field TYPE as a synonym for null.class.

Definite assignment

The body of a closure literal may not assign to a final variable declared outside the closure literal.

A closure literal does not affect the DA/DU status of any free variables it names.

A f ree variable referenced inside a closure literal receives its initial DA state from the DA state of the variable at the point where the closure literal appears.

The type Unreachable

We add the non-instantiable type java.lang.Unreachable. Values of this type appear where statements or expressions cannot complete normally. This is necessary to enable writing APIs that provide transparency for functions that do not complete normally. Unreachable is a subtype of every obect type. The null type is a subtype of Unreachable. At runtime, a NullPointerException is thrown when a value of type Unreachable would appear. This occurs when a programmer converts null to Unreachable.

Example: The following illustrates a function being assigned to a variable of the correct type.

interface NullaryFunction<T, throws E> {
    T invoke() throws E;
}
NullaryFunction<Unreachable,null> thrower = {=> throw new AssertionError(); };

Reachable statements

The initial statement of a closure literal is reachable.

An expression statement in which the expression is of type Unreachable cannot complete normally.

Control invocation syntax

A new invocation statement syntax is added to make closures convenient for control abstraction:

ControlInvocationStatement:
foropt Primary ( FormalParameters : ExpressionListopt ) Statement
foropt Primary ( ExpressionListopt ) Statement

This syntax is a shorthand for the following statement:

Primary ( ExpressionList, { FormalParametersopt => Statement } );

(See the later section loop abstractions for the meaning of the optional for keyword) This syntax makes some kinds of function-receiving APIs more convenient to use to compose statements.

Note: There is some question of the correct order in the rewriting. On the one hand, the function seems most natural in the last position when not using the abbreviated syntax. On the other hand, that doesn't work well with varargs methods. Which is best remains an open issue.

We could use the shorthand to write our previous example this way

withLock(lock) {
    System.out.println("hello");
}

Rationale: This is not an expression form for a very good reason: it looks like a statement, and we expect it to be used most commonly as a statement for the purpose of writing APIs that abstract patterns of control. If it were an expression form, an invocation like this would require a trailing semicolon after the close curly brace of a controlled block. Forgetting the semicolon would probably be a common source of error. The syntax is convenient for both synchronous (e.g. see withLock) and asynchronous (e.g. see Executor.execute) use cases.

Another example of its use would be a an API that closes a java.io.Closeable after a user-supplied block of code, discarding any exception from Closeable.close:

class OneArgBlock<R, T, throws E> {
    R invoke(T t) throws E;
}
<R, T extends java.io.Closeable, throws E>
R with(T t, OneArgBlock<R, ? super T,E> block) throws E {
    try {
        return block.invoke(t);
    } finally {
        try { t.close(); } catch (IOException ex) {}
    }
} 

Or using the functional version:

<R, T extends java.io.Closeable, throws E>
R with(T t, {T=>R throws E} block) throws E {
    try {
        return block.invoke(t);
    } finally {
        try { t.close(); } catch (IOException ex) {}
    }
} 

We could use the shorthand with this API to close a number of streams at the end of a block of code:

with(FileReader in : makeReader()) with(FileWriter out : makeWriter()) {
    // code using in and out
}

Loop Abstractions

The specification supports writing methods that act as control statements, but when used to support loops the programmer should be able to specify that the statement should capture the meaning of break and continue in a way analogous to the built-in loop statements. For this purpose we allow the for keyword at the beginning of a control invocation statement (see above), and introduce the use of the keyword for as a modifier on method declarations:

MethodDeclarator:
foropt Identifier ( FormalParamterListopt )

An overriding or implementing method must be declared with the keyword for if and only if the method being overridden or implemented was declared with the keyword for.

A control invocation statement must use the for keyword if and only if the method being invoked was declared with the keyword for.

Within the controlled statement of a control invocation statement using the keyword for:

Example: The following illustrates a method used to loop through the key-value pairs in a map. The method

<K,V,throws X>
void for eachEntry(Map<K,V> map, {K,V=>void throws X} block)
        throws X {
    for (Map.Entry<K,V> entry : map.entrySet()) {
        block.invoke(entry.getKey(), entry.getValue());
    }

}

allows us to rewrite this

void test(Map<String,Integer> map) {
    for (Map.Entry<String,Integer> entry : map.entrySet()) {
        String name = entry.getKey();
        Integer value = entry.getValue();
        if ("end".equals(name)) break;
        if (name.startsWith("com.sun.")) continue;
        System.out.println(name + ":" + value);
    }

}

like this

void test(Map<String,Integer> map) {
    for eachEntry(String name, Integer value : map) {
        if ("end".equals(name)) break;
        if (name.startsWith("com.sun.")) continue;
        System.out.println(name + ":" + value);
    }

}

Acknowledgments

The authors would like to thank the following people whose discussions and insight have helped us craft, refine, and improve this proposal:

C. Scott Ananian, Lars Bak, Cedric Beust, Joshua Bloch, Martin Buchholz, Danny Coward, Erik Ernst, Rémi Forax, Christian Plesner Hansen, Kohsuke Kawaguchi, Danny Lagrouw, Doug Lea,"crazy" Bob Lee, Martin Odersky, Tim Peierls, Cris Perdue, John Rose, Ken Russell, Guy Steele, Mads Torgersen, Jan Vitek, and Dave Yost.

Creative Commons License
Closures for Java by Gilad Bracha, Neal Gafter, James Gosling, Peter von der Ahé is licensed under a Creative Commons Attribution-Share Alike 3.0 United States License.