I figure most of you know that goto
is a reserved keyword in the Java language but is not actually used. And you probably also know that goto
is a Java Virtual Machine (JVM) opcode. I reckon all the sophisticated control flow structures of Java, Scala and Kotlin are, at the JVM level, implemented using some combination of goto
and ifeq
, ifle
, iflt
, etc.
Looking at the JVM spec https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.goto_w I see there's also a goto_w
opcode. Whereas goto
takes a 2-byte branch offset, goto_w
takes a 4-byte branch offset. The spec states that
Although the goto_w instruction takes a 4-byte branch offset, other factors limit the size of a method to 65535 bytes (§4.11). This limit may be raised in a future release of the Java Virtual Machine.
It sounds to me like goto_w
is future-proofing, like some of the other *_w
opcodes. But it also occurs to me that maybe goto_w
could be used with the two more significant bytes zeroed out and the two less significant bytes the same as for goto
, with adjustments as needed.
For example, given this Java Switch-Case (or Scala Match-Case):
12: lookupswitch {
112785: 48 // case "red"
3027034: 76 // case "green"
98619139: 62 // case "blue"
default: 87
}
48: aload_2
49: ldc #17 // String red
51: invokevirtual #18
// Method java/lang/String.equals:(Ljava/lang/Object;)Z
54: ifeq 87
57: iconst_0
58: istore_3
59: goto 87
62: aload_2
63: ldc #19 // String green
65: invokevirtual #18
// Method java/lang/String.equals:(Ljava/lang/Object;)Z
68: ifeq 87
71: iconst_1
72: istore_3
73: goto 87
76: aload_2
77: ldc #20 // String blue
79: invokevirtual #18
// etc.
we could rewrite it as
12: lookupswitch {
112785: 48
3027034: 78
98619139: 64
default: 91
}
48: aload_2
49: ldc #17 // String red
51: invokevirtual #18
// Method java/lang/String.equals:(Ljava/lang/Object;)Z
54: ifeq 91 // 00 5B
57: iconst_0
58: istore_3
59: goto_w 91 // 00 00 00 5B
64: aload_2
65: ldc #19 // String green
67: invokevirtual #18
// Method java/lang/String.equals:(Ljava/lang/Object;)Z
70: ifeq 91
73: iconst_1
74: istore_3
75: goto_w 91
79: aload_2
81: ldc #20 // String blue
83: invokevirtual #18
// etc.
I haven't actually tried this, since I've probably made a mistake changing the "line numbers" to accommodate the goto_w
s. But since it's in the spec, it should be possible to do it.
My question is whether there is a reason a compiler or other generator of bytecode might use goto_w
with the current 65535 limit other than to show that it can be done?
The size of the method code can be as large as 64K.
The branch offset of the short goto
is a signed 16-bit integer: from -32768 to 32767.
So, the short offset is not enough to make a jump from the beginning of 65K method to the end.
Even javac
sometimes emits goto_w
. Here is an example:
public class WideGoto {
public static void main(String[] args) {
for (int i = 0; i < 1_000_000_000; ) {
i += 123456;
// ... repeat 10K times ...
}
}
}
Decompiling with javap -c
:
public static void main(java.lang.String[]);
Code:
0: iconst_0
1: istore_1
2: iload_1
3: ldc #2
5: if_icmplt 13
8: goto_w 50018 // <<< Here it is! A jump to the end of the loop
...
There is no reason to use goto_w
when the branch fits into a goto
. But you seem to have missed that the branches are relative, using a signed offset, as a branch can also go backward.
You don’t notice it when looking at the output of a tool like javap
, as it calculates the resulting absolute target address before printing.
So goto
’s range of -327678 … +32767
is not always enough to address each possible target location in the 0 … +65535
range.
For example, the following method will have a goto_w
instruction at the beginning:
public static void methodWithLargeJump(int i) {
for(; i == 0;) {
try {x();} finally { switch(i){ case 1: try {x();} finally { switch(i){ case 1:
try {x();} finally { switch(i){ case 1: try {x();} finally { switch(i){ case 1:
try {x();} finally { switch(i){ case 1: try {x();} finally { switch(i){ case 1:
try {x();} finally { switch(i){ case 1: try {x();} finally { switch(i){ case 1:
try {x();} finally { switch(i){ case 1: try {x();} finally { switch(i){ case 1:
} } } } } } } } } } } } } } } } } } } }
}
}
static void x() {}
Compiled from "Main.java"
class LargeJump {
public static void methodWithLargeJump(int);
Code:
0: iload_0
1: ifeq 9
4: goto_w 57567
…
Main
with methodWithLargeJump()
compiles to almost 400KB.
finally
blocks get duplicated for normal and exceptional flow (mandatory since Java 6). So nesting ten of them implies ×2¹⁰, then, switch always has a default target, so together with the iload, it needs ten bytes plus padding. I also added a nontrivial statement in each branch to prevent optimizations. Exploiting limits is a recurring topic, nested expressions, lambdas, fields, constructors…
It appears that in some compilers (tried in 1.6.0 and 11.0.7), if a method is large enough the ever need goto_w, it uses exclusively goto_w. Even when it has very local jumps, it still uses goto_w.
Success story sharing
// ... repeat 10K times ...
That compiles? I know there is a limit to the size of a single source class... but I don't know what it precisely is (code generation is the only time I've seen something actually hit it).