ChatGPT解决这个技术问题 Extra ChatGPT

Why is the JVM stack-based and the Dalvik VM register-based?

I'm curious, why did Sun decide to make the JVM stack-based and Google decide to make the DalvikVM register-based?

I suppose the JVM can't really assume that a certain number of registers are available on the target platform, since it is supposed to be platform independent. Therefor it just postpones the register-allocation etc, to the JIT compiler. (Correct me if I'm wrong.)

So the Android guys thought, "hey, that's inefficient, let's go for a register based vm right away..."? But wait, there are multiple different android devices, what number of registers did the Dalvik target? Are the Dalvik opcodes hardcoded for a certain number of registers?

Do all current Android devices on the market have about the same number of registers? Or, is there a register re-allocation performed during dex-loading? How does all this fit together?

Was that Google's decision to make the DalvikVM register-based? I think DalvikVM was implemented before Google acquired the Android Inc.
You're right of course. (Not very relevant to the question though ;)

M
Mark Bessey

There are a few attributes of a stack-based VM that fit in well with Java's design goals:

A stack-based design makes very few assumptions about the target hardware (registers, CPU features), so it's easy to implement a VM on a wide variety of hardware. Since the operands for instructions are largely implicit, the object code will tend to be smaller. This is important if you're going to be downloading the code over a slow network link.

Going with a register-based scheme probably means that Dalvik's code generator doesn't have to work as hard to produce performant code. Running on an extremely register-rich or register-poor architecture would probably handicap Dalvik, but that's not the usual target - ARM is a very middle-of-the-road architecture.

I had also forgotten that the initial version of Dalvik didn't include a JIT at all. If you're going to interpret the instructions directly, then a register-based scheme is probably a winner for interpretation performance.


Ok, that's interesting. So does the DalvikVM assume any minimal number of registers on the target device?
Also, I've read that some people are installing Android on their laptops since it's a "light-weight" os... That seems like a bad idea if the laptop is not ARM, and perhaps has an architecture with many registers?
ok, I've just learned that dex bytecode is defined in terms of an infinite register machine, and when it comes to efficiency, it seems to mostly be about memory-footprint.
I couldn't remember whether Dalvik was infinite-register based, or had a fixed register file size. If it's infinite, then it'll tend to perform optimally on architectures which have "enough" registers for whatever code you're running.
More detailed explanation can be found here: markfaction.wordpress.com/2012/07/15/…
C
Community

I can't find a reference, but I think Sun decided for the stack-based bytecode approach because it makes it easy to run the JVM on an architecture with few registers (e.g. IA32).

In Dalvik VM Internals from Google I/O 2008, the Dalvik creator Dan Bornstein gives the following arguments for choosing a register-based VM on slide 35 of the presentation slides:

Register Machine Why? avoid instruction dispatch avoid unnecessary memory access consume instruction stream efficiently (higher semantic density per instruction)

and on slide 36:

Register Machine The stats 30% fewer instructions 35% fewer code units 35% more bytes in the instructions stream but we get to consume two at a time

According to Bornstein this is "a general expectation what you could find when you convert a set of class files to dex files".

The relevant part of the presentation video starts at 25:00.

There is also an insightful paper titled "Virtual Machine Showdown: Stack Versus Registers" by Shi et al. (2005), which explores the differences between stack- and register-based virtual machines.


J
Jonas

I don't know why Sun decided to make JVM stack based. Erlangs virtual machine, BEAM is register based for performance reasons. And Dalvik also seem to be register based because of performance reasons.

From Pro Android 2:

Dalvik uses registers as primarily units of data storage instead of the stack. Google is hoping to accomplish 30 percent fewer instructions as a result.

And regarding the code size:

The Dalvik VM takes the generated Java class files and combines them into one or more Dalvik Executables (.dex) files. It reuses duplicate information from multiple class files, effectively reducing the space requirement (uncompressed) by half from traditional .jar file. For example, the .dex file of the web browser app in Android is about 200k, whereas the equivalent uncompressed .jar version is about 500k. The .dex file of the alarm clock is about 50k, and roughly twice that size in its .jar version.

And as I remember Computer Architecture: A Quantitative Approach also conclude that a register machine perform better than a stack based machine.


If I had to guess, I'd say Sun decided to make the JVM stack based because it's easier to implement than a register machine. (But at a non-trivial performance cost, as noted here.)
I can't find a reference, but I think Sun decided for the stack-based bytecode approach because it makes it easy to run the JVM on a low register architecture.
For a hardware ISA, yes register machines have won. Basically every CPU / microcontroller is a register machine, because everything else sucks by comparison. Some have very few registers, like just an accumulator and maybe one or two pointer or index registers, but that's still more like a register machine in the theory-of-computation sense. But we're talking about VMs that are interpreted, so the "register file" if there is one would actually be in memory. Unless you JIT-compiled to native machine code. The reasons are very different for reg being faster than stack.