ChatGPT解决这个技术问题 Extra ChatGPT

Why is llvm considered unsuitable for implementing a JIT?

Many dynamic languages implement (or want to implement) a JIT Compiler in order to speed up their execution times. Inevitably, someone from the peanut gallery asks why they don't use LLVM. The answer is often, "LLVM is unsuitable for building a JIT." (For Example, Armin Rigo's comment here.)

Why is LLVM Unsuitable for building a JIT?

Note: I know LLVM has its own JIT. If LLVM used to be unsuitable, but now is suitable, please say what changed. I'm not talking about running LLVM Bytecode on the LLVM JIT, I'm talking about using the LLVM libraries to implement a JIT for a dynamic language.

Hmm... stackoverflow.com/questions/4077396/llvm-jit-speed-up-choices/… says the answer is "because it's too slow."
-1 LLVM is not considered unsuitable for implementing a JIT.
Well Jon, I have several good answers below. Maybe you can write one about your I-implemented-a-JIT-with-LLVM-and-it-was-awesome experience?
For my next trick, I'll tell the Iron Chef how to make waffles. (smacks self.)
S'ok. Was worse on this other question where I got 4 downvotes despite being the only answerer to have written several low-latency commercial applications in functional languages! stackoverflow.com/a/4479114/13924

J
J D

Why is LLVM Unsuitable for building a JIT?

I wrote HLVM, a high-level virtual machine with a rich static type system including value types, tail call elimination, generic printing, C FFI and POSIX threads with support for both static and JIT compilation. In particular, HLVM offers incredible performance for a high-level VM. I even implemented an ML-like interactive front-end with variant types and pattern matching using the JIT compiler, as seen in this computer algebra demonstration. All of my HLVM-related work combined totals just a few weeks work (and I am not a computer scientist, just a dabbler).

I think the results speak for themselves and demonstrate unequivocally that LLVM is perfectly suitable for JIT compilation.


Interesting. Your work is on statically typed functional languages, where the complaints I heard (linked in the post) were from people implementing dynamically typed imperative/OO languages. I wonder if the typing or functionalness has a bigger impact.
Both static typing and functional style have a big impact but you can address the mismatch. LLVM's own mem2reg optimization pass actually transforms imperative code over value types (ints, floats etc.) from memory operations into purely functional single-static-assignment code (the kind HLVM generates naturally). Dynamic typing is harder because it makes it impossible to attain predictably-good performance but some simple solutions should be effective such as representing all values as unboxed unions or compiling functions for all possible combinations of types of arguments on-demand.
is this thing still alive? for which languages is it being used?
I haven't actively developed HLVM for many years and AFAIK it has no users.
Julia actually implements your latter suggestion, with excellent performance.
M
Mikhail Korobov

There are some notes about LLVM in the Unladen Swallow post-mortem blog post: http://qinsb.blogspot.com/2011/03/unladen-swallow-retrospective.html .

Unfortunately, LLVM in its current state is really designed as a static compiler optimizer and back end. LLVM code generation and optimization is good but expensive. The optimizations are all designed to work on IR generated by static C-like languages. Most of the important optimizations for optimizing Python require high-level knowledge of how the program executed on previous iterations, and LLVM didn't help us do that.


Many people seem to come to LLVM with the dream that you flick a switch and it will magically optimize your poorly generated code instantaneously but that is not the case. Garbage in, garbage out. If you want LLVM to generate fast code then you must generate optimized IR yourself. Dynamically-typed languages like Python will be particularly hard-hit because any upcasting/boxing destroys the static type information that LLVM's optimization phases rely upon.
N
Necrolis

There is a presentation on using LLVM as a JIT backened where the address many of the concerns raised as to why its bad, most of its seems to boil down to people building a static compiler as a JIT instead of building an actual JIT.


p
parkovski

It takes a long time to start up is the biggest complaint - however, this is not so much of an issue if you did what Java does and start up in interpreter mode, and use LLVM to compile the most used parts of the program.

Also while there are arguments like this scattered all over the internet, Mono has been using LLVM as a JIT compiler successfully for a while now (though it's worth noting that it defaults to their own faster but less efficient backend, and they also modified parts of LLVM).

For dynamic languages, LLVM might not be the right tool, just because it was designed for optimizing system programming languages like C and C++ which are strongly/statically typed and support very low level features. In general the optimizations performed on C don't really make dynamic languages fast, because you're just creating an efficient way of running a slow system. Modern dynamic language JITs do things like inlining functions that are only known at runtime, or optimizing based on what type a variable has most of the time, which LLVM is not designed for.


C/C++ are not strongly typed. Strongly typed means each value is permanently associated with a type that cannot be changed, which is true of most dynamic languages, but is not true of C and C++ with their reinterpret cast.
@JanHudec you can do the same exact thing in Haskell, does that make it weakly typed?
@alternative: Are you saying that you can give Haskell a block of raw bytes and tell it to start treating it as, for example, IO()? In the core language? (The unsafe part intended for binding C functions must be able to do that kind of thing, but that is an interoperability extension and I wouldn't count that as Haskell).
@JanHudec hackage.haskell.org/package/base-4.6.0.1/docs/…. So yes. I'm not arguing that you are wrong according to common convention, but that the logic that drives the common convention is simply wrong.
I'm looking at this area in a bit more depth, and LLVM appears to work fine for JITting dynamic languages providing you can establish what the concrete types are and are able to throw stuff away and backtrack to a different strategy when you're wrong.
S
Sean McMillan

Update: as of 7/2014, LLVM has added a feature called "Patch Points", which are used to support Polymorphic Inline Caches in Safari's FTL JavaScript JIT. This covers exactly the use case complained about int Armin Rigo's comment in the original question.


S
Sean McMillan

For a more detailed rant about the LLVM IR see here: LLVM IR is a compiler IR.


Thanks. Realised I'd linked to partway through the thread too late to edit the comment.

关注公众号,不定期副业成功案例分享
Follow WeChat

Success story sharing

Want to stay one step ahead of the latest teleworks?

Subscribe Now