ChatGPT解决这个技术问题 Extra ChatGPT

Why does the JVM full GC need to stop-the-world?

I think it is because the JVM needs to move objects, is that correct?

Have a look here from the wikipedia page on garbage collection en.wikipedia.org/wiki/…

2
2240

First, Garbage Collection article at wikipedia is really good reading.

In general GC does not require Stop-the-World pause. There are JVM implementations which are (almost) pause free (e.g. Azul Zing JVM). Whenever JVM require STW to collect garbage depends on algorithm it is using.

Mark Sweep Compact (MSC) is popular algorithm used in HotSpot by default. It is implemented in STW fashion and has 3 phases:

MARK - traverse live object graph to mark reachable objects

SWEEP - scans memory to find unmarked memory

COMPACT - relocating marked objects to defragment free memory

When relocating objects in the heap, the JVM should correct all references to this object. During the relocation process the object graph is inconsistent, that is why STW pause is required.

Concurrent Mark Sweep (CMS) is another algorithm in HotSpot JVM which does not utilize STW pause for old space collection (not exactly same thing as full collection).

CMS is utilizing write barrier (trigger acting each time you are writing reference in Java heap) to implement concurrent version of MARK and does not use COMPACT. Lack of compaction may result in fragmentation and if background garbage collection is not fast enough application can still be blocked. In these cases CMS will fallback to STW mark-sweep-compact collection.

There is also G1 which is an incremental variation of MSC. You can read more about GC algorithms in HotSpot JVM in my blog.


There has since been another GC algrogithm thanks to RedHat openjdk.java.net/projects/shenandoah
According to this Oracle Page, docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/…, the CMS collector pauses an application twice during a concurrent collection cycle, including initial mark pause and remark pause. And if there is too much fragmentation and no enough block, it will result the third pause.
a
amanuel2

Using the throughput GC, the JVM needs STW pauses to free as much memory as possible. It is only using such pauses that it is the most effective.

Using the low-pauses collector (CMS), you clean the old generation concurrently, without pausing your application. The drawback is that the old generation become fragmented. If it is too fragmented and need a compaction, a Full GC (STW) happens. However, you can always tune your application so that you do not get any Full GC.

G1 GC is a special case. Its current primary goal is to have a low fragmentation on the heap, while still being concurrent (like CMS). When it cannot reach this goal, the JVM also reverts to a STW pause so that the heap is entirely cleaned and compacted.


L
Ludwig Wensauer

stop-the-world is guaranteeing that new objects are not allocated and objects do not suddenly become unreachable while the collector is running.

The advantage is that it is both simpler to implement and faster than incremental garbage collection.


Objects becoming unreachable after the GC has marked it as reachable is not a big problem. It's correct, and only delays the reclamation of memory by one GC cyle. A much more serious problem is new objects being allocated during GC, and the only references being stored in objects already visited -- then the object would be reclaimed erroneously. However, this is not an unsolvable problem. In fact, it has been solved since the 70s.
C
Community

A short stop-the-world phase is needed to scan for references on the stack in almost any garbage collection scheme, even in most schemes that minimize pauses. Great detailed explanation in this answer. incremental and concurrent algorithms work hard to minimize these pauses to a minimum but still have them in most cases.

There actually are even moving/compacting methods that don't need to stop the world while moving objects (Staccato come to mind)