ChatGPT解决这个技术问题 Extra ChatGPT

Java's WeakHashMap and caching: Why is it referencing the keys, not the values?

Java's WeakHashMap is often cited as being useful for caching. It seems odd though that its weak references are defined in terms of the map's keys, not its values. I mean, it's the values I want to cache, and which I want to get garbage collected once no-one else besides the cache is strongly referencing them, no?

In which way does it help to hold weak references to the keys? If you do a ExpensiveObject o = weakHashMap.get("some_key"), then I want the cache to hold on to 'o' until the caller doesn't hold the strong reference anymore, and I don't care at all about the string object "some_key".

Am I missing something?

Java API is full of weird quirks. You can always rewrite WeakHashMap using WeakReference.

D
Doc

WeakHashMap isn't useful as a cache, at least the way most people think of it. As you say, it uses weak keys, not weak values, so it's not designed for what most people want to use it for (and, in fact, I've seen people use it for, incorrectly).

WeakHashMap is mostly useful to keep metadata about objects whose lifecycle you don't control. For example, if you have a bunch of objects passing through your class, and you want to keep track of extra data about them without needing to be notified when they go out of scope, and without your reference to them keeping them alive.

A simple example (and one I've used before) might be something like:

WeakHashMap<Thread, SomeMetaData>

where you might keep track of what various threads in your system are doing; when the thread dies, the entry will be removed silently from your map, and you won't keep the Thread from being garbage collected if you're the last reference to it. You can then iterate over the entries in that map to find out what metadata you have about active threads in your system.

See WeakHashMap in not a cache! for more information.

For the type of cache you're after, either use a dedicated cache system (e.g. EHCache) or look at Guava's MapMaker class; something like

new MapMaker().weakValues().makeMap();

will do what you're after, or if you want to get fancy you can add timed expiration:

new MapMaker().weakValues().expiration(5, TimeUnit.MINUTES).makeMap();

Just to update this for Aug 2013: Google Collections is now named Guava, and the cache creation logic is now part of the CacheBuilder class.
Note, I think in your example for MapMaker you were meant to say new MapMaker().softValues().makeMap(), as calling weakValues() gives you the same result as a WeakHashMap. There's a great example here for how to build a cache with MapMaker - stackoverflow.com/questions/3737140/…
jklp - no, weakValues() isn't the same as a WeakHashMap at all. weakKeys() is the same as a weak hash map. A WeakHashMap holds STRONG references to its values! Agree that for most cases, soft may make more sense than weak; however, this is a matter of degree (a soft reference may, but is not required to, be kept around as long as there is no memory pressure; a weak reference will be thrown away next GC cycle) and the semantics are no different.
Links are broken.
r
rogerdpack

The main use for WeakHashMap is when you have mappings which you want to disappear when their keys disappear. A cache is the reverse---you have mappings which you want to disappear when their values disappear.

For a cache, what you want is a Map<K,SoftReference<V>>. A SoftReference will be garbage-collected when memory gets tight. (Contrast this with a WeakReference, which may be cleared as soon as there is no longer a hard reference to its referent.) You want your references to be soft in a cache (at least in one where key-value mappings don't go stale), since then there is a chance that your values will still be in the cache if you look for them later. If the references were weak instead, your values would be gc'd right away, defeating the purpose of caching.

For convenience, you might want to hide the SoftReference values inside your Map implementation, so that your cache appears to be of type <K,V> instead of <K,SoftReference<V>>. If you want to do that, this question has suggestions for implementations available on the net.

Note also that when you use SoftReference values in a Map, you must do something to manually remove key-value pairs which have had their SoftReferences cleared---otherwise your Map will only grow in size forever, and leak memory.


using this solution over time leaves you with many hashmap items that the value has been gc-ed . is there an alternative that uses a similar approach ?
A Map<K, SoftRereference<?>> approach leaves leaves instances of SoftReference in the map that contain null referents after GC has run. I think the internal implementation of this map must periodically clear all mappings with a value that is a soft reference holding a null referent, for a nice cleanup.
(continuation) Strictly speaking, if a naive programmer uses the implementation HashMap<K, SoftReference<V>> then this will induce a memory leak. You might consider to include this in your answer. Have a look at how WeakHashMap does it, the Oracle JDK has a private method expungeStaleEntries in it that takes care of this cleanup.
@Timmos, Do you mean that the internal JVM reference from obj instance to SoftReference would prevent collection? Then it's a JVM bug that needs to be fixed.
No, just the fact that the GC'ed values referenced by SoftReference are still in the map without any purpose, is a memory leak if you fail to take action to remove these SoftReferences from the map. The values contained in SoftReferences will be GC'ed eventually, but the SoftReferences themselves will not. This is the task of the programmer.
S
Shannon

Another thing to consider is that if you take the Map<K, WeakReference<V>> approach, the value may disappear, but the mapping will not. Depending on usage, you may as a result end up with a Map containing many entries whose Weak References have been GC'd.


Map<K, V>, not Map<K, WeakReference<V>>. This answer seems to make sense on the surface, but note the missing mappings can be removed every time the user calls Map.get, and seeing that this is exactly how WeakHashMap removes keys, it couldn't have been that the Java team didn't realize this.
@Pacerier You seem to be describing a custom implementation of Map which removes the entry when get is called and it finds the WeakReference's reference is null. That is still vulnerable to memory leakage, depending on usage, because it relies on calls to get with specific keys. Definitely better to use a cache with a specific eviction mechanism.
P
Pacerier

You need two maps: one which maps between the cache key and weak referenced values and one in the opposite direction mapping between the weak referenced values and the keys. And you need a reference queue and a cleanup thread.

Weak references have the ability to move the reference into a queue when the referenced object can not accessed any longer. This queue has to be drained by a cleanup thread. And for the cleanup it is necessary to get the key for a reference. This is the reason why the second map is required.

The following example shows how to create a cache with a hash map of weak references. When you run the program you get the following output:

$ javac -Xlint:unchecked Cache.java && java Cache
{even: [2, 4, 6], odd: [1, 3, 5]}
{even: [2, 4, 6]}

The first line shows the contents of the cache before the reference to the odd list has been deleted and the second line after the odds have been deleted.

This is the code:

import java.lang.ref.Reference;
import java.lang.ref.ReferenceQueue;
import java.lang.ref.WeakReference;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

class Cache<K,V>
{
    ReferenceQueue<V> queue = null;
    Map<K,WeakReference<V>> values = null;
    Map<WeakReference<V>,K> keys = null;
    Thread cleanup = null;

    Cache ()
    {
        queue  = new ReferenceQueue<V>();
        keys   = Collections.synchronizedMap (new HashMap<WeakReference<V>,K>());
        values = Collections.synchronizedMap (new HashMap<K,WeakReference<V>>());
        cleanup = new Thread() {
                public void run() {
                    try {
                        for (;;) {
                            @SuppressWarnings("unchecked")
                            WeakReference<V> ref = (WeakReference<V>)queue.remove();
                            K key = keys.get(ref);
                            keys.remove(ref);
                            values.remove(key);
                        }
                    }
                    catch (InterruptedException e) {}
                }
            };
        cleanup.setDaemon (true);
        cleanup.start();
    }

    void stop () {
        cleanup.interrupt();
    }

    V get (K key) {
        return values.get(key).get();
    }

    void put (K key, V value) {
        WeakReference<V> ref = new WeakReference<V>(value, queue);
        keys.put (ref, key);
        values.put (key, ref);
    }

    public String toString() {
        StringBuilder str = new StringBuilder();
        str.append ("{");
        boolean first = true;
        for (Map.Entry<K,WeakReference<V>> entry : values.entrySet()) {
            if (first)
                first = false;
            else
                str.append (", ");
            str.append (entry.getKey());
            str.append (": ");
            str.append (entry.getValue().get());
        }
        str.append ("}");
        return str.toString();
    }

    static void gc (int loop, int delay) throws Exception
    {
        for (int n = loop; n > 0; n--) {
            Thread.sleep(delay);
            System.gc(); // <- obstinate donkey
        }
    }

    public static void main (String[] args) throws Exception
    {
        // Create the cache
        Cache<String,List> c = new Cache<String,List>();

        // Create some values
        List odd = Arrays.asList(new Object[]{1,3,5});
        List even = Arrays.asList(new Object[]{2,4,6});

        // Save them in the cache
        c.put ("odd", odd);
        c.put ("even", even);

        // Display the cache contents
        System.out.println (c);

        // Erase one value;
        odd = null;

        // Force garbage collection
        gc (10, 10);

        // Display the cache again
        System.out.println (c);

        // Stop cleanup thread
        c.stop();
    }
}

Great answer. Worth noting that a ReferenceQueue, unlike many other kinds of collection, blocks until a value can be returned from queue.remove(). This means the cleanup thread isn't the non-waiting infinite loop that first glance might suggest.
@Gordon If you use weak keys and weak values, everything is weak and will garbage collected just after you have added it to the cache.
This is the answer. So we can say that therefore it's for optimizing the cleanup phase that the API is implemented as such.