ChatGPT解决这个技术问题 Extra ChatGPT

How to handle :java.util.concurrent.TimeoutException: android.os.BinderProxy.finalize() timed out after 10 seconds errors?

We're seeing a number of TimeoutExceptions in GcWatcher.finalize, BinderProxy.finalize, and PlainSocketImpl.finalize. 90+% of them happen on Android 4.3. We're getting reports of this from Crittercism from users out in the field.

https://i.stack.imgur.com/1PqL5.png

The error is a variation of: "com.android.internal.BinderInternal$GcWatcher.finalize() timed out after 10 seconds"

java.util.concurrent.TimeoutException: android.os.BinderProxy.finalize() timed out after 10 seconds
at android.os.BinderProxy.destroy(Native Method)
at android.os.BinderProxy.finalize(Binder.java:459)
at java.lang.Daemons$FinalizerDaemon.doFinalize(Daemons.java:187)
at java.lang.Daemons$FinalizerDaemon.run(Daemons.java:170)
at java.lang.Thread.run(Thread.java:841)

So far we haven't had any luck reproducing the problem in house or figuring out what might have caused it.

Any ideas what can cause this? Any idea how to debug this and find out which part of the app causes this? Anything that sheds light on the issue helps.

More Stacktraces:

1   android.os.BinderProxy.destroy  
2   android.os.BinderProxy.finalize Binder.java, line 482
3   java.lang.Daemons$FinalizerDaemon.doFinalize    Daemons.java, line 187
4   java.lang.Daemons$FinalizerDaemon.run   Daemons.java, line 170
5   java.lang.Thread.run    Thread.java, line 841  

2

1   java.lang.Object.wait   
2   java.lang.Object.wait   Object.java, line 401
3   java.lang.ref.ReferenceQueue.remove ReferenceQueue.java, line 102
4   java.lang.ref.ReferenceQueue.remove ReferenceQueue.java, line 73
5   java.lang.Daemons$FinalizerDaemon.run   Daemons.java, line 170
6   java.lang.Thread.run

3

1   java.util.HashMap.newKeyIterator    HashMap.java, line 907
2   java.util.HashMap$KeySet.iterator   HashMap.java, line 913
3   java.util.HashSet.iterator  HashSet.java, line 161
4   java.util.concurrent.ThreadPoolExecutor.interruptIdleWorkers    ThreadPoolExecutor.java, line 755
5   java.util.concurrent.ThreadPoolExecutor.interruptIdleWorkers    ThreadPoolExecutor.java, line 778
6   java.util.concurrent.ThreadPoolExecutor.shutdown    ThreadPoolExecutor.java, line 1357
7   java.util.concurrent.ThreadPoolExecutor.finalize    ThreadPoolExecutor.java, line 1443
8   java.lang.Daemons$FinalizerDaemon.doFinalize    Daemons.java, line 187
9   java.lang.Daemons$FinalizerDaemon.run   Daemons.java, line 170
10  java.lang.Thread.run

4

1   com.android.internal.os.BinderInternal$GcWatcher.finalize   BinderInternal.java, line 47
2   java.lang.Daemons$FinalizerDaemon.doFinalize    Daemons.java, line 187
3   java.lang.Daemons$FinalizerDaemon.run   Daemons.java, line 170
4   java.lang.Thread.run
Never mind, found it bugzilla.mozilla.org/show_bug.cgi?id=864102 I can also confirm is affecting our apps, it smells like a Google Play Services issue
The line of code which the error is thrown was introduced Version 4.3_r1, which was release in 5 June 2013. Might be the problem is happening since then.
Android version 4.2.2 also started to throw this exception so maybe its a google play update that is the source.
@EvelioTarazona I have it in some app that does not use play-services
@ligi is it the same stack-trace for you?

n
nyedidikeke

Full disclosure - I'm the author of the previously mentioned talk in TLV DroidCon.

I had a chance to examine this issue across many Android applications, and discuss it with other developers who encountered it - and we all got to the same point: this issue cannot be avoided, only minimized.

I took a closer look at the default implementation of the Android Garbage collector code, to understand better why this exception is thrown and on what could be the possible causes. I even found a possible root cause during experimentation.

The root of the problem is at the point a device "Goes to Sleep" for a while - this means that the OS has decided to lower the battery consumption by stopping most User Land processes for a while, and turning Screen off, reducing CPU cycles, etc. The way this is done - is on a Linux system level where the processes are Paused mid run. This can happen at any time during normal Application execution, but it will stop at a Native system call, as the context switching is done on the kernel level. So - this is where the Dalvik GC joins the story.

The Dalvik GC code (as implemented in the Dalvik project in the AOSP site) is not a complicated piece of code. The basic way it work is covered in my DroidCon slides. What I did not cover is the basic GC loop - at the point where the collector has a list of Objects to finalize (and destroy). The loop logic at the base can be simplified like this:

take starting_timestamp, remove object for list of objects to release, release object - finalize() and call native destroy() if required, take end_timestamp, calculate (end_timestamp - starting_timestamp) and compare against a hard coded timeout value of 10 seconds, if timeout has reached - throw the java.util.concurrent.TimeoutException and kill the process.

Now consider the following scenario:

Application runs along doing its thing.

This is not a user facing application, it runs in the background.

During this background operation, objects are created, used and need to be collected to release memory.

Application does not bother with a WakeLock - as this will affect the battery adversely, and seems unnecessary.

This means the Application will invoke the GC from time to time.

Normally the GC runs is completed without a hitch.

Sometimes (very rarely) the system will decide to sleep in the middle of the GC run.

This will happen if you run your application long enough, and monitor the Dalvik memory logs closely.

Now - consider the timestamp logic of the basic GC loop - it is possible for the device to start the run, take a start_stamp, and go to sleep at the destroy() native call on a system object.

When it wakes up and resumes the run, the destroy() will finish, and the next end_stamp will be the time it took the destroy() call + the sleep time.

If the sleep time was long (more than 10 seconds), the java.util.concurrent.TimeoutException will be thrown.

I have seen this in the graphs generated from the analysis python script - for Android System Applications, not just my own monitored apps.

Collect enough logs and you will eventually see it.

Bottom line:

The issue cannot be avoided - you will encounter it if your app runs in the background.

You can mitigate by taking a WakeLock, and prevent the device from sleeping, but that is a different story altogether, and a new headache, and maybe another talk in another con.

You can minimize the problem by reducing GC calls - making the scenario less likely (tips are in the slides).

I have not yet had the chance to go over the Dalvik 2 (a.k.a ART) GC code - which boasts a new Generational Compacting feature, or performed any experiments on an Android Lollipop.

Added 7/5/2015:

After reviewing the Crash reports aggregation for this crash type, it looks like these crashes from version 5.0+ of Android OS (Lollipop with ART) only account for 0.5% of this crash type. This means that the ART GC changes has reduced the frequency of these crashes.

Added 6/1/2016:

Looks like the Android project has added a lot of info on how the GC works in Dalvik 2.0 (a.k.a ART).

You can read about it here - Debugging ART Garbage Collection.

It also discusses some tools to get information on the GC behavior for your app.

Sending a SIGQUIT to your app process will essentially cause an ANR, and dump the application state to a log file for analysis.


In my case, I'm also planning to try to mitigate this by finding ways to reduce the amount of code/time I'm running in the background. Thanks for your research on the topic.
removing any background processing done in your app will greatly help reduce the problem.
For what it is worth, this happens still in Marshmallow (6.0.1). That said, I have only ever received this error once. So it doesn't seem to be a gigantic problem. Thank you for your thorough explanation.
After some time, I got the distinct impression that fixing this problem in the OS is very problematic, and requires cooperation between Google and the OEMs. I do not expect this to be fixed any time soon.
I'm using wakelock but still encountered this issue on Android 4.4.2. My app has some background operations but mainly designed to work all day long while charge cable mounted. Is there any different way to mitigate this issue?
C
Community

We see this constantly, all over our app, using Crashlytics. The crash usually happens way down in platform code. A small sampling:

android.database.CursorWindow.finalize() timed out after 10 seconds java.util.regex.Matcher.finalize() timed out after 10 seconds android.graphics.Bitmap$BitmapFinalizer.finalize() timed out after 10 seconds org.apache.http.impl.conn.SingleClientConnManager.finalize() timed out after 10 seconds java.util.concurrent.ThreadPoolExecutor.finalize() timed out after 10 seconds android.os.BinderProxy.finalize() timed out after 10 seconds android.graphics.Path.finalize() timed out after 10 seconds

The devices on which this happens are overwhelmingly (but not exclusively) devices manufactured by Samsung. That could just mean that most of our users are using Samsung devices; alternately it could indicate a problem with Samsung devices. I'm not really sure.

I suppose this doesn't really answer your questions, but I just wanted to reinforce that this seems quite common, and is not specific to your application.


Its happening for Android 5.0.1 version as well and doesn't seem to be restricted to Samsung devices. It happened on Nexus 6.
I have this issue on android 4.4.4 with device manufactured by XIAOMI
Just wanted to chime in that we are seeing the majority of these crashes on samsung tablets. Not sure what samsung did differently with how tablets handle backgrounded apps.
i have this issue on android 4.4.4. device manufactured by HUAWEI.
My app crashes after if I use leak canary library on android 5.0.2 Samsung device. If I disable the library initialization, the app works just fine.
C
Christopher

I found some slides about this issue.

http://de.slideshare.net/DroidConTLV/android-crash-analysis-and-the-dalvik-garbage-collector-tools-and-tips

In this slides the author tells that it seems to be a problem with GC, if there are a lot of objects or huge objects in heap. The slide also include a reference to a sample app and a python script to analyze this issue.

https://github.com/oba2cat3/GCTest

https://github.com/oba2cat3/logcat2memorygraph

Furthermore I found a hint in comment #3 on this side: https://code.google.com/p/android/issues/detail?id=53418#c3


k
kiwi

Here is an effective solution from didi to solve this problem, Since this bug is very common and difficult to find the cause, It looks more like a system problem, Why can't we ignore it directly?Of course we can ignore it, Here is the sample code:

final Thread.UncaughtExceptionHandler defaultUncaughtExceptionHandler = 
        Thread.getDefaultUncaughtExceptionHandler();
Thread.setDefaultUncaughtExceptionHandler(new Thread.UncaughtExceptionHandler() {
    @Override
    public void uncaughtException(Thread t, Throwable e) {
        if (t.getName().equals("FinalizerWatchdogDaemon") && e instanceof TimeoutException) {
        } else {
            defaultUncaughtExceptionHandler.uncaughtException(t, e);
        }
    }
});

By setting a special default uncaught exception handler, application can change the way in which uncaught exceptions are handled for those threads that would already accept whatever default behavior the system provided. When an uncaught TimeoutException is thrown from a thread named FinalizerWatchdogDaemon, this special handler will block the handler chain, the system handler will not be called, so crash will be avoided.

Through practice, no other bad effects were found. The GC system is still working, timeouts are alleviated as CPU usage decreases.

For more details see: https://mp.weixin.qq.com/s/uFcFYO2GtWWiblotem2bGg


how to use this to overcome exception?
Heads up, defaultUncaughtExceptionHandler can be null if no default is set
S
SqueezyMo

We solved the problem by stopping the FinalizerWatchdogDaemon.

public static void fix() {
    try {
        Class clazz = Class.forName("java.lang.Daemons$FinalizerWatchdogDaemon");

        Method method = clazz.getSuperclass().getDeclaredMethod("stop");
        method.setAccessible(true);

        Field field = clazz.getDeclaredField("INSTANCE");
        field.setAccessible(true);

        method.invoke(field.get(null));

    }
    catch (Throwable e) {
        e.printStackTrace();
    }
}

You can call the method in Application's lifecycle, like attachBaseContext(). For the same reason, you also can specific the phone's manufacture to fix the problem, it's up to you.


Doesn't work for us, I cannot understand why. Code completes without exceptions but we still receive those issues in Crashlytics reports and Google Play Console.
d
danny117

Broadcast Receivers timeout after 10 seconds. Possibly your doing an asynchronous call (wrong) from a broadcast receiver and 4.3 actually detects it.


Seems useless to detect it and not tell you enough about it. Letting us know which broadcast would be nice.
Pardon if I'm wrong, but I don't think the broadcast receiver timeout causes this particular crash. It's good practice to avoid the 10s limit, but that's a different issue than the requester is having.
I just have 10 seconds on the brain. developer.android.com/training/articles/perf-anr.html IDK if it was causing the crash either.
Your point is solid and a good practice. However, the original poster has a specific question about a specific set of devices. I'd advise other viewers of this post to check Christopher's answer and oba's answer if they're seeing the same symptoms (Samsung devices (esp. Galaxy S 4), etc.)
I'm not here to bash device manufactures it would be against the terms.
S
Sankalp Sharma

One thing which is invariably true is that at this time, the device would be suffocating for some memory (which is usually the reason for GC to most likely get triggered).

As mentioned by almost all authors earlier, this issue surfaces when Android tries to run GC while the app is in background. In most of the cases where we observed it, user paused the app by locking their screen. This might also indicate memory leak somewhere in the application, or the device being too loaded already. So the only legitimate way to minimize it is:

to ensure there are no memory leaks, and

to reduce the memory footprint of the app in general.


D
Dima Kozhevin
try {
    Class<?> c = Class.forName("java.lang.Daemons");
    Field maxField = c.getDeclaredField("MAX_FINALIZE_NANOS");
    maxField.setAccessible(true);
    maxField.set(null, Long.MAX_VALUE);
} catch (ClassNotFoundException e) {
    e.printStackTrace();
} catch (NoSuchFieldException e) {
    e.printStackTrace();
} catch (IllegalAccessException e) {
    e.printStackTrace();
}

This will not solve the problem in case the Sleep duration is > 100 seconds. Why not set it to MAX_INT?
Yes,I'm just do example~
This should not work because of constant inlining. Changing the field value will not affect the value inlined to callers.
Y
Yessy

The finalizeQueue may be too long

i think that java may require GC.SuppressFinalize() & GC.ReRegisterForFinalize() to let user reduce the finalizedQueue length explicitly

if the JVM' source code is available, may implement these method ourself, such as android ROM maker


v
vbevans94

It seems like a Android Runtime bug. There seems to be finalizer that runs in its separate thread and calls finalize() method on objects if they are not in the current frame of the stacktrace. For example following code(created to verify this issue) ended with the crash.

Let's have some cursor that do something in finalize method(e.g. SqlCipher ones, do close() which locks to the database that is currently in use)

private static class MyCur extends MatrixCursor {


    public MyCur(String[] columnNames) {
        super(columnNames);
    }

    @Override
    protected void finalize() {
        super.finalize();

        try {
            for (int i = 0; i < 1000; i++)
                Thread.sleep(30);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }
}

And we do some long running stuff having opened cursor:

for (int i = 0; i < 7; i++) {
        new Thread(new Runnable() {
            @Override
            public void run() {
                MyCur cur = null;
                try {
                    cur = new MyCur(new String[]{});
                    longRun();
                } finally {
                    cur.close();
                }
            }

            private void longRun() {
                try {
                    for (int i = 0; i < 1000; i++)
                        Thread.sleep(30);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        }).start();
    }

This causes following error:

FATAL EXCEPTION: FinalizerWatchdogDaemon
                                                                        Process: la.la.land, PID: 29206
                                                                        java.util.concurrent.TimeoutException: MyCur.finalize() timed out after 10 seconds
                                                                            at java.lang.Thread.sleep(Native Method)
                                                                            at java.lang.Thread.sleep(Thread.java:371)
                                                                            at java.lang.Thread.sleep(Thread.java:313)
                                                                            at MyCur.finalize(MessageList.java:1791)
                                                                            at java.lang.Daemons$FinalizerDaemon.doFinalize(Daemons.java:222)
                                                                            at java.lang.Daemons$FinalizerDaemon.run(Daemons.java:209)
                                                                            at java.lang.Thread.run(Thread.java:762)

The production variant with SqlCipher is very similiar:

12-21 15:40:31.668: E/EH(32131): android.content.ContentResolver$CursorWrapperInner.finalize() timed out after 10 seconds 12-21 15:40:31.668: E/EH(32131): java.util.concurrent.TimeoutException: android.content.ContentResolver$CursorWrapperInner.finalize() timed out after 10 seconds 12-21 15:40:31.668: E/EH(32131): at java.lang.Object.wait(Native Method) 12-21 15:40:31.668: E/EH(32131): at java.lang.Thread.parkFor$(Thread.java:2128) 12-21 15:40:31.668: E/EH(32131): at sun.misc.Unsafe.park(Unsafe.java:325) 12-21 15:40:31.668: E/EH(32131): at java.util.concurrent.locks.LockSupport.park(LockSupport.java:161) 12-21 15:40:31.668: E/EH(32131): at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:840) 12-21 15:40:31.668: E/EH(32131): at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:873) 12-21 15:40:31.668: E/EH(32131): at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197) 12-21 15:40:31.668: E/EH(32131): at java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:200) 12-21 15:40:31.668: E/EH(32131): at java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:262) 12-21 15:40:31.668: E/EH(32131): at net.sqlcipher.database.SQLiteDatabase.lock(SourceFile:518) 12-21 15:40:31.668: E/EH(32131): at net.sqlcipher.database.SQLiteProgram.close(SourceFile:294) 12-21 15:40:31.668: E/EH(32131): at net.sqlcipher.database.SQLiteQuery.close(SourceFile:136) 12-21 15:40:31.668: E/EH(32131): at net.sqlcipher.database.SQLiteCursor.close(SourceFile:510) 12-21 15:40:31.668: E/EH(32131): at android.database.CursorWrapper.close(CursorWrapper.java:50) 12-21 15:40:31.668: E/EH(32131): at android.database.CursorWrapper.close(CursorWrapper.java:50) 12-21 15:40:31.668: E/EH(32131): at android.content.ContentResolver$CursorWrapperInner.close(ContentResolver.java:2746) 12-21 15:40:31.668: E/EH(32131): at android.content.ContentResolver$CursorWrapperInner.finalize(ContentResolver.java:2757) 12-21 15:40:31.668: E/EH(32131): at java.lang.Daemons$FinalizerDaemon.doFinalize(Daemons.java:222) 12-21 15:40:31.668: E/EH(32131): at java.lang.Daemons$FinalizerDaemon.run(Daemons.java:209) 12-21 15:40:31.668: E/EH(32131): at java.lang.Thread.run(Thread.java:762)

Resume: Close cursors ASAP. At least on Samsung S8 with Android 7 where the issue have been seen.


B
Ben

For classes that you create (ie. are not part of the Android) its possible to avoid the crash completely.

Any class that implements finalize() has some unavoidable probability of crashing as explained by @oba. So instead of using finalizers to perform cleanup, use a PhantomReferenceQueue.

For an example check out the implementation in React Native: https://github.com/facebook/react-native/blob/master/ReactAndroid/src/main/java/com/facebook/jni/DestructorThread.java