ChatGPT解决这个技术问题 Extra ChatGPT

Debugger times out at "Collecting data..."

I am debugging a Python (3.5) program with PyCharm (PyCharm Community Edition 2016.2.2 ; Build #PC-162.1812.1, built on August 16, 2016 ; JRE: 1.8.0_76-release-b216 x86 ; JVM: OpenJDK Server VM by JetBrains s.r.o) on Windows 10.

The problem: when stopped at some breakpoints, the Debugger window is stuck at "Collecting data", which eventually timeout. (with Unable to display frame variables)

The data to be displayed is neither special, nor particularly large. It is somehow available to PyCharm since a conditional break point on some values of the said data works fine (the program breaks) -- it looks like the process to gather it for display only (as opposed to operational purposes) fails.

When I step into a function around the place I have my breakpoint, its data is displayed correctly. When I go up the stack (to the calling function, the one I stepped down from and where I wanted initially to have the breakpoint) - I am stuck with the "Collecting data" timeout again.

There have been numerous issues raised with the same point since at least 2005. Some were fixed, some not. The fixes were usually updates to the latest version (which I have).

Is there a general direction I can go to in order to fix or work around this family of problems?

EDIT: a year later the problem is still there and there is still no reaction from the devs/support after the bug was raised.

EDIT April 2018: It looks like the problem is solved in the 2018.1 version, the following code which was hanging when setting a breakpoint on the print line now works (I can see the variables):

import threading

def worker():
    a = 3
    print('hello')

threading.Thread(target=worker).start()
I'm encuntering the exact same problem. Have you found a solution or at least an explanation?
Unfortunately not. I opened a ticket with the devs but there was zero reaction (the same to another ticket for another issue). While the product is great, the support is non existing.
I'm fitting LSTM networks in Keras and I get this nonsense when I try to 'model.predict' from the debugger console. It didn't happen when I did the same thing with feedforward networks. The code actually runs just fine when not in the debugger/console. Weird and annoying.
I get this too when debugging with large objects. Is there still no workaround?
I get the same when debugging a separate process

S
Stephen Rauch

I had the same question when i use pycharm2018.2 to debug my web application.

The project is a complex flask web server that combined with SocketIO.

When I made a debug breakpoint inside the code then pressed the debug button, it stopped at the breakpoint, but the variables didn't load. It just collected data data. I made some tweaks to the debugger settings in the end and this made it work. See the following image for the setting to change:

https://i.stack.imgur.com/VtjWv.png


If you're using Gevent, this fixes it indeed!
I'm using PyCharm 2019.2 with PyTorch (a DL framework) which has very large (tensor) objects and have had the same issue. I've been wracking my brain and wasting time and am very grateful for this solution (Enabling "Gevent compatible" fixed the debugger hangup)!
Same problem while debugging PyTorch code - this should be the correct answer!
Excellent... this was such a problem that I was ready to dump Pycharm in favor of VS Code.
Guys, instead of everyone saying thank you, just upvote the first one. It all means the same with much less clutter :)
U
Ursin Brunner

In case you landed here because you are using PyTorch (or any other deep learning library) and try to debug in PyCharm (torch 1.31, PyCharm 2019.2 in my case) but it's super slow:

Enable Gevent compatible in the Python Debugger settings as linkliu mayuyu pointed out. The problem might be caused due to debugging large deep learning models (BERT transformer in my case), but I'm not entirely sure about this.

I'm adding this answer as it's end of 2019 and this doesn't seem to be fixed yet. Further I think this is affecting many engineers using deep learning, so I hope my answer-formatting triggers their stackoverflow algorithm :-)

Note (June 2020): While adding the Gevent compatible allows you to debug PyTorch models, it will prevent you from debug your Flask application in PyCharm! My breakpoints were not working anymore and it took me a while to figure out that this flag is the reason for it. So make sure to enable it only on a per-project base.


Also helpful with gensim. Thanks
C
Cody Gray

I also had this issue when I was working on code using sympy and the Python module 'Lea' aiming to calculate probability distributions.

The action I took that resolved the timeout issue was to change the 'Variables Loading Policy' in the debug setting from the default 'Asynchronously' to 'Synchronously'.

https://i.stack.imgur.com/FYtg3.png


Thanks! This is what I was looking for.
In my case I needed to set this to On demand to solve the problem. I had many variables with vary large pandas dataframes which can take 10's of seconds each to render in the debugger. Sometimes the debugger would attempt to render many of the variables and get stuck doing so for near eternity.
Thanks, this fixes the problem in my case!
M
Manel B

I think that this is caused by some classes having a default method __str__() that is too verbose. Pycharm calls this method to display the local variables when it hits a breakpoint, and it gets stuck while loading the string. A trick I use to overcome this is manually editing the class that is causing the error and substitute the __str__() method for something less verbose.

As an example, it happens for pytorch _TensorBase class (and all tensor classes extending it), and can be solved by editing the pytorch source torch/tensor.py, changing the __str__() method as:

def __str__(self):
        # All strings are unicode in Python 3, while we have to encode unicode
        # strings in Python2. If we can't, let python decide the best
        # characters to replace unicode characters with.
        return str() + ' Use .numpy() to print'
        #if sys.version_info > (3,):
        #    return _tensor_str._str(self)
        #else:
        #    if hasattr(sys.stdout, 'encoding'):
        #        return _tensor_str._str(self).encode(
        #            sys.stdout.encoding or 'UTF-8', 'replace')
        #    else:
        #        return _tensor_str._str(self).encode('UTF-8', 'replace')

Far from optimum, but comes in hand.

UPDATE: The error seems solved in the last PyCharm version (2018.1), at least for the case that was affecting me.


This is good information, but in my case it is my own code that is hanging (also without classes). This happens when debugging multiprocess programs, mostly.
Thanks for the head-up. I checked this with a threaded code and indeed, the problems seems to be gone (I added a sample code to an edit of my question)
F
Flamingo

I met the same problem when I try to run some Deep Learning scripts written by PyTorch (PyCharm 2019.3).

I finally figured out that the problem is I set num_workers in DataLoader to a large value (in my case 20).

So, in the debug mode, I would suggest to set num_workers to 1.


Setting num_workers from 1 to 0 worked for me here. I guess, that the debug thread somehow has no access to the running threads on the workers.
N
Neel0507

For me, the solution was removing manual watches every-time before starting to debug. If there were any existing manual watches in the "variables" window then it would remain stuck in "Collecting data...".


F
Federico Baù

Using Odoo or Other Large Python Server

None of the above solution worked for me despite I tried all. It normally works but saldomly gives this annoying Collecting data... or sometimes Timed Out....

The solution is to restart Pycharm and set less breakpoints as possible. after that it starts to work again.

I don't know way is doing that (maybe too many breakpoint) but it worked.