Debugger times out at "Collecting data..."
I am debugging a Python (3.5
) program with PyCharm (PyCharm Community Edition 2016.2.2 ; Build #PC-162.1812.1, built on August 16, 2016 ; JRE: 1.8.0_76-release-b216 x86 ; JVM: OpenJDK Server VM by JetBrains s.r.o
) on Windows 10.
The problem: when stopped at some breakpoints, the Debugger window is stuck at "Collecting data", which eventually timeout. (with Unable to display frame variables)
The data to be displayed is neither special, nor particularly large. It is somehow available to PyCharm since a conditional break point on some values of the said data works fine (the program breaks) -- it looks like the process to gather it for display only (as opposed to operational purposes) fails.
When I step into a function around the place I have my breakpoint, its data is displayed correctly. When I go up the stack (to the calling function, the one I stepped down from and where I wanted initially to have the breakpoint) - I am stuck with the "Collecting data" timeout again.
There have been numerous issues raised with the same point since at least 2005. Some were fixed, some not. The fixes were usually updates to the latest version (which I have).
Is there a general direction I can go to in order to fix or work around this family of problems?
EDIT: a year later the problem is still there and there is still no reaction from the devs/support after the bug was raised.
EDIT April 2018: It looks like the problem is solved in the 2018.1 version, the following code which was hanging when setting a breakpoint on the print
line now works (I can see the variables):
import threading
def worker():
a = 3
print('hello')
threading.Thread(target=worker).start()
I had the same question when i use pycharm2018.2 to debug my web application.
The project is a complex flask web server that combined with SocketIO.
When I made a debug breakpoint inside the code then pressed the debug button, it stopped at the breakpoint, but the variables didn't load. It just collected data data. I made some tweaks to the debugger settings in the end and this made it work. See the following image for the setting to change:
In case you landed here because you are using PyTorch (or any other deep learning library) and try to debug in PyCharm (torch 1.31, PyCharm 2019.2 in my case) but it's super slow:
Enable Gevent compatible
in the Python Debugger
settings as linkliu mayuyu pointed out. The problem might be caused due to debugging large deep learning models (BERT transformer in my case), but I'm not entirely sure about this.
I'm adding this answer as it's end of 2019 and this doesn't seem to be fixed yet. Further I think this is affecting many engineers using deep learning, so I hope my answer-formatting triggers their stackoverflow algorithm :-)
Note (June 2020):
While adding the Gevent compatible
allows you to debug PyTorch models, it will prevent you from debug your Flask application in PyCharm! My breakpoints were not working anymore and it took me a while to figure out that this flag is the reason for it. So make sure to enable it only on a per-project base.
I also had this issue when I was working on code using sympy and the Python module 'Lea' aiming to calculate probability distributions.
The action I took that resolved the timeout issue was to change the 'Variables Loading Policy' in the debug setting from the default 'Asynchronously' to 'Synchronously'.
I think that this is caused by some classes having a default method __str__() that is too verbose. Pycharm calls this method to display the local variables when it hits a breakpoint, and it gets stuck while loading the string. A trick I use to overcome this is manually editing the class that is causing the error and substitute the __str__() method for something less verbose.
As an example, it happens for pytorch _TensorBase class (and all tensor classes extending it), and can be solved by editing the pytorch source torch/tensor.py, changing the __str__() method as:
def __str__(self):
# All strings are unicode in Python 3, while we have to encode unicode
# strings in Python2. If we can't, let python decide the best
# characters to replace unicode characters with.
return str() + ' Use .numpy() to print'
#if sys.version_info > (3,):
# return _tensor_str._str(self)
#else:
# if hasattr(sys.stdout, 'encoding'):
# return _tensor_str._str(self).encode(
# sys.stdout.encoding or 'UTF-8', 'replace')
# else:
# return _tensor_str._str(self).encode('UTF-8', 'replace')
Far from optimum, but comes in hand.
UPDATE: The error seems solved in the last PyCharm version (2018.1), at least for the case that was affecting me.