How to interpret this stack trace

I recently released a Windows phone 8 app. The app sometimes seem to crash randomly but the problem is it crash without breaking and the only info I get is a message on output that tells me there were an Access violation without giving any details. So after releasing, from the crash reports I was able to obtain some more information, but they're kinda cryptical to me.

The info are:

Problem function: unknown //not very useful
Exception type: c0000005 //this is the code for Access violation exception
Stack trace: 
Frame    Image        Function      Offset 
0        qcdx9um8960                0x00035426 
1        qcdx9um8960                0x000227e2

I'm not used to work with memory pointer et similia and I'm not used to see a stack trace like that.

So I have those question:

  1. How should I interpret/read those information, what's the meaning of every piece of information?
  2. Is there a way to leverage those information to target my search for the problem?
  3. Is there a way to get those information while debugging in VS2012

Notes:

  • I'm not asking what an Access Violation is
  • I tagged this as c# and c++ because my code is in c# but the exception is generated (I'm semi-guessing) by c++ implementation for the WebBrowser component

edit:

I tried setting the Debug type to Native only, this let me obtain the same info I had in the crash report on the dev center. This way the debugger break when the exception is thrown and let me see the disassebled code, unfortunately there's no qcdx9um8960 .pdb file (even on Microsoft Symbol Server), so I don't know the function name that caused the error.


Solution 1:

Curiously, a search on the web for the image name "qcdx9um8960" returns several results referencing Windows Phone 8 and the WebBrowser control. Gathering the answers and replies (some even by MSFT), here is what you should possibly look into:

  • If you upgraded your application from Windows Phone 6/7 to 8, make sure you are not still referencing any 6/7 DLLs. 1
  • Make sure you aren't testing or publishing your software in Debug mode. There is a "qcdx9um8960.pdb" file that might be missing, causing the access violation. 1
  • "...there is a possible race condition known issue if the app has multiple copies of WebBrowser open. See if your code perhaps inadvertently makes more than one instance." 1
  • That image, "qcdx9um8960" is referencing a Qualcomm DirectX driver DLL. Perhaps it's not the WebBrowser component's fault, but the DirectX driver it might be using to render the web pages. 2
  • The name of the image suggests that the crash is happening on devices powered by a Qualcomm Snapdragon S4 Plus with model number MSM8960. 3
  • Assuming the processor above, and cross referencing Windows phones that use that chip, you're likely looking at the issue occurring on the Nokia Lumia 920T. 3 That's not to say that the driver doesn't work on several processor architectures or phones.

There are several other hits regarding crashes and issues debugging in the presence of that DLL, so unfortunately for you, I think you might be at the mercy of some third party software that has a few unresolved issues.


References

1Access Violation since updated to WP8

2[Toolkit][WP8] Performance issues with DepthStencilBuffer

3Snapdragon (system on chip)

Solution 2:

This kind of crash "should" never be caused by managed code, so you could go looking for a case where your app invokes some system or library API incorrectly. That's tedious. And the problem might have nothing to do with your app, it might be entirely internal to someone else's code. E.g, maybe WebBrowser crashes when user browses to some evil page. Or the failing code could be running on a thread that never even runs your code. From your observation that the debugger doesn't show any message before the access violation, and the fact that there are only 2 frames on the call stack, I suspect that's most likely.

So you should focus first on getting a (fairly) reliable repro scenario: the (minimal) set of steps that will (often or usually) produce the crash. This may involve interviewing the users who experienced the crash, or maybe some test automation on your part to try to accelerate the failure rate.

Once you have that, Microsoft (or another 3rd party) will accept responsibility -- managed code is never supposed to be able to cause an unhandled exception like access violation. And the scenario might give you a hint about how you can change your app's behavior to avoid the problem, because a real fix might take a long time to be released and distributed.