Collecting debug information when your GPU hangs
After having my i965 hang) twice this morning, I decided to create a small script to make it easier to capture the relevant information when this sort of bug happens. Because the X server stops running, the display is useless, and it’s convenient to be able to get the relevant information by running a single command (I do this using ConnectBot on my phone).
It’s designed to be invoked manually by the user while the system is hung, but if we can somehow detect that it’s locked up, then we could run it automatically.
It collects dmesg, /proc/interrupts, /proc/dri and (for Intel cards) intel_gpu_dump output at the time of the hang. It then leaves behind a crash report in /var/crash, so that after the user recovers their system, apport will collect the usual information and submit a bug on the appropriate package.
If this seems useful, it could be added to x11-common or to apport.
Will it work also for this bug? (X frozen but ssh access available)
https://bugs.launchpad.net/bugs/383973
Nicolò Chieffo
June 17, 2009 at 15:28
On Wed, Jun 17, 2009 at 02:28:30PM +0000,
mdz
June 17, 2009 at 17:47
hmm, it seems replying by mail doesn’t work so well.
The script will “work” for that situation, but since you’ve already gathered most of the info anyway and filed a bug, it won’t help you much.
Your bug may actually be the same one I was chasing, too.
mdz
June 17, 2009 at 22:04
Funnily enough, I’ve added some debug notes about this kind of thing today on my blog.
http://smackerelofopinion.blogspot.com/2009/06/looking-at-intel-x-hangs.html
Colin King
June 17, 2009 at 20:14
Great stuff! Maybe you could add your suggestions to https://bugs.edge.launchpad.net/ubuntu/+source/xorg/+bug/388467 ? It should be easy enough to incorporate them into the script.
mdz
June 17, 2009 at 22:04
How about adding it upstream instead?
foo
June 18, 2009 at 05:02
Dear foo,
Where do you suggest it be added? It could be useful with more than one X driver, and they’re all shipped separately. Since it uses the apport libraries to collect and store the system state, I could submit it upstream to apport, but I don’t think that’s what you meant.
mdz
June 19, 2009 at 14:42