Bad DMA/do_IRQ errors on suspend/resume, with occasional freezing

Solution 1:

1. "Bad DMA"

Let's deal with the "bad dma" errors first, since they're the only consistent ones which are reflected in your logs.

  • These, as well as any problems suspending/resuming, are caused by your internal USB 3G modem, which from the MAC address is an Ericsson F3507g.
    • Yes, you read that right. Not every USB device has to be external or plugged into one of the visible USB ports. Modern laptops will run a whole bunch of internal peripherals such as Wireless/3G cards, bluetooth, webcams, etc. from an internal USB "hub".

Notice this tell-tale sequence, which repeats every time the "bad dma" errors occur:

[171783.085166] usb 2-1.6: USB disconnect, device number 10
[171783.086623] ehci_hcd 0000:00:1d.0: dma_pool_free buffer-128, eafaa000/2afaa000 (bad dma)
[171783.087046] cdc_ncm 2-1.6:1.6: usb0: unregister 'cdc_ncm' usb-0000:00:1d.0-1.6, CDC NCM
[171783.092382] done.
[171783.129959] ehci_hcd 0000:00:1d.0: dma_pool_free buffer-128, eb1aa000/2b1aa000 (bad dma)
  • The cdc_ncm module is implicated; this is a low-level USB interface to high-speed cellular modems
  • This bug indicates that the F3507g WWAN cards have had similar problems with Ubuntu/Linux before, and a kernel update fixed it.
    • The error should only affect suspend/resume/freezing, and should NOT affect normal operation of the 3G card.
    • But I'd recommend you try one of the mainline kernels (or the Quantal 3.5 kernel), to see if it makes any difference.
    • The other extreme alternative, of course, is to either disable your 3G card in the BIOS, or if you actively use it, consider replacing it with another brand/model.

2. "do_IRQ" and "sdb1"

It's harder to debug these isolated warnings without context (which can be the key, as shown above). So we'll just have to guesstimate until you can provide a kern.log containing one or both of these errors.

  • "do_IRQ" seems to stem most often from PCI-Express bus issues, including graphics cards, and VIA chipsets are often implicated.
    • This message can otherwise be safely ignored.
  • Given that your SMART logs look OK, the "sdb1" errors probably come from even more USB communication issues with the external drive.

    • If you find more USB errors around these, I'd chalk it down to an occasional USB incompatibility and not worry; but if they occur only by themselves, it may indicate a problem with the drive. A more complete log would help :)
  • Again, I'd recommend trying one of the Quantal 3.5 kernels and seeings if things change, especially for the "do_IRQ".

3. Trying the 3.5-series Quantal Kernel (or a mainline build)

  • Once Ubuntu 12.10 is released, its kernel will be made available for 12.04 as a "backport" (the same goes for 13.04 and 13.10).
  • Right now, you can get the "beta" kernels from the Ubuntu-X team PPA
  • BUT this PPA also contains a number of extra packages which you have no need to upgrade.
  • So I've made just the backported kernel available in another PPA
  • To install:

    sudo apt-add-repository ppa:auanswers/lts-backported-kernels-prerelease
    sudo apt-get update
    sudo apt-get inst all linux-generic-lts-quantal
    
  • Reboot, and you should boot into the new kernel (check with uname -a). Nvidia/AMD graphics and Broadcom wireless cards may be problematic. You can always select your old 3.2-series kernel by keeping Shift pressed at boot until the Grub menu shows, and then going into "Previous Linux Versions"

  • For even more bleeding-edge kernels, you can try one of the mainline builds. Please see this question and answer for more information:

Should I upgrade to the "mainline" kernels?

Solution 2:

The errors you added on the Edit seem to refer to a broken disk sector.

Have you tried running fsck or badblocks?

I suggest you to perform everything from a Live CD environment as follows

  1. Boot the live Ubuntu CD (or any other distro)
  2. Scan for disks and partitions with fdisk

    sudo fdisk -l
    
  3. Once you identified the correct disk label (For example /dev/sda1) try running these two commands. The -c parameter to the fsck command tries to identify and isolate bad blocks

    sudo e2fsck -cv /dev/sda1
    sudo badblocks -sv /dev/sda
    

Solution 3:

For the "no irq for vector" issue, try adding "pci=nomsi" to the kernel boot options.