How to understand the Audio Subsystem?
Solution 1:
During my tenure maintaining the PC audio stack in Ubuntu, I made several presentations on debugging the desktop aspect, and you may find this one the most gentle or informative. The gist of the matter is that combinations of ego (aka "technical pride"), politics, and ownership collisions (aka "no maintainer") have caused a morass of APIs to crop up over the past decade (and few years), and many incorrect assumptions have been made all over the place. We're continuing to eliminate many of the inconsistencies and poor assumptions. For Ubuntu specifically, we have a developer Launchpad group. I can't speak for the other members, because they're all Canonical employees and may have additional restrictions, but certainly feel free to contact me through Launchpad; I'm happy to walk you through the debugging process individually as our schedules mesh.
It's absolutely vital to note that PulseAudio has more stringent requirements of one's audio hardware in a fundamentally different manner to traditional applications. These requirements continue to expose poor assumptions in the kernel, the sound drivers, and underlying libraries. For a long time, people's "solution" for "fixing broken sound" has consisted of removing PulseAudio, which effectively skips the broken aspects of the kernel, the sound drivers, and underlying libraries themselves. So, while you "get sound back," the problem remains. In the end, it's much better to fix the problems completely.
To summarize, there are two general approaches for methodical audio debugging. Either one starts at the application level, say, Banshee, or one starts at the hardware level. For the sake of consistency I'll refer to the former as "high level," i.e., higher in the stack, and the latter as "low level," or lower in the stack closest to the hardware. I'm most comfortable troubleshooting from the lowest level upward, since my involvement in audio over the past decade (from creating and maintaining device drivers up through application integration) has seen OEMs repeating mistakes that Linux, i.e., the kernel, has to work around (or "hack around" as we developers sometimes call it). Yet hardware people are no more guilty than software people. These problems have many bases: it doesn't matter whether they lie in the BIOS, the power supply, the motherboard bridges, sound controllers and codecs, the kernel, or the userspace portions (libraries, APIs, PulseAudio, applications, and so on). In the end the issue is quite straightforward: we all insufficiently handle out-of-bounds conditions from every level below and across.
At the PC hardware level, we begin debugging by identifying precisely which components are used. Most modern desktop audio hardware has two significant pieces of information: the PCI subsystem identifier and the audio codec subsystem identifier. You can find the former via lspci -nv
; look for the audio subsystem codes 0401 (AC'97) or 0403 (High Definition Audio). Depending on hardware, the latter can be identified through information exposed by the ALSA driver itself in /proc/asound
(the developers' debugging script I just linked automates much of that information-gathering). It's absolutely vital that one realize and remember that similar, even identical, symptoms often have varying hardware and software causes. For that reason it's easiest on people fixing the bugs to have a clear bug report, and that's why in Ubuntu we have ubuntu-bug alsa-base
for driver problems and ubuntu-bug pulseaudio
for application problems. If you aren't sure what's at fault, just choose one, or use ubuntu-bug's audio symptom. Regardless, we'll request additional information as necessary.
The PCI SSID is important, because it is a record of which computer manufacturer integrates which set of audio components onto a motherboard. For example, you'll often find Dell, HP/Compaq, Acer, Samsung, Lenovo, and so on using different combinations of IDT/Sigmatel, Realtek, Cirrus, Analog, and Conexant audio components. A Dell that has a Realtek 269 does not necessarily have the same behavior as an HP with a Realtek 269.
The codec SSID can be considered the analog of the PCI one from the perspective of the audio equipment manufacturer. Information associated with the codec SSID can be used to find which revision has been integrated onto the silicon.
Unfortunately, here's where the problems begin. Assuming that the BIOS is ok (which is a rather blind assumption, since there are a lot of broken BIOSes that wreak havoc with Linux audio), sometimes the PCI SSID is reused incorrectly. In those situations, our recourse is to apply specific quirks in the driver based on codec SSID. Unfortunately still, sometimes the codec SSID is reused incorrectly. In those situations, we have to look at specific revisions of the codec in addition to more blindly brute-forcing attempts to reinitialize the driver.
For most of the newest equipment, we tend to trust the BIOS instead of using our historical "just use this hard-coded definition" approach. In some situations, we have to use tools to emulate the codec to find which pin definitions perform what functions. Missing functions in the driver to handle jack sense, i.e., (un)muting internal speakers when headphones are (removed)inserted, can be easily corrected in this fashion.
Once we've determined that your underlying hardware isn't broken and that the sound driver doesn't need to be patched, we look at the userspace layer, namely alsa-lib
, alsa-plugins
, and pulseaudio
. The vast majority of issues lie in the driver or in (some interface to) PulseAudio.
alsa-lib
is responsible for handling all native ALSA operations (thus, PulseAudio uses it, too). Problems here will manifest themselves regardless whether PulseAudio is used. On the other hand, problems arising only when the 'default' ALSA virtual device is used can point to either the PulseAudio alsa-lib
plugin (in alsa-plugins
) or to PulseAudio itself. In standard Ubuntu 8.10 and Kubuntu 10.10 installations, 'default' routes through PulseAudio, so switching between the 'default' and native PulseAudio outputs (or inputs as the use case may be) should help narrow down which layer to further investigate.
Solution 2:
You might want to start reading from PulseAudio entry at ubuntu's wiki.