Emulating microphone input to Chrome inside Docker container

Background

I am trying to control the input to a WebRTC web application running on Chrome, controlled by Selenium, inside a Docker container.

This is part of an automated test of the WebRTC application.
As part of the test, I need to be able to check that the audio is being received on the other side when it supposed to be.
Basically, I want to check that if one client speaks, the other client hears it, and vice-versa, unless the client is on mute.

Now, I can easily get Chrome to emulate microphone (and camera) input by starting it with the command-line parameters:

--use-fake-ui-for-media-stream
--use-fake-device-for-media-stream

Which has the problem that the default sample has a lot of silence in it (harder to detect). I can solve that by supplying my own audio file with more consistent audio:

--use-file-for-fake-audio-capture=/opt/media/audio1.wav

But this has another problem - if Chrome is both sending and receiving audio at the same time, the received audio is severely clobbered, almost into complete silence, as part of Chrome's echo cancellation functionality. (Echo cancellation is set as part of the WebRTC application, and not as part of Chrome itself, and I don't want to make changes to the code being tested to facilitate the test.)
Using two different samples (one for each client) helps slightly, but not very much.

The real problem is that both clients "talk" non-stop for as long as they are running, which both messes up the audio because of the aforementioned echo cancellation, and also isn't a realistic scenario to test because people don't usually talk over each other constantly.

I could theoretically use specially-created samples with intentional sections of noise/silence in them, but then aligning those samples between clients and with the test validation would be a nightmare.

Problem

What I really I need is to be able to start and stop playing audio into the client on demand.

There doesn't seem to be any way to control the fake media stream in Chrome, so it seems that my best option is probably to somehow create a fake "microphone" audio input device inside the Docker container, and control that instead.

On a standard Linux, you can use pulseaudio to loop the audio output back in as a capture device, which looks promising, but I don't know how to use that inside a Docker container.
The Docker container doesn't even have any audio devices to use it with.
I've found various guides on how to set up a Docker to use the audio hardware of the host machine, but that isn't very useful since these containers are running on eSXI servers and don't have any sound cards to use.
Pulseaudio also supports virtual devices, but those need drivers / kernel modules to work. I may be wrong, but I don't think you can use those inside a Docker container.

Question

Sorry if the above was a bit wordy, but I was trying to explain the problem and the various directions I've already looked into.

So, does anyone know a way that I could control audio input into Chrome's capture device inside a Docker container, either using a fake capture device, or through some other means?


I managed to find a solution to this. The basic concept is fairly simple, but it has a couple of gotchas to work around.

The solution involves making use of pulseaudio's ability to create virtual audio sources, and the paplay tool to play media into that audio device.

Setting up the docker container

I needed to make my own Docker image, based on the Ubuntu/Chrome/Selenium image that I was already using, to install the pulseaudio package, tweak the entrypoint to launch it, and add some audio files to play back.

dockerfile:

FROM selenium/standalone-chrome-debug

# Install pulse audio
RUN apt-get -qq update && apt-get install -y pulseaudio

# Copy some media files into place
RUN mkdir -p /opt/media
COPY audio1.wav /opt/media/audio1.wav
COPY audio2.wav /opt/media/audio2.wav

# Use custom entrypoint
COPY entrypoint.sh /opt/bin/entrypoint.sh

ENTRYPOINT /opt/bin/entrypoint.sh

Then, I needed a custom entrypoint to start the pulseaudio server and configure a custom audio source, before starting the standard Selenium startup entry point.
There's two virtual devices here so that one can be used for audio playback without that being piped into the virtual microphone.

entrypoint.sh

# Load pulseaudio virtual audio source
pulseaudio -D --exit-idle-time=-1

# Create virtual output device (used for audio playback)
pactl load-module module-null-sink sink_name=DummyOutput sink_properties=device.description="Virtual_Dummy_Output"

# Create virtual microphone output, used to play media into the "microphone"
pactl load-module module-null-sink sink_name=MicOutput sink_properties=device.description="Virtual_Microphone_Output"

# Set the default source device (for future sources) to use the monitor of the virtual microphone output
pacmd set-default-source MicOutput.monitor

# Create a virtual audio source linked up to the virtual microphone output
pacmd load-module module-virtual-source source_name=VirtualMic

# Allow pulse audio to be accssed via TCP (from localhost only), to allow other users to access the virtual devices
pacmd load-module module-native-protocol-tcp auth-ip-acl=127.0.0.1

# Configure the "seluser" user to use the network virtual soundcard
mkdir -p /home/seluser/.pulse
echo "default-server = 127.0.0.1" > /home/seluser/.pulse/client.conf
chown seluser:seluser /home/seluser/.pulse -R


# Start Selenium-Chrome-Standalone
/opt/bin/entry_point.sh

Because I want to use the audio device in a Selenium-controlled instance of Chrome, which is run as the "seluser" user, I needed to expose the virtual soundcard via TCP (for localhost connections only), and then configure the seluser to use that networked soundcard. No additional setup is required. The virtual source is the only audio input device on the Docker image, so Chrome will use it automatically. All that remains is building and running the docker container.

Playing the audio

Once the container is running, I used paplay to send media into the virtual output device, which I named "MicOutput" above. That can be triggered via an exec command:

docker exec -t -i TestContainerName paplay --device=MicOutput /opt/media/audio2.wav

And that's it.

Of course, I also needed to use "--use-fake-ui-for-media-stream" option in the Chrome Capbilities when configuring my Selenium WebDriver, to let Selenium use the device without asking, but had to make sure not to use the "--use-fake-device-for-media-stream" option, as that would replace the fake input device with Chrome's built-in one.

Thanks to spacepickle's answer to this question for putting me on the right track, and Eli Billauer's post on using Pulse audio for multiple users