How can I change the voice used by Firefox Reader View (Narrator) in Ubuntu?

I managed to use the festival voice as default on Firefox.

enter image description here

In order to do that, we need to change some configurations on the file /etc/speech-dispatcher/speechd.conf. But first, I need to explain the basic idea of how it works. We can always see what voice is the default one used by speech-dispatcher using the command spd-say:

spd-say "Hello. How are you?"

On Ubuntu, the default Texto To Speech (TTS) voice that comes with speech-dispatcher is espeak. So we hear exactly the same voice when we use this other command:

espeak "Hello. How are you?"

That happens because spd-say is just using espeak voices as output. And well, Firefox does the same, it uses whatever voice is configured in speech-dispatcher as output to read web pages in the reader view mode (Ctrl+Alt+R).

So, what we need to do here is to change the voice that comes as output in the spd-say command and, once we do that, Firefox is going to use a different TTS voice as default as well. I'm going to describe the process of making it work with the festival voice, but I believe the procedure is the same if you want to run a different TTS voice. First, we need to install festival:

sudo apt-get install festival speech-dispatcher-festival

We can test its voice in the command line by typing:

echo "Hello. How are you?" | festival --tts

Now we need to change the file speechd.conf. So we type sudo vi /etc/speech-dispatcher/speechd.conf on the terminal and around the line 205 we'll see the following piece of commented configurations:

#AddModule "espeak"       "sd_espeak"   "espeak.conf"
AddModule "festival"     "sd_festival"  "festival.conf"
#AddModule "flite"        "sd_flite"     "flite.conf"
#AddModule "ivona"    "sd_ivona"    "ivona.conf"
#AddModule "pico"        "sd_pico"     "pico.conf"
#AddModule "espeak-generic" "sd_generic" "espeak-generic.conf"
#AddModule "espeak-mbrola-generic" "sd_generic" "espeak-mbrola-generic.conf"
#AddModule "swift-generic" "sd_generic" "swift-generic.conf"
#AddModule "epos-generic" "sd_generic"   "epos-generic.conf"
#AddModule "dtk-generic"  "sd_generic"   "dtk-generic.conf"
#AddModule "pico-generic"  "sd_generic"   "pico-generic.conf"
#AddModule "ibmtts"       "sd_ibmtts"    "ibmtts.conf"
#AddModule "cicero"        "sd_cicero"     "cicero.conf"

# DO NOT REMOVE the following line unless you have
# a specific reason -- this is the fallback output module
# that is only used when no other modules are in use
#AddModule "dummy"         "sd_dummy"      ""

# The output module testing doesn't actually connect to anything. It
# outputs the requested commands to standard output and reads
# responses from stdandard input. This way, Speech Dispatcher's
# communication with output modules can be tested easily.

# AddModule "testing"

# The DefaultModule selects which output module is the default.  You
# must use one of the names of the modules loaded with AddModule.

#DefaultModule espeak
DefaultModule festival

It's necessary to make two changes here:

  1. Uncomment the line AddModule "festival" "sd_festival" "festival.conf"
  2. Add the line DefaultModule festival

We need to run festival as a server in order to make speech-dispatcher use it as default. We can do that by adding the following line at the end of the file that's open when we use the command sudo crontab -e:

@reboot /usr/bin/festival --server

Now it's done!! After rebooting the system Firefox and spd-say will be using the festival voice as output.


Additional Information

I believe that the procedure to make new voices work in Firefox will be always the same:

  1. Uncomment the module of the new TTS voice that we installed (/etc/speech-dispatcher/speechd.conf).

  2. Set a new default line for the TTS voice that we want (/etc/speech-dispatcher/speechd.conf).

  3. Run a server on the port specified on the files inside the folder /etc/speech-dispatcher/modules/.

What called my attention on that is that there's a module for the Ivona voices there. Ivona is a proprietary product and today the only way to use it (as far as I know) is as a pay-as-you-go service on AWS, but its voices are really good and they sound very natural.

The file /etc/speech-dispatcher/modules/ivona.conf is configured to listen to a server on the port 9123. I think perhaps there's a way to run a local server that gets the Ivona voices using my AWS APIs ( I'm not sure, but perhaps using a part of this Node.js app that's already developed) ... And if that's possible, it means that it's also possible to run Ivona on Ubuntu as the default voice of the system and consequently use it with the reader view mode on Firefox . Although I don't know how to do it now, it looks like an interesting possibility.


The voices used by the narrate function of the firefox reader mode depend on the platform you run it on. On Linux, firefox will use speech-dispatcher to render text to artificial speech.

So whatever you have configured in your speech-dispatcher settings (/etc/speech-dispatcher/speechd.conf) should be picked up and used by firefox. There are various engines and voices available for speech-dispatcher, some of which can be installed via Ubuntu packages, e. g. speech-dispatcher-espeak-ng or speech-dispatcher-festival.

There is limited support for selecting voices/languages from within the firefox reader GUI, but most settings have to be made on the OS side, which is speechd.conf on linux.

Some settings are available via the about:config dialog if you search for "narrate":

enter image description here

I experimented quite a bit with different settings in both, about:config and speechd.conf, but could not get anything to work but the default that comes with Ubuntu. The feeling I get is that the interface between firefox and speech-dispatcher is not very stable, but maybe you are more lucky experimenting.

This guy: https://bbs.archlinux.org/viewtopic.php?id=217411 seems to have had more success on Archlinux configuring things to use festival as output. I tried to reproduce this on Ubuntu 18.04 but could never get firefox to run with it.


Thanks to Rafael Muynarsk to answer for kickstarting me. Here is what I did

Install Dependencies

apt install festival speech-dispatcher-festival festvox-{rablpc16k,kallpc16k,kdlpc16k} sox
  • festvox-{rablpc16k,kallpc16k,kdlpc16k} are voice languages for english
  • sox, without it only some part of the text where read

Edit config

sudo vim /etc/speech-dispatcher/speechd.conf

Disable espeak-related config and enable festival one

#AddModule "espeak-ng"    "sd_espeak-ng" "espeak-ng.conf"
AddModule "festival"     "sd_festival"  "festival.conf"

#DefaultModule espeak-ng
DefaultModule festival

Start festival server

Without it I got only some syntences.

/usr/bin/festival --server

Restart Firefox

Then go to reader view mode and try it.


Mbrola (easier method, 20.04 LTS, adapted from Édouard Lopez)

  1. sudo apt-get install espeak mbrola-us1 (or, to get a list of available voices and languages, run apt-cache search mbrola. Note the 'us1'. It will be set as default in step 4, and step 7 must be repeated for each voice installed!)

In /etc/speech-dispatcher/speechd.conf

  1. Enable espeak-mbrola-generic module:

    sudo sed -i '/#AddModule "espeak-mbrola-generic/s/^#//g' /etc/speech-dispatcher/speechd.conf
    
  2. Set Mbrola as default module:

    sudo sed -i '/DefaultModule espeak-ng/s/-ng/-mbrola-generic/g' /etc/speech-dispatcher/speechd.conf
    
  3. Use en1 language:

    sudo sed -i '/#LanguageDefaultModule "en"  "espeak"/s//LanguageDefaultModule "en1"  "espeak-mbrola-generic"/g' /etc/speech-dispatcher/speechd.conf
    

In /etc/speech-dispatcher/modules/espeak-mbrola-generic.conf

  1. Disable generic language:

    sudo sed -i '/GenericLanguage/s/^/#/g' /etc/speech-dispatcher/modules/espeak-mbrola-generic.conf
    
  2. Disable add voices

    sudo sed -i '/AddVoice/s/^/#/g' /etc/speech-dispatcher/modules/espeak-mbrola-generic.conf
    
  3. Enable US voices

    sudo sed -i '/#AddVoice.*us1"$/s/^#//g' /etc/speech-dispatcher/modules/espeak-mbrola-generic.conf`
    

Now, the default, indistinguishable voice will be replaced by a reasonable one that works for offline files. Happy listening!