lives vs lives: how can I correct the pronunciation of the 'say' command?

Is there a reliable way to correct the pronunciation of the say command without removing words or introducing pauses?

MacOS's built-in say command mispronounces some words due to the pronunciation changing depending on context. An endlessly problematic example of this is the pronunciation of lives, the plural of the noun life, and lives, the third-person singular conjugation of the verb live. One answer to this question suggests using a table-driven approach to correcting mistakes, which is what I have been doing.

Lives/lives pronunciation confounds this solution because even if you substitute "lighvs" for "lives" to force a specific pronunciation, the say command will say "livs" anyway, sometimes and with some voices. The underlying speech engine is apparently making a decision based on context, but the rules are opaque and non-obvious.

Example: Using the "Kate" voice, lives is mispronounced.

"Deputies did everything they could tonight to de-escalate, and they almost lost their lives to a 12-year-old and a 14-year-old," said Volusia County sheriff Mike Chitwood.

Changing it to

"Deputies did everything they could tonight to de-escalate, and they almost lost their lighvs to a 12-year-old and a 14-year-old," said Volusia County sheriff Mike Chitwood.

produces the same result.

Curiously, if you remove the word "almost", leaving

"Deputies did everything they could tonight to de-escalate, and they lost their lives to a 12-year-old and a 14-year-old," said Volusia County sheriff Mike Chitwood.

lives is pronounced as it should be. Replacing "lives" with ",lighvs" is also successful at forcing a particular pronunciation, but at the expense of introducing an unnatural pause.

Is there a reliable way to correct the pronunciation of the say command without removing words or introducing pauses?


Solution 1:

This may not be the elegant solution that you are wanting, but the macOS Speech Manager has speech tuning capabilities built in. You can use them by adding inline modifiers to your text. For example, to spell a word phonetically within a string of text, you wrap the word inside an [[inpt PHON]] [[inpt TEXT]] delimiter and use the table of phonemes (Table B-1 here) to spell what you want:

say "they almost lost their [[inpt PHON]]lIHvz[[inpt TEXT]]"  # SHORT 'i' sound
say "they almost lost their [[inpt PHON]]lAYvz[[inpt TEXT]]"  #  LONG 'i' sound

(Note that capitalization matters in this table.)

So this lets you avoid removing words and introducing pauses, but at the expense of readability and manual intervention. That may or may not be enough to meet your needs.

Full details are at https://developer.apple.com/library/archive/documentation/UserExperience/Conceptual/SpeechSynthesisProgrammingGuide/FineTuning/FineTuning.html#//apple_ref/doc/uid/TP40004365-CH5-SW11.

Solution 2:

This is the closest I could get:

Using lyyves with Kate (long I): WAV audio sample | mirror

Using livvs with Kate (short I): WAV audio sample | mirror

Confirmed working with voices Daniel, Fred, Alex, Samantha, Bruce, Zarvox too.

Hope it helps.