How to calculate number of syllables in a word using only the IPA (International Phonetic Alphabet) spelling?

Here's a spreadsheet with English words, IPA and syllable data https://docs.google.com/spreadsheets/d/1EfFhhC7kcTzB8c2UhAC53txRiTLKl3R9C2AM7ee0AVM/edit#gid=104606017


So after coming across this question and wanting to know the answer myself, I managed to pull together a few different sources of information into one Excel spreadsheet, and get data for just over 31,000 words with both their IPA pronunciations, and the number of syllables. I also found some frequency data, which can be used to naively split the words into deciles, according to how often the words are roughly used - meaning that you can sort the words by both syllable length and frequency of use (which is a very rough measure of complexity.)

Caveat: the pronunciation data I've pulled is from UK English. I pulled it from a GitHub repo containing IPA information for many languages, which also contains a file containing US English words and their pronunciations, linked here.

I haven't integrated it myself because I only need the UK data*, but you can pull the data into Excel fairly easily - the fields are split by whitespace, so Text-to-columns should separate the IPA from words. If you're comfortable with Excel then it should be fairly simple to combine this with the other data to get a list of all US English pronunciations and their syllable counts.

The rest of the sources for the data are linked in the spreadsheet itself.


* Also, I did try to add in the US Word data to the Google Sheets, but Sheets complained that this would exceed the cell limit. I put together this project based on UK data before I realised that you're probably from the US, and built all the formulas around it, so it would take a bit of unpicking for me to switch it over fully. I might come back to it another day. Hope this is still of some use to you.