How much of the English lexis comes from each of its influences?
I was watching a video linked in this answer and it made the following claim:
[...], like most words in English, is derived from German.
That got me thinking. While I know that Germanic languages have greatly influenced English, so have the Latin and Celtic ones (and various others to a greater or lesser degree). Is it true that more than 50% of the English vocabulary is derived from Germanic roots?
More generally, can someone point me to data on this? I imagine attempts have been made to quantify the contribution of different languages to English; what were the results? What percentage of the lexis comes from each source?
Ideally I would like to see this expressed in terms of % of words but I am aware that, at least to some linguists, attempting to quantify vocabulary is anathema (to give a simple reason, all languages that allow number construction have an infinite vocabulary by definition), so alternative approaches to quantifying this are also welcome.
Solution 1:
Wikipedia has the following pie chart showing the word origins:
It shows the breakdown as
- Latin (including words used only in scientific / medical / legal contexts) ≈ 29%
- French ≈ 29%
- Germanic ≈ 26%
- Greek ≈ 6%
- Others ≈ 10%
It cites some references which back up these numbers but I don't have access to those.
To answer your question, it does not appear to be true that 50% of words are Germanic. However, that probably depends on what your context is. If you exclude scientific, medical, and legal, you will probably find a much lower incidence of Latin words. Given that English is itself a Germanic language, it's more surprising that Germanic doesn't account for MORE of the vocabulary.