How much of the English lexis comes from each of its influences?

I was watching a video linked in this answer and it made the following claim:

[...], like most words in English, is derived from German.

That got me thinking. While I know that Germanic languages have greatly influenced English, so have the Latin and Celtic ones (and various others to a greater or lesser degree). Is it true that more than 50% of the English vocabulary is derived from Germanic roots?

More generally, can someone point me to data on this? I imagine attempts have been made to quantify the contribution of different languages to English; what were the results? What percentage of the lexis comes from each source?

Ideally I would like to see this expressed in terms of % of words but I am aware that, at least to some linguists, attempting to quantify vocabulary is anathema (to give a simple reason, all languages that allow number construction have an infinite vocabulary by definition), so alternative approaches to quantifying this are also welcome.


Solution 1:

Wikipedia has the following pie chart showing the word origins:

It shows the breakdown as

  • Latin (including words used only in scientific / medical / legal contexts) ≈ 29%
  • French ≈ 29%
  • Germanic ≈ 26%
  • Greek ≈ 6%
  • Others ≈ 10%

It cites some references which back up these numbers but I don't have access to those.

To answer your question, it does not appear to be true that 50% of words are Germanic. However, that probably depends on what your context is. If you exclude scientific, medical, and legal, you will probably find a much lower incidence of Latin words. Given that English is itself a Germanic language, it's more surprising that Germanic doesn't account for MORE of the vocabulary.