How do I find Oxford/Merriam Words by Year?

Is there a way to search for when a word become a recognized word? My guess is that words that have been around "forever" were part of early unabridged English dictionaries. For modern history though, how can we see when a word became recognized, perhaps by a dictionary?

What would be ideal is to see something like this by year.  For example, we know Oxford added bromance at some point. What year?


Solution 1:

How do you find Oxford/Merriam words by year? It's not terribly complicated but we should look at what all those things mean and imply first.

What is a word? I'm not going to try to explain that exactly except to explain what is around it. Speech came before writing, and writing is an attempt at recording a throat/mouth modulated sound stream into what we think is being said. distinct parts of that sound stream are cognitively separable into what are generally known as 'words'.

Language, which existed long before writing, didn't pop out of thin air; from subtle variations in air passage grunts, it slowly accreted features, to become the information portal we have today. The actual process of language evolution is a difficult study. Nobody was there at the beginning to say 'Aha! Those grunts are now a language'. We can attempt that now with animals and there is lots of language like abilities. But it is not a sharp process where one day you don't have it and the next day you do.

Individual words, which existed long before dictionaries and writing, didn't pop out of thin air; they slowly gained currency being modified little by little by sound changes, by semantic drift, by invading overlords and other-speaking childcare, by commonly made mistakes. 'Ward', for caretaker, is from Old English/Germanic; the Norman invaders speaking Old French had previously gotten what eventually became 'guard' from a different earlier German invasion and had already had a sound change from 'w-' to 'gu-'. The sound change happened slowly in northern Old French; the borrowing of the word 'guard' a little faster but still not immediate.

Dictionaries, which existed long before Sam Johnson and/or English, are created as one kind of book making process, with lots of data gathering and structuring that need to be cross-referenced and edited for quality and consistency. Coming up with a word list for a language has been around since the ancients, to help with learning or translation. Dictionaries with explanations or definitions evolved from that. With printing, they became even more elaborate, with grammar or examples.

A handful of words are neologisms, new words that did, contrary to all I've been saying so far, pop out of thin air. Created by a journalist for humor, an engineer creating a new machine, a doctor who finds something they've never seen before. The neologizer stays up late at night and in a burst of inspiration, the word flashes across their eyes, they type it up, it's printed the next day (in the days of yore writers had to have their words 'printed' with 'ink' on 'paper').

But a dictionary is a technical work with human authors (usually in the plural because it is so labor intensive). Some dictionaries are written for school children, others are written for foreign language learners, some are even meant to be an ongoing attempt at recording the state of the language.

Even if there is in fact an exact timestamp of when a part of an utterance becomes this magical thing called a word, there is still the difficulty of others pin-pointing exactly when this may have occurred. Records aren't perfect. A printed work isn't universally known to the world. Even computer information isn't universally known to the world instantaneously. So it takes lots of readers to come up with earlier and earlier citations to get a more and more accurate date of first appearance.

When a large collection of writing (called a corpus) is digitized, it then becomes easier to apply computer methods to word analysis. But there may still be difficulties: if using optical character recognition, letters in strange older fonts may be mischaracterized (literally!) and dates for documents may be chosen inappropriately.

This is all a long explanation to say that currently, in English, the best resources for what you want are:

  • the Oxford English Dictionary - they give citations for the earliest known evidence of a word, and they update their database continuously. They've moved off a print model to computer model (they print from their database now). For any particular word, there may be lots of evidence that shows that a particular neologism was printed in a certain publication, but often neologisms just aren't that new and someone else thought of it first, so there's always doubt. You can search the dictionary for words and then look at the entries to see which meaning you want. And then there'll be a number of citations starting with the earliest one found so far.

  • Google NGrams - they have a huge corpus of digitized library books. With computer techniques you can search for words, and they return a graph of the frequency of the word for each year since, up to 2000. if the line is 0 up to a certain data and then shoots up afterwards, then you have good evidence (not incontrovertible) that that is when the word was invented.