Why and when was the trilled R in middle English replaced by the modern untrilled one?
As a commentator above has already mentioned, Ben Jonson (1623) described the R thus:
"It is the dogs letter, and hurreth in the sound; the tongue striking the inner palate, with a trembling about the teeth. It is sounded firm in the beginning of the words, and more liquid in the middle and ends; as in rarer, riper."
Jonson's description is not consistent with a General American or West Country type of pronunciation. The reference to the tongue "striking" and trembling" in particular implies that a trill was heard in initial position, and that some other less "firm" sound came in other (postvocalic) positions.
Jonson's description on its own is vague, but would be completely plausible as a description of /r/ by a speaker of Modern Standard Greek, Tehran Persian, Standard Turkish or Antwerp Dutch. In Persian, for example, a trilled [r] occurs initially, a tapped [ɾ] intervocalically, and an approximant [ɹ] (often with some amount of partial frication) often occurs in coda before /t, d, s, z, ʃ, l, ʒ/, where it is in free variation with [ɾ]. The approximant [ɹ] can also occur idiolectally in other positions.
Jonson's description, moreover, is not the only evidence available for the historical realization of English /r/. Foreign or foreign-aimed descriptions of the sounds of English throughout the seventeenth and eighteenth centuries support the existence, one way or another, of a trill or tap of some kind.
Frenchmen, and Englishmen writing for French speakers, are quite illuminating on this point precisely because of how easy the R is for them to describe. Claude Mauger (1698) tells us that the English R "ne differre point de l'r François." Mather Flint (1740) too tells us that R is generally pronounced "comme en François" but that in coda position it is weaker and "presque muet."
This equation of the English and French R may sound strange to a modern reader. Modern English and French speakers are notoriously bad at pronouncing each other's R sounds. But they had no such difficulty until recently.
Articulatory descriptions of the Early Modern English R tend to be rather vague, perhaps because the precise way in which you pronounced your Rs was not seen as a salient social variable of any kind (though failure to pronounce your Rs at all was for quite some time stigmatized as vulgar.) Descriptions of the Early Modern French R, on the other hand, are a bit clearer. The uvular "gutteral" R now associated with French first appears on the scene as a substandard and highly stigmatized pronunciation in the 18th century and only becomes acceptable in good company after the Revolution.
Before the 19th century, the prestige pronunciation of French R was canonically a trill or intervocalic tap, readily described as such by Italians, Spaniards and Germans and quite comically but unambiguously described by the rhetorician in Molière's "Bourgeois Gentilhomme." There is some evidence of weakening and frication intervocalically, particularly in lower Parisian sociolects during the Middle Ages where it seems to have merged with /z/ for some speakers. There was also a tendency toward loss in the early 16th century (though this was restored early on, and never made it into the literary register completely) in coda position where /r/ had the effect of lowering a preceding /ɛ/.
Now it is all but certain that the English R ca. 1600 was subject to variation. It is just as certain that that variation was in many ways quite unlike that observable in Modern English. For one, a uvular /ʀ/ realization (now basically dead) was still alive and well in the north in places like Cumbria, and probably also in Tyneside and other places that now have the NORTH/NURSE merger.
R-ishness is a famously (and, for phoneticians, frustratingly) elastic category. All kinds of things may be perceived or treated as R-like in different languages, and they probably have something in common, but it is not clear what. And variation is not necessarily geographic or sociolectal. A phoneme with multiple possible variant realizations may, beyond allophonic or positional realizations, vary quite freely in the speech of a single speaker depending on a range of factors including every thing from rate of speech to emotional state. The positional effects and the non-phonological factors can overlap in quite complex and sometimes just plain weird ways. Koen Sebregts "Sociophonetics and Phonology of Dutch R" offers a detailed helping of this and more. (Nearly every sounds that has been considered rhotic cross-linguistic survey studies also shows up as a variant of Dutch /r/ in some context, somewhere.)
Taken together, the actual evidence we have for prestige London English R before the end of the 18th century supports more than anything else the (quite prominent) existence of a trill or tap, with weakened variant(s?) in intervocalic and especially in coda position where it had various effects on the preceding vowel (schwa insertion before final /r/ is attested, in some words at least, from the mid 16th century in John Hart's monumental attempt at a reformed spelling.) This would indeed make the modern American R of a word like "start" at the very least plausible in coda position, but a partially fricated alveolar tap without full oral closure could do the same thing. Given the wide distribution of approximant alveolar, rhetroflex or "bunched" R realizations in Modern Englishes around the world, and given the fact that most of these are descendants of exported 17th and 18th century Southern English varieties, it would be surprising if an alveolar approximant wasn't in the mix somewhere as a variant in Renaissance Southern English. But these global Englishes were not exported from London proper and, morever, they continued to be influenced, to varying degrees at various times, by the prestige London standard through most of the 19th century when such a trill would have been in retreat. (Trilled or tapped Rs did survive in upper class American and British speech well into the 19th century, long enough that older speakers were still around by the time audio recording technology became practical.)
In the English described to us by people like Ben Jonson, Claude Mauger, Mather Flint and even Benjamin Franklin, we have every indication that a word like ROUND had a quite different kind of R than that normally heard in the word's general American or Southern English pronunciation.
If you want to hear my reconstruction of how Benjamin Franklin may have sounded, click here
Again wild speculation here, but my hunch is that approximant/trilled/flapped /r/ have coexisted for a longer period than the interval between middle English and the present.
Recall that the /z/ phoneme of Proto-Germanic was rhoticised in North and West Germanic(contrast Gothic 'batiza' with English 'better', PGMC '*hauzijaną' with English 'hear'). The (to me) most plausible pathway includes an alveolar or retroflex approximant stage. The Wiki article on Proto-Norse suggests that Old Swedish maintained the distinction in runes for most of the runic period.
To my ear, most of the Scandiwegian languages still preserve vestiges of the approximant variant, at least allophonically. Many varieties of Dutch do as well, maybe for the same reason that in early NGmc and WGmc these sounds existed separately, but were eventually constrained to inhabit different environments. They could just as easily have been polarised along dialectic lines, and this is what I suspect happened in the Ingvaeonic or Anglo-Frisian period.
So methinks both approximant and trill/flap variants have been used by different groups of speakers, possibly since before the Angglo-Saxons arrived in Britain, but well before the Norman conquest in any case.
I admit I'm largely going on my undergrad comparative linguistic knowledge from 15 years ago. I'm also not considering what role contact with Celtic speakers might have played, so I'm quite cool about being proved wrong. :)
Trilled R is still common in Scotland and parts of Ireland. Many people in England, mostly the old and educated ones, those who neither speak Received Pronunciation nor vulgar accents, often pronounce trilled R. Trilled R can be clearly heard in theatre, mostly classic theatre, as well as in opera and in the first "motion pictures" with sound. Here is a long speech by Charles Chaplin pronouncing a tense & trilled R whenever it occurs before vowel. https://www.youtube.com/watch?v=GU_rn1xzItk And now here is a slow area from Dido and Aeneas by Purcell https://www.youtube.com/watch?v=bf92jTgicGg