Why are regular expressions called "regular" expressions?

Why are regular expressions called regular expressions?


Solution 1:

They are based on regular languages.

Solution 2:

Why are they called "regular expressions?"

Regular expressions trace back to the work of an American mathematician by the name of Stephen Kleene (one of the most influential figures in the development of theoretical computer science) who developed regular expressions as a notation for describing what he called "the algebra of regular sets." His work eventually found its way into some early efforts with computational search algorithms, and from there to some of the earliest text-manipulation tools on the Unix platform (including ed and grep). In the context of computer searches, the "*" is formally known as a "Kleene star."

From here.

Solution 3:

What Kleene meant by “regular events” was an event processed by a set of nerve cells–an event of perception or of thought. Kleene’s paper says nothing about computers, programming, matching patterns in text or searching for text on a computer the paper was not even composed on or near a computer, as the typescript would indicate.

As you may read in an excellent history of Regular Expressions, in the fine Christopher M. Kelty's book [Logical Instruments: Regular Expressions, AI and thinking about thinking] (2011)1

Regular Expressions originate in neurology and neurobiology in the work of McCulloch in 1930s. Later in 1940s, what McCulloch and Pitts achieved was far more influential in engineering, computer science and mathematics than it was in biology or neuroscience. Works that take McCulloch and Pitts logical calculus of nerve nets as a starting point have been extraordinarily bountiful in mathematics and computer science. Formalization entirely, beginning at least with McCulloch and Pitts themselves, whose 1947 paper “How we know universals” and the 1959 paper they wrote with Lettvin and Maturana, “What the Frogs Eye Tells the Frog’s brain” [Lettvin et al., 1959, Pitts and McCulloch, 1947] both abandon the strict formal equivalence with propositional calculi or the Turing machine, in favor of more complex biological models which are less amenable to logical manipulation.

McCulloch’s interest was initially in finding what he hypothesized as a “psychon”—or atomic unit of neural activity, which he first sought in his physiological research conducted during the 1930s in partnership with Yale physiologist J.G. Dusser de Barenne. In the early 1940s, McCulloch was introduced to Walter Pitts by Jerome Lettvin, and thereby to Nicholas Rashevsky’s Mathematical Biology group at the University of Chicago, where Walter Pitts had been actively working on models of neural activity with Rashevsky and mathematician Alston Householder.

The collaboration between the two was lopsided, at best. McCulloch was in his forties, Pitts was 17; McCulloch had spent his career in physiology and philosophy, Pitts was by various and sometimes unreliable accounts a mathematical prodigy who had run away from his home in Detroit and met Bertrand Russell in a park in Chicago [Smalheiser, 2000, Schlatter and Aizawa, 2008]. Together, however, they managed to piece together something that met in the middle, a paper that demonstrated the formal equivalence between a plausible model of neural activity, and a logical calculus.

Part of McCulloch and Pitts inspiration for their paper was Turing’s machine. As Tara Abraham puts it “Turing was able to define the complicated process of computation in ’mechanical’ terms, with the notion of a simple algorithm so exhaustive, rigorous and unambiguous that the executor would need no ’mathematical knowledge’ to carry out its task.” [Abraham, 2003, 18] This identification of computation with an automatic procedure provided the inspiration for McCulloch and Pitts to model a set of nerves as something that could also calculate “in the absence of mathematical knowledge.”

In hindsight, what McCulloch and Pitts achieved was far more influential in engineering, computer science and mathematics than it was in biology or neuroscience.

Kleene, Stephen C. (1956), "Representation of events in nerve nets and finite automata"

famous 1959 paper by J. Y. Lettvin, H. R. Maturana, W. S. McCulloch and W. H. Pitts, What the Frog's Eye Tells the Frog's Brain

In 1968, Ken Thompson published a short “Programming Techniques” paper for the CACM in which he described the “Regular Expression Search Algorithm”

Solution 4:

Because they used to in fact be regular. See http://en.wikipedia.org/wiki/Regular_language and http://en.wikipedia.org/wiki/Regular_expressions . Larry Wall advocates calling modern ones regexen because they are no longer anything like regular.