Estimating the "size" of the mathematical research literature

The other day I was telling one of my friends that mathematics, as a living science, possesses quite an extensive research literature. How extensive then, she asked. Unfortunately, I didn't have enough information to provide her with a useful answer.

Here's a set of questions related to my friend's inquiry. Any insights will be warmly appreciated.

  1. What quantity would be a good measure of the "size" of a particular science's research literature?

  2. How many mathematical research periodicals are being published today in the whole world?

  3. I've once heard it said that about 80% of all mathematical research literature was written after 1945. Is there any research to back this up?

  4. How extensive is the mathematical literature compared to that of other exact sciences, such as physics, chemistry, biology, or medicine?


Solution 1:

1. What quantity would be a good measure of the "size" of a particular science's research literature?

When considering such a broad question, it can be helpful to start with a narrower question that can be attacked more directly, and then relate the two scopes. So I would ask, what quantity is a good measure of the individual scientific contribution of a piece of literature?

The most fundamental answer, I think, is given by the Scientific Method, which has four parts (to borrow from U Rochester):

  1. Observation and description of a phenomenon or group of phenomena.
  2. Formulation of an hypothesis to explain the phenomena. In physics, the hypothesis often takes the form of a causal mechanism or a mathematical relation.
  3. Use of the hypothesis to predict the existence of other phenomena, or to predict quantitatively the results of new observations.
  4. Performance of experimental tests of the predictions by several independent experimenters and properly performed experiments.

A paper that instantiates any one of these components within a particular science can be said to meaningfully contribute to the "extensiveness" of its research literature.

Two parameters could suffice to measure the overall contribution of a scientific paper:

  1. The number of component instances.
  2. The "usefulness" of the paper.

A cheap & easy metric of usefulness could be the number of citations, which can be estimated from databases; it would follow that the usefulness of a contribution is better judged the longer it has been around, which makes sense historically.

Returning to the scope of your original question #1, a fair estimate of the "size" of a particular science's research literature could be the total sum contribution over all the published literature.

If one was to randomly select publications from a comprehensive research library and determine the sample mean contribution using the metric given above, statistical techniques could give a confidence interval for the population mean contribution (i.e. for the whole population of literature in that scientific field). Then one could naïvely multiply the lower confidence bound by the total number of publications. Someone more familiar with statistics could derive more descriptive result based on the same metric, I'm sure.

This approach could be used to compare the "size" of the research literature of various scientific disciplines. The results could be interesting.

2. How many mathematical research periodicals are being published today in the whole world?

The American Mathematical Society keeps an updated database of mathematical serials being published around the world. Though I couldn't find an explicit total number on their website, you can count the number of entries on the citation abbreviation reference sheet here, where hail sundry journals from Singapore to Slovenia, from Moscow to Vietnam. According to Wikipedia, it is an "essentially complete list of mathematical journals."

Sampling & multiplying entries off a single page yields an extrapolated total of around 2,500 as of April, 2015.

3. I've once heard it said that about 80% of all mathematical research literature was written after 1945. Is there any research to back this up?

Larson and Markus (2010) studied the growth rate of scientific literature (in general) from 1907 to 2007. The paper is freely available to read, but lacking the time to peruse the whole paper, I just grabbed a loosely-relevant statistic from the first paragraph and did a back-of-the-envelope calculation with a surprising result.

In 1961 Derek J. de Solla Price published the first quantitative data about the growth of science, covering the period from about 1650 to 1950. The first data used were the numbers of scientific journals. The data indicated a growth rate of about 5.6% per year ...

Suppose the number of scientific journals in 1945 was $S_{1945}$. Using the 5.6% growth rate, we have $S_{2015} = S_{1945}(0.056)(2015-1945) = S_{1945}3.92$.

Then we look at the ratio $\frac{S_{2015}}{S_{1945} + S_{2015}} = \frac{3.92S_{1945}}{4.92S_{1945}} = \frac{3.92}{4.92} = 79.7$%, only 1/3 of a percentage point off 80%.

This merely suggests that your figure is consistent with the average growth of general scientific literature over 3 centuries. Edit: This is probably pure coincidence, as the growth rate of periodicals is a 2nd derivative. My apologies.

The paper looks fairly extensive at a glance, so you might find something supportive from a thorough perusal (i.e. beyond the first paragraph), but it's not specific to mathematics so YMMV. Tunneling through the bibliography might be helpful.

4. How extensive is the mathematical literature compared to that of other exact sciences, such as physics, chemistry, biology, or medicine?

That's a really great question; I'd be surprised if there aren't any studies on this. My approach to question #1 could answer this but is pretty tedious.