A known closed form for Borchardt mean (generalization of AGM) - why doesn't it work?
Solution 1:
Ok, my answer got a little bit out of hand, but I hope it helps to figure out what the problem is. In my opinion the problems are mainly due to typos that are carried over. My answer is based on two articles. One is the original article of Borchardt, where the link was provided in Ewan's answer.
[1] Borchardt, K. W. (1876): Hr. Borchardt las über das arithmetisch-geometrische Mittel aus vier Elementen. Monatsberichte der Königlich Preußischen Akademie der Wissenschaften zu Berlin (Berlin: Verl. d. Kgl. Akad. d. Wiss., 1877), Berlin. (http://bibliothek.bbaw.de/bibliothek-digital/digitalequellen/schriften/anzeige/index_html?band=09-mon/1876&seite:int=646)
[2] J. M. Borwein and P. B. Borwein. On the Mean Iteration $(a,b) \leftarrow \left(\frac{a+3b}{4},\frac{\sqrt{ab}+b}{2}\right)$. Mathematics of Computation, 53(187):311–326, 1989. (https://pdfs.semanticscholar.org/5ead/200192d495859702eb47bc54babb20ab75ae.pdf)
I didn't double check the mathematics (i.e. the reasoning and proofs) in [1] and [2], but I compared the relevant formulas and implemented them numerically. This finally led to a solution that I think is numerically satisfying (but see below).
Basically, [2] uses the same definitions as in the question, so I'll start from [1] and highlight the differences.
The first important difference are the conditions on $a,b,c,e$. In [1] (on top of page 612) the real numbers $a,b,c,e$ have to be such that (i) $a > b > c > e > 0$ and (ii) $a - b - c + e > 0$. I will call this condition $(C)$. On the other hand in [2] page 324 we have $a > b > c > e > 0$ such that $ae - bc > 0$, and I'll call this condition $(C^*)$.
The conditions $(C)$ and $(C^*)$ are clearly there to prevent problems with negative square roots etc., which should already solve some of the problems. However, it seems (numerically) that $(C^*)$ is the one that should finally be used (see below).
From the original article [1], page 612, I take $a_0 = a, b_0 = b, c_0 = c, e_0 = e$ and \begin{align} a_{n+1} &= (a_n+b_n+c_n+e_n)/4,\\ b_{n+1} &= (\sqrt{a_nb_n}+\sqrt{c_ne_n})/2,\\ c_{n+1} &= (\sqrt{a_nc_n}+\sqrt{b_ne_n})/2,\\ e_{n+1} &= (\sqrt{a_ne_n}+\sqrt{b_nc_n})/2, \end{align} and I set $g = \lim_{n \to \infty} a_n$ (the limits are all the same, so I pick wlog the one for $a_n$).
From [1], page 617, I also take \begin{align} A &= a+b+c+e,\\ B &= a+b-c-e,\\ C &= a-b+c-e,\\ E &= a-b-c+e, \end{align} and \begin{align} B_1 &= (\sqrt{AB}+\sqrt{CE})/2,\\ B_2 &= (\sqrt{AB}-\sqrt{CE})/2,\\ C_1 &= (\sqrt{AC}+\sqrt{BE})/2,\\ C_2 &= (\sqrt{AC}-\sqrt{BE})/2,\\ E_1 &= (\sqrt{AE}+\sqrt{BC})/2,\\ E_2 &= (\sqrt{AE}-\sqrt{BC})/2, \end{align} where it is important to notice that [1] defines $E_1 = (\sqrt{AE}-\sqrt{BC})/2 = E_2$ which I believe is a typo and hence I switch the $-$ for a $+$. So far we are in agreement with the definitions in the question and in [2], but considering the values of $\alpha_i$, $i=0,1,2,3$ there are differences. From [1], page 618, I take \begin{align} \Delta = (abce B_1 C_1 E_1 B_2 C_2 E_2)^{1/4}, \end{align} which is different from the question or the definition in [2], see page 325, in that I use the starting values $a,b,c,e$ and not the derived values $A,B,C,E$ (as stated in [1]). The values of $\alpha_i$ also differ between [1] and [2]. From [1], page 618, I take \begin{align} \alpha_0 &= \frac{acB_1}{\Delta},\\ \alpha_1 &= \frac{c C_1 E_1}{\Delta},\\ \alpha_2 &= \frac{a C_2 E_1}{\Delta},\\ \alpha_3 &= \frac{B_1 C_1 C_2}{\Delta}, \end{align} where again the starting values are used, and not $A,B,C,E$ as in the question or in [3], page 325.
All sources however agree on the definition of $R(x)$ as \begin{align} R(x) = x (x-\alpha_0)(x-\alpha_1)(x-\alpha_2)(x-\alpha_3). \end{align}
Now [1], page 618, states that \begin{align} 4 \frac{\pi^2}{g} = \int_{0}^{\alpha_3}\int_{\alpha_2}^{\alpha_1} \frac{x-y}{\sqrt{R(x)R(y)}} dx dy, \end{align} where it is important to notice the factor $4$ that is missing in [2] and in the question.
My numerical experiments show that neither [1] nor [2] is correct. In fact I think the following statement should hold:
Theorem: For $a > b > c > e > 0$ such that $ae - bc > 0$ we have \begin{align} \frac{\pi^2}{g} = \int_{0}^{\alpha_3}\int_{\alpha_2}^{\alpha_1} \frac{x-y}{\sqrt{R(x)R(y)}} dx dy, \end{align} where $g = \lim_{n \to \infty} a_n$ and $\alpha_i$, $i=0,1,2,3$ is defined as in [1], i.e., as above.
It is important to notice that the statement
- uses condition $(C^*)$ from [2], page 324,
- uses the definition from [1] regarding $\Delta$ and $\alpha_i$, $i=0,1,2,3$,
- and does not include the factor $4$ which is present in [1] but not in [2].
In summary, it's a wild mix of [1] and [2]...
I got to this basically following [1] and then using $(C^*)$ instead of $(C)$ and realizing that the value was $4$ times too high. You can double check this with your favorite numerical software tool, I have added my R implementation below. Finally, I have to say that the values don't match perfectly. The numerical integration of the rational function seems to be difficult, but the numbers are always (for random starting values obeying $(C^*)$) close (enough) together in my opinion. As long as I enforce $(C^*)$ I also don't experience any trouble with negative square roots or the like (but this is possible if $(C)$ is used).
R Implementation
#non-random starting values that work with Borwein & Borwein page 324 Theorem 3 condition
a <- 7
b <- 4
c <- 3
e <- 2
#get some random starting values that are OK
goodStart <- FALSE
while (goodStart==FALSE){
X <- sort(rexp(4,rate=20))
#Borwein & Borwein page 324 Theorem 3 condition
if ( X[4]*X[1] - X[2]*X[3] > 0){
a <- X[4]
b <- X[3]
c <- X[2]
e <- X[1]
goodStart = TRUE
}
}
#Borwein & Borwein page 324 Theorem 3 condition
if ( a*e - b*c <= 0){stop("Borwein & Borwein coefficient condition not ok")}
#Borchardt condition - seems to be wrong
#if ( a - b - c + e <= 0){stop("coefficient condition not ok")}
a_i <- a
b_i <- b
c_i <- c
e_i <- e
for (i in 1:1000) {
tmp_a <- (a_i+b_i+c_i+e_i)/4
tmp_b <- (sqrt(a_i*b_i) + sqrt(c_i*e_i))/2
tmp_c <- (sqrt(a_i*c_i) + sqrt(b_i*e_i))/2
tmp_e <- (sqrt(a_i*e_i) + sqrt(b_i*c_i))/2
a_i <- tmp_a
b_i <- tmp_b
c_i <- tmp_c
e_i <- tmp_e
}
#print(max(a_i,b_i,c_i,e_i)-min(a_i,b_i,c_i,e_i))
#just pick any of the four
g <- a_i
#compute close form solution
I1 <- pi^2/g
print("sequence")
print(I1)
#page 617 bottom
A <- a + b + c + e
B <- a + b - c - e
C <- a - b + c - e
E <- a - b - c + e
B1 <- (sqrt(A*B)+sqrt(C*E))/2
B2 <- (sqrt(A*B)-sqrt(C*E))/2
C1 <- (sqrt(A*C)+sqrt(B*E))/2
C2 <- (sqrt(A*C)-sqrt(B*E))/2
E1 <- (sqrt(A*E)+sqrt(B*C))/2 #this seems to be wrong in the original
E2 <- (sqrt(A*E)-sqrt(B*C))/2
Delta <- (a*b*c*e*B1*C1*E1*B2*C2*E2)^(1/4)
alpha0 <- a*c*B1/Delta
alpha1 <- c*C1*E1/Delta
alpha2 <- a*C2*E1/Delta
alpha3 <- B1*C1*C2/Delta
integrand <- function(x,y,param){
alpha0 <- param$alpha0
alpha1 <- param$alpha1
alpha2 <- param$alpha2
alpha3 <- param$alpha3
Rx <- x*(x-alpha0)*(x-alpha1)*(x-alpha2)*(x-alpha3)
Ry <- y*(y-alpha0)*(y-alpha1)*(y-alpha2)*(y-alpha3)
z <- (x-y) / sqrt(Rx*Ry)
return(z)
}
p <- list(alpha0=alpha0,alpha1=alpha1,alpha2=alpha2,alpha3=alpha3)
library(statmod)
gauss <- gauss.quad(500,kind="legendre")
w <- gauss$weights
x <- gauss$nodes
upper_x <- alpha1
lower_x <- alpha2
upper_y <- alpha3
lower_y <- 0
I <- 0
for (i in 1:length(x)) {
for (j in 1:length(x)) {
xx <- (upper_x-lower_x)*x[i]/2 + (upper_x+lower_x)/2
yy <- (upper_y-lower_y)*x[j]/2 + (upper_y+lower_y)/2
ww <- w[i]*w[j]
I <- I + ww*integrand(xx,yy,p)
}
}
I <- (upper_x-lower_x)*(upper_y-lower_y)*I/4
print("numerical integral:")
print(I)
Solution 2:
Here are just a few remarks.
Borchart's original article (referred to as Berl. Monatsber. , pages 611–621) is fully available (in German) at http://bibliothek.bbaw.de/bibliothek-digital/digitalequellen/schriften/anzeige/index_html?band=09-mon/1876&seite:int=647.
However this article itself uses work already done in a longer older article (where most of the hard compatations are done), (referred to as (Mem. Acad. Berlin, 1878, pp. 33-96), and also fully available online at http://bibliothek.bbaw.de/bbaw/bibliothek-digital/digitalequellen/schriften/anzeige/index_html?band=07-abh/1878&seite:int=93.
Your square-root of negative number problem can be solved very simply by starting from $1$ (or any nonzero index) instead of $0$. Because of Borchardt's identity $$ a_{n+1}+\varepsilon b_{n+1}+ \varepsilon' c_{n+1}+\varepsilon \varepsilon'd_{n+1}=\frac{(\sqrt{a_n}+\varepsilon \sqrt{b_n}+ \varepsilon'\sqrt{c_n}+\varepsilon \varepsilon'\sqrt{d_n})^2}{4} \tag{1} $$
for any two signs $\varepsilon,\varepsilon'$ we have for $n\geq 1$, $$a_{n+1}+\varepsilon b_{n+1}+ \varepsilon ' c_{n+1}+\varepsilon \varepsilon 'd_{n+1} \geq 0 \tag{2}$$
So if you use any $a_1,b_1,c_1,d_1$ instead of $a_0,b_0,c_0,d_0$ to define the $B,C,\alpha$ constants, you will get only real and positive numbers.