scipy is not optimizing and returns "Desired error not necessarily achieved due to precision loss"

I have the following code which attempts to minimize a log likelihood function.

#!/usr/bin/python
import math
import random
import numpy as np
from scipy.optimize import minimize

def loglikelihood(params, data):
    (mu, alpha, beta) = params
    tlist = np.array(data)
    r = np.zeros(len(tlist))
    for i in xrange(1,len(tlist)):
        r[i] = math.exp(-beta*(tlist[i]-tlist[i-1]))*(1+r[i-1])
    loglik  = -tlist[-1]*mu
    loglik = loglik+alpha/beta*sum(np.exp(-beta*(tlist[-1]-tlist))-1)
    loglik = loglik+np.sum(np.log(mu+alpha*r))
    return -loglik

atimes = [ 148.98894201,  149.70253172,  151.13717804,  160.35968355,
        160.98322609,  161.21331798,  163.60755544,  163.68994973,
        164.26131871,  228.79436067]
a= 0.01
alpha = 0.5
beta = 0.6
print loglikelihood((a, alpha, beta), atimes)

res = minimize(loglikelihood, (0.01, 0.1,0.1), method = 'BFGS',args = (atimes,))
print res

It gives me

28.3136498357
./test.py:17: RuntimeWarning: invalid value encountered in log
  loglik = loglik+np.sum(np.log(mu+alpha*r))
   status: 2
  success: False
     njev: 14
     nfev: 72
 hess_inv: array([[1, 0, 0],
       [0, 1, 0],
       [0, 0, 1]])
      fun: 32.131359359964378
        x: array([ 0.01,  0.1 ,  0.1 ])
  message: 'Desired error not necessarily achieved due to precision loss.'
      jac: array([ -2.8051672 ,  13.06962156, -48.97879982])

Notice that it hasn't managed to optimize the parameters at all and the minimized value 32 is bigger than 28 which is what you get with a= 0.01, alpha = 0.5, beta = 0.6 . It's possible this problem could be avoided by choosing better initial guesses but if so, how can I do this automatically?

Solution 1:

I copied your example and tried a little bit. Looks like if you stick with BFGS solver, after a few iteration the mu+ alpha * r will have some negative numbers, and that's how you get the RuntimeWarning.

The easiest fix I can think of is to switch to Nelder Mead solver.

res = minimize(loglikelihood, (0.01, 0.1,0.1), method = 'Nelder-Mead',args = (atimes,))

And it will give you this result:

28.3136498357
  status: 0
    nfev: 159
 success: True
     fun: 27.982451280648817
       x: array([ 0.01410906,  0.68346023,  0.90837568])
 message: 'Optimization terminated successfully.'
     nit: 92

Solution 2:

Another solution (that worked for me) is to scale your function (and gradients) to values closer to 0. For example, my problem came up when I had to evaluate a log-likelihood of 60k points. This meant that my log-likelihood was a very large number. Conceptually, the log-likelihood was a very very spikey function.

The gradients started off large (to climb this spikey mountain), and then became moderately small, but never less than the default gtol parameter in the BGFS routine (which is the threshold that all gradients must be below for termination). Also, at this time I had essentially arrived at the correct values (I was using generated data so I knew the true values).

What was happening was that my gradients were approx. 60k * average individual gradient value, and even if the average individual gradient value was small, say less than 1e-8, 60k * 1e-8 > gtol. So I was never satisfying the threshold even though I had arrived at the solution.

Conceptually, because of this very spikey mountain, the algorithm was making small steps, but stepping over the true minimum and never achieved average individual gradient << 1e-8 which implies my gradients never went under gtol.

Two solutions:

1) Scale your log-likelihood and gradients by a factor, like 1/n where n is the number of samples.

2) Scale your gtol: for example "gtol": 1e-7 * n

Solution 3:

Facing the same warning, I solved it by rewriting the log-likelihood function to get log(params) and log(data) as arguments, instead of params and data.

Thus, I avoid using np.log() in the likelihood function or Jacobian, if possible.

Solution 4:

Watch out for negative values of the log() function, resolve them and tell the optimizer that they are bad, by adding a penalty:

#!/usr/bin/python
import math
import random
import numpy as np
from scipy.optimize import minimize

def loglikelihood(params, data):
    (mu, alpha, beta) = params
    tlist = np.array(data)
    r = np.zeros(len(tlist))
    for i in xrange(1,len(tlist)):
        r[i] = math.exp(-beta*(tlist[i]-tlist[i-1]))*(1+r[i-1])
    loglik = -tlist[-1]*mu
    loglik += alpha/beta*sum(np.exp(-beta*(tlist[-1]-tlist))-1)
    argument = mu + alpha * r
    limit = 1e-6
    if np.min(argument) < limit:
        # add a penalty for too small argument of log
        loglik += np.sum(np.minimum(0.0, argument - limit)) / limit
        # keep argument of log above the limit
        argument = np.maximum(argument, limit)
    loglik += np.sum(np.log(argument))
    return -loglik

atimes = [ 148.98894201,  149.70253172,  151.13717804,  160.35968355,
        160.98322609,  161.21331798,  163.60755544,  163.68994973,
        164.26131871,  228.79436067]
a= 0.01
alpha = 0.5
beta = 0.6
print loglikelihood((a, alpha, beta), atimes)

res = minimize(loglikelihood, (0.01, 0.1,0.1), method = 'BFGS',args = (atimes,))
print res

Solution 5:

I know I am late but I do 3 optimizations in series. First I use a Nelder-Mead to get close. Without first getting close, I get way too many overflow errors. I then copy res.x to the starting parameters for the next optimizing routine. I have found that Powell is most reliable and it usually does a pretty good job. BUT, I then do another minimization using Nelder-Mead again to avoid falling in to local minimums.
Usually, there isn't much improvement after using the Powell minimization.