Unable to do release update on ubuntu 14.04

I'm currently trying to upgrade a ubuntu 14.04 box to xenial. I'm trying to do do release update, and its failing with errors like UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 382: invalid start byte

It looks like a known bug - I've tried that and had no luck finding the offending package, and disabled/removed my 2 non standard package.lst files for nodesource and veeam repositories.

The traceback reads something like this

Traceback (most recent call last):
  File "/tmp/ubuntu-release-upgrader-woadaq_z/xenial", line 8, in <module>
    sys.exit(main())
  File "/tmp/ubuntu-release-upgrader-woadaq_z/DistUpgrade/DistUpgradeMain.py", line 242, in main
    if app.run():
  File "/tmp/ubuntu-release-upgrader-woadaq_z/DistUpgrade/DistUpgradeController.py", line 1876, in run
    return self.fullUpgrade()
  File "/tmp/ubuntu-release-upgrader-woadaq_z/DistUpgrade/DistUpgradeController.py", line 1757, in fullUpgrade
    if not self.doPostInitialUpdate():
  File "/tmp/ubuntu-release-upgrader-woadaq_z/DistUpgrade/DistUpgradeController.py", line 943, in doPostInitialUpdate
    self.tasks = self.cache.installedTasks
  File "/tmp/ubuntu-release-upgrader-woadaq_z/DistUpgrade/DistUpgradeCache.py", line 806, in installedTasks
    for line in pkg._pcache._records.record.split("\n"):
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 382: invalid start byte
Error in sys.excepthook:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/problem_report.py", line 416, in add_to_existing
    self.write(f)
  File "/usr/lib/python3/dist-packages/problem_report.py", line 369, in write
    block = f.read(1048576)
  File "/usr/lib/python3.4/codecs.py", line 319, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8b in position 1: invalid start byte

Original exception was:
Traceback (most recent call last):
  File "/tmp/ubuntu-release-upgrader-woadaq_z/xenial", line 8, in <module>
    sys.exit(main())
  File "/tmp/ubuntu-release-upgrader-woadaq_z/DistUpgrade/DistUpgradeMain.py", line 242, in main
    if app.run():
  File "/tmp/ubuntu-release-upgrader-woadaq_z/DistUpgrade/DistUpgradeController.py", line 1876, in run
    return self.fullUpgrade()
  File "/tmp/ubuntu-release-upgrader-woadaq_z/DistUpgrade/DistUpgradeController.py", line 1757, in fullUpgrade
    if not self.doPostInitialUpdate():
  File "/tmp/ubuntu-release-upgrader-woadaq_z/DistUpgrade/DistUpgradeController.py", line 943, in doPostInitialUpdate
    self.tasks = self.cache.installedTasks
  File "/tmp/ubuntu-release-upgrader-woadaq_z/DistUpgrade/DistUpgradeCache.py", line 806, in installedTasks
    for line in pkg._pcache._records.record.split("\n"):
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 382: invalid start byte
=== Command terminated with exit status 1 (Mon Apr  3 09:31:21 2017) ===

And there's nothing really helpful in the logs. How would I get do-release update to work?


What you have there is the upgrade script itself tripping over invalid data somewhere. You need to find and remove the invalid data.

In this case, it was the package veeamsnap. Removing that package should fix it. But because this is different for each case, I'll describe the steps taken to reach that conclusion. It is a fairly complicated process.

This is a fun one, because python3 strings should all be in UTF-8. What you have here (discovered after the fact) is a C module (apt_pkg) somehow inserting non-UTF-8 data into a python3 string, therefore breaking every attempt at reading the string - notice how the error handler itself threw an exception too?

Into the unknown debugger we go!

The best way to diagnose issues like this is to cause the debugger to pause before the line that fails. With Python, when you have a series of nested calls like this the easiest way to add a debugger pause is to edit the file itself.

  1. Using your example, we can see that the failure in question is in the file /tmp/ubuntu-release-upgrader-woadaq_z/DistUpgrade/DistUpgradeCache.py line 806, so let's fire up a text editor and go to that line. The temp path will be different for each run, so make sure you use the one from your error output!

    screenshot of editor

  2. From here, we can first add a simple pause in the debugger, by inserting import pdb; pdb.set_trace(); on line 806 just before the error. Because this is Python, the indentation is important!

    screenshot of debugging statement

  3. Now we need to run the modified program. Don't run do-release-upgrade again; that'll probably download a new one. See in the error logs, the first line after "Original exception was"? The one with /tmp/ubuntu-release-upgrader-woadaq_z/xenial? That's the one you want to run. So run that file, as root (or sudo).

    Running that should get you into the debugger (pdb):

    screenshot of debugger

  4. From here, we figure out how many packages there are in total. The easy way to do that is to run sum(1 for _ in self). Wait a bit (this can take a while) and it will print a number. In this case, it was 76028.

    Now, since the error probably doesn't happen in the first few, and we don't want to manually step through >75000 packages, and we can't add an exception handler (because the error is so bad it breaks Python itself), we need an alternative.

  5. Remove the line added in step 4. Edit the code to print an incrementing number out for every package. For example, add foo = 0 above the loop on line 802 and foo += 1; print(foo) on line 807 (just before the erroring line).

    screenshot of number printing code

  6. Run the code again, using the same command as in step 3. It will print a large list of numbers. Let it keep running until it prints the error again. You might need to enlarge your window:

    screenshot of number output

    That last number should be the package it crashed on. Keep note of that number.

  7. Now that you know which package/number causes the crash, it's time to add the debugger pause with a condition to only execute on that package. For example, if you crash on package 72285, add if foo == 72285: import pdb; pdb.set_trace() just after the line that prints foo:

    screenshot of new pdb pause

  8. Run the code again. Now when you get into pdb it should be on the package that causes the crash. You can type in the name of the variable pkg to print its value, which will tell you the current package's name:

    screenshot of package name

    More generally, typing in the name of any variable will print its output.

  9. Remove the offending package and try the upgrade again (from a clean do-release-upgrade).