How to extract an embedded image from a SVG file?

Solution 1:

My own solution (or... workaround):

  1. Select the image in Inkscape
  2. Open the built-in XML Editor (Shift+Ctrl+X)
  3. Select the xlink:href attribute, which will contain the image as data: URI
  4. Copy the entire data: URI
  5. Paste that data: URI into a browser, and save it from there.

Alternatively, I can open the SVG file in any text editor, locate the data: URI and copy it from there.

Although this solution works, it's kinda cumbersome and I'd love to learn a better one.

Solution 2:

There's a better solution instead:

go to Extensions -> Images -> Extract Image..., there you can save selected raster image as a file. However this extension works weird and somehow works rather slowly (but perfectly well).

Another note: this extension is cumbersome and dies silently on vary large images. Also, with large number of raster images it can spike memory usage of inkscape to horrendous levels (like 3GB after only a handful of images extracted).

Because I've got about 20 svg files with about 70 raster images in them each, each image at least 1MB in size, I needed a different solution. After a short check using Denilson Sá tip I devised the following php script, that extracts images from svg files:

#!/usr/bin/env php
<?php

$svgs = glob('*.svg');

$existing = array();

foreach ($svgs as $svg){
    mkdir("./{$svg}.images");
    $lines = file($svg);
    $img = 0;
    foreach ($lines as $line){
        if (preg_match('%xlink:href="data:([a-z0-9-/]+);base64,([^"]+)"%i', $line, $regs)) {
            $type = $regs[1];
            $data = $regs[2];
            $md5 = md5($data);
            if (!in_array($md5, $existing)) {
                $data = str_replace(' ', "\r\n", $data);
                $data = base64_decode($data);
                $type = explode('/', $type);
                $save = "./{$svg}.images/{$img}.{$type[1]}";
                file_put_contents($save, $data);
                $img++;
                $existing[] = $md5;
            }
        } else {
            $result = "";
        }
    }
}

echo count($existing);

This way I can get all the images I want, and md5 saves me from getting repeated images.

I bet there must be another way that is a lot simpler, but it's up to inkscape devs to do it better.

Solution 3:

Finally, years later, I've written a script to correctly extract all images from an SVG file, using a proper XML library to parse the SVG code.

https://github.com/denilsonsa/small_scripts/blob/master/extract_embedded_images_from_svg.py

This script is written for Python 2.7 but should be quite easy to convert to Python 3. Even better, about 50 lines can be deleted after conversion to Python 3.4, due to the new features introduced in that version.