How to extract vectors from a PDF file?

Solution 1:

You can use Inkscape, which is a free, open source and cross-platform vector graphics application. It will allow you to import the PDF files and select embedded vectors. You can then edit them and process as you like.

Detailed documentation is available on the Inkscape website.

Note that on Linux it like requires X11. There is also a native Windows version.

Alternatively, you may want to give Adobe Illustrator a go (paid software).

Solution 2:

While Inkscape is an awesome way to do it, for those lacking X11, you can also extract individual pages of a PDF into SVG format using the poppler-utils at the command line. For example, to extract just page 30:

$ pdftocairo -f 30 -l 30 -svg  somehugemanual.pdf  myextractedpage.svg

You can then use your favorite vector editor (mine is Inkscape) to isolate the image from the text.

Alternately, if you're a hardcore command-line user, you can extract to EPS (encapsulated postscript) and use sed to hide all the text (which happens to be between BT and ET lines for pdftocairo). Here's how:

$ pdftocairo -f 30 -l 30 -eps  manual.pdf  - | sed '/^BT$/,/^ET$/ d' > myimage.eps

And, if you're really insane to avoid using X11, you can even shrink the bounding box of the image from the command line using Ghostscript's eps2eps command:

$ eps2eps myimage.eps myimage-bb.eps

I've tested this and it works great. However, personally, I find it easier to just use Inkscape.

How to extract vectors from a PDF file?

Solution 1:

Solution 2:

Related

Recent Posts