How to convert a .pdf file into a folder of images?
Solution 1:
OK well, I did some more research and although tohuwawohu's method does work, I found it easier to use a program called pdftoppm to achieve what I wanted done. Since I am pretty much a layperson when it comes to using command line apps, I will do my best to explain how I got this to work for me.
-
Navigate to the folder containing the .pdf you wish to edit and open a terminal there. I did this by using the sample command:
cd ~/Documents/PDF
-
Let's say the file I want to edit is called Sample.pdf What I want to do is use pdftoppm to create image files of each page of the .pdf. Several formats can be chosen (see the man pages link above) but I prefer to use .png. The basic command looks like this:
pdftoppm -FORMAT FILENAME.pdf PREFIX
or in the example above:
pdftoppm -png Sample.pdf Sample
This command creates an image file of each page in the same folder as the original .pdf file with names like Sample-01.png, Sample-02.png and so on. I have tried it with the .png and .jpeg extensions successfully. .jpg is apparently not supported.
Then I just use Archive Manager by selecting all the newly-created image files, right-clicking, and choosing "Compress" from the context menu. I then choose the archive format I prefer (in this case .cbz or Comic Book Zip) and create the new archive.
Now I have a shiny new .cbz file called Sample.cbz which I can then view with my Comix reader!
Hopefully what I have posted above makes enough sense that someone else can learn from it. If I need to change it in any way please let me know.
Solution 2:
I'm not very familiar with *.cbr / *.cbz, but it seems you'll have to combine two steps:
- Convert PDF to Images
- Compress them into a ZIP / RAR archive.
Regarding step 1, you could use ImageMagick's convert
command. You can feed convert
with a PDf comprising multiple pages, and convert
will return each page as single graphics file. I've tested it with a text scanned at 400 dpi, and the following command resulted in nice single JPGEs:
$ convert -verbose -colorspace RGB -interlace none -density 400 -quality 100 yourPdfFile.pdf 00%d.jpeg
(credits regarding the -quality
option: this forum entry)
As a result, you get 000.jpeg
, 001.jpeg
and so on. Just zip them into a .cbz
file, and you're done.
You could even combine both steps by "concatenating" them:
$ convert -verbose -colorspace RGB -interlace none -density 400 -quality 100 yourPdfFile.pdf 00%d.jpg && zip -vm comic.cbz *.jpg
(make sure that there aren't any other JPEGs in your current working directory, since using the code above, zip will move all JPEGs into the cbz file)