copy a table from pdf as a table?
Solution 1:
You might try Tabula - works pretty well for data oriented content placed in tables.
A short intro can be found on the homepage.
Using the tool for the PDF attached to this question you have to:
- Download the file to your local disk.
- Install and start the tool following the instructions on the homepage.
- Upload the PDF and select Submit.
- Navigate to the first table and select the table. Ensure that you do not select the header and footer of the page to get a more accurate result.
- Choose Repeat this selection if you want to select the following tables as well using the same coordinates.
- Choose Download all data and you get.
- Choose Download data to get a CSV file with the extracted tables. This file can be opened with MS Excel or any other application which can read the CSV format for further processing.
Solution 2:
You can use Okular document viewer available on Linux and Windows trough http://windows.kde.org/ installer.
It can select text as a table, where you can define rows and columns.
Solution 3:
MirzaD, thanks for suggesting Okular. I have it installed in my ubuntu desktop, and never took it seriously .. until now. Thanks to you.
Okular is awesome
in the features it packs, and can certainly address the needs of the guy asking the question. With Okular, you use a Table Selection Tool
and define an area, and then click in column borders to mark fields .. and then copy. When you paste it, you get a consistent tab-delimited output that any serious tool can be coaxed into handling as a CSV file.
I have this need (to extract a few tables from a PDF doc) now on a Centos desktop running gnome/xfce, and installing okular would mean installing a whole bunch of other KDE graphics tools. So, I will try first with tabula (which looks very promising too), and if that fails, then okular it would have to be.
Would this work for windows? Yes, KDE can be installed in windows, but KDE applications come with some decent overhead of other needless software .. So, it depends on how great your needs are, this may be a viable option for even windows.
Read more about Okular here .. and their slogan More Than a Reader
certainly fits .. I am really impressed with what Okular can do .. in a neat and fast enough application with a small footprint.
KDE Windows project .. makes it easy to install a subset of excellent KDE apps in windows.
Solution 4:
Open the document with Adobe Acrobat. Click File > Save As. Select "HTML 4.01 with CSS 1.0 (*.htm, *.html)" in "Save as type", then save.
You can then open the saved HTML file in Microsoft Word, and it will be displayed as a table instead of plain text.