Read pdf files with php

Check out FPDF (with FPDI):

http://www.fpdf.org/

http://www.setasign.de/products/pdf-php-solutions/fpdi/

These will let you open an pdf and add content to it in PHP. I'm guessing you can also use their functionality to search through the existing content for the values you need.

Another possible library is TCPDF: https://tcpdf.org/

Update to add a more modern library: PDF Parser


There is a php library (pdfparser) that does exactly what you want.

project website

http://www.pdfparser.org/

github

https://github.com/smalot/pdfparser

Demo page/api

http://www.pdfparser.org/demo

After including pdfparser in your project you can get all text from mypdf.pdf like so:

<?php
$parser = new \installpath\PdfParser\Parser();
$pdf    = $parser->parseFile('mypdf.pdf');  
$text = $pdf->getText();
echo $text;//all text from mypdf.pdf

?>

Simular you can get the metadata from the pdf as wel as getting the pdf objects (for example images).


Not exactly php, but you could exec a program from php to convert the pdf to a temporary html file and then parse the resulting file with php. I've done something similar for a project of mine and this is the program I used:

PdfToHtml

The resulting HTML wraps text elements in < div > tags with absolute position coordinates. It seems like this is exactly what you are trying to do.