How can I read XMP data from a JPG with PHP?

Solution 1:

XMP data is literally embedded into the image file so can extract it with PHP's string-functions from the image file itself.

The following demonstrates this procedure (I'm using SimpleXML but every other XML API or even simple and clever string parsing may give you equal results):

$content = file_get_contents($image);
$xmp_data_start = strpos($content, '<x:xmpmeta');
$xmp_data_end   = strpos($content, '</x:xmpmeta>');
$xmp_length     = $xmp_data_end - $xmp_data_start;
$xmp_data       = substr($content, $xmp_data_start, $xmp_length + 12);
$xmp            = simplexml_load_string($xmp_data);

Just two remarks:

  • XMP makes heavy use of XML namespaces, so you'll have to keep an eye on that when parsing the XMP data with some XML tools.
  • considering the possible size of image files, you'll perhaps not be able to use file_get_contents() as this function loads the whole image into memory. Using fopen() to open a file stream resource and checking chunks of data for the key-sequences <x:xmpmeta and </x:xmpmeta> will significantly reduce the memory footprint.

Solution 2:

I'm only replying to this after so much time because this seems to be the best result when searching Google for how to parse XMP data. I've seen this nearly identical snippet used in code a few times and it's a terrible waste of memory. Here is an example of the fopen() method Stefan mentions after his example.

<?php

function getXmpData($filename, $chunkSize)
{
    if (!is_int($chunkSize)) {
        throw new RuntimeException('Expected integer value for argument #2 (chunkSize)');
    }

    if ($chunkSize < 12) {
        throw new RuntimeException('Chunk size cannot be less than 12 argument #2 (chunkSize)');
    }

    if (($file_pointer = fopen($filename, 'r')) === FALSE) {
        throw new RuntimeException('Could not open file for reading');
    }

    $startTag = '<x:xmpmeta';
    $endTag = '</x:xmpmeta>';
    $buffer = NULL;
    $hasXmp = FALSE;

    while (($chunk = fread($file_pointer, $chunkSize)) !== FALSE) {

        if ($chunk === "") {
            break;
        }

        $buffer .= $chunk;
        $startPosition = strpos($buffer, $startTag);
        $endPosition = strpos($buffer, $endTag);

        if ($startPosition !== FALSE && $endPosition !== FALSE) {
            $buffer = substr($buffer, $startPosition, $endPosition - $startPosition + 12);
            $hasXmp = TRUE;
            break;
        } elseif ($startPosition !== FALSE) {
            $buffer = substr($buffer, $startPosition);
            $hasXmp = TRUE;
        } elseif (strlen($buffer) > (strlen($startTag) * 2)) {
            $buffer = substr($buffer, strlen($startTag));
        }
    }

    fclose($file_pointer);
    return ($hasXmp) ? $buffer : NULL;
}

Solution 3:

A simple way on linux is to call the exiv2 program, available in an eponymous package on debian.

$ exiv2 -e X extract image.jpg

will produce image.xmp containing embedded XMP which is now yours to parse.

Solution 4:

I know... this is kind of an old thread, but it was helpful to me when I was looking for a way to do this, so I figured this might be helpful to someone else.

I took this basic solution and modified it so it handles the case where the tag is split between chunks. This allows the chunk size to be as large or small as you want.

<?php
function getXmpData($filename, $chunk_size = 1024)
{
	if (!is_int($chunkSize)) {
		throw new RuntimeException('Expected integer value for argument #2 (chunkSize)');
	}

	if ($chunkSize < 12) {
		throw new RuntimeException('Chunk size cannot be less than 12 argument #2 (chunkSize)');
	}

	if (($file_pointer = fopen($filename, 'rb')) === FALSE) {
		throw new RuntimeException('Could not open file for reading');
	}

	$tag = '<x:xmpmeta';
	$buffer = false;

	// find open tag
	while ($buffer === false && ($chunk = fread($file_pointer, $chunk_size)) !== false) {
		if(strlen($chunk) <= 10) {
			break;
		}
		if(($position = strpos($chunk, $tag)) === false) {
			// if open tag not found, back up just in case the open tag is on the split.
			fseek($file_pointer, -10, SEEK_CUR);
		} else {
			$buffer = substr($chunk, $position);
		}
	}

	if($buffer === false) {
		fclose($file_pointer);
		return false;
	}

	$tag = '</x:xmpmeta>';
	$offset = 0;
	while (($position = strpos($buffer, $tag, $offset)) === false && ($chunk = fread($file_pointer, $chunk_size)) !== FALSE && !empty($chunk)) {
		$offset = strlen($buffer) - 12; // subtract the tag size just in case it's split between chunks.
		$buffer .= $chunk;
	}

	fclose($file_pointer);

	if($position === false) {
		// this would mean the open tag was found, but the close tag was not.  Maybe file corruption?
		throw new RuntimeException('No close tag found.  Possibly corrupted file.');
	} else {
		$buffer = substr($buffer, 0, $position + 12);
	}

	return $buffer;
}
?>