Best way to automatically remove comments from PHP code
What’s the best way to remove comments from a PHP file?
I want to do something similar to strip-whitespace() - but it shouldn't remove the line breaks as well.
For example,
I want this:
<?PHP
// something
if ($whatsit) {
do_something(); # we do something here
echo '<html>Some embedded HTML</html>';
}
/* another long
comment
*/
some_more_code();
?>
to become:
<?PHP
if ($whatsit) {
do_something();
echo '<html>Some embedded HTML</html>';
}
some_more_code();
?>
(Although if the empty lines remain where comments are removed, that wouldn't be OK.)
It may not be possible, because of the requirement to preserve embedded HTML - that’s what’s tripped up the things that have come up on Google.
I'd use tokenizer. Here's my solution. It should work on both PHP 4 and 5:
$fileStr = file_get_contents('path/to/file');
$newStr = '';
$commentTokens = array(T_COMMENT);
if (defined('T_DOC_COMMENT')) {
$commentTokens[] = T_DOC_COMMENT; // PHP 5
}
if (defined('T_ML_COMMENT')) {
$commentTokens[] = T_ML_COMMENT; // PHP 4
}
$tokens = token_get_all($fileStr);
foreach ($tokens as $token) {
if (is_array($token)) {
if (in_array($token[0], $commentTokens)) {
continue;
}
$token = $token[1];
}
$newStr .= $token;
}
echo $newStr;
Use php -w <sourcefile>
to generate a file stripped of comments and whitespace, and then use a beautifier like PHP_Beautifier to reformat for readability.