Determining what classes are defined in a PHP class file
I needed something like this for a project I am working on, and here are the functions I wrote:
function file_get_php_classes($filepath) {
$php_code = file_get_contents($filepath);
$classes = get_php_classes($php_code);
return $classes;
}
function get_php_classes($php_code) {
$classes = array();
$tokens = token_get_all($php_code);
$count = count($tokens);
for ($i = 2; $i < $count; $i++) {
if ( $tokens[$i - 2][0] == T_CLASS
&& $tokens[$i - 1][0] == T_WHITESPACE
&& $tokens[$i][0] == T_STRING) {
$class_name = $tokens[$i][1];
$classes[] = $class_name;
}
}
return $classes;
}
If you just want to check a file without loading it use token_get_all()
:
<?php
header('Content-Type: text/plain');
$php_file = file_get_contents('c2.php');
$tokens = token_get_all($php_file);
$class_token = false;
foreach ($tokens as $token) {
if (is_array($token)) {
if ($token[0] == T_CLASS) {
$class_token = true;
} else if ($class_token && $token[0] == T_STRING) {
echo "Found class: $token[1]\n";
$class_token = false;
}
}
}
?>
Basically, this is a simple finite state machine. In PHP the sequence of tokens will be:
-
T_CLASS
: 'class' keyword; -
T_WHITESPACE
: space(s) after 'class'; -
T_STRING
: name of class.
So this code will handle any weird spacing or newlines you get just fine because it's using the same parser PHP uses to execute the file. If token_get_all()
can't parse it, neither can PHP.
By the way, you use token_name()
to turn a token number into it's constant name.
Here is my c2.php:
<?php
class MyClass {
public __construct() {
}
}
class MyOtherClass {
public __construct() {
}
}
?>
Output:
Found class: MyClass
Found class: MyOtherClass
I needed parse classes from file with namespaces, so I modified code. If somebody need too, here is it:
public function getPhpClasses($phpcode) {
$classes = array();
$namespace = 0;
$tokens = token_get_all($phpcode);
$count = count($tokens);
$dlm = false;
for ($i = 2; $i < $count; $i++) {
if ((isset($tokens[$i - 2][1]) && ($tokens[$i - 2][1] == "phpnamespace" || $tokens[$i - 2][1] == "namespace")) ||
($dlm && $tokens[$i - 1][0] == T_NS_SEPARATOR && $tokens[$i][0] == T_STRING)) {
if (!$dlm) $namespace = 0;
if (isset($tokens[$i][1])) {
$namespace = $namespace ? $namespace . "\\" . $tokens[$i][1] : $tokens[$i][1];
$dlm = true;
}
}
elseif ($dlm && ($tokens[$i][0] != T_NS_SEPARATOR) && ($tokens[$i][0] != T_STRING)) {
$dlm = false;
}
if (($tokens[$i - 2][0] == T_CLASS || (isset($tokens[$i - 2][1]) && $tokens[$i - 2][1] == "phpclass"))
&& $tokens[$i - 1][0] == T_WHITESPACE && $tokens[$i][0] == T_STRING) {
$class_name = $tokens[$i][1];
if (!isset($classes[$namespace])) $classes[$namespace] = array();
$classes[$namespace][] = $class_name;
}
}
return $classes;
}
Or you could easily use AnnotationsParser from Nette\Reflection (installable using composer):
use Nette\Reflection\AnnotationsParser;
$classes = AnnotationsParser::parsePhp(file_get_contents($fileName));
var_dump($classes);
Output will be then something like this:
array(1) {
["Your\Class\Name"] =>
array(...) {
// property => comment
},
["Your\Class\Second"] =>
array(...) {
// property => comment
},
}
The parsePhp() method basically does something similar as examples in other answers, but you don't have to declare nor test the parsing yourselves.
My snippet too. Can parse files with multiple classes, interfaces, arrays and namespaces. Returns an array with classes+types (class, interface, abstract) divided by namespaces.
<?php
/**
*
* Looks what classes and namespaces are defined in that file and returns the first found
* @param String $file Path to file
* @return Returns NULL if none is found or an array with namespaces and classes found in file
*/
function classes_in_file($file)
{
$classes = $nsPos = $final = array();
$foundNS = FALSE;
$ii = 0;
if (!file_exists($file)) return NULL;
$er = error_reporting();
error_reporting(E_ALL ^ E_NOTICE);
$php_code = file_get_contents($file);
$tokens = token_get_all($php_code);
$count = count($tokens);
for ($i = 0; $i < $count; $i++)
{
if(!$foundNS && $tokens[$i][0] == T_NAMESPACE)
{
$nsPos[$ii]['start'] = $i;
$foundNS = TRUE;
}
elseif( $foundNS && ($tokens[$i] == ';' || $tokens[$i] == '{') )
{
$nsPos[$ii]['end']= $i;
$ii++;
$foundNS = FALSE;
}
elseif ($i-2 >= 0 && $tokens[$i - 2][0] == T_CLASS && $tokens[$i - 1][0] == T_WHITESPACE && $tokens[$i][0] == T_STRING)
{
if($i-4 >=0 && $tokens[$i - 4][0] == T_ABSTRACT)
{
$classes[$ii][] = array('name' => $tokens[$i][1], 'type' => 'ABSTRACT CLASS');
}
else
{
$classes[$ii][] = array('name' => $tokens[$i][1], 'type' => 'CLASS');
}
}
elseif ($i-2 >= 0 && $tokens[$i - 2][0] == T_INTERFACE && $tokens[$i - 1][0] == T_WHITESPACE && $tokens[$i][0] == T_STRING)
{
$classes[$ii][] = array('name' => $tokens[$i][1], 'type' => 'INTERFACE');
}
}
error_reporting($er);
if (empty($classes)) return NULL;
if(!empty($nsPos))
{
foreach($nsPos as $k => $p)
{
$ns = '';
for($i = $p['start'] + 1; $i < $p['end']; $i++)
$ns .= $tokens[$i][1];
$ns = trim($ns);
$final[$k] = array('namespace' => $ns, 'classes' => $classes[$k+1]);
}
$classes = $final;
}
return $classes;
}
Outputs something like this...
array
'namespace' => string 'test\foo' (length=8)
'classes' =>
array
0 =>
array
'name' => string 'bar' (length=3)
'type' => string 'CLASS' (length=5)
1 =>
array
'name' => string 'baz' (length=3)
'type' => string 'INTERFACE' (length=9)
array
'namespace' => string 'this\is\a\really\big\namespace\for\testing\dont\you\think' (length=57)
'classes' =>
array
0 =>
array
'name' => string 'yes_it_is' (length=9)
'type' => string 'CLASS' (length=5)
1 =>
array
'name' => string 'damn_too_big' (length=12)
'type' => string 'ABSTRACT CLASS' (length=14)
2 =>
array
'name' => string 'fogo' (length=6)
'type' => string 'INTERFACE' (length=9)
Might help someone!