Best way to avoid code injection in PHP

My website was recently attacked by, what seemed to me as, an innocent code:

<?php
  if ( isset( $ _GET['page'] ) ) {
    include( $ _GET['page'] . ".php" );
  } else {
    include("home.php");
  }
?>

There where no SQL calls, so I wasn't afraid for SQL Injection. But, apparently, SQL isn't the only kind of injection.

This website has an explanation and a few examples of avoiding code injection: http://www.theserverpages.com/articles/webmasters/php/security/Code_Injection_Vulnerabilities_Explained.html

How would you protect this code from code injection?


Use a whitelist and make sure the page is in the whitelist:

  $whitelist = array('home', 'page');

  if (in_array($_GET['page'], $whitelist)) {
        include($_GET['page'].'.php');
  } else {
        include('home.php');
  }

Another way to sanitize the input is to make sure that only allowed characters (no "/", ".", ":", ...) are in it. However don't use a blacklist for bad characters, but a whitelist for allowed characters:

$page = preg_replace('[^a-zA-Z0-9]', '', $page);

... followed by a file_exists.

That way you can make sure that only scripts you want to be executed are executed (for example this would rule out a "blabla.inc.php", because "." is not allowed).

Note: This is kind of a "hack", because then the user could execute "h.o.m.e" and it would give the "home" page, because all it does is removing all prohibited characters. It's not intended to stop "smartasses" who want to cute stuff with your page, but it will stop people doing really bad things.

BTW: Another thing you could do in you .htaccess file is to prevent obvious attack attempts:

RewriteEngine on
RewriteCond %{QUERY_STRING} http[:%] [NC]
RewriteRule .* /–http– [F,NC]
RewriteRule http: /–http– [F,NC]

That way all page accesses with "http:" url (and query string) result in an "Forbidden" error message, not even reaching the php script. That results in less server load.

However keep in mind that no "http" is allowed in the query string. You website might MIGHT require it in some cases (maybe when filling out a form).

BTW: If you can read german: I also have a blog post on that topic.