How do I get the HTML code of a web page in PHP?

Solution 1:

If your PHP server allows url fopen wrappers then the simplest way is:

$html = file_get_contents('https://stackoverflow.com/questions/ask');

If you need more control then you should look at the cURL functions:

$c = curl_init('https://stackoverflow.com/questions/ask');
curl_setopt($c, CURLOPT_RETURNTRANSFER, true);
//curl_setopt(... other options you want...)

$html = curl_exec($c);

if (curl_error($c))
    die(curl_error($c));

// Get the status code
$status = curl_getinfo($c, CURLINFO_HTTP_CODE);

curl_close($c);

Solution 2:

Also if you want to manipulate the retrieved page somehow, you might want to try some php DOM parser. I find PHP Simple HTML DOM Parser very easy to use.

Solution 3:

You may want to check out the YQL libraries from Yahoo: http://developer.yahoo.com/yql

The task at hand is as simple as

select * from html where url = 'http://stackoverflow.com/questions/ask'

You can try this out in the console at: http://developer.yahoo.com/yql/console (requires login)

Also see Chris Heilmanns screencast for some nice ideas what more you can do: http://developer.yahoo.net/blogs/theater/archives/2009/04/screencast_collating_distributed_information.html

Solution 4:

Simple way: Use file_get_contents():

$page = file_get_contents('http://stackoverflow.com/questions/ask');

Please note that allow_url_fopen must be true in you php.ini to be able to use URL-aware fopen wrappers.

More advanced way: If you cannot change your PHP configuration, allow_url_fopen is false by default and if ext/curl is installed, use the cURL library to connect to the desired page.