Best way to check if a URL is valid
I want to use PHP to check, if string stored in $myoutput
variable contains a valid link syntax or is it just a normal text. The function or solution, that I'm looking for, should recognize all links formats including the ones with GET parameters.
A solution, suggested on many sites, to actually query string (using CURL or file_get_contents()
function) is not possible in my case and I would like to avoid it.
I thought about regular expressions or another solution.
You can use a native Filter Validator
filter_var($url, FILTER_VALIDATE_URL);
Validates value as URL (according to » http://www.faqs.org/rfcs/rfc2396), optionally with required components. Beware a valid URL may not specify the HTTP protocol http:// so further validation may be required to determine the URL uses an expected protocol, e.g. ssh:// or mailto:. Note that the function will only find ASCII URLs to be valid; internationalized domain names (containing non-ASCII characters) will fail.
Example:
if (filter_var($url, FILTER_VALIDATE_URL) === FALSE) {
die('Not a valid URL');
}
Here is the best tutorial I found over there:
http://www.w3schools.com/php/filter_validate_url.asp
<?php
$url = "http://www.qbaki.com";
// Remove all illegal characters from a url
$url = filter_var($url, FILTER_SANITIZE_URL);
// Validate url
if (filter_var($url, FILTER_VALIDATE_URL) !== false) {
echo("$url is a valid URL");
} else {
echo("$url is not a valid URL");
}
?>
Possible flags:
FILTER_FLAG_SCHEME_REQUIRED - URL must be RFC compliant (like http://example)
FILTER_FLAG_HOST_REQUIRED - URL must include host name (like http://www.example.com)
FILTER_FLAG_PATH_REQUIRED - URL must have a path after the domain name (like www.example.com/example1/)
FILTER_FLAG_QUERY_REQUIRED - URL must have a query string (like "example.php?name=Peter&age=37")
Using filter_var() will fail for urls with non-ascii chars, e.g. (http://pt.wikipedia.org/wiki/Guimarães). The following function encode all non-ascii chars (e.g. http://pt.wikipedia.org/wiki/Guimar%C3%A3es) before calling filter_var().
Hope this helps someone.
<?php
function validate_url($url) {
$path = parse_url($url, PHP_URL_PATH);
$encoded_path = array_map('urlencode', explode('/', $path));
$url = str_replace($path, implode('/', $encoded_path), $url);
return filter_var($url, FILTER_VALIDATE_URL) ? true : false;
}
// example
if(!validate_url("http://somedomain.com/some/path/file1.jpg")) {
echo "NOT A URL";
}
else {
echo "IS A URL";
}
function is_url($uri){
if(preg_match( '/^(http|https):\\/\\/[a-z0-9_]+([\\-\\.]{1}[a-z_0-9]+)*\\.[_a-z]{2,5}'.'((:[0-9]{1,5})?\\/.*)?$/i' ,$uri)){
return $uri;
}
else{
return false;
}
}