What is a good regular expression to match a URL? [duplicate]

Regex if you want to ensure URL starts with HTTP/HTTPS:

https?:\/\/(www\.)?[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)

If you do not require HTTP protocol:

[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)

To try this out see http://regexr.com?37i6s, or for a version which is less restrictive http://regexr.com/3e6m0.

Example JavaScript implementation:

var expression = /[-a-zA-Z0-9@:%._\+~#=]{1,256}\.[a-zA-Z0-9()]{1,6}\b([-a-zA-Z0-9()@:%_\+.~#?&//=]*)?/gi;
var regex = new RegExp(expression);
var t = 'www.google.com';

if (t.match(regex)) {
  alert("Successful match");
} else {
  alert("No match");
}

(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})

Will match the following cases

  • http://www.foufos.gr
  • https://www.foufos.gr
  • http://foufos.gr
  • http://www.foufos.gr/kino
  • http://werer.gr
  • www.foufos.gr
  • www.mp3.com
  • www.t.co
  • http://t.co
  • http://www.t.co
  • https://www.t.co
  • www.aa.com
  • http://aa.com
  • http://www.aa.com
  • https://www.aa.com

Will NOT match the following

  • www.foufos
  • www.foufos-.gr
  • www.-foufos.gr
  • foufos.gr
  • http://www.foufos
  • http://foufos
  • www.mp3#.com

var expression = /(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})/gi;
var regex = new RegExp(expression);

var check = [
  'http://www.foufos.gr',
  'https://www.foufos.gr',
  'http://foufos.gr',
  'http://www.foufos.gr/kino',
  'http://werer.gr',
  'www.foufos.gr',
  'www.mp3.com',
  'www.t.co',
  'http://t.co',
  'http://www.t.co',
  'https://www.t.co',
  'www.aa.com',
  'http://aa.com',
  'http://www.aa.com',
  'https://www.aa.com',
  'www.foufos',
  'www.foufos-.gr',
  'www.-foufos.gr',
  'foufos.gr',
  'http://www.foufos',
  'http://foufos',
  'www.mp3#.com'
];

check.forEach(function(entry) {
  if (entry.match(regex)) {
    $("#output").append( "<div >Success: " + entry + "</div>" );
  } else {
    $("#output").append( "<div>Fail: " + entry + "</div>" );
  }
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div id="output"></div>

Check it in rubular - NEW version

Check it in rubular - old version


These are the droids you're looking for. This is taken from validator.js which is the library you should really use to do this. But if you want to roll your own, who am I to stop you? If you want pure regex then you can just take out the length check. I think it's a good idea to test the length of the URL though if you really want to determine compliance with the spec.

 function isURL(str) {
     var urlRegex = '^(?!mailto:)(?:(?:http|https|ftp)://)(?:\\S+(?::\\S*)?@)?(?:(?:(?:[1-9]\\d?|1\\d\\d|2[01]\\d|22[0-3])(?:\\.(?:1?\\d{1,2}|2[0-4]\\d|25[0-5])){2}(?:\\.(?:[0-9]\\d?|1\\d\\d|2[0-4]\\d|25[0-4]))|(?:(?:[a-z\\u00a1-\\uffff0-9]+-?)*[a-z\\u00a1-\\uffff0-9]+)(?:\\.(?:[a-z\\u00a1-\\uffff0-9]+-?)*[a-z\\u00a1-\\uffff0-9]+)*(?:\\.(?:[a-z\\u00a1-\\uffff]{2,})))|localhost)(?::\\d{2,5})?(?:(/|\\?|#)[^\\s]*)?$';
     var url = new RegExp(urlRegex, 'i');
     return str.length < 2083 && url.test(str);
}