Regular expression to find URLs within a string

Solution 1:

This is the one I use

(http|ftp|https):\/\/([\w_-]+(?:(?:\.[\w_-]+)+))([\w.,@?^=%&:\/~+#-]*[\w@?^=%&\/~+#-])

Works for me, should work for you too.

Solution 2:

Guess no regex is perfect for this use. I found a pretty solid one here

/(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)(?:\([-A-Z0-9+&@#\/%=~_|$?!:,.]*\)|[-A-Z0-9+&@#\/%=~_|$?!:,.])*(?:\([-A-Z0-9+&@#\/%=~_|$?!:,.]*\)|[A-Z0-9+&@#\/%=~_|$])/igm

Some differences / advantages compared to the other ones posted here:

  • It does not match email addresses
  • It does match localhost:12345
  • It won't detect something like moo.com without http or www

See here for examples

Solution 3:

text = """The link of this question: https://stackoverflow.com/questions/6038061/regular-expression-to-find-urls-within-a-string
Also there are some urls: www.google.com, facebook.com, http://test.com/method?param=wasd, http://test.com/method?param=wasd&params2=kjhdkjshd
The code below catches all urls in text and returns urls in list."""

urls = re.findall('(?:(?:https?|ftp):\/\/)?[\w/\-?=%.]+\.[\w/\-&?=%.]+', text)
print(urls)

Output:

[
    'https://stackoverflow.com/questions/6038061/regular-expression-to-find-urls-within-a-string', 
    'www.google.com', 
    'facebook.com',
    'http://test.com/method?param=wasd',
    'http://test.com/method?param=wasd&params2=kjhdkjshd'
]