C# regex pattern to extract urls from given string - not full html urls but bare links as well
You can write some pretty simple regular expressions to handle this, or go via more traditional string splitting + LINQ methodology.
Regex
var linkParser = new Regex(@"\b(?:https?://|www\.)\S+\b", RegexOptions.Compiled | RegexOptions.IgnoreCase);
var rawString = "house home go www.monstermmorpg.com nice hospital http://www.monstermmorpg.com this is incorrect url http://www.monstermmorpg.commerged continue";
foreach(Match m in linkParser.Matches(rawString))
MessageBox.Show(m.Value);
Explanation Pattern:
\b -matches a word boundary (spaces, periods..etc)
(?: -define the beginning of a group, the ?: specifies not to capture the data within this group.
https?:// - Match http or https (the '?' after the "s" makes it optional)
| -OR
www\. -literal string, match www. (the \. means a literal ".")
) -end group
\S+ -match a series of non-whitespace characters.
\b -match the closing word boundary.
Basically the pattern looks for strings that start with http:// OR https:// OR www. (?:https?://|www\.)
and then matches all the characters up to the next whitespace.
Traditional String Options
var rawString = "house home go www.monstermmorpg.com nice hospital http://www.monstermmorpg.com this is incorrect url http://www.monstermmorpg.commerged continue";
var links = rawString.Split("\t\n ".ToCharArray(), StringSplitOptions.RemoveEmptyEntries).Where(s => s.StartsWith("http://") || s.StartsWith("www.") || s.StartsWith("https://"));
foreach (string s in links)
MessageBox.Show(s);
Using Nikita's reply, I get the url in string very easy :
using System.Text.RegularExpressions;
string myString = "test =) https://google.com/";
Match url = Regex.Match(myString, @"http(s)?://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]*)?");
string finalUrl = url.ToString();