How to validate a domain name using Regex & Php?
I want a solution to validate only domain names not full urls, The following example is what i'm looking for:
domain.com -> true
domain.net -> true
domain.org -> true
domain.biz -> true
domain.co.uk -> true
sub.domain.com -> true
domain.com/folder -> false
domµ*$ain.com -> false
Solution 1:
The accepted answer is incomplete/wrong.
The regex pattern;
-
should NOT validate domains such as:
-domain.com
,domain--.com
,-domain-.-.com
,domain.000
, etc... -
should validate domains such as:
schools.k12
,newTLD.clothing
,good.photography
, etc...
After some further research; below is the most correct, cross-language and compact pattern I could come up with:
^(?!\-)(?:(?:[a-zA-Z\d][a-zA-Z\d\-]{0,61})?[a-zA-Z\d]\.){1,126}(?!\d+)[a-zA-Z\d]{1,63}$
This pattern conforms with most* of the rules defined in the specs:
- Each label/level (splitted by a dot) may contain up to 63 characters.
- The full domain name may have up to 127 levels.
- The full domain name may not exceed the length of 253 characters in its textual representation.
- Each label can consist of letters, digits and hyphens.
- Labels cannot start or end with a hyphen.
- The top-level domain (extension) cannot be all-numeric.
Note 1: The full domain length check is not included in the regex. It should be simply checked by native methods e.g. strlen(domain) <= 253
.
Note 2: This pattern works with most languages including PHP, Javascript, Python, etc...
See DEMO here (for JS, PHP, Python)
More Info:
-
The regex above does not support IDNs.
-
There is no spec that says the extension (TLD) should be between 2 and 6 characters. It actually supports 63 characters. See the current TLD list here. Also, some networks do internally use custom/pseudo TLDs.
-
Registration authorities might impose some extra, specific rules which are not explicitly supported in this regex. For example,
.CO.UK
and.ORG.UK
must have at least 3 characters, but less than 23, not including the extension. These kinds of rules are non-standard and subject to change. Do not implement them if you cannot maintain. -
Regular Expressions are great but not the best effective, performant solution to every problem. So a native URL parser should be used instead, whenever possible. e.g. Python's
urlparse()
method or PHP'sparse_url()
method... -
After all, this is just a format validation. A regex test does not confirm that a domain name is actually configured/exists! You should test the existence by making a request.
Specs & References:
- IETF: RFC1035
- IETF: RFC1123
- IETF: RFC2181
- IETF: RFC952
- Wikipedia: Domain Name System
UPDATE (2019-12-21): Fixed leading hyphen with subdomains.
Solution 2:
How about:
^(?:[-A-Za-z0-9]+\.)+[A-Za-z]{2,6}$