Are email addresses allowed to contain non-alphanumeric characters?
I'm building a website using Django. The website could have a significant number of users from non-English speaking countries.
I just want to know if there are any technical restrictions on what types of characters an email address could contain.
Are email addresses only allowed to contain English letters, numbers, _
, @
and .
?
Are they allowed to contain non-English alphabets like é
or ü
?
Are they allowed to contain Chinese or Japanese or other Unicode characters?
Solution 1:
Email address consists of two parts local
before @ and domain
that goes after.
Rules to these parts are different:
For local part
you can use ASCII:
- Latin letters A - Z a - z
- digits 0 - 9
- special characters !#$%&'*+-/=?^_`{|}~
- dot ., that it is not first or last, and not in sequence
- space and "(),:;<>@[] characters are allowed with restrictions (they are only allowed inside a quoted string, a backslash or double-quote must be preceded by a backslash)
Plus since 2012 you can use international characters above U+007F
, encoded as UTF-8.
Domain part
is more restricted:
- Latin letters A - Z a - z
- digits 0 - 9
- hyphen -, that is not first or last, multiple hyphens in sequence are allowed.
Regex to validate
^(([^<>()\[\]\.,;:\s@\"]+(\.[^<>()\[\]\.,;:\s@\"]+)*)|(\".+\"))@(([^<>()[\]\.,;:\s@\"]+\.)+[^<>()[\]\.,;:\s@\"]{2,})
Hope this saves you some time.
Solution 2:
Well, yes. Read (at least) this article from Wikipedia.
I live in Argentina and here are allowed emails like ñoñó[email protected]
Solution 3:
The allowed syntax in an email address is described in [RFC 3696][1], and is pretty involved.
The exact rule [for local part; the part before the '@'] is that any ASCII character, including control characters, may appear quoted, or in a quoted string. When quoting is needed, the backslash character is used to quote the following character
[...]
Without quotes, local-parts may consist of any combination of alphabetic characters, digits, or any of the special characters ! # $ % & ' * + - / = ? ^ _ ` . { | } ~
[...]
Any characters, or combination of bits (as octets), are permitted in DNS names. However, there is a preferred form that is required by most applications...
...and so on, in some depth. [1]: https://www.rfc-editor.org/rfc/rfc3696