Are email addresses allowed to contain non-alphanumeric characters?

I'm building a website using Django. The website could have a significant number of users from non-English speaking countries.

I just want to know if there are any technical restrictions on what types of characters an email address could contain.

Are email addresses only allowed to contain English letters, numbers, _, @ and .?

Are they allowed to contain non-English alphabets like é or ü?

Are they allowed to contain Chinese or Japanese or other Unicode characters?


Solution 1:

Email address consists of two parts local before @ and domain that goes after.

Rules to these parts are different:

For local part you can use ASCII:

  • Latin letters A - Z a - z
  • digits 0 - 9
  • special characters !#$%&'*+-/=?^_`{|}~
  • dot ., that it is not first or last, and not in sequence
  • space and "(),:;<>@[] characters are allowed with restrictions (they are only allowed inside a quoted string, a backslash or double-quote must be preceded by a backslash)

Plus since 2012 you can use international characters above U+007F, encoded as UTF-8.

Domain part is more restricted:

  • Latin letters A - Z a - z
  • digits 0 - 9
  • hyphen -, that is not first or last, multiple hyphens in sequence are allowed.

Regex to validate

^(([^<>()\[\]\.,;:\s@\"]+(\.[^<>()\[\]\.,;:\s@\"]+)*)|(\".+\"))@(([^<>()[\]\.,;:\s@\"]+\.)+[^<>()[\]\.,;:\s@\"]{2,})

Hope this saves you some time.

Solution 2:

Well, yes. Read (at least) this article from Wikipedia.

I live in Argentina and here are allowed emails like ñoñó[email protected]

Solution 3:

The allowed syntax in an email address is described in [RFC 3696][1], and is pretty involved.

The exact rule [for local part; the part before the '@'] is that any ASCII character, including control characters, may appear quoted, or in a quoted string. When quoting is needed, the backslash character is used to quote the following character
[...]
Without quotes, local-parts may consist of any combination of alphabetic characters, digits, or any of the special characters ! # $ % & ' * + - / = ? ^ _ ` . { | } ~
[...]
Any characters, or combination of bits (as octets), are permitted in DNS names. However, there is a preferred form that is required by most applications...

...and so on, in some depth. [1]: https://www.rfc-editor.org/rfc/rfc3696