Regex for alphanumeric, but at least one letter

In my ASP.NET page, I have an input box that has to have the following validation on it:

Must be alphanumeric, with at least one letter (i.e. can't be ALL numbers).


Solution 1:

^\d*[a-zA-Z][a-zA-Z0-9]*$

Basically this means:

  • Zero or more ASCII digits;
  • One alphabetic ASCII character;
  • Zero or more alphanumeric ASCII characters.

Try a few tests and you'll see this'll pass any alphanumeric ASCII string where at least one non-numeric ASCII character is required.

The key to this is the \d* at the front. Without it the regex gets much more awkward to do.

Solution 2:

Most answers to this question are correct, but there's an alternative, that (in some cases) offers more flexibility if you want to change the rules later on:

^(?=.*[a-zA-Z].*)([a-zA-Z0-9]+)$

This will match any sequence of alphanumerical characters, but only if the first group also matches the whole sequence. It's a little-known trick in regular expressions that allows you to handle some very difficult validation problems.

For example, say you need to add another constraint: the string should be between 6 and 12 characters long. The obvious solutions posted here wouldn't work, but using the look-ahead trick, the regex simply becomes:

^(?=.*[a-zA-Z].*)([a-zA-Z0-9]{6,12})$

Solution 3:

^[\p{L}\p{N}]*\p{L}[\p{L}\p{N}]*$

Explanation:

  • [\p{L}\p{N}]* matches zero or more Unicode letters or numbers
  • \p{L} matches one letter
  • [\p{L}\p{N}]* matches zero or more Unicode letters or numbers
  • ^ and $ anchor the string, ensuring the regex matches the entire string. You may be able to omit these, depending on which regex matching function you call.

Result: you can have any alphanumeric string except there's got to be a letter in there somewhere.

\p{L} is similar to [A-Za-z] except it will include all letters from all alphabets, with or without accents and diacritical marks. It is much more inclusive, using a larger set of Unicode characters. If you don't want that flexibility substitute [A-Za-z]. A similar remark applies to \p{N} which could be replaced by [0-9] if you want to keep it simple. See the MSDN page on character classes for more information.

The less fancy non-Unicode version would be

^[A-Za-z0-9]*[A-Za-z][A-Za-z0-9]*$