Regular expression for first and last name

Don't forget about names like:

Mathias d'Arras
Martin Luther King, Jr.
Hector Sausage-Hausen

This should do the trick for most things:

/^[a-z ,.'-]+$/i

OR Support international names with super sweet unicode:

/^[a-zA-ZàáâäãåąčćęèéêëėįìíîïłńòóôöõøùúûüųūÿýżźñçčšžÀÁÂÄÃÅĄĆČĖĘÈÉÊËÌÍÎÏĮŁŃÒÓÔÖÕØÙÚÛÜŲŪŸÝŻŹÑßÇŒÆČŠŽ∂ð ,.'-]+$/u

You make false assumptions on the format of first and last name. It is probably better not to validate the name at all, apart from checking that it is empty.

After going through all of these answers I found a way to build a tiny regex that supports most languages and only allows for word characters. It even supports some special characters like hyphens, spaces and apostrophes. I've tested in python and it supports the characters below:

^[\w'\-,.][^0-9_!¡?÷?¿/\\+=@#$%ˆ&*(){}|~<>;:[\]]{2,}$

Characters supported:

abcdefghijklmnopqrstwxyz
ABCDEFGHIJKLMNOPQRSTUVWXYZ
áéíóúäëïöüÄ'
陳大文
łŁőŐűŰZàáâäãåąčćęèéêëėįìíîïłńòóôöõøùúûüųū
ÿýżźñçčšžÀÁÂÄÃÅĄĆČĖĘÈÉÊËÌÍÎÏĮŁ
ŃÒÓÔÖÕØÙÚÛÜŲŪŸÝŻŹÑßÇŒÆČŠŽ.-
ñÑâê都道府県Федерации
আবাসযোগ্য জমির걸쳐 있는

I have created a custom regex to deal with names:

I have tried these types of names and found working perfect

John Smith
John D'Largy
John Doe-Smith
John Doe Smith
Hector Sausage-Hausen
Mathias d'Arras
Martin Luther King
Ai Wong
Chao Chang
Alzbeta Bara

My RegEx looks like this:

^([a-zA-Z]{2,}\s[a-zA-Z]{1,}'?-?[a-zA-Z]{2,}\s?([a-zA-Z]{1,})?)

MVC4 Model:

[RegularExpression("^([a-zA-Z]{2,}\\s[a-zA-Z]{1,}'?-?[a-zA-Z]{2,}\\s?([a-zA-Z]{1,})?)", ErrorMessage = "Valid Charactors include (A-Z) (a-z) (' space -)") ]

Please note the double \\ for escape characters

For those of you that are new to RegEx I thought I'd include a explanation.

^               // start of line
[a-zA-Z]{2,}    // will except a name with at least two characters
\s              // will look for white space between name and surname
[a-zA-Z]{1,}    // needs at least 1 Character
\'?-?           // possibility of **'** or **-** for double barreled and hyphenated surnames
[a-zA-Z]{2,}    // will except a name with at least two characters
\s?             // possibility of another whitespace
([a-zA-Z]{1,})? // possibility of a second surname

I have searched and searched and played and played with it and although it is not perfect it may help others making the attempt to validate first and last names that have been provided as one variable.

In my case, that variable is $name.

I used the following code for my PHP:

    if (preg_match('/\b([A-Z]{1}[a-z]{1,30}[- ]{0,1}|[A-Z]{1}[- \']{1}[A-Z]{0,1}  
    [a-z]{1,30}[- ]{0,1}|[a-z]{1,2}[ -\']{1}[A-Z]{1}[a-z]{1,30}){2,5}/', $name)  
    # there is no space line break between in the above "if statement", any that   
    # you notice or perceive are only there for formatting purposes.  
    # 
    # pass - successful match - do something
    } else {
    # fail - unsuccessful match - do something

I am learning RegEx myself but I do have the explanation for the code as provided by RegEx buddy.
Here it is:

Assert position at a word boundary «\b»

Match the regular expression below and capture its match into backreference number 1
«([A-Z]{1}[a-z]{1,30}[- ]{0,1}|[A-Z]{1}[- \']{1}[A-Z]{0,1}[a-z]{1,30}[- ]{0,1}|[a-z]{1,2}[ -\']{1}[A-Z]{1}[a-z]{1,30}){2,5}»

Between 2 and 5 times, as many times as possible, giving back as needed (greedy) «{2,5}»

* I NEED SOME HELP HERE WITH UNDERSTANDING THE RAMIFICATIONS OF THIS NOTE *

Note: I repeated the capturing group itself. The group will capture only the last iteration. Put a capturing group around the repeated group to capture all iterations. «{2,5}»

Match either the regular expression below (attempting the next alternative only if this one fails) «[A-Z]{1}[a-z]{1,30}[- ]{0,1}»

Match a single character in the range between “A” and “Z” «[A-Z]{1}»

Exactly 1 times «{1}»

Match a single character in the range between “a” and “z” «[a-z]{1,30}»

Between one and 30 times, as many times as possible, giving back as needed (greedy) «{1,30}»

Match a single character present in the list “- ” «[- ]{0,1}»

Between zero and one times, as many times as possible, giving back as needed (greedy) «{0,1}»

Or match regular expression number 2 below (attempting the next alternative only if this one fails) «[A-Z]{1}[- \']{1}[A-Z]{0,1}[a-z]{1,30}[- ]{0,1}»

Match a single character in the range between “A” and “Z” «[A-Z]{1}»

Exactly 1 times «{1}»

Match a single character present in the list below «[- \']{1}»

Exactly 1 times «{1}»

One of the characters “- ” «- » A ' character «\'»

Match a single character in the range between “A” and “Z” «[A-Z]{0,1}»

Between zero and one times, as many times as possible, giving back as needed (greedy) «{0,1}»

Match a single character in the range between “a” and “z” «[a-z]{1,30}»

Between one and 30 times, as many times as possible, giving back as needed (greedy) «{1,30}»

Match a single character present in the list “- ” «[- ]{0,1}»

Between zero and one times, as many times as possible, giving back as needed (greedy) «{0,1}»

Or match regular expression number 3 below (the entire group fails if this one fails to match) «[a-z]{1,2}[ -\']{1}[A-Z]{1}[a-z]{1,30}»

Match a single character in the range between “a” and “z” «[a-z]{1,2}»

Between one and 2 times, as many times as possible, giving back as needed (greedy) «{1,2}»

Match a single character in the range between “ ” and “'” «[ -\']{1}»

Exactly 1 times «{1}»

Match a single character in the range between “A” and “Z” «[A-Z]{1}»

Exactly 1 times «{1}»

Match a single character in the range between “a” and “z” «[a-z]{1,30}»

Between one and 30 times, as many times as possible, giving back as needed (greedy) «{1,30}»

I know this validation totally assumes that every person filling out the form has a western name and that may eliminates the vast majority of folks in the world. However, I feel like this is a step in the proper direction. Perhaps this regular expression is too basic for the gurus to address simplistically or maybe there is some other reason that I was unable to find the above code in my searches. I spent way too long trying to figure this bit out, you will probably notice just how foggy my mind is on all this if you look at my test names below.

I tested the code on the following names and the results are in parentheses to the right of each name.

STEVE SMITH (fail)
Stev3 Smith (fail)
STeve Smith (fail)
Steve SMith (fail)
Steve Sm1th (passed on the Steve Sm)
d'Are to Beaware (passed on the Are to Beaware)
Jo Blow (passed)
Hyoung Kyoung Wu (passed)
Mike O'Neal (passed)
Steve Johnson-Smith (passed)
Jozef-Schmozev Hiemdel (passed)
O Henry Smith (passed)
Mathais d'Arras (passed)
Martin Luther King Jr (passed)
Downtown-James Brown (passed)
Darren McCarty (passed)
George De FunkMaster (passed)
Kurtis B-Ball Basketball (passed)
Ahmad el Jeffe (passed)

If you have basic names, there must be more than one up to five for the above code to work, that are similar to those that I used during testing, this code might be for you.

If you have any improvements, please let me know. I am just in the early stages (first few months of figuring out RegEx.

Thanks and good luck, Steve

Regular expression for first and last name

Don't forget about names like:

This should do the trick for most things:

OR Support international names with super sweet unicode:

Related

Recent Posts