Sql script to find invalid email addresses
A data import was done from an access database and there was no validation on the email address field. Does anyone have an sql script that can return a list of invalid email addresses (missing @, etc).
SELECT * FROM people WHERE email NOT LIKE '%_@__%.__%'
Anything more complex will likely return false negatives and run slower.
Validating e-mail addresses in code is virtually impossible.
EDIT: Related questions
- I've answered a similar question some time ago: TSQL Email Validation (without regex)
- T-SQL: checking for email format
- Regexp recognition of email address hard?
- many other Stack Overflow questions
Here is a quick and easy solution:
CREATE FUNCTION dbo.vaValidEmail(@EMAIL varchar(100))
RETURNS bit as
BEGIN
DECLARE @bitRetVal as Bit
IF (@EMAIL <> '' AND @EMAIL NOT LIKE '_%@__%.__%')
SET @bitRetVal = 0 -- Invalid
ELSE
SET @bitRetVal = 1 -- Valid
RETURN @bitRetVal
END
Then you can find all rows by using the function:
SELECT * FROM users WHERE dbo.vaValidEmail(email) = 0
If you are not happy with creating a function in your database, you can use the LIKE-clause directly in your query:
SELECT * FROM users WHERE email NOT LIKE '_%@__%.__%'
Source
I find this simple T-SQL query useful for returning valid e-mail addresses
SELECT email
FROM People
WHERE email LIKE '%_@__%.__%'
AND PATINDEX('%[^a-z,0-9,@,.,_]%', REPLACE(email, '-', 'a')) = 0
The PATINDEX bit eliminates all e-mail addresses containing characters that are not in the allowed a-z, 0-9, '@', '.', '_' & '-' set of characters.
It can be reversed to do what you want like this:
SELECT email
FROM People
WHERE NOT (email LIKE '%_@__%.__%'
AND PATINDEX('%[^a-z,0-9,@,.,_]%', REPLACE(email, '-', 'a')) = 0)
select
email
from loginuser where
patindex ('%[ &'',":;!+=\/()<>]*%', email) > 0 -- Invalid characters
or patindex ('[@.-_]%', email) > 0 -- Valid but cannot be starting character
or patindex ('%[@.-_]', email) > 0 -- Valid but cannot be ending character
or email not like '%@%.%' -- Must contain at least one @ and one .
or email like '%..%' -- Cannot have two periods in a row
or email like '%@%@%' -- Cannot have two @ anywhere
or email like '%.@%' or email like '%@.%' -- Cant have @ and . next to each other
or email like '%.cm' or email like '%.co' -- Unlikely. Probably typos
or email like '%.or' or email like '%.ne' -- Missing last letter
This worked for me. Had to apply rtrim and ltrim to avoid false positives.
Source: http://sevenwires.blogspot.com/2008/09/sql-how-to-find-invalid-email-in-sql.html
Postgres version:
select user_guid, user_guid email_address, creation_date, email_verified, active
from user_data where
length(substring (email_address from '%[ &'',":;!+=\/()<>]%')) > 0 -- Invalid characters
or length(substring (email_address from '[@.-_]%')) > 0 -- Valid but cannot be starting character
or length(substring (email_address from '%[@.-_]')) > 0 -- Valid but cannot be ending character
or email_address not like '%@%.%' -- Must contain at least one @ and one .
or email_address like '%..%' -- Cannot have two periods in a row
or email_address like '%@%@%' -- Cannot have two @ anywhere
or email_address like '%.@%' or email_address like '%@.%' -- Cant have @ and . next to each other
or email_address like '%.cm' or email_address like '%.co' -- Unlikely. Probably typos
or email_address like '%.or' or email_address like '%.ne' -- Missing last letter
;
MySQL
SELECT * FROM `emails` WHERE lower(`email`)
NOT REGEXP '[-a-z0-9~!$%^&*_=+}{\\\'?]+(\\.[-a-z0-9~!$%^&*_=+}{\\\'?]+)*@([a-z0-9_][-a-z0-9_]*(\\.[-a-z0-9_]+)*\\.(aero|arpa|biz|com|coop|edu|gov|info|int|mil|museum|name|net|org|pro|travel|mobi|[a-z][a-z])|([0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}))(:[0-9]{1,5})?'