Using RegEx in SQL Server
I'm looking how to replace/encode text using RegEx based on RegEx settings/params below:
RegEx.IgnoreCase = True
RegEx.Global = True
RegEx.Pattern = "[^a-z\d\s.]+"
I have seen some examples on RegEx, but confused as to how to apply it the same way in SQL Server. Any suggestions would be helpful. Thank you.
You do not need to interact with managed code, as you can use LIKE:
CREATE TABLE #Sample(Field varchar(50), Result varchar(50))
GO
INSERT INTO #Sample (Field, Result) VALUES ('ABC123 ', 'Do not match')
INSERT INTO #Sample (Field, Result) VALUES ('ABC123.', 'Do not match')
INSERT INTO #Sample (Field, Result) VALUES ('ABC123&', 'Match')
SELECT * FROM #Sample WHERE Field LIKE '%[^a-z0-9 .]%'
GO
DROP TABLE #Sample
As your expression ends with +
you can go with '%[^a-z0-9 .][^a-z0-9 .]%'
EDIT:
To make it clear: SQL Server doesn't support regular expressions without managed code. Depending on the situation, the LIKE
operator can be an option, but it lacks the flexibility that regular expressions provides.
Regular Expressions In SQL Server Databases Implementation Use
Regular Expression - Description
. Match any one character
* Match any character
+ Match at least one instance of the expression before
^ Start at beginning of line
$ Search at end of line
< Match only if word starts at this point
> Match only if word stops at this point
\n Match a line break
[] Match any character within the brackets
[^...] Matches any character not listed after the ^
[ABQ]% The string must begin with either the letters A, B, or Q and can be of any length
[AB][CD]% The string must have a length of two or more and which must begin with A or B and have C or D as the second character
[A-Z]% The string can be of any length and must begin with any letter from A to Z
[A-Z0-9]% The string can be of any length and must start with any letter from A to Z or numeral from 0 to 9
[^A-C]% The string can be of any length but cannot begin with the letters A to C
%[A-Z] The string can be of any length and must end with any of the letters from A to Z
%[%$#@]% The string can be of any length and must contain at least one of the special characters enclosed in the bracket
Slightly modified version of Julio's answer.
-- MS SQL using VBScript Regex
-- select dbo.RegexReplace('aa bb cc','($1) ($2) ($3)','([^\s]*)\s*([^\s]*)\s*([^\s]*)')
-- $$ dollar sign, $1 - $9 back references, $& whole match
CREATE FUNCTION [dbo].[RegexReplace]
( -- these match exactly the parameters of RegExp
@searchstring varchar(4000),
@replacestring varchar(4000),
@pattern varchar(4000)
)
RETURNS varchar(4000)
AS
BEGIN
declare @objRegexExp int,
@objErrorObj int,
@strErrorMessage varchar(255),
@res int,
@result varchar(4000)
if( @searchstring is null or len(ltrim(rtrim(@searchstring))) = 0) return null
set @result=''
exec @res=sp_OACreate 'VBScript.RegExp', @objRegexExp out
if( @res <> 0) return '..VBScript did not initialize'
exec @res=sp_OASetProperty @objRegexExp, 'Pattern', @pattern
if( @res <> 0) return '..Pattern property set failed'
exec @res=sp_OASetProperty @objRegexExp, 'IgnoreCase', 0
if( @res <> 0) return '..IgnoreCase option failed'
exec @res=sp_OAMethod @objRegexExp, 'Replace', @result OUT,
@searchstring, @replacestring
if( @res <> 0) return '..Bad search string'
exec @res=sp_OADestroy @objRegexExp
return @result
END
You'll need Ole Automation Procedures turned on in SQL:
exec sp_configure 'show advanced options',1;
go
reconfigure;
go
sp_configure 'Ole Automation Procedures', 1;
go
reconfigure;
go
sp_configure 'show advanced options',0;
go
reconfigure;
go
You will have to build a CLR procedure that provides regex functionality, as this article illustrates.
Their example function uses VB.NET:
Imports System
Imports System.Data.Sql
Imports Microsoft.SqlServer.Server
Imports System.Data.SqlTypes
Imports System.Runtime.InteropServices
Imports System.Text.RegularExpressions
Imports System.Collections 'the IEnumerable interface is here
Namespace SimpleTalk.Phil.Factor
Public Class RegularExpressionFunctions
'RegExIsMatch function
<SqlFunction(IsDeterministic:=True, IsPrecise:=True)> _
Public Shared Function RegExIsMatch( _
ByVal pattern As SqlString, _
ByVal input As SqlString, _
ByVal Options As SqlInt32) As SqlBoolean
If (input.IsNull OrElse pattern.IsNull) Then
Return SqlBoolean.False
End If
Dim RegExOption As New System.Text.RegularExpressions.RegExOptions
RegExOption = Options
Return RegEx.IsMatch(input.Value, pattern.Value, RegExOption)
End Function
End Class '
End Namespace
...and is installed in SQL Server using the following SQL (replacing '%'-delimted variables by their actual equivalents:
sp_configure 'clr enabled', 1
RECONFIGURE WITH OVERRIDE
IF EXISTS ( SELECT 1
FROM sys.objects
WHERE object_id = OBJECT_ID(N'dbo.RegExIsMatch') )
DROP FUNCTION dbo.RegExIsMatch
go
IF EXISTS ( SELECT 1
FROM sys.assemblies asms
WHERE asms.name = N'RegExFunction ' )
DROP ASSEMBLY [RegExFunction]
CREATE ASSEMBLY RegExFunction
FROM '%FILE%'
GO
CREATE FUNCTION RegExIsMatch
(
@Pattern NVARCHAR(4000),
@Input NVARCHAR(MAX),
@Options int
)
RETURNS BIT
AS EXTERNAL NAME
RegExFunction.[SimpleTalk.Phil.Factor.RegularExpressionFunctions].RegExIsMatch
GO
--a few tests
---Is this card a valid credit card?
SELECT dbo.RegExIsMatch ('^(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|6(?:011|5[0-9][0-9])[0-9]{12}|3[47][0-9]{13}|3(?:0[0-5]|[68][0-9])[0-9]{11}|(?:2131|1800|35\d{3})\d{11})$','4241825283987487',1)
--is there a number in this string
SELECT dbo.RegExIsMatch( '\d','there is 1 thing I hate',1)
--Verifies number Returns 1
DECLARE @pattern VARCHAR(255)
SELECT @pattern ='[a-zA-Z0-9]\d{2}[a-zA-Z0-9](-\d{3}){2}[A-Za-z0-9]'
SELECT dbo.RegExIsMatch (@pattern, '1298-673-4192',1),
dbo.RegExIsMatch (@pattern,'A08Z-931-468A',1),
dbo.RegExIsMatch (@pattern,'[A90-123-129X',1),
dbo.RegExIsMatch (@pattern,'12345-KKA-1230',1),
dbo.RegExIsMatch (@pattern,'0919-2893-1256',1)
SELECT * from SOME_TABLE where NAME like '%[^A-Z]%'
Or some other expression instead of A-Z