Remove weird characters ( A with hat) from SQL Server varchar column
Solution 1:
You can use .net
regular expression functions. For example, using Regex.Replace
:
Regex.Replace(s, @"[^\u0000-\u007F]", string.Empty);
As there is no support for regular expressions in SQL Server
you need to create a SQL CLR
function. More information about the .net
integration in SQL Server
can be found here:
- String Utility Functions Sample - full working examples
- Stairway to SQLCLR - still in progress
- Introduction to SQL Server CLR Integration - official documentation
In your case:
-
Open
Visual Studio
and createClass Library Project
: -
Then rename the class to
StackOverflow
and paste the following code in its file:using Microsoft.SqlServer.Server; using System; using System.Collections.Generic; using System.Data.SqlTypes; using System.Linq; using System.Text; using System.Text.RegularExpressions; using System.Threading.Tasks; public class StackOverflow { [SqlFunction(DataAccess = DataAccessKind.None, IsDeterministic = true, Name = "RegexReplace")] public static SqlString Replace(SqlString sqlInput, SqlString sqlPattern, SqlString sqlReplacement) { string input = (sqlInput.IsNull) ? string.Empty : sqlInput.Value; string pattern = (sqlPattern.IsNull) ? string.Empty : sqlPattern.Value; string replacement = (sqlReplacement.IsNull) ? string.Empty : sqlReplacement.Value; return new SqlString(Regex.Replace(input, pattern, replacement)); } }
-
Now, build the project. Open the
SQL Server Management Studio
. Select your database and replace the path value of the followingFROM
clause to match yourStackOverflow.dll
:CREATE ASSEMBLY [StackOverflow] FROM 'C:\Users\gotqn\Desktop\StackOverflow\StackOverflow\bin\Debug\StackOverflow.dll';
-
Finally, create the
SQL CLR
function:CREATE FUNCTION [dbo].[StackOverflowRegexReplace] (@input NVARCHAR(MAX),@pattern NVARCHAR(MAX), @replacement NVARCHAR(MAX)) RETURNS NVARCHAR(4000) AS EXTERNAL NAME [StackOverflow].[StackOverflow].[Replace] GO
You are ready to use RegexReplace
.net
function directly in your T-SQL
statements:
SELECT [dbo].[StackOverflowRegexReplace] ('Hello Kitty Essential Accessory Kit', '[^\u0000-\u007F]', '')
//Hello Kitty Essential Accessory Kit
Solution 2:
if you are looking for alphabets and numbers only in a string, than this can help you out.
In this, Regex is used to replace all characters other than alphabets and numbers.
Solution 3:
This seems to work:
string input = "Hello Kitty Essential Accessory Kit";
string res = Regex.Replace(input, @"[^a-zA-Z0-9\s]", "");
Console.WriteLine(res); // Hello Kitty Essential Accessory Kit