How to remove all non-alpha numeric characters from a string in MySQL?

Using MySQL 8.0 or higher

Courtesy of michal.jakubeczy's answer below, replacing by Regex is now supported by MySQL:

UPDATE {table} SET {column} = REGEXP_REPLACE({column}, '[^0-9a-zA-Z ]', '')

Using MySQL 5.7 or lower

Regex isn't supported here. I had to create my own function called alphanum which stripped the chars for me:

DROP FUNCTION IF EXISTS alphanum; 
DELIMITER | 
CREATE FUNCTION alphanum( str CHAR(255) ) RETURNS CHAR(255) DETERMINISTIC
BEGIN 
  DECLARE i, len SMALLINT DEFAULT 1; 
  DECLARE ret CHAR(255) DEFAULT ''; 
  DECLARE c CHAR(1);
  IF str IS NOT NULL THEN 
    SET len = CHAR_LENGTH( str ); 
    REPEAT 
      BEGIN 
        SET c = MID( str, i, 1 ); 
        IF c REGEXP '[[:alnum:]]' THEN 
          SET ret=CONCAT(ret,c); 
        END IF; 
        SET i = i + 1; 
      END; 
    UNTIL i > len END REPEAT; 
  ELSE
    SET ret='';
  END IF;
  RETURN ret; 
END | 
DELIMITER ; 

Now I can do:

select 'This works finally!', alphanum('This works finally!');

and I get:

+---------------------+---------------------------------+
| This works finally! | alphanum('This works finally!') |
+---------------------+---------------------------------+
| This works finally! | Thisworksfinally                |
+---------------------+---------------------------------+
1 row in set (0.00 sec)

Hurray!


From a performance point of view, (and on the assumption that you read more than you write)

I think the best way would be to pre calculate and store a stripped version of the column, This way you do the transform less.

You can then put an index on the new column and get the database to do the work for you.


SELECT teststring REGEXP '[[:alnum:]]+';

SELECT * FROM testtable WHERE test REGEXP '[[:alnum:]]+'; 

See: http://dev.mysql.com/doc/refman/5.1/en/regexp.html
Scroll down to the section that says: [:character_class:]

If you want to manipulate strings the fastest way will be to use a str_udf, see:
https://github.com/hholzgra/mysql-udf-regexp


Since MySQL 8.0 you can use regular expression to remove non alphanumeric characters from a string. There is method REGEXP_REPLACE

Here is the code to remove non-alphanumeric characters:

UPDATE {table} SET {column} = REGEXP_REPLACE({column}, '[^0-9a-zA-Z ]', '')