Finding and removing Non-ASCII characters from an Oracle Varchar2

Solution 1:

I think this will do the trick:

SELECT REGEXP_REPLACE(COLUMN, '[^[:print:]]', '')

Solution 2:

If you use the ASCIISTR function to convert the Unicode to literals of the form \nnnn, you can then use REGEXP_REPLACE to strip those literals out, like so...

UPDATE table SET field = REGEXP_REPLACE(ASCIISTR(field), '\\[[:xdigit:]]{4}', '')

...where field and table are your field and table names respectively.

Solution 3:

I wouldn't recommend it for production code, but it makes sense and seems to work:

SELECT REGEXP_REPLACE(COLUMN,'[^' || CHR(1) || '-' || CHR(127) || '],'')

Solution 4:

In a single-byte ASCII-compatible encoding (e.g. Latin-1), ASCII characters are simply bytes in the range 0 to 127. So you can use something like [\x80-\xFF] to detect non-ASCII characters.

Solution 5:

The select may look like the following sample:

select nvalue from table
where length(asciistr(nvalue))!=length(nvalue)  
order by nvalue;