Finding and removing Non-ASCII characters from an Oracle Varchar2
Solution 1:
I think this will do the trick:
SELECT REGEXP_REPLACE(COLUMN, '[^[:print:]]', '')
Solution 2:
If you use the ASCIISTR
function to convert the Unicode to literals of the form \nnnn
, you can then use REGEXP_REPLACE
to strip those literals out, like so...
UPDATE table SET field = REGEXP_REPLACE(ASCIISTR(field), '\\[[:xdigit:]]{4}', '')
...where field and table are your field and table names respectively.
Solution 3:
I wouldn't recommend it for production code, but it makes sense and seems to work:
SELECT REGEXP_REPLACE(COLUMN,'[^' || CHR(1) || '-' || CHR(127) || '],'')
Solution 4:
In a single-byte ASCII-compatible encoding (e.g. Latin-1), ASCII characters are simply bytes in the range 0 to 127. So you can use something like [\x80-\xFF]
to detect non-ASCII characters.
Solution 5:
The select may look like the following sample:
select nvalue from table
where length(asciistr(nvalue))!=length(nvalue)
order by nvalue;