How to remove substring from the end of string using spark sql?
Solution 1:
You can use regexp_replace
function with this regex:
(\s*BRAND\s*)*$ # removes all brand names that comes at the end
For cases where perfume name is the same as brand name then the output of regexp_replace
will be empty string, using when expression you can check if it's empty then use brand name:
SELECT CASE WHEN trim(regexp_replace(perfume, format_string('(\\s*%s\\s*)*$', brand), '')) <> ''
THEN regexp_replace(perfume, format_string('(\\s*%s\\s*)*$', brand), '')
ELSE brand
END AS perfume,
brand
FROM global_temp.gv_web