LISTAGG in Oracle to return distinct values
I am trying to use the LISTAGG
function in Oracle. I would like to get only the distinct values for that column. Is there a way in which I can get only the distinct values without creating a function or a procedure?
col1 col2 Created_by 1 2 Smith 1 2 John 1 3 Ajay 1 4 Ram 1 5 Jack
I need to select col1 and the LISTAGG
of col2 (column 3 is not considered). When I do that, I get something like this as the result of LISTAGG
: [2,2,3,4,5]
I need to remove the duplicate '2' here; I need only the distinct values of col2 against col1.
Solution 1:
19c and later:
select listagg(distinct the_column, ',') within group (order by the_column)
from the_table
18c and earlier:
select listagg(the_column, ',') within group (order by the_column)
from (
select distinct the_column
from the_table
) t
If you need more columns, something like this might be what you are looking for:
select col1, listagg(col2, ',') within group (order by col2)
from (
select col1,
col2,
row_number() over (partition by col1, col2 order by col1) as rn
from foo
order by col1,col2
)
where rn = 1
group by col1;
Solution 2:
From oracle 19C it is built in see here
From 18C and earlier try within group see here
Otherwise use regular expressions
Here's how to solve your issue.
select
regexp_replace(
'2,2,2.1,3,3,3,3,4,4'
,'([^,]+)(,\1)*(,|$)', '\1\3')
from dual
returns
2,2.1,3,4
ANSWER below:
select col1,
regexp_replace(
listagg(
col2 , ',') within group (order by col2) -- sorted
,'([^,]+)(,\1)*(,|$)', '\1\3') )
from tableX
where rn = 1
group by col1;
Note: The above will work in most cases - list should be sorted , you may have to trim all trailing and leading space depending on your data.
If you have a alot of items in a group > 20 or big string sizes you might run into oracle string size limit 'result of string concatenation is too long'.
From oracle 12cR2 you can suppress this error see here. Alternatively put a max number on the members in each group. This will only work if its ok to list only the first members. If you have very long variable strings this may not work. you will have to experiment.
select col1,
case
when count(col2) < 100 then
regexp_replace(
listagg(col2, ',') within group (order by col2)
,'([^,]+)(,\1)*(,|$)', '\1\3')
else
'Too many entries to list...'
end
from sometable
where rn = 1
group by col1;
Another solution (not so simple) to hopefully avoid oracle string size limit - string size is limited to 4000. Thanks to this post here by user3465996
select col1 ,
dbms_xmlgen.convert( -- HTML decode
dbms_lob.substr( -- limit size to 4000 chars
ltrim( -- remove leading commas
REGEXP_REPLACE(REPLACE(
REPLACE(
XMLAGG(
XMLELEMENT("A",col2 )
ORDER BY col2).getClobVal(),
'<A>',','),
'</A>',''),'([^,]+)(,\1)*(,|$)', '\1\3'),
','), -- remove leading XML commas ltrim
4000,1) -- limit to 4000 string size
, 1) -- HTML.decode
as col2
from sometable
where rn = 1
group by col1;
V1 - some test cases - FYI
regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)+', '\1')
-> 2.1,3,4 Fail
regexp_replace('2 ,2 ,2.1,3 ,3 ,4 ,4 ','([^,]+)(,\1)+', '\1')
-> 2 ,2.1,3,4 Success - fixed length items
V2 -items contained within items eg. 2,21
regexp_replace('2.1,1','([^,]+)(,\1)+', '\1')
-> 2.1 Fail
regexp_replace('2 ,2 ,2.1,1 ,3 ,4 ,4 ','(^|,)(.+)(,\2)+', '\1\2')
-> 2 ,2.1,1 ,3 ,4 -- success - NEW regex
regexp_replace('a,b,b,b,b,c','(^|,)(.+)(,\2)+', '\1\2')
-> a,b,b,c fail!
v3 - regex thank Igor! works all cases.
select
regexp_replace('2,2,2.1,3,3,4,4','([^,]+)(,\1)*(,|$)', '\1\3') ,
---> 2,2.1,3,4 works
regexp_replace('2.1,1','([^,]+)(,\1)*(,|$)', '\1\3'),
--> 2.1,1 works
regexp_replace('a,b,b,b,b,c','([^,]+)(,\1)*(,|$)', '\1\3')
---> a,b,c works
from dual
Solution 3:
you can use undocumented wm_concat
function.
select col1, wm_concat(distinct col2) col2_list
from tab1
group by col1;
this function returns clob column, if you want you can use dbms_lob.substr
to convert clob to varchar2.