PL/SQL comma delimited list; remove dups and put in array

Solution 1:

There is a well-known SQL trick for turning comma-separated lists into rows. Just use that trick, add a DISTINCT keyword, and BULK COLLECT the results into your array (I assume you mean collection).

DECLARE
  p_test_string   VARCHAR2 (4000) := 'A,B,C,B,B,D';

  TYPE string_array_type IS TABLE OF VARCHAR2 (4000);

  l_array         string_array_type;
BEGIN
  SELECT DISTINCT REGEXP_SUBSTR (p_test_string,
                        '[^,]+',
                        1,
                        LEVEL)
  BULK   COLLECT INTO l_array
  FROM   DUAL
  CONNECT BY REGEXP_SUBSTR (p_test_string,
                            '[^,]+',
                            1,
                            LEVEL)
               IS NOT NULL
  ORDER BY 1;

  DBMS_OUTPUT.put_line ('l_array.count = ' || l_array.COUNT);
  DBMS_OUTPUT.put_line ('l_array(2) = ' || l_array (2));
END;

Output:

l_array.count = 4
l_array(2) = B

Solution 2:

There are multiple methods to split a delimited string. One of which is to use a simple PL/SQL function:

CREATE TYPE string_list IS TABLE OF VARCHAR2(4000);
/

CREATE OR REPLACE FUNCTION split_String(
  i_str    IN  VARCHAR2,
  i_delim  IN  VARCHAR2 DEFAULT ','
) RETURN STRING_LIST DETERMINISTIC
AS
  p_result       STRING_LIST := STRING_LIST();
  p_start        NUMBER(5) := 1;
  p_end          NUMBER(5);
  c_len CONSTANT NUMBER(5) := LENGTH( i_str );
  c_ld  CONSTANT NUMBER(5) := LENGTH( i_delim );
BEGIN
  IF c_len > 0 THEN
    p_end := INSTR( i_str, i_delim, p_start );
    WHILE p_end > 0 LOOP
      p_result.EXTEND;
      p_result( p_result.COUNT ) := SUBSTR( i_str, p_start, p_end - p_start );
      p_start := p_end + c_ld;
      p_end := INSTR( i_str, i_delim, p_start );
    END LOOP;
    IF p_start <= c_len + 1 THEN
      p_result.EXTEND;
      p_result( p_result.COUNT ) := SUBSTR( i_str, p_start, c_len - p_start + 1 );
    END IF;
  END IF;
  RETURN p_result;
END;
/

This is a pure PL/SQL function using simple string functions (rather than using more expensive regular expressions and context switches into an SQL scope).

There is also a very simple, built-in, function SET( collection_value ) for removing duplicates from a collection:

SET( STRING_LIST( 'A', 'B', 'A', 'C', 'B' ) )

Will give the collection:

STRING_LIST( 'A', 'B', 'C' )

So, if you want to split a delimited string and de-duplicate it then you can just do:

SET( split_String( 'A,B,C,A,B,D,C,E' ) )

Which will give you:

STRING_LIST( 'A', 'B', 'C', 'D', 'E' )