Prolog: First duplicate value

I need to find the first duplicate value in a list.

prep(3,[1,3,5,3,5]). Should be true.

prep(5,[1,3,5,3,5]). Should be false.

I thought checking for equality with the current value and the previous list members until I find a duplicate, if it finds one it will test for equality with X but I have no idea how to do that in Prolog!

I appreciate any help! Thanks


Here is a pure version using dif/2 which implements sound inequality. dif/2 is offered by B-Prolog, YAP-Prolog, SICStus-Prolog and SWI-Prolog.

firstdup(E, [E|L]) :-
    member(E, L).
firstdup(E, [N|L]) :-
    non_member(N, L),
    firstdup(E, L).

member(E, [E|_L]).
member(E, [_X|L]) :-
    member(E, L).

non_member(_E, []).
non_member(E, [F|Fs]) :-
    dif(E, F),
    non_member(E, Fs).

The advantages are that it can also be used with more general queries:

?- firstdup(E, [A,B,C]).
E = A, A = B ;
E = A, A = C ;
E = C,
B = C,
dif(A, C) ;
false.

Here, we get three answers: A is the duplicate in the first two answers, but on two different grounds: A might be equal to B or C. In the third answer, B is the duplicate, but it will only be a duplicate if C is different to A.

To understand the definition, read :- as an arrow ← So what is on the right-hand side is what you know and on the left is what you conclude. It is often in the beginning a bit irritating to read predicates in that direction, after all you might be tempted to follow "the thread of execution". But often this thread leads to nowhere – it gets too complex to understand.

The first rule reads:

Provided E is an element of list L we conclude that [E|L] has E as first duplicate.

The second rule reads:

Provided E is the first duplicate of L (don't panic here and say we don't know that ...) and provided that N is not an element of L we conclude that [N|L] has E as first duplicate.

The member/2 predicate is provided in many Prolog systems and non_member(X,L) can be defined as maplist(dif(X),L). Thus one would define firstdup/2 rather as:

firstdup(E, [E|L]) :-
    member(E, L).
firstdup(E, [N|L]) :-
    maplist(dif(N), L),
    firstdup(E, L).

In this answer we improve the logically pure code presented in this earlier answer. Step-by-step:

  1. We combine two predicates memberd/2 and non_member/2 to one, memberd_t/3, which uses an additional argument for reifying list membership into a truth-value (true or false).

    memberd_t/3 is equivalent to memberd/2 + non_member/2, so we could define it like this:

    memberd_t(X,Xs,true) :-
       memberd(X,Xs).
    memberd_t(X,Xs,false) :-
       non_member(X,Xs).
    

    Or, vice versa, we could define memberd/2 and non_member/2 like so:

    memberd(X,Xs) :-
       memberd_t(X,Xs,true).
    
    non_member(X,Xs) :-
       memberd_t(X,Xs,false).
    

    In practise, we use a tuned implementation of memberd_t/3—one with better determinism.

    Let's see that memberd_t/3 in fact covers both memberd/2 and non_member/2!

    ?- memberd_t(X,[1,2,3],T).
      T = true ,     X=1
    ; T = true ,               X=2
    ; T = true ,                         X=3
    ; T = false, dif(X,1), dif(X,2), dif(X,3).
    
  2. Take the following variant of predicate firstdup/2 (defined earlier) as our starting point:

    firstdup(E,[X|Xs]) :-
       (  memberd(X,Xs),
          E=X      
       ;  non_member(X,Xs),
          firstdup(E,Xs)
       ).
    
  3. Let's replace the use of memberd/2 and non_member/2 with memberd_t/3!

    firstdup(E,[X|Xs]) :-
       (  memberd_t(X,Xs,true),
          E=X
       ;  memberd_t(X,Xs,false),
          firstdup(E,Xs)
       ).
    
  4. Let's hoist memberd_t/3!

    firstdup(E,[X|Xs]) :-
       memberd_t(X,Xs,T),
       (  T=true
       -> E=X      
       ;  T=false,
          firstdup(E,Xs)
       ).
    
  5. Above pattern pred_t(OtherArgs,T), (T = true -> Then ; T = false, Else) can be expressed more concisely using if_/3, writing if_(pred_t(OtherArgs),Then,Else).

    firstdup(E,[X|Xs]) :-
       if_(memberd_t(X,Xs),
           E=X,
           firstdup(E,Xs)).
    

Let's run some queries!

?- firstdup(1,[1,2,3,1]).
true.                                % succeeds deterministically

?- firstdup(X,[1,2,3,1]).
X = 1.                               % succeeds deterministically

?- firstdup(X,[A,B,C]).              % succeeds, leaving behind a choicepoint
      A=X ,     B=X                  % ... to preserve the full solution set.
;     A=X , dif(B,X), C=X
; dif(A,X),     B=X , C=X
; false.