Reification of term equality/inequality

Pure Prolog programs that distinguish between the equality and inequality of terms in a clean manner suffer from execution inefficiencies ; even when all terms of relevance are ground.

A recent example on SO is this answer. All answers and all failures are correct in this definition. Consider:

?- Es = [E1,E2], occurrences(E, Es, Fs).
Es = Fs, Fs = [E, E],
E1 = E2, E2 = E ;
Es = [E, E2],
E1 = E,
Fs = [E],
dif(E, E2) ;
Es = [E1, E],
E2 = E,
Fs = [E],
dif(E, E1) ;
Es = [E1, E2],
Fs = [],
dif(E, E1),
dif(E, E2).

While the program is flawless from a declarative viewpoint, its direct execution on current systems like B, SICStus, SWI, YAP is unnecessarily inefficient. For the following goal, a choicepoint is left open for each element of the list.

?- occurrences(a,[a,a,a,a,a],M).
M = [a, a, a, a, a] ;
false.

This can be observed by using a sufficiently large list of as as follows. You might need to adapt the I such that the list can still be represented ; in SWI this would mean that

1mo the I must be small enough to prevent a resource error for the global stack like the following:

?- 24=I,N is 2^I,length(L,N), maplist(=(a),L).
ERROR: Out of global stack

2do the I must be large enough to provoke a resource error for the local stack:

?- 22=I,N is 2^I,length(L,N), maplist(=(a),L), ( Length=ok ; occurrences(a,L,M) ).
I = 22,
N = 4194304,
L = [a, a, a, a, a, a, a, a, a|...],
Length = ok ;
ERROR: Out of local stack

To overcome this problem and still retain the nice declarative properties some comparison predicate is needed.


How should this comparison predicate be defined?

Here is such a possible definition:

equality_reified(X, X, true).
equality_reified(X, Y, false) :-
   dif(X, Y).

Edit: Maybe the argument order should be reversed similar to the ISO built-in compare/3 (link links to draft only).

An efficient implementation of it would handle the fast determinate cases first:

equality_reified(X, Y, R) :- X == Y, !, R = true.
equality_reified(X, Y, R) :- ?=(X, Y), !, R = false. % syntactically different
equality_reified(X, Y, R) :- X \= Y, !, R = false. % semantically different
equality_reified(X, X, true).
equality_reified(X, Y, false) :-
   dif(X, Y).

Edit: it is not clear to me whether or not X \= Y is a suitable guard in the presence of constraints. Without constraints, ?=(X, Y) or X \= Y are the same.


Example

As suggested by @user1638891, here is an example how one might use such a primitive. The original code by mats was:

occurrences_mats(_, [], []).
occurrences_mats(X, [X|Ls], [X|Rest]) :-
   occurrences_mats(X, Ls, Rest).
occurrences_mats(X, [L|Ls], Rest) :-
   dif(X, L),
   occurrences_mats(X, Ls, Rest).

Which can be rewritten to something like:

occurrences(_, [], []).
occurrences(E, [X|Xs], Ys0) :-
   reified_equality(Bool, E, X),
   ( Bool == true -> Ys0 = [X|Ys] ; Ys0 = Ys ),
   % ( Bool = true, Ys0 = [X|Ys] ; Bool = true, Ys0 = Ys ),
   occurrences(E, Xs, Ys).

reified_equality(R, X, Y) :- X == Y, !, R = true.
reified_equality(R, X, Y) :- ?=(X, Y), !, R = false.
reified_equality(true, X, X).
reified_equality(false, X, Y) :-
   dif(X, Y).

Please note that SWI's second-argument indexing is only activated, after you enter a query like occurrences(_,[],_). Also, SWI need the inherently nonmonotonic if-then-else, since it does not index on (;)/2 – disjunction. SICStus does so, but has only first argument indexing. So it leaves one (1) choice-point open (at the end with []).


Well for one thing, the name should be more declarative, like equality_truth/2.


The following code is based on if_/3 and (=)/3 (a.k.a. equal_truth/3), as implemented by @false in Prolog union for A U B U C:

=(X, Y, R) :- X == Y,    !, R = true.
=(X, Y, R) :- ?=(X, Y),  !, R = false. % syntactically different
=(X, Y, R) :- X \= Y,    !, R = false. % semantically different
=(X, Y, R) :- R == true, !, X = Y.
=(X, X, true).
=(X, Y, false) :-
   dif(X, Y).

if_(C_1, Then_0, Else_0) :-
   call(C_1, Truth),
   functor(Truth,_,0),  % safety check
   ( Truth == true -> Then_0 ; Truth == false, Else_0 ).

Compared to occurrences/3, the auxiliary occurrences_aux/3 uses a different argument order that passes the list Es as the first argument, which can enable first-argument indexing:

occurrences_aux([], _, []).
occurrences_aux([X|Xs], E, Ys0) :-
   if_(E = X, Ys0 = [X|Ys], Ys0 = Ys),
   occurrences_aux(Xs, E, Ys).

As pointed out by @migfilg, the goal Fs=[1,2], occurrences_aux(Es,E,Fs) should fail, as it is logically false: occurrences_aux(_,E,Fs) states that all elements in Fs are equal to E. However, on its own, occurrences_aux/3 does not terminate in cases like this.

We can use an auxiliary predicate allEqual_to__lazy/2 to improve termination behaviour:

allEqual_to__lazy(Xs,E) :-
   freeze(Xs, allEqual_to__lazy_aux(Xs,E)).

allEqual_to__lazy_aux([],_).
allEqual_to__lazy_aux([E|Es],E) :-
   allEqual_to__lazy(Es,E).

With all auxiliary predicates in place, let's define occurrences/3:

occurrences(E,Es,Fs) :-
   allEqual_to__lazy(Fs,E),    % enforce redundant equality constraint lazily
   occurrences_aux(Es,E,Fs).   % flip args to enable first argument indexing

Let's have some queries:

?- occurrences(E,Es,Fs).       % first, the most general query
Es = Fs, Fs = []        ;
Es = Fs, Fs = [E]       ;
Es = Fs, Fs = [E,E]     ;
Es = Fs, Fs = [E,E,E]   ;
Es = Fs, Fs = [E,E,E,E] ...    % will never terminate universally, but ...
                               % that's ok: solution set size is infinite

?- Fs = [1,2], occurrences(E,Es,Fs).
false.                         % terminates thanks to allEqual_to__lazy/2

?- occurrences(E,[1,2,3,1,2,3,1],Fs).
Fs = [1,1,1],     E=1                     ;
Fs = [2,2],                 E=2           ;
Fs = [3,3],                           E=3 ;
Fs = [],      dif(E,1), dif(E,2), dif(E,3).

?- occurrences(1,[1,2,3,1,2,3,1],Fs).
Fs = [1,1,1].                  % succeeds deterministically

?- Es = [E1,E2], occurrences(E,Es,Fs).
Es = [E,  E], Fs = [E,E],     E1=E ,     E2=E  ;
Es = [E, E2], Fs = [E],       E1=E , dif(E2,E) ;
Es = [E1, E], Fs = [E],   dif(E1,E),     E2=E  ;
Es = [E1,E2], Fs = [],    dif(E1,E), dif(E2,E).

?- occurrences(1,[E1,1,2,1,E2],Fs).
    E1=1 ,     E2=1 , Fs = [1,1,1,1] ;
    E1=1 , dif(E2,1), Fs = [1,1,1]   ;
dif(E1,1),     E2=1 , Fs = [1,1,1]   ;
dif(E1,1), dif(E2,1), Fs = [1,1].

Edit 2015-04-27

Some more queries for testing if the occurrences/3 universal terminates in certain cases:

?-           occurrences(1,L,[1,2]).
false. 
?- L = [_|_],occurrences(1,L,[1,2]).
false.
?- L = [X|X],occurrences(1,L,[1,2]).
false.
?- L = [L|L],occurrences(1,L,[1,2]).
false.

It seems to be best to call this predicate with the same arguments (=)/3. In this manner, conditions like if_/3 are now much more readable. And to use rather the suffix _t in place of _truth:

memberd_t(_X, [], false).
memberd_t(X, [Y|Ys], Truth) :-
   if_( X = Y, Truth=true, memberd_t(X, Ys, Truth) ).

Which used to be:

memberd_truth(_X, [], false).
memberd_truth(X, [Y|Ys], Truth) :-
   if_( equal_truth(X, Y), Truth=true, memberd_truth(X, Ys, Truth) ).

UPDATE: This answer has been superseded by mine of 18 April. I do not propose that it be deleted because of the comments below.

My previous answer was wrong. The following one runs against the test case in the question and the implementation has the desired feature of avoiding superfluous choice-points. I assume the top predicate mode to be ?,+,? although other modes could easily be implemented.

The program has 4 clauses in all: the list in the 2nd argument is visited and for each member there are two possibilities: it either unifies with the 1st argument of the top predicate or is different from it in which case a dif constraint applies:

occurrences(X, L, Os) :- occs(L, X, Os).

occs([],_,[]).
occs([X|R], X, [X|ROs]) :- occs(R, X, ROs).
occs([X|R], Y, ROs) :- dif(Y, X), occs(R, Y, ROs).

Sample runs, using YAP:

?- occurrences(1,[E1,1,2,1,E2],Fs).
E1 = E2 = 1,
Fs = [1,1,1,1] ? ;
E1 = 1,
Fs = [1,1,1],
dif(1,E2) ? ;
E2 = 1,
Fs = [1,1,1],
dif(1,E1) ? ;
Fs = [1,1],
dif(1,E1),
dif(1,E2) ? ;
no  

?- occurrences(E,[E1,E2],Fs).
E = E1 = E2,
Fs = [E,E] ? ;
E = E1,
Fs = [E],
dif(E,E2) ? ;
E = E2,
Fs = [E],
dif(E,E1) ? ;
Fs = [],
dif(E,E1),
dif(E,E2) ? ;
no