Introducing non-ground vars into a Prolog query for a custom DSL query language

I have written an external DSL with SWI-Prolog that works by parsing text with a DCG, transforms parsed expressions into facts that get asserted into the Prolog process, then exposes a query language to the user using the same DCG grammar to query against the facts.

I am stuck trying to figure out how to translate a ground term resulting from a DCG-based parser into a non-ground term with variables that can be passed into a functor like findall/3 to return a list of query results for the user.

Here is an example of a dataset that could be queried:

congruent(const(aa), const(bb)).
congruent(const(bb), const(cc)).
congruent(const(cc), const(dd)).

and here is an example of a normalized query term from the parser:

congruent(const(aa), wildcard).

So I presumably need some way to transform wildcard from an atom into a free variable that I can pass to findall/3 (or similar). For example:

findall(X, congruent(const(aa), X), Result).

My problem arises when trying to substitute wildcard with a free variable. AFAIK, a functor can't return a completely unbound free variable, so I can't create any way to programmatically descend a complex query term to interpolate free variables wherever a wildcard atom is found, then pass that into a functor like findall/3.

Here is a more complete (albeit contrived) example that illustrates the way I'm trying to introduce variables into a query:

main :-
    query(congruent(const(aa), wildcard), Result),
    format("Result: ~p~n", [Result]).

query(QueryTerm, Result) :-
    % This is what I cannot figure out how to implement...
    xform_wildcards_to_free_vars(QueryTerm, QueryWithVars),

    term_variables(QueryWithVars, FreeVars),
    findall(FreeVars, QueryWithVars, Result).

% This is a contrived fact that transforms only one specific
% type of query, but even this wouldn't work because Wildcard
% would be considered a singleton.
xform_wildcards_to_free_vars(
        congruent(const(X), wildcard),
        congruent(const(X), Wildcard)
).

I have pored over the SWI-Prolog docs trying to find some functor that lets me transform ground terms into non-ground terms but I have been spinning my wheels unable to find anything. What am I missing? Surely this is a common enough use-case of Prolog that it supports this.

EDIT: Example testcase

Guy Coder below asked me to provide a test case, so here it is. It was too long to include in a comment.

:- table congruent/2.

congruent(A, C) :- congruent(A, B),
                   congruent(B, C).

congruent(const(aa), const(bb)).
congruent(const(bb), const(cc)).
congruent(const(cc), const(dd)).
congruent(const(cc), const(ee)).
congruent(const(ee), const(ff)).
congruent(const(yy), const(zz)).

assert_congruences :-
    assertion(congruent(const(aa), const(bb))),
    assertion(congruent(const(bb), const(dd))),
    assertion(congruent(const(bb), const(ee))),
    assertion(congruent(const(bb), const(ff))),
    assertion(not(congruent(const(bb), const(yy)))),
    assertion(not(congruent(const(bb), const(zz)))).

assert_query :-
    query(congruent(const(aa), wildcard), Results),
    assertion(member(const(ff), Results)).

The last line above shows how the hypothetical query/2 functor would return a list of terms which would be returned if findall/3 were used like above with a free variable introduced where wildcard is found.

I do need this to work for complex terms that can have one or more wildcards deeply nested in a term. My examples above just show a transitive congruent/2 predicate, but I have other much more logically complex predicates that may also be used to query.

EDIT 2: Implementation works! Thanks slago and Isabelle!

Ok, I finally got it working! Here is what I came up with:

:- module(repl, []).

:- table congruent/2.

congruent(A, C) :- congruent(A, B),
                   congruent(B, C).

congruent(A, B) :-
    congruences(List),
    nth0(IndexA, List, A),
    nth0(IndexB, List, B),
    IndexA < IndexB.

congruences([const(aa), const(bb), const(cc), const(dd)]).
congruences([const(cc), const(ee), const(ff)]).
congruences([const(yy), const(zz)]).

related(const(aa), eins).
related(const(cc), zwei).
related(const(ff), drei).
related(const(zz), vier).

main :-
    FooQuery = ( congruent(const(aa), wildcard(foo)) ),
    BarQuery = ( related(wildcard(foo), wildcard(bar)) ),

    query_and_log(FooQuery),
    query_and_log(BarQuery),
    query_and_log((FooQuery,BarQuery)).

query_and_log(QueryTerm) :-
    query(QueryTerm, Result),
    format("~nQUERY: ~p~nRESULT:~p~n~n-----~n", [QueryTerm, Result]).

query(QueryTerm, Result) :-
    prepare_query(QueryTerm, NamedVars, QueryWithVars),
    findall(NamedVars, QueryWithVars, Result), !.

prepare_query(Term, NamedVars, TermWithVars) :-
    dict_create(EmptyDict, vars, []),
    xform_wildcards_to_free_vars(Term, EmptyDict-[], VarsDict-[TermWithVars]),
    format("~nPREPARED QUERY VARS: ~p~nPREPARED QUERY TERMS: ~p~n",
           [VarsDict, TermWithVars]),
    dict_pairs(VarsDict, vars, NamedVars).

% Compound terms must be recursively searched for new wildcards while
% reusing any previously created wildcard fresh variables.
xform_wildcards_to_free_vars(Term, PrevVars-PrevTerms, NewVars-NewTerms) :-
    compound(Term),
    Term =.. [TermName|SubTerms],
    TermName \= wildcard,
    foldl(
        xform_wildcards_to_free_vars,
        SubTerms,
        PrevVars-[],
        NewVars-SubTermsWithVars
    ),
    TermWithVars =.. [TermName|SubTermsWithVars],
    append(PrevTerms, [TermWithVars], NewTerms).

% Atomic terms are emitted as-is into NewTerms leaving Vars unchanged
xform_wildcards_to_free_vars(Term, Vars-PrevTerms, Vars-NewTerms) :-
    atomic(Term),
    not(Term =.. [wildcard|_]),
    append(PrevTerms, [Term], NewTerms).

% When Name is already found in Vars, re-use the same variable
xform_wildcards_to_free_vars(wildcard(Name), Vars-PrevTerms, Vars-NewTerms) :-
    get_dict(Name, Vars, ReusedVar),
    append(PrevTerms, [ReusedVar], NewTerms).

% When Name isn't in PrevVars, create a new fresh var in NewVars for it
xform_wildcards_to_free_vars(wildcard(Name), PrevVars-PrevTerms, NewVars-NewTerms) :-
    not(get_dict(Name, PrevVars, _)),
    put_dict(Name, PrevVars, FreshVar, NewVars),
    append(PrevTerms, [FreshVar], NewTerms).

% Re-write wildcard with no args as wildcard('?')
xform_wildcards_to_free_vars(wildcard, V1-T1, V2-T2) :-
    xform_wildcards_to_free_vars(wildcard('?'), V1-T1, V2-T2).

There are a few changes here that are worth mentioning from the previous code examples I have shared:

  • There is now a related/2 predicate with some simple dummy data that's used for a join
  • Wildcards now have atom names to differentiate them from each other
  • Congruences work semantically the same but there is now a congruences/1 predicate. This is a superficial change.
  • xform_wildcards_to_free_vars/3 now folds over a term's sub-terms ("args") with a dict to keep track of previously created fresh vars. Without this, each occurrence of wildcard(foo) would have its own unique fresh var instead of reusing a previous var.

And here is the output from main/0:

?- repl:main.

PREPARED QUERY VARS: vars{foo:_8244}
PREPARED QUERY TERMS: congruent(const(aa),_8244)

QUERY: congruent(const(aa),wildcard(foo))
RESULT:[[foo-const(ff)],[foo-const(dd)],[foo-const(ee)],[foo-const(bb)],[foo-const(cc)]]

-----

PREPARED QUERY VARS: vars{bar:_8616,foo:_8578}
PREPARED QUERY TERMS: related(_8578,_8616)

QUERY: related(wildcard(foo),wildcard(bar))
RESULT:[[bar-eins,foo-const(aa)],[bar-zwei,foo-const(cc)],[bar-drei,foo-const(ff)],[bar-vier,foo-const(zz)]]

-----

PREPARED QUERY VARS: vars{bar:_9236,foo:_9120}
PREPARED QUERY TERMS: congruent(const(aa),_9120),related(_9120,_9236)

QUERY: congruent(const(aa),wildcard(foo)),related(wildcard(foo),wildcard(bar))
RESULT:[[bar-drei,foo-const(ff)],[bar-zwei,foo-const(cc)]]

-----
true.

This was a tricky problem but I'm very relieved it works so well! Slago and Isabelle were very helpful to me finding this approach!


Solution 1:

Another possibility would be to recursively transform a term into a new equivalent term, replacing wildcards with fresh variables.

% wildcards_to_variables(++Term, --NewTerm, --Variables)

  wildcards_to_variables(wildcard, Variable, [Variable]) :- !.
  wildcards_to_variables(Term, Term, []) :- atomic(Term), !.
  wildcards_to_variables(Term, NewTerm, Variables) :-
      compound(Term),
      compound_name_arguments(Term, Name, Args),
      maplist(wildcards_to_variables, Args, NewArgs, Vars),
      compound_name_arguments(NewTerm, Name, NewArgs),
      append(Vars, Variables).

Examples:

?- wildcards_to_variables(congruent(const(aa), wildcard), NewTerm, Variables).
NewTerm = congruent(const(aa), _A),
Variables = [_A].

?- wildcards_to_variables(congruent(const(wildcard), wildcard), NewTerm, Variables).
NewTerm = congruent(const(_A), _B),
Variables = [_A, _B].

?- wildcards_to_variables(and(congruent(const(aa),wildcard), congruent(wildcard,const(wildcard))), NewTerm, Variables).
NewTerm = and(congruent(const(aa), _A), congruent(_B, const(_C))),
Variables = [_A, _B, _C].

Thus, the query predicate query/2 can be defined as:

query(Term, Results) :-
    wildcards_to_variables(Term, NewTerm, Vars),
    findall(Vars, NewTerm, Results).

Solution 2:

Singleton variables are one way to go here. Your solution is actually fine and gives no warning if you mark the singleton variable as such with an underscore:

xform_wildcards_to_free_vars(
        congruent(const(X), wildcard),
        congruent(const(X), _Wildcard)
).

This does "return [a term containing a] a completely unbound free variable". It will give the result you seem to expect (with your original main/0):

?- main.
Result: [[const(dd)],[const(ee)],[const(bb)],[const(cc)],[const(ff)]]
true.

Singleton variables are fairly useless in pure Prolog because they cannot be shared, so even if a singleton variable is bound to something, you cannot inspect that binding later. But your use of the impure term_variables/2 does introduce sharing for a previously unshared variable, so your program works as expected. It's just a bit misleading for the reader that a variable marked as singleton does become shared later.

An alternative would be to expose the variable as an extra argument, if there is only one variable:

xform_wildcards_to_free_vars(
        congruent(const(X), wildcard),
        congruent(const(X), Wildcard),
        Wildcard
).

query(QueryTerm, Result) :-
    xform_wildcards_to_free_vars(QueryTerm, QueryWithVars, Var),
    findall(Var, QueryWithVars, Result).

This does the same as your version, but is purer and arguably cleaner, because the sharing of the Wildcard variable is shown clearly. However, your query transformation predicate will involve more bookkeeping if you can have several different wildcards that should correspond to different variables.

Solution 3:

I think a more general solution would be to change the grammar so that it inserts and collects the necessary variables during the parsing of sentences.

Just to illustrate the point (without too much concern for the quality of grammar and code presented below), you could try something like this:

binrel(Vars-Term) -->                   % a simple binary relation
    entity(X),
    relation(R),
    entity(Y),
    { include(var, [X,Y], Vars),        % include only variables in Vars list
      Term =.. [R, X, Y] }.

entity(_)   --> [wildcard], !.          % use a fresh variable for each wildcard!
entity(C)   --> [C], { C \= wildcard }.
relation(R) --> [R], { R \= wildcard }.

db_tell(Text) :-
    tokenize_atom(Text, Tokens),
    phrase(binrel(_-Term), Tokens),
    functor(Term, Name, Arity),
    dynamic(Name/Arity),
    assertz(Term).

db_ask(Text, Results) :-
    tokenize_atom(Text, Tokens),
    phrase(binrel(Vars-Term), Tokens),
    findall(Vars, Term, Results).

db_create :-
    db_tell('ann likes apple'),
    db_tell('ann likes orange'),
    db_tell('bob likes banana'),
    db_tell('coy likes banana').

Examples:

?- db_create.
true.

?- db_ask('ann likes wildcard', Results).
Results = [[apple], [orange]].

?- db_ask('bob likes wildcard', Results).
Results = [[banana]].

?- db_ask('wildcard likes orange', Results).
Results = [[ann]].

?- db_ask('wildcard likes wildcard', Results).
Results = [[ann, apple], [ann, orange], [bob, banana], [coy, banana]].