Flatten a list in Prolog
I've only been working with Prolog for a couple days. I understand some things but this is really confusing me.
I'm suppose to write a function that takes a list and flattens it.
?- flatten([a,[b,c],[[d],[],[e]]],Xs).
Xs = [a,b,c,d,e]. % expected result
The function takes out the inner structures of the list.
This is what I have so far:
flatten2([],[]).
flatten2([Atom|ListTail],[Atom|RetList]) :-
atom(Atom), flatten2(ListTail,RetList).
flatten2([List|ListTail],RetList) :-
flatten2(List,RetList).
Now, this works when I call:
?- flatten2([a,[b,c],[[d],[],[e]]], R).
R = [a,b,c,d,e]. % works as expected!
But when I call to see if a list that I input is already flattened, is returns false
instead of true
:
?- flatten2([a,[b,c],[[d],[],[e]]], [a,b,c,d,e]).
false. % BAD result!
Why does it work on one hand, but not the other? I feel like I'm missing something very simple.
The definition of flatten2/2
you've given is busted; it actually behaves like this:
?- flatten2([a, [b,c], [[d],[],[e]]], R).
R = [a, b, c] ;
false.
So, given the case where you've already bound R
to [a,b,c,d,e]
, the failure isn't surprising.
Your definition is throwing away the tail of lists (ListTail
) in the 3rd clause - this needs to be tidied up and connected back into the list to return via RetList
. Here is a suggestion:
flatten2([], []) :- !.
flatten2([L|Ls], FlatL) :-
!,
flatten2(L, NewL),
flatten2(Ls, NewLs),
append(NewL, NewLs, FlatL).
flatten2(L, [L]).
This one recursively reduces all lists of lists into either single item lists [x]
, or empty lists []
which it throws away. Then, it accumulates and appends them all into one list again out the output.
Note that, in most Prolog implementations, the empty list []
is an atom and a list, so the call to atom([])
and is_list([])
both evaluate to true; this won't help you throw away empty lists as opposed to character atoms.
You can maintain your lists open-ended, with both a pointer to its start, and an "ending hole ⁄ free pointer" (i.e. logvar) at its end, which you can then instantiate when the end is reached:
flatten2( [], Z, Z):- !. % ---> X
flatten2( [Atom|ListTail], [Atom|X], Z) :- % .
\+is_list(Atom), !, % .
flatten2( ListTail, X, Z). % Y
flatten2( [List|ListTail], X, Z) :- % .
flatten2( List, X, Y), % from X to Y, and then % .
flatten2( ListTail, Y, Z). % from Y to Z % Z --->
You then call it as
flatten2( A, B):- flatten2( A, B, []).
That way there's no need to use reverse
anywhere. This technique is known as "difference lists", but it's much easier just to think about it as "open-ended lists" instead.
update: This is much easier coded using the dcg syntax. Since it is unidirectional (the first argument must be fully ground), why not use cuts after all:
flattn([]) --> [], !.
flattn([A|T]) --> {\+is_list(A)}, [A], !, flattn(T).
flattn([A|T]) --> flattn(A), flattn(T).
Testing:
16 ?- phrase(flattn([a,[b,c],[[d],[],[e]]]), [a, b, c, d, e]).
true.
17 ?- phrase(flattn([a,[b,c],[[d],[],[e]]]), R).
R = [a, b, c, d, e].
18 ?- phrase(flattn([a,[b,X],[[d],[],[e]]]), [a, b, c, d, e]).
X = c.
If the definition were fully declarative, the last one should've succeeded also with X=[c] ; X=[[],c] ; ... ; X=[[c]] ; ...
; alas, it isn't.
(edit2: simplified both versions, thanks to @mat's comments!)
Here's an accumulator based version for completeness:
% flatten/2
flatten(List, Result) :- flatten(List, [], Result).
% auxiliary predicate flatten/3
flatten([], Result, Result).
flatten([Head| Tail], Part, Result) :-
is_list(Head),
!,
flatten(Head, HR),
append(Part, HR, PR),
flatten(Tail, PR, Result).
flatten([Head| Tail], Part, Result) :-
append(Part, [Head], PR),
flatten(Tail, PR, Result).
flatten(X, Part, Result) :-
fail.
Prolog's list notation is syntactic sugar on top of very simple prolog terms. Prolog lists are denoted thus:
The empty list is represented by the atom
[]
. Why? Because that looks like the mathematical notation for an empty list. They could have used an atom likenil
to denote the empty list but they didn't.A non-empty list is represented by the term
.\2
, where the first (leftmost) argument is the head of the list and the second (rightmost) argument is the tail of the list, which is, recursively, itself a list.
Some examples:
-
An empty list:
[]
is represented as the atom it is:[]
-
A list of one element,
[a]
is internally stored as.(a,[])
-
A list of two elements
[a,b]
is internally stored as.(a,.(b,[]))
-
A list of three elements,
[a,b,c]
is internally stored as.(a,.(b,.(c,[])))
Examination of the head of the list is likewise syntactic sugar over the same notation:
[X|Xs]
is identical to.(X,Xs)
[A,B|Xs]
is identical to.(A,.(B,Xs))
[A,B]
is (see above) identical to.(A,.(B,[]))
Myself, I'd write flatten/2
like this:
%------------------------
% public : flatten a list
%------------------------
flatten( X , R ) :-
flatten( X , [] , T ) ,
reverse( T , R )
.
%--------------------------------------------
% private : flatten a list into reverse order
%--------------------------------------------
flatten( [] , R , R ) . % the empty list signals the end of recursion
flatten( [X|Xs] , T , R ) :- % anything else is flattened by
flatten_head( X , T , T1 ) , % - flattening the head, and
flatten( Xs , T1 , R ) % - flattening the tail
. %
%-------------------------------------
% private : flatten the head of a list
%-------------------------------------
flatten_head( X , T , [X|T] ) :- % if the head is a not a list
\+ list(X) , % - simply prepend it to the accumulator.
! . %
flatten_head( X , T , R ) :- % if the head is a list
flatten( X , T , R ) % - recurse down and flatten it.
.
%-----------------------
% what's a list, anyway?
%-----------------------
list( X ) :- var(X) , ! , fail .
list( [] ) .
list( [_|_] ) .
Building on if_//3
and list_truth/2
, we can implement myflatten/2
as follows:
myflatten(Xs,Ys) :-
phrase(myflatten_aux(Xs),Ys).
myflatten_aux([]) --> [].
myflatten_aux([T|Ts]) -->
if_(neither_nil_nor_cons_t(T), [T], myflatten_aux(T)),
myflatten_aux(Ts).
:- use_module(library(dialect/sicstus/block)).
:- block neither_nil_nor_cons(-).
neither_nil_nor_cons(X) :-
\+nil_or_cons(X).
nil_or_cons([]).
nil_or_cons([_|_]).
neither_nil_nor_cons_t(X,Truth) :-
( nonvar(X)
-> ( neither_nil_nor_cons(X) -> Truth = true
; Truth = false
)
; nonvar(Truth)
-> ( Truth == true -> neither_nil_nor_cons(X)
; Truth == false, nil_or_cons(X)
)
; Truth = true, neither_nil_nor_cons(X)
; Truth = false, nil_or_cons(X)
).
Sample queries (taken from other answers, and comments to answers):
?- myflatten([[4],[[5,6],[7,[8],[9,[10,11]]]]], Xs).
Xs = [4, 5, 6, 7, 8, 9, 10, 11].
?- myflatten([1,[8,3],[3,[5,6],2],8], Xs).
Xs = [1, 8, 3, 3, 5, 6, 2, 8].
?- myflatten([a,[b,c],[],[[[d]]]], Xs).
Xs = [a, b, c, d].
?- myflatten([a,[b,c],[[d],[],[e]]], Xs).
Xs = [a, b, c, d, e].
neither_nil_nor_cons_t
and not(nil_or_cons_t)
describe have same solutions, but the solution order differs. Consider:
?- myflatten([A,B,C],Xs), A=a,B=b,C=c.
A = a,
B = b,
C = c,
Xs = [a, b, c] ; % does not terminate universally