Dot notation in scheme

Note: this is cannibalized from an answer to Recursive range in Lisp adds a period?, which is really asking a different question. However, the explanation of how pairs are printed is the same. The rest of the answer is different.

Your question exhibits a bit of misunderstanding, but I think we can clear it up.

The first [argument to cons] could be a symbol, or a list, it doesn't matter. But specially the second element must be a symbol, if it is not, may be a list for instance, then we can't create a pair with built-in cons procedure.

This is not correct. You can call cons with whatever arguments you like, and you will always get back a cons cell whose car is the same as the first argument to cons, and whose cdr is the same as the second argument to cons. That is, the only thing that's important about cons is that it satisfies the equations

(eq? a (car (cons a b))
(eq? b (cdr (cons a b))

So is it possible to create a pair of 2 lists??? Well i'm thinking of a solution is that converting a list to symbol but actually those are 2 completely different thing -> impossible as i understand.

It's quite possible; if you have two lists, e.g., list1 and list2, you can create a pair whose car is list1 and whose cdr is list2 just by calling (cons list1 list2). Now, I think the issue that you're running up against is that you're expecting to see (<value-of-list1> . <value-of-list2>) as the output, and you're seeing some different. To explain why this is, we need to understand how lists are represented in Lisps, and how pairs are printed.

A list in Scheme is either the empty list () (also known as nil in some Lisps), or a cons cell whose car (also known as first) is an element of the list and whose cdr (also known as rest) is either the rest of the list (i.e., another list), or an atom that terminates the list. The conventional terminator is the empty list (); lists terminated by () are said to be "proper lists". Lists terminated by any other atom are called "improper lists". The list (1 2 3 4 5) contains the elements 1, 2, 3, 4, and 5, and is terminated by (). You could construct it by

(cons 1 (cons 2 (cons 3 (cons 4 (cons 5 ())))))

Now, when the system prints a cons cell, the general case is to print it by

(car . cdr)

For instance, the result of (cons 1 2) is printed as

(1 . 2)

Since lists are built of cons cells, you can use this notation for lists too:

'(1 2 3 4 5) ==
'(1 . (2 . (3 . (4 . (5 . ())))))

That's rather clunky, though, so most lisps (all that I know of) have a special case for printing cons cells: if the cdr is a list (either another cons cell, or ()), then don't print the ., and don't print the surrounding parenthesis of the cdr (which it would otherwise have, since it's a list).

Now we can explain why the result of (cons list1 list2) doesn't look like (<value-of-list1> . <value-of-list2>). If you call cons with two lists, you do get back a pair with the expected car and cdr, but it's not printed with the . notation. E.g.,

(cons '(1 2 3) '(a b c))
;=> ((1 2 3) . (a b c))   ; which is typically *printed* as
;=> ((1 2 3) a b c)

But again, the printed representation doesn't really matter though, as long as the following equations hold:

(eq? a (car (cons a b))
(eq? b (cdr (cons a b))

Sure enough:

(car (cons '(1 2 3) '(a b c)))
;=> (1 2 3)

(cdr (cons '(1 2 3) '(a b c)))
;=> (a b c)

In the specific example you're asking about, consider what happens when you call

(cons '(prov:label "entity e0") '(prov:location "London"))

The result is, in fact,

((prov:label "entity e0") . (prov:location "London"))

but, because of the printing rules, this is printed as

((prov:label "entity e0") prov:location "London")

Nonetheless, you can still get the two attributes out by using car and cdr:

(car '((prov:label "entity e0") prov:location "London"))
;=> (prov:label "entity e0")

(cdr '((prov:label "entity e0") prov:location "London"))
;=> (prov:location "London")

and that's all you really need to be able to do later.


The dot notation in Scheme or any other LISP dialect for that matter is used to create a dotted pair of any two values. The values may be either symbols, lists or anything else. It really doesn't matter what value it is. For example:

'(author . aaditmshah) => (author . aaditmshah)
'((a b c) . (d e f))   => ((a b c) d e f)

As you can see if you create a dotted pair of two lists then the first list will be added to the head of the second list. This is because lists in LISP are nested dotted pairs:

'(a . (b . (c . ())))  => '(a b c)

Hence when you write '((a b c) . (d e f)) it's as if you were writing the following:

'((a b c) . (d . (e . (f . ())))) => '((a b c) d e f)

This is perfectly valid. You can still access both the lists individually using car and cdr as you normally would:

(car '((a b c) . (d e f))) => (a b c)
(cdr '((a b c) . (d e f))) => (d e f)

I'm not sure what your compare-attrs function is supposed to do. Your if branch returns a cons cell whereas your else branch returns a boolean. To me this make absolutely no sense. The return type of the function should be consistent.

Perhaps you haven't worded your question correctly because I'm not really sure what your problem is. Let me know if you still have any doubts.


My recollection about data structuring in scheme is that you would prefer to avoid a dotted pair of atoms (ie, (a . b)) as that represents the result of cons'ing two atoms together. If you look in The Little Schemer, page 3, you'll see this:


The Law of Cons

The primitive cons takes two arguments. The second argument to cons must be a list. The result is a list.


if you've read the other answers, you know that this isn't true. So honestly, the thing to do with Scheme data structuring, which is proscribed by SICP, is to define your data structure layout in terms of lists and then make little functions to act as the accessors.

So let's say that you want to have a tuple - you could make it look like (a b) which is totally reasonable. Then you you could write these functions:

(define (make-tuple a b) (list a b))

(define (tuple-first tup) (car tup))
(define (tuple-second tup) (car (cdr tup)))
(define (tuple? tup) (and (list? tup) (eq? 2 (length tup))))

which now gives me accessor functions and a constructor and I'm gold. But that's not to say that this is the only way to do this. You could go right ahead and use pairs too:

(define (make-tuple a b) (cons a b))

(define (tuple-first tup) (car tup))
(define (tuple-second tup) (cdr tup))
(define (tuple? tup) (pair? tup))

So in your code, you can use the constructor here to make the tuple that you want.

In general, your compare-attrs function is odd in that the name doesn't really give us a sense of what you're trying to do. I'd be more inclined to write it like this:

(define (compare-attrs a1 a2)
    (or
        (and (null? a1) (null? a2))
        (and (not (null? a1)) (not (null? a2)))
    ))

(define (join-populated-attrs a1 a2)
    (if (compare-attrs a1 a2) (make-tuple a1 a2) '()))

which still feels funny in that you're accepting two null attributes, but that's probably part of your over all problem domain.

I should also say that if you want the output to appear in a particular way, you probably should write print-tuple as well.