This is one of the programming questions asked during written test from Microsoft. I am giving the question and the answer that I came up with. Thing is my answer although looks comprehensive (at least to me), I feel that the number of lines can be reduced. It was asked in C and I am a Java person but I managed to code it (my answer may contain too many Java like syntaxes)

Ok, here is the question.

You have two lists that are already sorted, you have to merge them and return a new list without any new extra nodes. The returned list should be sorted as well.

The method signature is,

Node* MergeLists(Node* list1, Node* list2);

struct Node{
    int data;
    Node *next;
}

The following is the solution I came up with,

Node* MergeLists(Node* list1, Node* list2){
    Node* mergedList;
    if(list1 == null && list2 ==null){//if both are null, return null
        return null;
    }
    if(list1 == null){//if list1 is null, simply return list2
        return list2;
    }
    if(list2 == null){//if list2 is null, simply return list1
        return list1;
    }
    if(list1.data < list2.data){//initialize mergedList pointer to list1 if list1's data is lesser
        mergedList = list1;
    }else{//initialize mergedList pointer to list2 if list2's data is lesser or equal
        mergedList = list2;
    }
    while(list1!=null && list2!=null){
        if(list1.data < list2.data){
            mergedList->next = list1;
            list1 = list1->next;
        }else{
            mergedList->next = list2;
            list2 = list2->next;
        }
    }
    if(list1 == null){//remaining nodes of list2 appended to mergedList when list1 has reached its end.
        mergedList->next = list2;
    }else{//remaining nodes of list1 appended to mergedList when list2 has reached its end
        mergedList->next = list1;
    }
    return mergedList;
}

I am very confident this can be enhanced. Please help me find what lines are redundantly I've added. Please feel free to criticize my syntax errors and logic.

Thanks!


The most glaring bug is in your loop, you keep overwriting mergedList->next, losing the previously added node. That is, your returned list will never contain more than two nodes, regardless of input ...

It's been a while since I did C, but I would do it as follows:

Node* merge(Node* list1, Node* list2) {
    Node* merged = null;
    Node** tail = &merged;

    while (list1 && list2) {
        if (list1->data < list2->data) {
            *tail = list1;
            list1 = list1->next;
        } else {
            *tail = list2;
            list2 = list2->next;
        }
        tail = &((*tail)->next);
    }
    *tail = list1 ? list1 : list2;
    return merged;
}

Your code is overloaded with if-s inserted for handling "special" cases, which bloats it a lot and makes it difficult to read. This usually happens when you decide to handle special cases "by code" instead of finding a way to handle them "by data". A statement attributed to David Wheeler says "All problems in computer science can be solved by another level of indirection". That "extra level of indirection" usually works very well with lists, helping to significantly reduce clutter created by those ifs.

To illustrate the above, here's what my code would look like

#define SWAP_PTRS(a, b) do { void *t = (a); (a) = (b); (b) = t; } while (0)

Node* MergeLists(Node* list1, Node* list2) 
{
  Node *list = NULL, **pnext = &list;

  if (list2 == NULL)
    return list1;

  while (list1 != NULL)
  {
    if (list1->data > list2->data)
      SWAP_PTRS(list1, list2);

    *pnext = list1;
    pnext = &list1->next;
    list1 = *pnext;
  }

  *pnext = list2;
  return list;
}

Some might argue that the use of an extra level of indirection in pnext pointer actually makes the code more difficult to read. I'd agree that for a newbie the approach might pose some difficulties, but for an experienced programmer this should be easily digestible as an idiom.


My take, with a test case

So far all of the answers have been interesting and well done. It's possible that this one is more like what an interviewer would like to see, featuring DRY/DIE, and TDD. :-)

#include <stdio.h>

typedef struct ns {
    int data;
    struct ns *next;
} Node;

Node l1[] = {
  { 1, &l1[1] },
  { 3, &l1[2] },
  { 5, &l1[3] },
  { 7, &l1[4] },
  { 9, &l1[5] },
  {11, 0 },
};

Node l2[] = {
  { 2, &l2[1] },
  { 4, &l2[2] },
  { 6, 0 },
};

Node* MergeLists(Node* list1, Node* list2) {
  Node *t, **xt;
  for(xt = &t; list1 || list2;) {
    Node **z = list1 == NULL ? &list2 :
               list2 == NULL ? &list1 :
               list1->data < list2->data ? &list1 : &list2;
    *xt = *z;
     xt = &(*z)->next;
    *z  = *xt;
  }
  *xt = NULL;
  return t;
}

int main(void) {
  for(Node *t = MergeLists(l1, l2); t; t = t->next) 
    printf("%d\n", t->data);
  return 0;
}

It doesn't get any more elegant than this:

Node* merge2(Node* n1, Node* n2) {
    n1->next = merge(n1->next, n2);
    return n1;
}

Node* merge(Node* n1, Node* n2) {
    return (n1 == null) ? n2 :
           (n2 == null) ? n1 :
           (n1->data < n2->data) ?
               merge2(n1, n2) :
               merge2(n2, n1);
}

Assuming that you understand recursion, this is as clear as it gets.


I should point out that this is good for an interview answer only (where presumably demonstrating clarity of thought has more impact than simply showing that you know how to write programs). In practice, you wouldn't want to merge this way, since it uses O(n) stack depth, which likely would cause a stack overflow. Also, it's not a tail-recursion, so it's not compiler-optimizable.


Divide et Impera

(i.e. MergeSort)