Why doesn't Dijkstra's algorithm work for negative weight edges?

Can somebody tell me why Dijkstra's algorithm for single source shortest path assumes that the edges must be non-negative.

I am talking about only edges not the negative weight cycles.


Recall that in Dijkstra's algorithm, once a vertex is marked as "closed" (and out of the open set) - the algorithm found the shortest path to it, and will never have to develop this node again - it assumes the path developed to this path is the shortest.

But with negative weights - it might not be true. For example:

       A
      / \
     /   \
    /     \
   5       2
  /         \
  B--(-10)-->C

V={A,B,C} ; E = {(A,C,2), (A,B,5), (B,C,-10)}

Dijkstra from A will first develop C, and will later fail to find A->B->C


EDIT a bit deeper explanation:

Note that this is important, because in each relaxation step, the algorithm assumes the "cost" to the "closed" nodes is indeed minimal, and thus the node that will next be selected is also minimal.

The idea of it is: If we have a vertex in open such that its cost is minimal - by adding any positive number to any vertex - the minimality will never change.
Without the constraint on positive numbers - the above assumption is not true.

Since we do "know" each vertex which was "closed" is minimal - we can safely do the relaxation step - without "looking back". If we do need to "look back" - Bellman-Ford offers a recursive-like (DP) solution of doing so.


Consider the graph shown below with the source as Vertex A. First try running Dijkstra’s algorithm yourself on it.

enter image description here

When I refer to Dijkstra’s algorithm in my explanation I will be talking about the Dijkstra's Algorithm as implemented below,

Dijkstra’s algorithm

So starting out the values (the distance from the source to the vertex) initially assigned to each vertex are,

initialization

We first extract the vertex in Q = [A,B,C] which has smallest value, i.e. A, after which Q = [B, C]. Note A has a directed edge to B and C, also both of them are in Q, therefore we update both of those values,

first iteration

Now we extract C as (2<5), now Q = [B]. Note that C is connected to nothing, so line16 loop doesn't run.

second iteration

Finally we extract B, after which Q is Phi. Note B has a directed edge to C but C isn't present in Q therefore we again don't enter the for loop in line16,

3rd?

So we end up with the distances as

no change guys

Note how this is wrong as the shortest distance from A to C is 5 + -10 = -5, when you go a to b to c.

So for this graph Dijkstra's Algorithm wrongly computes the distance from A to C.

This happens because Dijkstra's Algorithm does not try to find a shorter path to vertices which are already extracted from Q.

What the line16 loop is doing is taking the vertex u and saying "hey looks like we can go to v from source via u, is that (alt or alternative) distance any better than the current dist[v] we got? If so lets update dist[v]"

Note that in line16 they check all neighbors v (i.e. a directed edge exists from u to v), of u which are still in Q. In line14 they remove visited notes from Q. So if x is a visited neighbour of u, the path source to u to x is not even considered as a possible shorter way from source to v.

In our example above, C was a visited neighbour of B, thus the path A to B to C was not considered, leaving the current shortest path A to C unchanged.

This is actually useful if the edge weights are all positive numbers, because then we wouldn't waste our time considering paths that can't be shorter.

So I say that when running this algorithm if x is extracted from Q before y, then its not possible to find a path - not possible which is shorter. Let me explain this with an example,

As y has just been extracted and x had been extracted before itself, then dist[y] > dist[x] because otherwise y would have been extracted before x. (line 13 min distance first)

And as we already assumed that the edge weights are positive, i.e. length(x,y)>0. So the alternative distance (alt) via y is always sure to be greater, i.e. dist[y] + length(x,y)> dist[x]. So the value of dist[x] would not have been updated even if y was considered as a path to x, thus we conclude that it makes sense to only consider neighbors of y which are still in Q (note comment in line16)

But this thing hinges on our assumption of positive edge length, if length(u,v)<0 then depending on how negative that edge is we might replace the dist[x] after the comparison in line18.

So any dist[x] calculation we make will be incorrect if x is removed before all vertices v - such that x is a neighbour of v with negative edge connecting them - is removed.

Because each of those v vertices is the second last vertex on a potential "better" path from source to x, which is discarded by Dijkstra’s algorithm.

So in the example I gave above, the mistake was because C was removed before B was removed. While that C was a neighbour of B with a negative edge!

Just to clarify, B and C are A's neighbours. B has a single neighbour C and C has no neighbours. length(a,b) is the edge length between the vertices a and b.


Dijkstra's algorithm assumes paths can only become 'heavier', so that if you have a path from A to B with a weight of 3, and a path from A to C with a weight of 3, there's no way you can add an edge and get from A to B through C with a weight of less than 3.

This assumption makes the algorithm faster than algorithms that have to take negative weights into account.


Correctness of Dijkstra's algorithm:

We have 2 sets of vertices at any step of the algorithm. Set A consists of the vertices to which we have computed the shortest paths. Set B consists of the remaining vertices.

Inductive Hypothesis: At each step we will assume that all previous iterations are correct.

Inductive Step: When we add a vertex V to the set A and set the distance to be dist[V], we must prove that this distance is optimal. If this is not optimal then there must be some other path to the vertex V that is of shorter length.

Suppose this some other path goes through some vertex X.

Now, since dist[V] <= dist[X] , therefore any other path to V will be atleast dist[V] length, unless the graph has negative edge lengths.

Thus for dijkstra's algorithm to work, the edge weights must be non negative.


Try Dijkstra's algorithm on the following graph, assuming A is the source node and D is the destination, to see what is happening:

Graph

Note that you have to follow strictly the algorithm definition and you should not follow your intuition (which tells you the upper path is shorter).

The main insight here is that the algorithm only looks at all directly connected edges and it takes the smallest of these edge. The algorithm does not look ahead. You can modify this behavior , but then it is not the Dijkstra algorithm anymore.