How to speed up A* algorithm at large spatial scales?

From http://ccl.northwestern.edu/netlogo/models/community/Astardemo, I coded an A* algorithm by using nodes in a network to define least-cost paths. The code seems to work but it is much too slow when I use it at large spatial scales.My landscape has an extent of 1000 patches x 1000 patches with 1 patch = 1 pixel. Even if I reduce it at 400 patches x 400 patches with 1 patch = 1 pixel, it is yet too slow (I can't modify my landscape below 400 patches x 400 patches). Here is the code:

to find-path [ source-node destination-node] 

let search-done? false
let search-path []
let current-node 0
set list-open []
set list-closed []  
let list-links-with-nodes-in-list-closed []
let list-links []

set list-open lput source-node list-open
while [ search-done? != true]
[    
ifelse length list-open != 0
[
  set list-open sort-by [[f] of ?1 < [f] of ?2] list-open 
  set current-node item 0 list-open 
  set list-open remove-item 0 list-open 
  set list-closed lput current-node list-closed
  ask current-node
  [  
    if parent-node != 0[
    set list-links-with-nodes-in-list-closed lput link-with parent-node list-links-with-nodes-in-list-closed 
    ]
    ifelse any? (nodes-on neighbors4) with [ (xcor = [ xcor ] of destination-node) and (ycor = [ycor] of destination-node)]
    [
      set search-done? true 
    ]
    [        
      ask (nodes-on neighbors4) with [ (not member? self list-closed) and (self != parent-node) ]  
      [  
        if not member? self list-open and self != source-node and self != destination-node
        [
          set list-open lput self list-open
          set parent-node current-node
          set list-links sentence (list-links-with-nodes-in-list-closed) (link-with parent-node)
          set g sum (map [ [link-cost] of ? ] list-links)
          set h distance destination-node 
          set f (g + h)
        ]
      ]
    ]
  ]
]

[
  user-message( "A path from the source to the destination does not exist." )
  report []
 ]
]
set search-path lput current-node search-path
let temp first search-path
while [ temp != source-node ]
[
 ask temp
[
  set color red
]
set search-path lput [parent-node] of temp search-path 
set temp [parent-node] of temp 
]
set search-path fput destination-node search-path
set search-path reverse search-path  
print search-path
end

Unfortunately, I don't know how to speed up this code. Is there a solution to calculate rapidly least-cost paths at large spatial scales ?

Thanks very much for your help.


Was curious so I tested mine A* and here is mine result

Maze 1280 x 800 x 32 bit pixels

A* test

  • as you can see it took ~23ms
  • no multithreading (AMD 3.2GHz)
  • C++ 32bit app (BDS2006 Turbo C++ or Borland C++ builder 2006 if you like)
  • the slowest path I found was ~44ms (fill almost whole map)

I think that is fast enough ...

Here is source for mine A* class:

//---------------------------------------------------------------------------
//---------------------------------------------------------------------------
//---------------------------------------------------------------------------
const DWORD A_star_space=0xFFFFFFFF;
const DWORD A_star_wall =0xFFFFFFFE;
//---------------------------------------------------------------------------
class A_star
    {
public:
    // variables
    DWORD **map;        // map[ys][xs]
    int xs,ys;          // map esolution   xs*ys<0xFFFFFFFE !!!
    int *px,*py,ps;     // output points px[ps],py[ps] after compute()

    // internals
    A_star();
    ~A_star();
    void _freemap();                                    // release map memory
    void _freepnt();                                    // release px,py memory

    // inteface
    void resize(int _xs,int _ys);                       // realloc map to new resolution
    void set(Graphics::TBitmap *bmp,DWORD col_wall);    // copy bitmap to map
    void get(Graphics::TBitmap *bmp);                   // draw map to bitmap for debuging
    void compute(int x0,int y0,int x1,int y1);          // compute path from x0,y0 to x1,y1 output to px,py
    };
//---------------------------------------------------------------------------
     A_star::A_star()   { map=NULL; xs=0; ys=0; px=NULL; py=NULL; ps=0; }
     A_star::~A_star()  { _freemap(); _freepnt(); }
void A_star::_freemap() { if (map) delete[] map; map=NULL; xs=0; ys=0; }
void A_star::_freepnt() { if (px) delete[] px; px=NULL; if (py) delete[] py; py=NULL; ps=0; }
//---------------------------------------------------------------------------
void A_star::resize(int _xs,int _ys)
    {
    if ((xs==_xs)&&(ys==_ys)) return;
    _freemap();
    xs=_xs; ys=_ys;
    map=new DWORD*[ys];
    for (int y=0;y<ys;y++)
     map[y]=new DWORD[xs];
    }
//---------------------------------------------------------------------------
void A_star::set(Graphics::TBitmap *bmp,DWORD col_wall)
    {
    int x,y;
    DWORD *p,c;
    resize(bmp->Width,bmp->Height);
    for (y=0;y<ys;y++)
     for (p=(DWORD*)bmp->ScanLine[y],x=0;x<xs;x++)
        {
        c=A_star_space;
        if (p[x]==col_wall) c=A_star_wall;
        map[y][x]=c;
        }
    }
//---------------------------------------------------------------------------
void A_star::get(Graphics::TBitmap *bmp)
    {
    int x,y;
    DWORD *p,c;
    bmp->SetSize(xs,ys);
    for (y=0;y<ys;y++)
     for (p=(DWORD*)bmp->ScanLine[y],x=0;x<xs;x++)
        {
        c=map[y][x];
             if (c==A_star_wall ) c=0x00000000;
        else if (c==A_star_space) c=0x00FFFFFF;
        else                      c=((c>>1)&0x7F)+0x00404040;
        p[x]=c;
        }
    }
//---------------------------------------------------------------------------
void A_star::compute(int x0,int y0,int x1,int y1)
    {
    int x,y,xmin,xmax,ymin,ymax,xx,yy;
    DWORD i,j,e;
    // [clear previous paths]
    for (y=0;y<ys;y++)
     for (x=0;x<xs;x++)
      if (map[y][x]!=A_star_wall)
       map[y][x]=A_star_space;
/*
    // [A* no-optimizatims]
    xmin=x0; xmax=x0; ymin=y0; ymax=y0;
    if (map[y0][x0]==A_star_space)
     for (i=0,j=1,e=1,map[y0][x0]=i;(e)&&(map[y1][x1]==A_star_space);i++,j++)
      for (e=0,y=ymin;y<=ymax;y++)
       for (   x=xmin;x<=xmax;x++)
        if (map[y][x]==i)
        {
        yy=y-1; xx=x; if ((yy>=0)&&(map[yy][xx]==A_star_space)){ map[yy][xx]=j; e=1; if (ymin>yy) ymin=yy; }
        yy=y+1; xx=x; if ((yy<ys)&&(map[yy][xx]==A_star_space)){ map[yy][xx]=j; e=1; if (ymax<yy) ymax=yy; }
        yy=y; xx=x-1; if ((xx>=0)&&(map[yy][xx]==A_star_space)){ map[yy][xx]=j; e=1; if (xmin>xx) xmin=xx; }
        yy=y; xx=x+1; if ((xx<xs)&&(map[yy][xx]==A_star_space)){ map[yy][xx]=j; e=1; if (xmax<xx) xmax=xx; }
        }
*/
    // [A* changed points list]
    // init space for 2 points list
    _freepnt();
    int i0=0,i1=xs*ys,n0=0,n1=0,ii;
    px=new int[i1*2];
    py=new int[i1*2];
    // if start is not on space then stop
    if (map[y0][x0]==A_star_space)
        {
        // init start position to first point list
        px[i0+n0]=x0; py[i0+n0]=y0; n0++; map[y0][x0]=0;
        // search until hit the destination (swap point lists after each iteration and clear the second one)
        for (j=1,e=1;(e)&&(map[y1][x1]==A_star_space);j++,ii=i0,i0=i1,i1=ii,n0=n1,n1=0)
         // test neibours of all points in first list and add valid new points to second one
         for (e=0,ii=i0;ii<i0+n0;ii++)
            {
            x=px[ii]; y=py[ii];
            yy=y-1; xx=x; if ((yy>=0)&&(map[yy][xx]==A_star_space)){ map[yy][xx]=j; e=1; px[i1+n1]=xx; py[i1+n1]=yy; n1++; map[yy][xx]=j; }
            yy=y+1; xx=x; if ((yy<ys)&&(map[yy][xx]==A_star_space)){ map[yy][xx]=j; e=1; px[i1+n1]=xx; py[i1+n1]=yy; n1++; map[yy][xx]=j; }
            yy=y; xx=x-1; if ((xx>=0)&&(map[yy][xx]==A_star_space)){ map[yy][xx]=j; e=1; px[i1+n1]=xx; py[i1+n1]=yy; n1++; map[yy][xx]=j; }
            yy=y; xx=x+1; if ((xx<xs)&&(map[yy][xx]==A_star_space)){ map[yy][xx]=j; e=1; px[i1+n1]=xx; py[i1+n1]=yy; n1++; map[yy][xx]=j; }
            }
        }
    // [reconstruct path]
    _freepnt();
    if (map[y1][x1]==A_star_space) return;
    if (map[y1][x1]==A_star_wall) return;
    ps=map[y1][x1]+1;
    px=new int[ps];
    py=new int[ps];
    for (i=0;i<ps;i++) { px[i]=x0; py[i]=y0; }
    for (x=x1,y=y1,i=ps-1,j=i-1;i>=0;i--,j--)
        {
        px[i]=x;
        py[i]=y;
        if ((y>   0)&&(map[y-1][x]==j)) { y--; continue; }
        if ((y<ys-1)&&(map[y+1][x]==j)) { y++; continue; }
        if ((x>   1)&&(map[y][x-1]==j)) { x--; continue; }
        if ((x<xs-0)&&(map[y][x+1]==j)) { x++; continue; }
        break;
        }
    }
//---------------------------------------------------------------------------
//---------------------------------------------------------------------------
//---------------------------------------------------------------------------

I know it is a bit too much code but it is complete. The important stuff is in member function compute so search for [A* changed points list]. The unoptimized A* (rem-ed) is about 100 times slower.

Code use bitmap from Borland VCL so if you do not have it ignore functions get,set and rewrite them to your input/output gfx style. They just load map from bitmap and draw computed map back to bitmap

Usage:

// init
A_star map;
Graphics::TBitmap *maze=new Graphics::TBitmap;
maze->LoadFromFile("maze.bmp");
maze->HandleType=bmDIB;
maze->PixelFormat=pf32bit;
map.set(maze,0); // walls are 0x00000000 (black)
// this can be called repetitive without another init
map.compute(x0,y0,x1,y1); // map.px[map.ps],map.py[map.ps] holds the path
map.get(maze,0); // this is just for drawing the result map back to bitmap for viewing

for more info about A* see Backtracking in A star


TL;DR: Include in your node list (graph) only the patches (or agents) that are important!

One way to speed things up is to not search over every grid space. A* is a graph search, but seems like most coders just dump every point in the grid into the graph. That's not required. Using a sparse search graph, rather than searching every point on the screen, can speed things up.

Even in a complex maze, you can speed up by only including corners and junctions in the graph. Don't add hallway grids to the open list--seek ahead to find the next corner or junction. This is where pre-processing the screen/grid/map to construct the search graph can save time later.

As you can see in this image from my (rather inefficient) A* model on turtlezero.com, a naive approach creates a lot of extra steps. Any open nodes created in a long straight corridor are wasted:

Sample maze solved with naive a-star

By eliminating these steps from the graph, the solution could be found hundreds of times faster.

Another sparse graph technique is to use a graph that is gradually less dense the further from the walker. That is, make your search space detailed near the walker, and sparse (fewer nodes, less accurate regarding obstacles) away from the walker. This is especially useful where the walker is moving through detailed terrain on a map that is changing or towards a target that is moving and the route has to be recalculated anyway.

For example, in a traffic simulation where roads may become gridlocked, or accidents occur. Likewise, a simulation where one agent is pursuing another agent on a changing landscape. In these cases, only the next few steps need to be exactly plotted. The general route to the destination can be approximate.

One simple way to implement this is to gradually increase the step size of the walker as the path becomes longer. Disregard obstacles or do a quick line-intersection or tangent test. This gives the walker a general idea of where to go.

An improved path can be recalculated with each step, or periodically, or when an obstacle is encountered.

It may only be milliseconds saved, but milliseconds wasted on the soon-to-change end of the path could be better used providing brains for more walkers, or better graphics, or more time with your family.

For an example of a sparse graph of varying density, see chapter 8 of Advanced Java Programming By David Wallace Croft from APress: http://www.apress.com/game-programming/java/9781590591239

He uses a circular graph of increasing sparseness in a demo tank game with an a* algorithm driving the enemy tanks.

Another sparse graph approach is to populate the graph with only way-points of interest. For example, to plot a route across a simple campus of buildings, only entrances, exits, and corners are important. Points along the side of a building or in the open space between are not important, and can be omitted from the search graph. A more detailed map might need more way-points--such as a circle of nodes around a fountain or statue, or where paved paths intersect.

Here's a diagram showing the paths between waypoints.

Sample of building corner waypoints for path-search optimization

This was generated by the campus-buildings-path-graph model by me on turtlezero.com: http://www.turtlezero.com/models/view.php?model=campus-buildings-path-graph

It uses simple netlogo patch queries to find points of interest, like outside and inside corners. I'm sure a somewhat more sophisticated set of queries could deal with things like diagonal walls. But even without such fancy further optimization, the A* search space would be reduced by orders of magnitude.

Unfortunately, since now Java 1.7 won't allow unsigned applets, you can't run the model in the webpage without tweaking your java security settings. Sorry about that. But read the description.


A* is two heuristics; Dijkstra's Algorithm & Greedy Search. Dijkstra's Algorithm searches for the shortest path. The Greedy Search looks for the cheapest path. Dijkstra's algorithm is extraordinarily slow because it doesn't take risks. Multiply the effect of the Greedy Search to take more risks.

For example, if A* = Dijkstra + Greedy, then a faster A* = Dijkstra + 1.1 * Greedy. No matter how much you optimize your memory access or your code, it will not fix a bad approach to solving the problem. Make your A* more greedy and it will focus on finding a solution, rather than a perfect solution.

NOTE:

Greedy Search = distance from end
Dijkstra's Algorithm = distance from start

in standard A*, it will seek perfect solutions until reaching an obstacle. This video shows the different search heuristics in action; notice how fast a greedy search can be (skip to 2:22 for A*, 4:40 for Greedy). I myself had a similar issue when I first began with A* and the modified A* I outline above improved my performance exponentially. Moral of the story; use the right tool for the job.


If you plan to reuse the same map multiple times, some form of pre-processing is usually optimal. Effectively, you work out the shortest distances between some common points, and add them to the graphs as edges, this will typically help a* find a solution more quickly. Although its more difficult to implement.

E.g. you might do this for all motorway routes in a map of the UK, so the search algorithm only has to find a route to a motorway, and from the motorway junctions to its destination.


I can't tell what the actual cause of the observed slowlyness might be. Maybe it's just due to shortcomings in efficiency imposed by the programming language at hand. How did you measure your performance? How can we reproduce it?

Besides that, the heuristic (distance metric) being used has a large influence on the amount of exploration that is done in order to find the optimal path and thus also influences the perceived efficiency of the algorithm.

In theory you have to use an admissible heuristic, that is, one that never overestimates the remaining distance. In practice, depending on the complicatedness of the maze, a conservative choice for a 2d grid maze, like manhattan distance, might significantly underestimate the remaining distance. Therefore a lot of exploration is done in areas of the maze far away from the goal. This leads to a degree of exploration that resembles much rather that of an exhaustive search (e.g., Breadth-First Search), than what one would expect from an informed search algorithm.

This might be something to look into.

Also have a look at my related answer here:

  • https://stackoverflow.com/a/16656993/1025391

There I have compared different heuristics used with the basic A-Star algorithm and visualized the results. You might find it interesting.