Zig-zag scan an N x N array

Solution 1:

Very cool question. Here's an analysis and an algorithm.

A key advantage to using this algorithm is that it's all done using simple integer calculations; it has no "if" statements and therefore no branches, which means if it were compiled, it would execute very quickly even for very large values of n. This also means it can be easily parallelized to divide the work across multiple processors for very large values of n.

Consider an 8x8 grid (here, the input is technically n = 64, but for simplicity in the formulas below I'll be using n = 8) following this zigzag pattern, like so (with 0-indexed row and column axis):

     [ 0] [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7]
[ 0]   1    3    4   10   11   21   22   36
[ 1]   2    5    9   12   20   23   35   37
[ 2]   6    8   13   19   24   34   38   49
[ 3]   7   14   18   25   33   39   48   50
[ 4]  15   17   26   32   40   47   51   58
[ 5]  16   27   31   41   46   52   57   59
[ 6]  28   30   42   45   53   56   60   63
[ 7]  29   43   44   54   55   61   62   64

First notice that the diagonal from the lower left (0,7) to upper right (7,0) divides the grid into two nearly-mirrored components:

     [ 0] [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7]
[ 0]   1    3    4   10   11   21   22   36
[ 1]   2    5    9   12   20   23   35
[ 2]   6    8   13   19   24   34
[ 3]   7   14   18   25   33
[ 4]  15   17   26   32
[ 5]  16   27   31
[ 6]  28  30
[ 7]  29

and

     [ 0] [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7]
[ 0]                                     36
[ 1]                                35   37
[ 2]                           34   38   49
[ 3]                      33   39   48   50
[ 4]                 32   40   47   51   58
[ 5]            31   41   46   52   57   59
[ 6]       30   42   45   53   56   60   63
[ 7]   29  43   44   54   55   61   62   64

You can see that the bottom-right is just the top-left mirrored and subtracted from the square plus 1 (65 in this case).

If we can calculate the top-left portion, then the bottom-right portion can easily be calculated by just taking the square plus 1 (n * n + 1) and subtracting the value at the inverse coordinates (value(n - x - 1, n - y - 1)).

As an example, consider an arbitrary pair of coordinates in the bottom-right portion, say (6,3), with a value of 48. Following this formula that would work out to (8 * 8 + 1) - value(8 - 6 - 1, 8 - 3 - 1), simplified to 65 - value(1, 4). Looking at the top-left portion, the value at (1,4) is 17. And 65 - 17 == 48.

But we still need to calculate the top-left portion. Note that this can also be further sub-divided into two overlapping components, one component with the numbers increasing as you head up-right:

     [ 0] [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7]
[ 0]        3        10        21        36
[ 1]   2         9        20        35
[ 2]        8        19        34
[ 3]   7        18        33
[ 4]       17        32
[ 5]  16        31
[ 6]       30
[ 7]  29

And one component with the numbers increasing as you head down-left:

     [ 0] [ 1] [ 2] [ 3] [ 4] [ 5] [ 6] [ 7]
[ 0]   1         4        11        22     
[ 1]        5        12        23     
[ 2]   6        13        24     
[ 3]       14        25     
[ 4]  15        26     
[ 5]       27     
[ 6]  28    
[ 7]    

The former can also be defined as the numbers where the sum of the coordinates (x + y) is odd, and the latter defined as the numbers where the sum of the coordinates is even.

Now, the key insight here is that we are drawing triangles here, so, not suprisingly, the Triangular Numbers play a prominent role here. The triangle number sequence is: 1, 3, 6, 10, 15, 21, 28, 36, ...

As you can see, in the odd-sum component, every other triangular number starting with 3 appears in the first row (3, 10, 21, 36), and in the even-sum component, every other triangular number starting with 1 appears in the first column (1, 6, 15, 28).

Specifically, for a given coordinate pair (x,0) or (0,y) the corresponding triangle number is triangle(x + 1) or triangle(y + 1).

And the rest of the graph can be computed by incrementally subtracting from these triangular numbers up or down the diagonals, which is equivalent to subtracting the given row or column number.

Note that a diagonal can be formally defined as the set of all cells with a given sum of coordinates. For example, the diagonal with coordinate sum 3 has coordinates (0,3), (1,2), (2,1), and (3,0). So a single number defines each diagonal, and that number is also used to determine the starting triangular number.

So from simple inspection, the formula to calculate the the odd-sum component is simply:

triangle(x + y + 1) - y

And the formula to calculate the even-sum component is simply:

triangle(x + y + 1) - x

And the well-known formula for triangle numbers is also simple:

triangle(n) = (n * (n + 1)) / 2

So, the algorithm is:

  1. Initialize an n x n array, where n is the square root of the input size.
  2. Calculate the indexes for the even-summed coordinates of the top-left portion. This can be accomplished by nesting two loops, an outer loop "y going from 0 to n - 1" and an inner loop "x going from y % 2 to y in steps of 2" (by bounding x on the the current y, we only look at the top-left portion, as desired, and by starting at y % 2 and going in steps of 2 we only get the even-summed coordinates). The loop indexes can be plugged into the formula above to get the results. value[x, y] = triangle(x + y + 1) - x.
  3. Calculate the indexes for the odd-summed coordinates of the top-left portion. This can be accomplished with similar loops except the inner loop would be "x going from y % 2 + 1 to y in steps of 2", to only get the odd-summed coordinates. value[x, y] = triangle(x + y + 1) - y.
  4. Calculate the indexes for the bottom-right portion by simple subtraction from n * n + 1 as described in the first part of this post. This can be done with two nested loops counting backwards (and bounding the inner one on the outer one to only get the bottom-right portion). value[x, y] = (n * n + 1) - value[n - x - 1, n - y - 1].
  5. Flatten the grid out into an array (lining up all the rows) and then transform the given input (of size n * n) to output by using the numbers generated in the grid as new indices.

Solution 2:

Here's mine.

function waveSort(array $array) {
  $dimension = pow(count($array),0.5);
  if((int)$dimension != $dimension) {
    throw new InvalidArgumentException();
  }

  $tempArray = array();
  for($i = 0; $i < $dimension; $i++) {
    $tempArray[] = array_slice($array,$i*$dimension,$dimension);
  }

  $returnArray = array();

  for($i = 0; $i < $dimension * 2 -1; $i++) {
    $diagonal = array();

    foreach($tempArray as $x => $innerArray) {
      if($i - $x >= 0 && $i - $x < $dimension) {
        $diagonal[] = $innerArray[$i - $x];
      }
    }

    if($i % 2 == 1) {
      krsort($diagonal);
    }

    $returnArray = array_merge($returnArray,$diagonal);

  }

  return $returnArray;

}

Usage:

<?php
$a = range(1,25);
var_dump(waveSort($a));

Output

array(25) {
  [0]=>
  int(1)
  [1]=>
  int(6)
  [2]=>
  int(2)
  [3]=>
  int(3)
  [4]=>
  int(7)
  [5]=>
  int(11)
  [6]=>
  int(16)
  [7]=>
  int(12)
  [8]=>
  int(8)
  [9]=>
  int(4)
  [10]=>
  int(5)
  [11]=>
  int(9)
  [12]=>
  int(13)
  [13]=>
  int(17)
  [14]=>
  int(21)
  [15]=>
  int(22)
  [16]=>
  int(18)
  [17]=>
  int(14)
  [18]=>
  int(10)
  [19]=>
  int(15)
  [20]=>
  int(19)
  [21]=>
  int(23)
  [22]=>
  int(24)
  [23]=>
  int(20)
  [24]=>
  int(25)
}

Solution 3:

Although there are already many solutions to this question, this is mine:

The main feature that differentiates it from the other solutions:

  • Only a single loop of complexity O(n)
  • Only primitive (integer) temporary variables

The source:

<?php

function zigzag($input)
{
    $output = array();

    $inc = -1;
    $i = $j = 0;
    $steps = 0;

    $bounds = sqrt(sizeof($input));

    if(fmod($bounds, 1) != 0)
    {
        die('Matrix must be square');
    }

    while($steps < sizeof($input))
    {
        if($i >= $bounds) // bottom edge
        {
            $i--;
            $j++;
            $j++;
            $inc = 1;
        }
        if($j >= $bounds) // right edge
        {
            $i++;
            $i++;
            $j--;
            $inc = -1;
        }
        if($j < 0) // left edge
        {
            $j++;
            $inc = 1;
        }
        if($i < 0) // top edge
        {
            $i++;
            $inc = -1;
        }

        $output[] = $input[$bounds * $i + $j];

        $i = $i - $inc;
        $j = $j + $inc;
        $steps++;
    }
    return $output;
}

$a = range(1,25);
var_dump(zigzag($a));

By the way, this sort of algorithm is called "zig zag scan" and is being used heavily for JPEG and MPEG coding.