How many squares does a line between two points pass through?

Solution 1:

Imagine the subdivided square as an $N\times N$ chessboard of subsquares. Every point on the board belongs to one subsquare with coordinates $(r,c)$, where $r$ (the row index) ranges from $1$ to $N$ and similarly for the $c$ (the column index).

Now the number of subsquares that the line segment $\overline{AB}$ between two uniformly selected points $A$ and $B$ touches equals $d(A,B)=|A(r)-B(r)|+|A(c)-B(c)|+1$. Here $A(r)$ means row index for $A$, and similarly for the other notation. The points $A$ and $B$ are in "general position" because of the randomness, so the segment always passes through the maximum number of subsquares. The segment doesn't hit any corners.

Since the random variables $A(r),B(r),A(c),B(c)$ are independent and uniformly distributed on $\{1,\dots, N\}$, it is not hard to calculate that the expected number is $$\mathbb{E}(d(A,B))=1+{2(N^2-1)\over3N}.$$

Solution 2:

The number of of the cells touched by the line is (following Byron's answer) $N (|X_1-X_2| + |Y_1-Y_2| +1) $ . Calling $X =|X_1 - X_2|$ and $Y = |Y_1 - X_2|$ and leaving apart the $N$ factor, this is approximately equivalent, for large $N$, to $d_M=X+Y$, the Manhattan distance.

Hence, I'll attack the following problem: We throw two points at random on the unit square; we want to compute $$E(d_M | d)$$

where $d=\sqrt{X^2+Y^2}$ is the euclidean distance.

The expected number of touched cells, in this approximation, would be $N E(d_M | d) + 1$.

It's not difficult to see that $X$ (absolute value of the difference of two uniform v.a) has a triangular density : $f_X(x)=2(1-x)$ The same for $Y$, and both are independent, hence: $$f_{XY}(x y)=4(1-x)(1-y)$$ on the unit square.

UPDATE: Below I found a much simpler solution

--------------- begin ignore -----

Conditioning on a fixed value of $d$ implies that we must restrict to an arc, the intersection of $x^2+y^2=d^2$ and the unit square.

Let's assume first that $d<1$

Note that $d\le d_M \le \sqrt{2} d$

To compute $P(d_M \le \xi | d)$, we must integrate over two pieces of arcs begining on the axis (because of symmetry, we compute just one and multiply by two). (blue arcs in the figure) The first limit point is given by $$x_1=\frac{\xi - \sqrt{2 d^2-\xi^2}}{2}$$ enter image description here Further to change the integration over the arc for an integral over $x$, we note that $ds=\sqrt{1+\frac{x^2}{y^2}}{dx} = \frac{d}{y}dx$. So, leaving out the constants (that disappear on the division), we get that the cumulative distribution function (conditioned on $d$) is given by

$$F_{d_M|d}(\xi;d)=P(d_M \le \xi | d) = \frac{2 \int_{0}^{x_1}\left( 1-x\right) \,\left( \frac{1}{\sqrt{{d}^{2}-{x}^{2}}}-1\right) dx}{\int_{0}^{d}\left( 1-x\right) \,\left( \frac{1}{\sqrt{{d}^{2}-{x}^{2}}}-1\right) dx}$$

Finally, we must integrate to get the expected value

$$E(d_M |d) = d+ \int_{d}^{\sqrt{2} d} \left[1- F_{d_M|d}(\xi;d) \right] \; d\xi $$

This is all.. but it seems pretty complicated to get in close form. Let's see Maxima:

assume(x1>0);assume(d>0);assume(d>x1);
integrate((1-x)*(1/sqrt(d^2-x^2)-1), x,0,x1);

$$\frac{2\,\mathrm{asin}\left( \frac{x1}{d}\right) +2\,\sqrt{{d}^{2}-{x1}^{2}}+{x1}^{2}-2\,x1-2\,d}{2}$$

ix2(x1,d):=(2*asin(x1/d)+2*sqrt(d^2-x1^2)+x1^2-2*x1-2*d)/2;
x1(dm,d):=(dm-sqrt(2*d^2-dm^2))/2;
assume(dm>d); assume(dm<sqrt(2)*d);
fdist(dm,d):=2*(ix2(xx(dm,d),d))/ix2(d,d);

This gets a little messy. Let's try at least some numerical values:

dd:0.8;
dd+quad_qags(1-fdist(dm,dd), dm, dd, sqrt(2)*dd)

1.01798

Let's simulate to check: Matlab/Octave:

N=300000;
t=rand(N,4);
xy = [abs(t(:,1)-t(:,3)),abs(t(:,2)-t(:,4))];
t=[];
d2=xy(:,1).^2+xy(:,2).^2;
d=sqrt(d2);
dm=xy(:,1)+xy(:,2);
step=0.02;
dround=round(d/step)*step;
%size(dround(dround==0.8))
mean(dm(dround==0.8))
>>>ans =  1.0174

Not bad, I'd say. Some other values:

  d   Maxima      Octave (simul)
 0.9  1.15847     1.1569
 0.8  1.01798     1.0174
 0.7  0.88740     0.8863
 0.6  0.75983     0.7579
 0.5  0.63328     0.6331
 0.4  0.50698     0.5063
 0.3  0.38062     0.3808

The case $d>1$ should be solved in a similar way.

--------------- end ignore -----

A much simpler solution:

Let's change to polar coordinates: $x = r \cos(t)$, $y = r \sin(t)$. The joint density is

$$f_{r,t}(r,t) = 4 r (1- r \cos(t))(1- r \sin(t))$$

in the domain $0 \le r \le \sqrt{2}$, $0 \le t \le \pi/2$ for $r\le 1$, $ arccos(1/r) \le t \le arcsin(1/r)$ for $r > 1$

The conditional density is then

$$f_{t|r}(t|r) = \frac{1}{g(r)} (1- r \cos(t))(1- r \sin(t))$$

where $g(r)$ is the normalization constant (indepent of $t$).

Assuming first that $r<1$, then

$$g(r) = \int_0^{\pi/2} (1- r \cos(t))(1- r \sin(t)) dt = \frac{{r}^{2}}{2}-2\,r+\frac{\pi }{2}$$

Further, $d_M= x + y= r (\cos(t)+\sin(t))$, so the conditional expectation is given by

$$E[d_M | r] = r \frac{1}{g(r)} \int_0^{\pi/2} (1- r \cos(t))(1- r \sin(t)) (\cos(t)+\sin(t)) dt$$

which gives

$$ E[d_M | r] = r \frac{4\,{r}^{2}-3\,\left( \pi +2\right) \,r+12}{3\,{r}^{2}-12\,r+3\,\pi }$$

For $r>1$ it's more complicated. The result is

$$ E[d_M | r] = r \frac{\sqrt{{r}^{2}-1}\,\left( 4\,{r}^{2}+8\right) +\left( 6\,\mathrm{asin}\left( \frac{1}{r}\right) -6\,\mathrm{acos}\left( \frac{1}{r}\right) -6\right) \,{r}^{2}-4}{3\,{r}^{3}-12\,r\,\sqrt{{r}^{2}-1}-\left( 6\,\mathrm{asin}\left( \frac{1}{r}\right) -6\,\mathrm{acos}\left( \frac{1}{r}\right) -6\right) \,r} $$

The figure shows $E[d_M | r] / r$, ie. the factor by which the expected Manhattan distance exceeds the euclidean distance (we already knew that this must be in the $[1,\sqrt{2}]$ range).

enter image description here

(BTW: sorry if the formatting is not optimal, but I got sick of my Chorme crashing on the edition, lots of times, sometimes losing changes - am I the only one?)

Added: Notice that for small distances ($r \to 0$) the expected number of squares results $r N \frac{4}{\pi} +1 $, which agrees with Ronald's answer.

Solution 3:

Suppose that you divide your square into $N^2$ sub-squares, where $N\in\mathbb{N}$. Then suppose that $A=(x_{1},y_{1})$ and $B=(x_{2},y_{2})$ where $x_{1},x_{2},y_{1}$ and $y_{2}$ are real numbers in $[0,1]$. Then the segment $AB$ must pass through at least $N|y_{2}-y_{1}|$ squares in the $y$ (or vertical) direction and at least $N|x_{2}-x_{1}|$ squares in the $x$ (or horizontal) direction. So, we would expect the segment $AB$ to pass through $N|x_{2}-x_{1}|+ N|y_{2}-y_{1}|$ squares.

Solution 4:

I broadly agree with @Byron Schmuland's answer, which seems correct, but I have some additional suggestions.

I am assuming that the size of a small square is 1, for simplicity, compared to a distance of D.

If the question is to place A and B "randomly" with a given distance of D apart, this may fall foul of Bertrand's paradox: that this concept of randomness is not clearly enough defined.

However, it seems to make sense to place A first, suggesting we should place B on an arc with centre A and radius D. The arc will be restricted depending on how close it is to the edge of the square.

Let's imagine that we choose the point A, and then the angle $\theta$ is chosen randomly. Each direction $\theta$ is equally likely (by assumption here and by symmetry of the square).

Then the number of squares passed through will be $D|\sin(\theta)|+D|\cos(\theta)|+1$ from Byron's work and using a right-angled triangle.

$$E[D|\sin(\theta)|+D|\cos(\theta)|+1]$$ $$= D \cdot E[|\sin(\theta)|+|\cos(\theta)|] + 1$$ $$= 4D/\pi + 1$$

Or, considering $(1/N)$ to be the length of a little square (according to the question as given) $$= 4ND/\pi + 1$$

In the case where $N = 2$ ($2$ little squares per side) and $D = 0.5$ ($D$ is half the length of a long side), we expect to pass through $2.27$ squares. Seems right. Sometimes it will pass through only $1$, but more often $3$.

This will break down badly when D gets long compared to the side of the big square, I imagine ($D>1$?), when the line no longer has the option of being horizontal or vertical.