Number of apples on the tree after n days?

Start with one apple on day one. On every following day:

Every apple grows $k\cdot(c+1)$ new apples, where $c$ is the number of connections to that apple, and $k$ is the number of steps it takes to reach the starting apple from that apple (counting the apple itself as a step).

For example, the first apple takes only one step to itself. The apples growing from it take two steps. The apples growing from those take three steps, etc. (see images below for more detail)

How many apples will be on the tree after $n$ days?


I computed first $25$ terms: (sequence is not in OEIS)

1, 2, 8, 56, 548, 6752, 99908, 1724816, 34031348, 755384672, 18630078308, 505421692976, 14958279256148, 479591526968192, 16559455408832708, 612609633148083536, 24174100149092384948, 1013551337258199761312, 44995053102770888963108, 2108457649886329729936496, 104001928043774583748777748, 5386506619791901055945028032, 292264718383139371373233669508, 16578710198212615619201747731856, 981315128726093566691094046194548

First four days look like this:

enter image description here

How can I find a formula to calculate the number of apples at the $n$th day?


I don't know how to solve this, so I'm counting apples on each layer individually:

Idea was to find a general pattern for computed layers and then sum them all up at the end.
Note, Layer $k$ has all apples that take $k$ steps to the starting apple.

Let number of layer $k$ apples after $n$ days be given by $a_n(k)$.

Then the number of apples on the tree at day $n$ is given by $A_n=\sum_{i=1}^n{a_n(i)}$ .

Here is the computed data for first $6$ layers:

a_n(1) = 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,...
a_n(2) = 0, 1, 3, 7, 15, 31, 63, 127, 255, 511, 1023, 2047, 4095, 8191, 16383, 32767, 65535, 131071, 262143, 524287, 1048575, 2097151, 4194303, 8388607, 16777215,... 
a_n(3) = 0, 0, 4, 24, 100, 360, 1204, 3864, 12100, 37320, 114004, 346104, 1046500, 3155880, 9500404, 28566744, 85831300, 257756040, 773792404, 2322425784, 6969374500, 20912317800, 62745342004, 188252803224, 564791964100,... 
a_n(4) = 0, 0, 0, 24, 240, 1560, 8400, 40824, 186480, 818520, 3498000, 14676024, 60780720, 249401880, 1016542800, 4123173624, 16664094960, 67171367640, 270232006800, 1085570781624, 4356217681200, 17466686971800, 69992221794000, 280345359228024, 1122510953731440,... 
a_n(5) = 0, 0, 0, 0, 192, 2880, 26880, 201600, 1334592, 8164800, 47372160, 264844800, 1441632192, 7694406720, 40467248640, 210468585600, 1085328316992, 5559954344640, 28337142664320, 143847569376000, 727922413132992, 3674461807114560, 18512042531347200, 93120150431088000, 467843515029264192,...
a_n(6) = 0, 0, 0, 0, 0, 1920, 40320, 510720, 5080320, 43827840, 344615040, 2541411840, 17896919040, 121797836160, 807731084160, 5251058991360, 33611039804160, 212519521994880, 1330716675415680, 8267671479137280, 51044504568583680, 313546251542832000, 1918022127328137600, 11693253189282297600, 71090720640004665600,...

Here is the python code.



Update

I corrected the layer data; now I managed to find formulas for first $6$ layers:

$ a_n(1)=1 $
$ a_n(2)=\frac{1}{2}(2^n-2)$
$ a_n(3)=\frac{2}{3}(3^n-3\cdot2^n+3)$
$ a_n(4)=1( 4^{n} - 4\cdot3^n + 6\cdot2^{n} - 4)$
$ a_n(5)=\frac{8}{5}(5^n - 5\cdot4^n + 10\cdot3^n - 10\cdot 2^{n}+5)$
$ a_n(6)=\frac{8}{3} (6^n - 6\cdot5^n + 15\cdot4^n - 20\cdot3^n + 15\cdot2^n - 6 )$

I noticed the pattern here, and observed that these sequences are given by:

$$ a_n(k)=\frac{\lceil 2^{k-2} \rceil}{k} \sum_{i=0}^{k-1} (-1)^i \binom{k}i (k-i)^n$$

And the number of apples on the day $n$ is thus given by:

$$ A(n)= \sum_{k=1}^n\sum_{i=0}^{k-1} (-1)^i \binom{k}i (k-i)^n\frac{\lceil 2^{k-2} \rceil}{k} $$



The fact that the formula was observable from the data is nice, but how do you now mathematically show (prove) this expression holds for all $n$?

Q: How would one solve this (or similar problems) algebraically to arrive at this expression? Without relying on computation and pattern analysis?

What if the growing condition per apple was slightly changed; what are the methods to solve this and similar problems?


Solution 1:

First we need to formalize the definition to have something to work with. We can view your problem as a sequence of oriented graphs, starting with single node (first apple) and then connecting iteratively additional nodes. Now in each day (let's say $n$-th day) we have certain number of apples that are $k$ steps away from the first apple, let's call this number $a_{n,k}$ (so far same as your $a_n(k)$). But we also we have certain number incoming edges for these (the "connections"), let's call this number $i_{n,k}$. Similarly, we have certain number of outgoing edges, say $o_{n,k}$. This way we have defined $3$ sequences for which it is relatively easy to find recurrence relation. Later on we will just simplify these into single sequence.

Let's investigate these $3$ sequences. Since there are no edges on first day, we have $i_{1,k}=o_{1,k}=0$. Also each day we have $0$ incoming edges into $1$-st apple, and so $i_{n,1}=0$. Also notice that on first day we start with just one apple, and so $a_{n,1}=1$ and $a_{1,k}=0$ for $k>1$.

Now since number of edges from/to nodes $k$-steps away grow from each separate day based on how many nodes and edges were there in previous day, and this happens with weight $k-1$ (this corresponds to your "every apples grows a new apple for every connected apple, including itself", so we have $1$ contribution from incoming edge, $1$ from outgoing edge and one from each node/apple itself). So for number of incoming edges we have

$$ i_{n,k} = i_{n-1,k} +(k-1) (i_{n-1,k-1}+o_{n-1,k-1}+a_{n-1,k-1}). $$

Number of outgoing edges from $k$ apples far away grows only based on number of edges and nodes that are already $k$-apples far away, with weight exactly equal $k$, and so

$$ o_{n,k} = o_{n-1,k} +k (i_{n-1,k}+o_{n-1,k}+a_{n-1,k}). $$

As for last sequence $a_{n,k}$ (the actual number of nodes), this one grows the same way as the number of incoming edges, and so

$$ a_{n,k} = a_{n-1,k}+(k-1)(i_{n-1,k-1}+o_{n-1,k-1}+a_{n-1,k-1}) $$ but notice that this sequence is not the same as $i_{n,k}$ because of different initial conditions. Now if we denote $$ c_{n,k} = i_{n,k}+o_{n,k}+a_{n,k} $$ where initial conditions $c_{n,1}=2^{n-1}$ and $c_{1,k}=0$ for $k>1$ can be verified from original sequences, we can write a bit more compactly \begin{align} i_{n,k} &= i_{n-1,k} +(k-1) c_{n-1,k-1}\\ o_{n,k} &= o_{n-1,k} +k c_{n-1,k}\\ a_{n,k} &= a_{n-1,k} +(k-1) c_{n-1,k-1} \end{align} We can do little trick to convert the sequences above to single two variable sequence, simply by summing all $3$ equations we obtain $$ c_{n,k}=(k+1)c_{n-1,k}+2(k-1) c_{n-1,k-1}. $$ Eventually since we are only interested in $a_{n,k}$, we can write \begin{align} a_{n,k} &= a_{n-1,k} +(k-1) c_{n-1,k-1}\\ &= a_{n-2,k} +(k-1) c_{n-2,k-1}+(k-1) c_{n-1,k-1}\\ \ \ &\vdots\\ &= (k-1) c_{1,k-1}+ \dots +(k-1) c_{n-2,k-1}+(k-1) c_{n-1,k-1}\\ &= (k-1) \sum_{i=1}^{n-1}c_{i,k-1} \end{align} and plugging in the $c_{n,k}$ recurrence relation (assuming $k>2$) we get \begin{align} a_{n,k} &= (k-1) \sum_{i=1}^{n-1}c_{i,k-1}\\ &= (k-1) \sum_{i=2}^{n-1}c_{i,k-1}\\ &= (k-1) \sum_{i=2}^{n-1}(kc_{i-1,k-1}+2(k-2)c_{i-1,k-2})\\ &= (k-1)k \sum_{i=2}^{n-1}c_{i-1,k-1}+2(k-1)(k-2)\sum_{i=2}^{n-1}c_{i-1,k-2}\\ &= (k-1)k \sum_{i=1}^{n-2}c_{i,k-1}+2(k-1)(k-2)\sum_{i=1}^{n-2}c_{i,k-2}\\ &= k a_{n-1,k}+2(k-1)a_{n-1,k-1} \end{align} For initial $k=2$ case we have $a_{n,2}=\sum_{i=1}^{n-1}c_{i,1} = \sum_{i=1}^{n-1}2^{i-1} = 2^{n-1}-1$. (By the way this different behavior for $k=2$ corresponds to your $\lceil 2^{k-2} \rceil$ expression). So summarizing what we have so far $$ a_{n,k} = \begin{cases} 1&k=1\\ 2^{n-1}-1&k=2\\ 0 &n=1,k>2\\ ka_{n-1,k}+2(k-1)a_{n-1,k-1} &n>1, k>2 \end{cases} $$ Now it becomes an algebraic problem to either derive the closed form formula, or at least prove that it is equal to the one you conjectured. I will do the latter since it seems much simpler at this point (and there is nothing wrong about figuring out pattern/formula from data and then prove it additionally).

So let's prove that $$ a_{n,k} = a_n(k) = \frac{\lceil 2^{k-2} \rceil}{k} \sum_{i=0}^{k-1} (-1)^i \binom{k}i (k-i)^n. $$ The initial cases are simple, for example for $k=1$: $$ a_n(1) = (-1)^0\binom{1}{0}(1-0)^n = 1. $$ Similarly $$ a_n(2) = \frac{1}{2}\left[(-1)^0\binom{2}{0}(2-0)^n+(-1)^1\binom{2}{1}(2-1)^n\right] = \frac{1}{2} (2^n-2) = 2^{n-1}-1$$ and $$ a_1(k) = \frac{2^{k-2}}{k} \sum_{i=0}^{k-1} (-1)^i \binom{k}i (k-i) = 2^{k-2} \sum_{i=0}^{k-1} (-1)^i \binom{k-1}{i} = 2^{k-2}\left(1+(-1)\right)^{k-1} =0. $$ where we used $\binom{k}i (k-i)=k\binom{k-1}{i}$ and the binomial theorem.

The last case is to prove that for $n>1, k>2$: $$ a_n(k) - ka_{n-1}(k)- 2(k-1)a_{n-1}(k-1) = 0. $$ Let's simplify the left side

\begin{align} LHS&=\frac{2^{k-2}}{k} \sum_{i=0}^{k-1} (-1)^i \binom{k}i (k-i)^n - k \frac{2^{k-2}}{k} \sum_{i=0}^{k-1} (-1)^i \binom{k}i (k-i)^{n-1}\\ \ \ \ &-2(k-1) \frac{2^{k-3}}{k-1} \sum_{i=0}^{k-2} (-1)^i \binom{k-1}i (k-1-i)^{n-1}\\ &=2^{k-2}\left[\frac{1}{k}\sum_{i=0}^{k-1} (-1)^i \binom{k}i (k-i)^n -\sum_{i=0}^{k-1} (-1)^i \binom{k}i (k-i)^{n-1}- \sum_{i=0}^{k-2} (-1)^i \binom{k-1}i (k-1-i)^{n-1} \right]\\ &=2^{k-2}\left[\frac{1}{k}\sum_{i=0}^{k-1} (-1)^i \binom{k}i (k-i)^n -\sum_{i=0}^{k-1} (-1)^i \binom{k}i (k-i)^{n-1}- \sum_{i=1}^{k-1} (-1)^{i-1} \binom{k-1}{i-1} (k-i)^{n-1} \right]\\ &=2^{k-2}\sum_{i=1}^{k-1}\left[(-1)^i \left( \frac{1}{k}\binom{k}i (k-i)^n - \binom{k}i (k-i)^{n-1} + \binom{k-1}{i-1} (k-i)^{n-1} \right)\right]\\ &=2^{k-2}\sum_{i=1}^{k-1}\left[(-1)^i(k-i)^{n-1} \left( \binom{k-1}{i} - \binom{k}i + \binom{k-1}{i-1} \right)\right]\\ &=0. \end{align} where at the end we used $\binom{k}{i}=\binom{k-1}{i}+\binom{k-1}{i-1}$. This finishes the proof and verifies your formula is correct.

There are possibly other ways to approach this, for example some of the things might be proven combinatorially instead (the sums look suspiciously like inclusion-exclusion principle), but this is besides the point. Once we have arrived at recursive relations (or anything precise), we can derive the result algebraically.

This answer might feel unsatisfactory since we did not really arrive at the formula, we have merely proven that it is correct, the derivation itself is still "just" an algebraic issue once we have the recursion and should be in principle possible. One approach could be to formulate bivariate generating function from the recursion formula we found.