Why the "self-referential number" function eventually fixes every point
One sort of obvious observation is that shuffling the digits of the input number $N$ doesn't affect the value of $f(N)$ at all.*
This alone significantly limits the number of possible values $f(N)$ can take. While there are $10^8$ distinct non-negative decimal numbers with up to eight digits (or, equivalently, $10^8$ distinct octuples of decimal digits), the number of distinct inputs ignoring the order of the digits is only ${10+8-1 \choose 8} = 24310$.
Also, on every step of the iteration, the number of values that the $k$ times iterated function $f^{(k)}(N)$ can take becomes more and more restricted. For example, for any $0 \le N < 10^8$:
- The last digit of $f(N)$ must be at least $1$, the rest of its digits can sum to at most $8$, and it can contain at most one digit greater than $4$. (And if it does contain a digits greater than $4$, it cannot also contain all digits from $0$ to $4$, as that would violate the sum condition!) Also, the digits of $f(N)$ cannot all be equal.
- Thus, the last digit of $f^{(2)}(N) = f(f(N))$ must be at least $2$ and at most $5$, and thus its first seven digits must include at least two zeros (and cannot be all zeros).
- Thus, in addition to all of the constraints above, the first digit of $f^{(3)}(N) = f(f(f(N)))$ must be at least $2$ and at most $6$, etc.
In such a fashion, one could presumably proceed to build a chain of logical arguments eventually showing that the only possible value of $f^{(8)}(N)$ is $23110105$.
Instead of doing that, though, I decided to write a simple Python script to enumerate all the possible values of $f^{(k)}(N)$ for each $k$, and in particular to print out the range of possible values of each digit. Its output looks like this:
step 1: 0-8, 0-8, 0-8, 0-8, 0-8, 0-8, 0-8, 1-8 (8943 distinct values)
step 2: 0-7, 0-7, 0-4, 0-3, 0-2, 0-1, 0-1, 2-5 (96 distinct values)
step 3: 2-6, 0-4, 0-2, 0-2, 0-2, 0-1, 0-1, 3-5 (18 distinct values)
step 4: 2-5, 1-4, 0-2, 0-2, 0-2, 0-1, 0-1, 4-5 (9 distinct values)
step 5: 2-3, 1-4, 0-2, 0-2, 0-2, 0-1, 0-0, 4-5 (6 distinct values)
step 6: 2-3, 1-3, 0-2, 0-2, 0-2, 0-1, 0-0, 4-5 (4 distinct values)
step 7: 2-3, 1-3, 1-2, 1-1, 0-1, 0-1, 0-0, 5-5 (2 distinct values)
step 8: 2-2, 3-3, 1-1, 1-1, 0-0, 1-1, 0-0, 5-5 (1 distinct value)
From the output above, we can see that the first two iterations are enough to reduce all $10^8$ possible inputs to just 96 different outputs, and the third iteration reduces those further down to just 18 choices: $23110105$, $24001104$, $31211005$, $32021004$, $32102004$, $33001104$, $40211004$, $41021004$, $41102004$, $41110105$, $42001104$, $42010014$, $50021003$, $50110104$, $50200013$, $51010014$, $51100004$ and $60100003$. The remaining five iterations are then needed to reduce these 18 values down to just one.
For a closer look at what happens during those last five iterations, a slight modification to the Python script lets it print the path that each of these 18 values takes to reach the unique fixed point as a nice Unicode tree:
┌► f(23110105) = 23110105
└─┬─ f(31211005) = 23110105
├─┬─ f(32021004) = 31211005
│ └─┬─ f(33001104) = 32021004
│ ├─── f(50110104) = 33001104
│ └─┬─ f(51010014) = 33001104
│ └─── f(60100003) = 51010014
└─┬─ f(32102004) = 31211005
├─┬─ f(24001104) = 32102004
│ └─┬─ f(41110105) = 24001104
│ ├─── f(50021003) = 41110105
│ └─── f(50200013) = 41110105
├─── f(40211004) = 32102004
├─── f(41021004) = 32102004
├─── f(41102004) = 32102004
├─┬─ f(42001104) = 32102004
│ └─── f(51100004) = 42001104
└─── f(42010014) = 32102004
In this tree, the fixed point $23110105$ is on the first row at the top, marked by the arrow tip. Underneath it is the value $31211005$, which is the only one of the 18 values (other than $23110105$ itself) that yields $23110105$ when $f$ is applied to it. Below that are the values $32021004$ and $32102004$ that both yield $23110105$ when fed through $f$, and below each of those are all the inputs that yield each of them in turn, and so on.
To be honest, though, I'm not convinced that there's any particular insight to be gleaned from this graph. Certainly I don't see any obvious or natural candidates for a monotone property $p$ such that $p(f(N)) \ge p(N)$ (with the inequality being strict unless $N$ is the unique fixed point of $f$), although that of course doesn't rule out the possibility that someone smarter than me might find one.
(Of course, given that iteration of $f$ clearly does converge, we could always artificially construct such a property $p$: for example, we could trivially let $p(N)$ be the highest $k \le 8$ such that $N = f^{(k)}(N')$ for some $0 \le N' < 10^8$. But such an artificial construction would yield no useful insight whatsoever, nor would it make proving the convergence of the iteration any easier.)
So it seems that the permutation invariance mostly explains the rapid initial convergence of the iteration into a small number of possible values, and may also explain the general statistical behavior of the size of the image of $f^{(k)}$ as a function of $k$. What it does not explain is the final convergence to just a single fixed point, as opposed to multiple fixed points or limit cycles.
In fact, I believe that this may be just a coincidence, and that arbitrary minor changes to the definition of $f$ may change the eventual result of the iteration.
To test this hypothesis, I decided to see what would happen if we instead considered the function $g(N) = f(N)-1$. (Recall that the last digit of $f(N)$ is always at least $1$, so $f(N)$ and $g(N)$ only differ in their last digit.)
As it turns out, in this case the iteration converges in nine steps to a limit set of five values:
step 1: 0-8, 0-8, 0-8, 0-8, 0-8, 0-8, 0-8, 0-7 (8943 distinct values)
step 2: 0-8, 0-7, 0-4, 0-3, 0-2, 0-1, 0-1, 0-4 (92 distinct values)
step 3: 2-7, 0-4, 0-3, 0-3, 0-2, 0-1, 0-1, 1-4 (17 distinct values)
step 4: 2-6, 0-4, 0-3, 0-3, 0-2, 0-1, 0-1, 2-4 (13 distinct values)
step 5: 2-5, 0-4, 0-3, 0-3, 0-2, 0-1, 0-1, 2-4 (11 distinct values)
step 6: 2-4, 0-4, 0-3, 0-3, 0-2, 0-1, 0-0, 2-4 (9 distinct values)
step 7: 2-4, 0-4, 0-3, 0-3, 0-2, 0-0, 0-0, 2-4 (7 distinct values)
step 8: 2-4, 0-4, 0-3, 0-2, 0-2, 0-0, 0-0, 2-4 (6 distinct values)
step 9: 2-4, 0-4, 0-3, 0-2, 0-2, 0-0, 0-0, 2-4 (5 distinct values)
These five limit values consist of two fixed points ($23111004$ and $31220003$, the latter having no other ancestors within the range of $g^{(3)}$) and a single cycle of three values ($24002002$, $40301002$ and $41111004$), as show in the tree below (slightly hand-edited from the output of the Python code to show the cycle more clearly):
┌► g(23111004) = 23111004
└─┬─ g(32111004) = 23111004
├─┬─ g(41200103) = 32111004
│ └─┬─ g(50200102) = 41200103
│ └─── g(52000002) = 50200102
└─── g(42100013) = 32111004
┌─┬─ g(24002002) = 40301002
│ └─┬─ g(41111004) = 24002002
└─► └─┬─ g(40301002) = 41111004
└─┬─ g(40220002) = 40301002
└─┬─ g(32030002) = 40220002
└─┬─ g(33010103) = 32030002
├─── g(51010103) = 33010103
└─┬─ g(51100013) = 33010103
└─┬─ g(61000002) = 51100013
└─── g(70000001) = 61000002
─► g(31220003) = 31220003
Given this observation, I'm inclined to say that the fact that the limit set of the original iterated function $f$ consists of a single fixed point is mostly just pure luck, aided by the rapid shrinking of the iterated image due to the permutation invariance of the function.
*) Except for the possible ambiguity regarding whether leading zeros should be counted or not. Above, I'm assuming that they should be counted, and that all inputs to $f$ should effectively be zero-padded to eight digits. In any case, this only affects the first few iterations, since it's easy to show that, whether leading zeros are counted or not, $f^{(2)}(N)$ must contain at least one non-leading zero for all $N$, and therefore $f^{(3)}(N)$ and all further iterates must have eight digits with no leading zeros.
The only result I know that lets you show that a map $f : X \to X$ has a unique fixed point that can be obtained by iterating $f$ is the Banach fixed point theorem, and to apply it here we would have to find a metric with respect to which $f$ is a contraction. This seems plausible but I don't see how to do it yet. The metric could be something like a Hamming distance. An easy observation, for example, is that if $n$ and $m$ differ in one digit then $f(n)$ and $f(m)$ differ in at most three digits, each of which has changed by at most $1$, which is not bad.
On the other hand, we could argue that the map $f$ we're interested in is really just not very structured, so maybe it behaves like a random function $f : X \to X$, and we can try to see what we can say about that. Write $n = |X|$ (here $n = 10^8$ or maybe $10^8 - 1$ depending on whether you allow a first digit of zero).
First, note that by linearity of expectation, the expected number of fixed points of $f$ is just $n$ times the probability that any particular $x \in X$ is a fixed point, which is just $\frac{1}{n}$ since the values of $f$ are chosen uniformly. So:
Claim 1: The expected number of fixed points of $f$ is $1$.
(The same is true for a random permutation. Note that the answer doesn't depend on $n$! This gives us some reason to expect approximately this "unique fixed point" behavior heuristically.)
Second, again by linearity of expectation, the expected size of the image $\text{im}(f)$ is $n$ times the probability that any particular $x \in X$ is in the image. In turn this is $1$ minus the probability that $x$ is not in the image, which is $\left( 1 - \frac{1}{n} \right)^n \approx e^{-1}$. So:
Claim 2: The expected size of $\text{im}(f)$ is $$n \left( 1 - \left( 1 - \frac{1}{n} \right)^n \right) \approx \left(1 - e^{-1} \right) n \approx (0.632 \dots)n.$$
Write $c = 1 - e^{-1}$. Now we can argue very heuristically as follows. If $f$ is a random function then it ought to still behave like a random function after being restricted to its image (actually I doubt this is really true but hopefuly it's true enough); this restriction gives a map $\text{im}(f) \to \text{im}(f)$ which we can iterate, and if Claim 2 still holds up then we get that the expected size of $\text{im}(f^2)$ is about (again, this is very heuristic) $c^2 n$, and more generally that the expected size of $\text{im}(f^k)$ is about $c^k n$. This tells us to expect to hit a fixed point, or at the very least an element of the eventual image $\text{im}(f^{\infty}) = \bigcap_{k \ge 1} \text{im}(f^k)$, which may contain short cycles, after about
$$- \frac{\log n}{\log c} \approx (2.18 \dots) \log n$$
iterations. (All logarithms are natural here.) Here $n = 10^8$ gives that we expect to hit a fixed point, or something like it, after about
$$(2.18 \dots) \log 10^8 \approx 40$$
steps, which is not so bad but not quite $10$ yet. At this point I'm tempted to switch back to making a Banach fixed point theorem argument work but it's getting late and I should sleep! This at least provides some evidence for repeatedly iterating $f$ as a heuristic strategy even if you don't know it's guaranteed to work ahead of time.
Edit: I haven't yet thought very hard about the specific properties of $f$ itself. As a first pass, after one iteration we can replace $X$ by its image $\text{im}(f)$, which is very clearly not all of $X$. As Thomas says, any element of the image has the property that its first seven digits add up to at most $8$, and we can count exactly how many $7$-tuples of digits have this property.
Exercise: The number of non-negative integer solutions to $\displaystyle \sum_{i=0}^{k-1} a_i \le n$ is $\displaystyle {n+k \choose k}$.
So here we get ${15 \choose 7} = 6435$ possibilities for the first seven digits and $9$ for the eighth, giving
$$|\text{im}(f)| \le {15 \choose 7} \cdot 9 = 57915$$
which is much smaller than $10^8$. Using this as the new value of $n$ we now heuristically expect iteration to converge in
$$- \frac{\log 57915}{\log c} + 1 \approx 25$$
steps. Getting there! Probably a similar analysis can be done at least for $\text{im}(f^2)$.
Edit 2: Sorry for the extreme length of the answer! The heuristic argument I suggested above is not quite right. The exponential shrinking of $\text{im}(f^k)$ doesn't happen the way I said; I found the actual answer in the comments to this nCafe post, which is that the expected size of $\text{im}(f^k)$, for fixed $k$ as $n \to \infty$, is asymptotically
$$\mathbb{E}(|\text{im}(f^k)|) \sim (1 - \tau_k n)$$
where $\tau_0 = 0, \tau_{k+1} = \exp(\tau_k - 1)$. The function $x \mapsto \exp(x - 1)$ has unique positive fixed point $x = 1$ but I'd have to put some thought into describing how quickly it converges to that fixed point.
I also learned that the expected number of periodic points of $f$, which equivalently is the expected size of the eventual image $\text{im}(f^{\infty})$, is asymptotically $\sqrt{ \frac{\pi n}{2} }$. So the function $f$ under consideration does not behave like a random function; it has far fewer periodic points!
So the whole discussion of random functions, while fun from my point of view, ended up being a digression. Sorry! In the next edit I'll try to say something more about this specific function $f$.
Not yet a full answer but here are some comments, not well ordered yet.
1. Some Brute Force Looking at all possibilities, $[2,3,1,1,0,1,0,5]$ is the unique fixed point for $f$.
There are no loops, all $10^8$ possible inputs converge to to this value, in at most 8 steps. Here is an histogram of the number of iteration needed
With data : \begin{array}{c||c} \text{Nb of iterations} &\text{ Nb of inputs}\\ \hline 0&1\\ 1&3359\\ 2&1407840\\ 3&4939200\\ 4&17522400\\ 5&40745460\\ 6&25723446\\ 7&7367026\\ 8&2291268\\ \end{array} And $[0, 0, 0, 0, 7, 7, 8, 9]$ is an example of an input needing 8 iterations. Here is the "path" to the fixed point, I was hoping to use this to look for some invariant or monotone variant but I couldn't find any pattern. \begin{array}{c||c} \text{step}&\text{value}\\ \hline 0&[0, 0, 0, 0, 7, 7, 8, 9]\\ 1&[4, 0, 0, 0, 0, 0, 0, 4]\\ 2&[6, 0, 0, 0, 2, 0, 0, 2]\\ 3&[5, 0, 2, 0, 0, 0, 1, 3]\\ 4&[4, 1, 1, 1, 0, 1, 0, 5]\\ 5&[2, 4, 0, 0, 1, 1, 0, 4]\\ 6&[3, 2, 1, 0, 2, 0, 0, 4]\\ 7&[3, 1, 2, 1, 1, 0, 0, 5]\\ 8&[2, 3, 1, 1, 0, 1, 0, 5] \end{array} 2. Some first thoughts Let $N=[a_0,a_1,\ldots,a_6,a_\#]$ be a fixed point for $f$. Note that
- We must have $$\sum_{i=0}^6 a_i \leq 8\qquad (*)$$
- Given that $[1,1,\ldots,1]$ is not a fixed point, $a_\#>1$
- Suppose that $a_\#>5$, then $$\sum_{i=0}^6 a_i \geq 0+1+2+3+4 > 8$$ a contradiction. Therefore $a_\#\leq 5$.
- Suppose that there is some $i\in \{0,\ldots,6\}$ such that $a_i\geq 7$. Then we must have at least 7 times the same number. Given condition $(*)$ this number can only be 1, and implies that $a_\#=1$ a contradiction. Therefore any for any $i$, $a_i\leq 6$.
- This means that inequality $(*)$ is in fact an equality (we count everything), $$\sum_{i=0}^6 a_i = 8\qquad (1)$$
- It's messy but we can also prove that we must have $a_i< 4$ for any $i\in\{0,\ldots,6\}$. I'm trying to see if I can simplify the argument (few cases : if we have one $a_i=4$, then we must have $i=0$ or $i=1$ and they both imply a contradiction, using $a_\#\geq 2$ and $(1)$).