Counting the numbers with certain sum of digits.

The question :

In how many different numbers between $1$ and $100000000$ have the sum of their digits equal to $45$?

I'm thinking about using the stars and bars formula but I'm not sure if it's possible and how.

Thanks in advance !


Solution 1:

You can use inclusion exclusion to get the answer 2706880. First let's reformulate the problem. We need to find the number of non-negative integer solutions to

$$ x_1 \, + \, x_2 \, + \; ... \; + \, x_k = n \;\;\;\;\;\;\;\;\;\; (*)$$

where $x_i < \mu$. If we can solve this, then we can simply plug in $n=45, k=8$ and $\mu=10$ to solve your problem. Each $x_i$ represents the $i$'th digit of a given number between $0$ and $10^k$. Now if we don't have the restriction that $x_i<\mu$, then the solutions are known as weak compositions, and the number of such weak compositions is easily seen to be ${ n+k-1 \choose k-1}$ using stars and bars enter image description here

To handle the constraint $x_i < \mu$ we can use the principle of inclusion-exclusion which goes as follows: Let $S$ be a set of objects, let $A,B,C$ be subsets of $S$ that satisfy properties $a,b,c$, respectively. Suppose we want to find the number of elements that don't satisfy any of the properties $a,b$, or $c$. That is, we want to count $|S - (A \cup B \cup C)|$ or the shaded part in the following diagram.

$\qquad \qquad \qquad \qquad \qquad \qquad \qquad$ enter image description here

We can count this by sieving

$$|S| \; - \; (|A|+|B|+|C|) \; + \; \left( |A \cap B|+|A \cap C|+|B \cap C| \right) \; - \;|A \cap B \cap C| $$

$ \qquad \qquad \qquad \quad $ enter image description here

This is easily generalized for $k$ different properties. Going back to your problem, we let $S$ be the set of all solutions to (*) and let $P_i$ be the subset where $x_i \geq \mu$. Then we want to count

$$\left|S - (P_1 \cup P_2 \, \cup \; ... \; \cup \, P_k \right)| = \left| \, S \, \right| - \sum_i\, \left|P_i\right| \; + \; \sum_{i,j} \left |P_i \cap P_j \right| \; - \;\; ...$$

But what are the summands? Or specifically, what is $|P_i|$? Solutions in $P_i$ are solutions to the new problem

$$ \begin{array} \; & x_1 \, + \, x_2 \, + \; ... \; +&(x_i)_{\geq\mu} &+ \; ... \; + \, x_k &= n \\ \leadsto & x_1 \, + \, x_2 \, + \; ... \; + &(\mu+x_i') &+ \; ... \; + \, x_k &= n \\ \leadsto & x_1 \, + \, x_2 \, + \; ... \; + &(x_i')_{\geq 0} &+ \; ... \; + \, x_k &= n-\mu \end{array}$$

But we already know this to be ${n-\mu+k-1 \choose k-1}$. Following a similar pattern, we notice

$$ \begin{array} ( \left| S \right| &= {n+k-1 \choose k-1} \\ \left| P_i \right| &= {n+k-1-\mu \choose k-1} & \small{\;\;\;\;\; \forall \, i}\\ \left| P_i \cap P_j \right| &= {n+k-1-2\mu \choose k-1} & \small{\;\;\;\;\; \forall i,j}\\ \left| P_i \cap P_j \cap P_\ell \right| &= {n+k-1-3\mu \choose k-1} &\small{\;\;\;\;\; \forall \, i,j,k}\\ &... \end{array} $$

So our desired sum becomes

$$\begin{array} (\displaystyle \left|S - (P_1 \cup P_2 \, \cup \; ... \; \cup \, P_k) \right| & = \left| \, S \, \right| - {k \choose 1} \left|P_i\right| \; + \; {k \choose 2} \left |P_i \cap P_j \right| \; - \;\; ... \\ \\ & = \displaystyle \sum_{i=0}^{\min \{k, \lfloor n/\mu \rfloor \}} (-1)^i {k \choose i} {n+k-1-i\mu \choose k-1} \\ \end{array} $$

We truncate the sum at $min \{k, \lfloor n/\mu \rfloor \} $ because the summands are zero from then onward (by definition). To count the number of integers less than $10^8$ with digit sum $45$, we plug in $n=45, k=8$, and $\mu=10$ and get

$$\sum_{i=0}^4 (-1)^i {8 \choose i} {52-10i \choose 7} = 2706880$$

Solution 2:

Yes, this is similar to a stars-and-bars problem. However, there's the extra constraint that each digit be at most nine. Therefore, your problem is equivalent to this one (with $k=45$, $n=8$ and $c=9$). For the answer in that question given as a sum of $n=8$ terms, Maple gives 2,706,880 numbers between 1 and 100,000,000 whose digits have a sum of 45.

As an alternative noted in the comments, you can approximate this result with the central limit theorem. You can think of each digit as a random variable uniformly distributed between 0 and 9 (with mean $\frac92$ and standard deviation $\frac{\sqrt{33}}2$). Then, by the central limit theorem, the sum $S$ of the digits in an eight-digit number would have a distribution that is approximately normal (with a mean of $8\cdot\frac92$ and standard deviation $\frac{\sqrt{8\cdot33}}2$). Using this normal distribution, $\text{prob}(44.5<S<45.5)\approx 0.0266$, so we get an approximation of $100,000,000\cdot0.0266=2,660,000$ numbers satisfying your condition.