Counting subsets containing three consecutive elements (previously Summation over large values of nCr)
Problem: In how many ways can you select at least $3$ items consecutively out of a set of $n ( 3\leqslant n \leqslant10^{15}$) items. Since the answer could be very large, output it modulo $10^{9}+7$.
Example:
for $n=4 ({abcd})$,
answer = $3 (abc,bcd,abcd)$
I came up with this expression: $$ \sum_{k=0}^{n-3} C^{n-3}_k + (n-3)\sum_{k=0}^{n-4} C^{n-4}_k $$
The values n could take is so large that the above expression will take ages to be computed. I have no idea of how to simplify it.
Also, because this is an algorithmic problem, there's a time constraint of $5$ sec.
How do I compute the answer within the given time constraint?
Added: The corrected problem is a good deal harder. I’ll denote the set $\{1,\dots,n\}$ by $[n]$. Call a subset of $[n]$ good if it contains at least three consecutive integers, and bad otherwise. Let $g(n)$ be the number of good subsets of $[n]$, and let $b(n)=2^n-g(n)$ be the number of bad subsets of $[n]$. Consider a bad subset $A$ of $[n]$: both $A$ and $A\cup\{n+1\}$ are bad subsets of $[n+1]$ unless $n-1,n\in A$. In that case $n-2\notin A$, so $A\setminus\{n-1,n\}$ is a bad subset of $[n-3]$. Every bad subset of $[n+1]$ is either a bad subset of $[n]$ or a set of the form $A\cup\{n+1\}$ for some bad $A\subseteq[n]$ of the form $B\cup\{n-1,n\}$ for some bad $B\subseteq[n-3]$. There are $b(n)$ bad subsets of $[n]$, and there are $b(n)-b(n-3)$ bad subsets of $[n]$ of the form $B\cup\{n-1,n\}$ for some bad $B\subseteq[n-3]$, so we have the recurrence $$b(n+1)=2b(n)-b(n-3)\;.\tag{1}$$
This implies that
$$\begin{align*}g(n+1)&=2^{n+1}-b(n+1)\\ &=2^{n+1}-\Big(2b(n)-b(n-3)\Big)\\ &=2^{n+1}-2\Big(2^n-g(n)\Big)+2^{n-3}-g(n-3)\\ &=2g(n)-g(n-3)+2^{n-3}\;,\tag{2} \end{align*}$$
giving us a reasonably nice recurrence for $g$ as well. It can also be written as $$g(n+1)=2g(n)+b(n-3)\;,$$ showing that $g(n)$ more than doubles at each stage.
For the curious, here are the first few values of $g(n)$ and $b(n)$:
$$\begin{array}{r|cc} n&3&4&5&6&7&8&9\\ \hline g(n)&1&3&8&20&47&107&238\\ b(n)&7&13&24&44&81&149&274 \end{array}$$
Note that they are not given by the expression $$(n-2)\sum_{k=0}^{n-3} \binom{n-3}k - (n-3)=(n-2)2^{n-3}-(n-3)\;,$$ which gives $10$ for $n=5$. (The eight good subsets of $[5]$ are $\{1,2,3\},\{1,2,3,4\},\{1,2,3,5\}$, $\{1,2,3,4,5\},\{2,3,4\},\{2,3,4,5\}$, and $\{3,4,5\}$.)
The sequence of $g(n)$’s is A050231 in OEIS, which gives my recurrence $(2)$ and a linear recurrence,
$$g(n)=3g(n-1)-g(n-2)-g(n-3)-2g(n-4)\tag{3}$$
with initial conditions $g(0)=g(1)=g(2)=0,g(1)=1$. (These initial conditions also work for $(2)$.) It does not give a closed form.
The sequence of $b(n)$’s is also in OEIS; these are the Tribonacci numbers, OEIS A000073, satisfying
$$b(n)=b(n-1)+b(n-2)+b(n-3)\;,$$
with indices shifted so that the initial conditions are $b(0)=1,b(1)=2,b(2)=4$. They have a known closed form, but it’s not very useful:
$$g(n)=\left\lfloor\frac{\gamma(\alpha+\beta+1)^{n+2}}{3^{n+1}(\gamma^2-2\gamma+4)}+\frac12\right\rfloor\;,$$
where
$$\begin{align*} \alpha&=\big(19+3\sqrt{33}\big)^{1/3}\;,\\ \beta&=\big(19-3\sqrt{33}\big)^{1/3}\;,\text{ and}\\ \gamma&=\big(586+102\sqrt{33}\big)^{1/3}\;. \end{align*}$$
It appears to me that you’ll probably have to use one of the recurrences.
Answer to the problem as originally stated:
There are $$1+n+\binom{n}2=1+n+\frac{n(n+1)}2=\frac{(n+1)(n+2)}2$$ subsets containing fewer than $3$ elements, so the number of sets containing at least $3$ elements is $$2^n-\frac{(n+1)(n+2)}2\;,$$ or, more concisely, $\displaystyle2^n-\binom{n+2}2$.
To compute this modulo $10^9+7$, you may be able to make use of the fact that
$$2^{30}=1,073,741,824\equiv 73,741,817\pmod{10^9+7}\;.$$
Better yet, $10^9+7$ is known to be prime, so by Fermat’s little theorem $2^{10^9+6}\equiv 1\pmod{10^9+7}$.
The question can be reduced to
How many bit strings of $0's$ and $1's$ are there so that there are three consecutive $1's$.
Here $1$ means object is selected and $0$ means not selected.
Let $a_n$ denote the number of bit strings of length $n$ that contain three consecutive $1's$.That will be equal to the number of bit strings of length $n-1$ that contain three consecutive $1's$ with a $0$ added to the end plus (because we have not included the case when last bit is $1$) the number of bit strings of length $n-2$ that contain three consecutive $1's$ with a $10$ added to the end plus (because we have not included the case when last two bit is $11$) the number of bit strings of length $n-3$ that contain three consecutive $1's$ with $110$ added to the end plus the number of bit strings of length $n-3$ with $111$ added to the end (in this case we have to consider all the possibilities of remaining $n-3$ bits )
Hence the recurrence relation will be
$$ a_n = a_{n-1} + a_{n-2} + a_{n-3} + 2^{n-3}$$
for $n>3$ and $a_1 = 0, a_2 = 0, a_3 = 1$.
See this to solve this recurrence.
Hint: how many total subsets are there of a set of size $n$? How many of them have $0, 1,$ or $2$ elements? Now you are summing only $3$ items, not $n-3$.