Consider an alphabet $B = \{1, 2, ..., K \}$ and let $B^n$ be the $n$th fold Cartesian product.

$y^n \in B^n$ denotes a sequence of length $n$ with $y_i \in B$ for $1 \leq i \leq n$.

$\mathcal{P}(B)$ denotes the set of all probability distributions on $B$.

Shtarkov sum is defined as

$$S_n = \sum_{y^n \in B^n } \sup_{p \in \mathcal{P}(B)} p^n(y^n)$$ where $p^n(y^n) = \prod \limits_{i=1}^n p(y_i)$

Can anyone provide some references containing approximations of $S_n$? I am actually trying to see if, for large $n$, something like $$S_n \approx \frac{K - 1}{2} \log (n)$$ has been proven somewhere?

Edit: I found the answer. See below


Paper Reference: A. Orlitsky, N. P. Santhanam and Junan Zhang, "Universal compression of memoryless sources over unknown alphabets"

In this paper, there is equality between $(2)$ and $(4)$ which directly implies the following result about $S_n$: $$\log S_n \approx \frac{K - 1}{2} \log (n)$$

Or more exactly, $$\log S_n = \frac{K - 1}{2} \log \left (\frac{n}{2\pi} \right ) + \log \left( \frac{\Gamma(1/2)^K}{\Gamma(K/2)} \right) + o_K(1) $$

where $o_K(1)$ goes to zero as $n \to \infty$ with rate determined by $K$.

Brief Interpretation:

$\log S_n$ is shown to be equal to universal lossless source coding redundancy of i.i.d. sources taking values on alphabet $B$.

i.i.d. means independent and identically distributed source also called discrete memoryless source in this paper.

universal means the source distribution is unknown but is known to be i.i.d. over the alphabet $B$.