How do we find the structure of a set with the most equidistribution using entropy?
Edit: I made a simpler example since the last one was difficult
Section 1: Motivation
Suppose $f:A\to\mathbb{R}$ and $A$ is countably infinite subset of the real numbers.
We can enumerate $A$ as $\left\{a_n\right\}_{n=1}^{\infty}$ and set $t$ as a natural number but the average
$$\lim\limits_{t\to\infty}\frac{f(a_1)+f(a_2)+\cdot\cdot\cdot+f(a_t)}{t}$$
may not exist or could vary depending on the enumeration. In order to increase the chances that an average exists and reduce the number of values it varies with, I did the following:
Split $A$ into a sequence of finite subsets $F_1,F_2,...$ such that $F_1\subset F_2\subset...$ and $\bigcup\limits_{n=1}^{\infty}F_n=A$.
The sequence of subsets will be denoted as a "structure of $A$" and a natural example (such as for $A=\mathbb{Z}$) would be $$\left\{F_n\right\}=\left\{m\in\mathbb{Z},-n\le m\le n\right\}$$
Following this, I want a structure that has an average and the most equidistribution which I will define below.
Section 2: Definition Of Average From Structure
Suppose $f:A\to\mathbb{R}$ and $A$ is countably infinite subset of the real numbers. Then there exists an average or $\text{avg}\left(\left\{F_n\right\},f\right)$, if for every arbitrarily small positive $\epsilon$ there exists a sufficiently large integer $N$ such for all $n\ge N$.
$$\left|\frac{1}{|F_n|}\sum\limits_{x\in F_{n}}f(x)-\text{avg}\left(\left\{F_n\right\},f\right)\right|\le \epsilon$$
Depending on the structure we choose, we may get a different average. However, for an intuitive average, we want a structure with most equidistribution as defined below:
Section 3: Steps To Determining A Structure With The Most Equidistribution
1. Arrange the values in $F_n$ from least to greatest and take the absolute difference between consecutive elements. Call this $\Delta F_n$. (Note $\Delta F_n$ could be a multi-set.)
Ex 1.1: If $A=\mathbb{Z}$ and $\left\{F_n\right\}=\left\{m\in\mathbb{Z},-n\le m\le n\right\}$ then $F_2=\left\{-2,-1,0,1,2\right\}$, and $\Delta F_2=\left\{-1-(-2),0-(-1),1-0,2-1\right\}=\left\{1,1,1,1\right\}$. Here, the elements repeat, making this a multiset.
2. Divide $\Delta F_n$ by the sum of all its elements so we get a distribution where all the elements sum to 1. Call this $\Delta F_n/\sum\limits_{x\in\Delta F_n}x$ or "the information probability of the structure"
Ex 2.1: From example 1.1 note $\sum\limits_{x\in\Delta F_2}x=1+1+1+1=4$ and
$\Delta F_3/\sum\limits_{x\in\Delta F_3}x=\frac{1}{4}\cdot \Delta F_3=\left\{1/4,1/4,1/4,1/4\right\}$. Note the elements in this set sum to $1$ and can hence act as a probability distribution (even though the elements are not acutal probabilties)
3. As the elements of the information probability always sum to $1$ we can calculate its deviance from being a discrete uniform distribution using Entropy (from Information Theory). This can be written as $$E(F_n)=-\sum\limits_{j\in\Delta F_n/\sum\limits_{x\in \Delta F_n}x}j\log j$$
From example 2.1, $E(F_3)$ is the same as $-\sum\limits_{j\in\left\{1/4,1/4,1/4,1/4\right\}}j\log j=-\left(1/4\log \left(1/4\right)+1/4\log \left(1/4\right)+1/4\log\left( 1/4\right)\right)\approx .602$
The structure whose information probability gives the highest entropy has the greatest equidistribution.
However, with this comes problems.
-
For $A=\mathbb{Z}$, if two different structures have the same number of elements, they always have the same entropy.
-
In order for an average to exist, some structures add more than one element as $n$ increases by one. This means when comparing different structures with a defined average, it is hard to trust which one has the greatest equidistribution unless the structures has the same number of elements.
For $A=\mathbb{Z}$, the structure with the most equidistribution should be $F_n=\left\{m\in\mathbb{Z},-n\le m \le n\right\}$, however, there are many structures in the form of $F_{a,b,n}=\left\{m\in\mathbb{Z},-an\le m \le bn\right\}$ where $a,b\in\mathbb{N}$. (Note, depending on the porportion $a$ and $b$ we could get a completely different average.)
When I faced I faced a similar problem with $A=\left\{\frac{1}{n}:n\in\mathbb{N}\right\}$, I assumed I could do the following (which we will call Statement A):
Suppose $\mathbb{S}(A)$ represents all structures of $A$ with a defined average from function $f$ (see Section 1). If $s,n,j\in\mathbb{N}$, I wish to find all $\left\{G_s\right\}\in \mathbb{S}(A)$ such that:
$\forall\left(\left\{F_n\right\}\in\mathbb{S}(A)\right)\exists\left(s\in\mathbb{N}\right):\forall\left(j\ge s\right)\exists\left(n\in\mathbb{N}\right)\left(E(F_n)\le E(G_j)\Rightarrow|G_j|\le|F_n|\right)$.
Where $G_j$ and $F_n$ is equivelant when $\left(\forall n\right)\left(\forall j\right) \left(\bigg[ \left|G_n\right|\le\left|F_j\right| \Rightarrow G_n\subseteq F_j \bigg] \lor \bigg[\left|F_j\right|\le\left|G_n\right|\Rightarrow F_j\subseteq G_n \bigg]\right)$
However, I assume this doesn't work when $A=\mathbb{Z}$ and
$$f(x)=\begin{cases} 2 & x\in \text{Even Numbers} \\ 1 & x\in\text{Odd Numbers} \end{cases}$$
since almost all or no $F_{n,a,b}$ that give a defined average would satisfy Statement A.
Questions
-
How do we correct Statement $A$ so this is the only $G_s$ is $F_{n}=\left\{m\in\mathbb{Z},-n\le m\le n\right\}$?
-
For question 1), if this is not possible, is there a simpler way to calculate the structure with the most equidistribution so that gives the same result as my assumption for $G_s$ without using Entropy?
-
If $f$ has at least more than one average, how we determine $G_s$ for other types of $A$ and $f$?
I'd like to make some comments, regarding your construction as well as directions you might be interested in taking/approximating your construction with.
To me there seems to be three parts to your construction:
- Sets $A$ over which one can take averages of functions. (~ ergodic theorems)
- A notion of entropy compatible with that in information theory (~ Shannon-McMillan-Breiman theorems)
- Probability distributions on average-admitting $A$'s that maximize entropy (~ thermodynamic formalism/study of measures of maximal entropy as well as strict/unique ergodicity)
(In parens I wrote the possible directions that each item is reminiscent of.)
Regarding $A$:
-
It seems to me that implicitly you have a "$1$-dimensionality" assumption on $A$, as you are using some ordering. In general you would need a graph (a set to code nodes and a set to code adjacency relations). (For such matters books on spectral graph theory/ symbolic dynamics/ geometric group theory would be worth looking into.)
-
On that node, instead of taking $A$ as a set of numbers, it might be better to think of $A$ as a graph with weights (either on vertices or on edges). With this perspective, given $A$ as a graph one can think of the space of probability distributions (which could be interpreted in a variety of ways), which would then lead to a more convenient formulation of the problem of entropy maximization. (For this books on ergodic theory (for "$1$-dimensional" $A$) as well as Lindenstrauss' paper "Pointwise Theorems for Amenable Groups" (https://www.ams.org/journals/era/1999-05-12/S1079-6762-99-00065-7/S1079-6762-99-00065-7.pdf) could be useful. It seems your case is a non-autonomous version of these, still it might be better to start with more structure and then weaken the hypotheses.)
-
A "structure" on $A$ reminds me of Følner sequences. Maybe you are going for something different, but based on my understanding I would estimate that there be a global structure on $A$ that is equivalent/analogous to existence of Følner sequences (such a global structure would then (possibly) generalize to uncountable sets $A$) (in the case of Følner sequences the structure is amenability, or polynomial growth, depending on your perspective).
Regarding $f$:
-
I would not expect the existence of the $A$-averages of a function $f$ to restrict $A$ that much. One could instead use function spaces (even then different function spaces are often used not specifically to guarantee averages but to control the rates of convergence of the finite set averages to the asymptotic average).
-
From another point, taking $f=1$ gives that one needs to settle the averaging structure on $A$.
-
Instead of optimizing entropy, you may also be interested in optimizing the $A$-averages, given $f$. This leads to ergodic optimization problems, which seem to be getting more traction recently. (A good reference for this is Jenkinson's paper "Ergodic optimization in dynamical systems" (https://arxiv.org/abs/1712.02307).)
Regarding Entropy:
-
Given the remarks in the previous section I would expect that the entropy of $A$ depends only on (the underlying graph and) the probability distribution.
-
Before optimization matters it might be better to establish an analog of the Shannon-McMillan-Breiman theorem, so that the entropy of this construction is indeed analogous to entropy as people already use it.
-
Once an SMB theorem is established, one could try adapting the methods of thermodynamic formalism. In particular one perhaps could define the analog of pressure which would involve both the probability distribution and the function $f$. (Bowen's book Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms (https://www.cpht.polytechnique.fr/sites/default/files/Bowen_LN_Math_470_second_ed_v2013.pdf) is the standard reference for these; any other ergodic theory book with a chapter titled (/ title including) "thermodynamic formalism" would also be useful.)
-
Alternatively, for the other approach through equidistribution I would expect that $A$ admits a unique probability distribution (to say the least). (For this Furstenberg's book Recurrence in Ergodic Theory and Combinatorial Number Theory could be useful.)