Information-theoretic aspects of mathematical systems?

You are asking, if I understand right, for how much information is lost during the mapping from two elements to another symbol indicating the relation. In your example, function division maps $a,b$ to $\frac ab$, which you choose $c$ to represent the relation of $a,b$.

Unfortunately, we can say nothing on loss of information for that because essentially they contain no information. In the area of information theory, we discuss information only for possiblities. Possibility means uncertaint. Uncertainty implies information. That is how Boltzmann defines entropy as a measure for information. $$S=k_B\ln\Omega$$ where $\Omega$ is the total number of possibilities, $k_B$ is a constant. The point is, rather than measuring the loss of information of elements, which is the only one of possibilities, we can measure that for the sets in which elements belong to. The Sets here are, of course, domain and range of the function.


Follow the idea above, we introduce some notations. The mapping $\Phi$ has domain $\mathcal D$ and range $\mathcal R$. The size of a set $A$ is denoted by $|A|$ and the information it contains by $S_A$. Further assume both $\mathcal D$ and $\mathcal R$ is finite. We can define the information of $\mathcal D$ contains is logarithm of the total number of possibilities of $\mathcal D$, namely $$S_{\mathcal D}:=\ln|\mathcal D|$$ Sometimes function $\Phi=\Phi(x_1,\ldots,x_n)$ is an $n$-variable function where $x_i\in X_i$, then $\mathcal D=X_1\times\cdots\times X_n$ is cartesian product of each domain, in which case $|\mathcal D|=|X_1|\cdots|x_n|$. On the other hand, we don't want define $S_\mathcal R=\ln|\mathcal R|$ because $\mathcal R$ is not dependent with $\mathcal D$. They are connected by function $\Phi$. Thus we denote $n_y=|\Phi^{-1}(y)|$ and $\displaystyle p_y=\frac{n_y}{\sum_{y\in\mathcal R}n_y}=\frac{n_y}{|\mathcal D|}$for each $y\in\mathcal R$, where $\Phi^{-1}(y)\in\mathcal D$ is the preimage of $y$. Now the definition of information of $\mathcal R$ with respect to $\Phi$ is defined as $$S_\mathcal R:=\sum_{y\in\mathcal R}p_y\ln\frac{1}{p_y}$$ Hence the loss of information $$\Delta S=S_\mathcal D-S_\mathcal R\geq0$$ The inequality follows the maximum of concave function $S_\mathcal R$ is $S_\mathcal D$ where $n_y=1$ for each $y$. In another word, the information retains invariant if and only if the function is one-to-one. This definition is consistent with both the definition of entropy in information theory and, which is the most important, our intuition of loss of information.

We can easily generalize this method to cases where both domain and range are compact, just through replacing summation by integration and size by measure.


As for other cases, this method doesn't work any more but we still have some ways to roughly approximate the loss.

Note function $\Phi$ itself defines an equivalent relation $\sim$ and we can further define the quotient space $$\mathcal Q=\mathcal D/\sim$$ and $$\Delta S=\dim\mathcal D-\dim\mathcal Q$$ This has also correspondence in many areas such as linear algebra, group theory, topology etc. Sometimes we call the space of $\sim$ kernel. When the kernel is zero space, we say two spaces are isomorphic.


At the last, to give an explantion for your example $c=\Phi(a,b)=ab$, if $a,b\in\mathbb Q$, we have $\Delta S^\mathbb Q=\dim_\mathbb Q(\mathbb Q\times\mathbb Q)-\dim_\mathbb Q(\mathbb Q)=2-1=1$, while if $a,b\in\mathbb R$, we have $\Delta S^\mathbb R=1$. Although both value equal to 1, knowing that the cardinal of 1 dimensional rational number is less than that of real number, the information loss from rational multiplication is less than that from real multiplication.

I hope this will help even if this may not be what you want.


Your particular example of information being lost in multiplication or division is really the reason one studies ideals in ring theory, normal subgroups in group theory, and kernels in general. But I think prime ideals explain it best.

The prime factorization of a number represents how much information was lost in creating it. A prime number only has one factorization, so if it is a product, we know the factors. The more prime factors an element has, the more information is lost in creating it.

Mathematicians generalized this to 'prime ideals' where a class of products (called an ideal) has information loss measured by the number of of prime ideals in its decomposition.

Fields have the greatest information loss, as every element can occur from a product with any given factors.

Ideals can also be used in another way to destroy information. Quotienting by an ideal causes you to lose information in proportion to the size of the ideal. Quotienting the integers by the ideal of multiples of 10 tells you the last digit of a number, while quotienting hy multiples of 2 only tells you if a number is odd or even.