Estimate probabilities from its moments

The entire sequence of moments of a random variable $m_k = \mathbb{E}(X^k)$ determines the distribution function of $X$ uniquely, provided that $\sum_{k=0}^\infty \frac{m_k}{k!} t^k$ converges for all $t$ in an open neighborhood of $t=0$. See this.

If you have two such sequences, which coincide up to order $r$, but differ afterwards, these sequences correspond to different distributions.

You may, however, ask to approximate the distribution function $F_X(x) = \mathbb{P}(X \leq x)$ given the values of the low order moments, if some assumptions on the nature of the distribution is made. See method of moments estimation, for example.

Knowledge of moments, determines an upper bound on the tail of the distribution function. See Chernoff bound, and Chebyshev inequality.

You may also find Pearson distribution, determined by first 4 moments, useful


Let the first four moments be $E(X^j) = m_j$, $j=1\ldots 4$. Suppose $g(x)$ is a polynomial of degree $d$ so that $g(x) \ge I_{x \le a}$, i.e. $g(x) \ge 1$ for $x \le a$ and $g(x) \ge 0$ for all real $x$. Then for any random variable $X$ such that $E[X^d]$ exists, $P(X \le a) = E[I_{X \le a}] \le E[g(X)]$, and $E[g(X)]$ can be calculated using the first $d$ moments of $X$, i.e. if $g(x) = \sum_{j=0}^d c_j x^j$, $E[g(X)] = c_0 + \sum_{j=1}^d c_j E[X^j]$. Moreover, suppose $g(x) = I_{x\le a}$ at some points. Then this upper bound is optimal in the sense that it gives the exact value of $P(X \le a)$ for any probability distribution concentrated on those points. In the case $d=4$, for any $b_1$ and $b_2$ with $b_1 < a < b_2$, there is a unique polynomial $g(x)$ of degree 4 with $g(b_1) = g(a) = 1$, $g'(b_1) = g(b_2) = g'(b_2) = 0$; this will satisfy $g(x) \ge I_{x \le a}$, and the corresponding estimate is tight for distributions concentrated on $\{b_1, a, b_2\}$. For example, with $b_1 = -1$, $a=0$ and $b_2 = 1$, $g(x) = \frac{x^4}{2} + \frac{x^3}{4} - x^2 - \frac{3x}{4} + 1$, leading to the estimate $P(X \le 0) \le \frac{1}{2} E[X^4] + \frac{1}{3} E[X^3] - E[X^2] - \frac{3}{4} E[X] + 1$.


This will be an incomplete answer based on things I thought about several years ago, and I can't remember all the details. The first $n$ moments determine the first $n$ cumulants and vice-versa. Given the cumulants up to the ($2n-1$)th one, there is a constraint on the set of possible values of the $2n$th cumulant, saying that is is $\ge$ to a particular number. If it's less than that number, then there is no probability distribution with that sequence of the first $2n$ cumulants; otherwise there is one. And right at the boundary, it's a discrete distribution that can take only finitely many possible values. And I think every distribution with finite support is realized in that way. Peter McCullagh's book Tensor Methods in Statistics has some material on this. If I wanted to work out from scratch the answer to the original question above, that's where I'd start thinking about it. What you'd probably want as an answer is inequalities that $\Pr(X\le a)$ would have to satisfy.

If the distribution is supported on a not-necessarily proper subset of $[0,1]$, then (if I recall correctly) the way in which the values of the cumulative distribution function depend on the sequence of all of the moments is explicitly worked out somewhere in Feller's famous book. Maybe I'll find it.....


As I had pointed out in my comments, it's hard to answer this question in generality. So, I'll just point you to a resource online.

But, that said, the magic words are generating functions-Probability generating functions and Moment Generating Functions.

The probability generating functions $\Phi_X$ exists only for non-negative integer valued random variables. The Moment generating function $M_X$ is related to the former [whenever and wherever both exist] by the following: $$M_X(t)=\Phi_X(e^t)$$

There are other inputs required, sometimes and sometimes not. So, please go through the material I have pointed you to.

EDITED TO ADD: I'll get a little specific now: If the random variable at hand has finite range, and you have $all$ the moments, then the distribution of $X$ can be found out, {Theorem 10.2, pp 5, 369 in the typeset}. If you just have first two moments, you'll get only Mean and Variance.

I'd love to hear from you incase you have specific queries. [Just add a comment below, I'll be notified!]