Ill-posedness of inverse problems
Solution 1:
Let $K:X\to Y$ be a compact linear operator between two Hilbert spaces. If we have an equation of the form
$$Kx=y,$$
then we say that the problem is "ill posed" if there is not a "continuous dependence between the solution and the data". In this case, the data is normally some perturbed version of the RHS, i.e. $y_\delta=y+e$, for some $\delta\ge\|e\|$; and the solution is $x^\dagger=K^\dagger y$, where $K^\dagger:\operatorname{range}K\oplus\operatorname{range}K^\perp\to \ker K^\perp$ is the Moore-Penrose pseudo-inverse. Thus, the part in bold is equivalent to saying that $K^\dagger$ is continuous. Note that this condition is violated when $\dim\operatorname{range} K=\infty$ and $\overline{\operatorname{range}K}\ne Y$
In particular, $x^\dagger$ is equal to the least squares solution of minimum norm, i.e. it solves the normal equation $K^\ast Kx^\dagger=K^\ast y$. Then you can easily compute your solution using spectral theory, i.e.
$$ \begin{aligned} x^\dagger&=(K^\star K)^{-1}K^\ast y \\ &=\int_0^{\|T\|^2+}\frac{1}{\lambda}\,\mathrm{d}E_\lambda K^\ast y, \end{aligned} $$ where $\{E_\lambda\}$ denotes the spectral family of $K^\ast K$. Now, you can clearly see that for small eigenvalues (i.e. $\lambda$), one encounters problems. In particular, if $K$ is compact and $\dim \operatorname{range} K=\infty$, then the eigenvalues accumulate at zero!
Therefore, one needs to use regularisation. A regularisation essentially consists of replacing $1/\lambda$ with a parametric spectral filter function $g_\alpha(\lambda)$ such that $g_\alpha(\lambda)\to 1/\lambda$ as $\alpha\to 0$. Note that this is equivalent to replacing $K^\dagger$ by a family of parametric continuous operators $R_\alpha:Y\to X$ called a "regularisation operator".
Regularisation
Truncated singular value decomposition:
If we set $g_\alpha(\lambda)=1_{[\alpha,\infty)}(\lambda)\lambda^{-1}$, then we can rewrite the integral above in the form of a regularised solution which we denote as
$$x_\alpha=\int_\alpha^{\|T\|^2+}\frac{1}{\lambda}\,\mathrm{d}E_\lambda K^\ast y\longrightarrow x^\dagger\mbox{ }\text{as }\alpha\to 0.$$
Notice that this corresponds to "cutting off the bad eigenvalues".
Tikhonov regularisation:
Recall the normal equation $K^\ast K x=K^\ast y$. We can regularise it by shifting the LHS, i.e. $(K^\ast K+\alpha I)x=K^\ast y$. Then the solution which satisfies this equation is denoted by
$$ \begin{aligned} x_\alpha&=(K^\ast K+\alpha I)^{-1}K^\ast y \\ &=\int_0^{\|T\|^2+}\frac{1}{\lambda+\alpha}\,\mathrm{d}E_\lambda K^\ast y \\ &\to x^\dagger\mbox{ as }\alpha\to 0. \end{aligned} $$ In this case $g_\alpha(\lambda)=1/(\lambda+\alpha)$.
Of course, in practice, the $y$ is replaced by $y_\delta$.
There are also many other regularisation methods which I will not list.
Parameter Choice Rules
The question still remains of how to choose the so-called regularisation parameter $\alpha$? Recall that the data is usually perturbed. Thus one must choose $\alpha_\ast:=\alpha(\delta,y_\delta)$ such that if $\{\delta_n\}$ is a sequence tending to zero, then $x_{\alpha_\ast,\delta}:= R_{\alpha_\ast}y^\delta\to K^\dagger y=:x^\dagger$ as $\delta_n\to 0$.
A-priori parameter choice rules:
If one has knowledge of the noise level $\delta$, then one can simply choose $\alpha_\ast\sim\delta$. In particular, if $x^\dagger\in\operatorname{range}(K^{\ast}K)^\mu$, for $\mu>0$, which is known in the literature as a source condition (i.e. a certain condition on the smoothness of your solution), then one can even choose
$$\alpha_\ast\sim\delta^{\frac{2}{2\mu+1}},$$ which then yields optimal convergence rate:
$$\|x_{\alpha_\ast,\delta}-x^\dagger\|\le C\delta^{\frac{2\mu}{2\mu+1}},$$
for Tikhonov regularisation.
A-posteriori parameter choice rules:
In practice, one does not normally know the $\mu$. An a-posteriori parameter choice rule select the parameter depending on both the noise and the noisy data. One method, which was suggested by Morozov, is the so-called discrepancy principle. That is, choose
$$\alpha_\ast=\sup\{\alpha:\|Kx_{\alpha,\delta}-y_\delta\|\le\tau\delta\},$$
for a constant $\tau$. If the aforementioned source condition is satisfied, then this also yields optimal convergence rates for Tikhonov regularisation.
Heuristic parameter choice rules:
Suppose that you have no knowledge of the noise level (which is usually the case in practical problems), then one may opt to use a heuristic paramter choice rule (i.e. selecting a parameter which depends only on the noisy data). The drawback, due to Bakushinskii's veto, is that a heuristic regularisation method cannot converge in the worst case (i.e. if we take the supremum over all noise levels). However, it has been proven that in many situations (i.e. when the noise satisfies certain conditions, e.g. it is sufficiently irregular), then heuristic regularisation methods are indeed convergent.
A simple example is the so-called heuristic discrepancy principle:
$$\alpha_\ast=\operatorname{argmin}_{\alpha}\frac{\|Kx_{\alpha,\delta}-y_\delta\|}{\sqrt{\alpha}}$$