Motivation for proof of Berry-Esséen Theorem
Solution 1:
First of all, it is worth to mention that we really want to work with characteristic functions (Fourier transforms), since adding independent random variables corresponds to a plain multiplication (as opposed to convolutions).
Smoothing the cdf $F$ by convolution with a nice function $H$ (which is equivalent to adding an independent random variable with a smooth cdf) gives the following:
The resulting cdf $G = F*H$ has bounded density. This allows to write for any smooth cdf $\widetilde G$ with the same mean $$ G(x) - \widetilde G(x) = \frac{i}{2\pi} \int_{\mathbb{R}}\frac{\psi(t) - \widetilde\psi(t)}{t} e^{-itx} dt,\tag{1} $$ where $\psi$ and $\widetilde\psi$ are the characteristic functions of $G$ and $\widetilde G$.
Taking $\widetilde G = \Phi * H$, where $\Phi$ is the standard Gaussian cdf, we get from $(1)$ $$ \big(F - \Phi\big)*H(x) = \frac{i}{2\pi} \int_{\mathbb{R}}\frac{\varphi(t) - e^{-t^2/2}}{t}\omega(t) e^{-itx} dt,\tag{2} $$ where $\varphi$ and $\omega$ are characteristic functions of $F$ and $H$. Since $\omega$ has finite support, the integral in $(2)$ is over a finite interval. This allows to write Taylor expansion for the characteristic function $\varphi$ with error terms admitting nice bounds in this interval.
There are now two errors: smoothing error (replacing $F$ by $G$) and normal approximation error, controlled by $(2)$. Adjusting the "window" $T$ (the width of support of $\omega$) increases one error and decrease another, so we can optimize the error.
If we took compactly supported $H$, not $\omega$, this could still increase smoothness of $F$. However, the analysis of characteristic functions (and, as I have explained, we want to analyze them rather than cdfs) would not be simplified, as explained in 2-3 above.