What is the difference between a function and a distribution?
Solution 1:
A distribution is also a function (mapping), but its "input" are also functions and not "numbers". This is what a distribution is in layman's terms. A precise definition would be rather complicated (basically the topology on the test functions is rather difficult to define).
To clarify: when people say that "$\delta$ is not a function", then they mean there is no function $\delta:\mathbf{R}\rightarrow\mathbf{R}$ such that $$\int_{-\infty}^\infty \delta(x)f(x)dx=f(0)$$ for all $f\in\mathcal{F}$ where $\mathcal{F}$ is a certain vector space of functions. By definition $\delta$ is the function $\delta:\mathcal{F}\rightarrow\mathbf{R}$ by $\delta(f)=f(0)$. So this "not is a function" relies on a more narrow interpretation of "function", i.e. that functions are those whose domain is (say) a set of real numbers.
Why the Laplace transform is a function and Fourier transform is a distribution? I mean, they are both infinite integrals. So what am I missing?
I think you mean that the Laplace transform of $1$ and the Fourier transform of $1$, right? Well the integral defining the Fourier transform of $1$ does not converge!
Solution 2:
Disclaimer: In the following, “function” refers to the classical map from $(\mathbb{R})$ to $\mathbb{C}$. Many book authors do the same, and so presumably did yours.
A distribution in a more general concept than a function. Some distributions correspond to functions (although they are still different objects, if you look deep enough) so many authors just use the same notation for those, like $\sin x$. But there are many more distributions which behave like no function could.
(Most strikingly, they may not have values in points on the real axis. You must take $\delta(\omega)$ as nothing but a symbol with known properties: $\delta(0)$ can't be evaluated not only because it would not be finite or something but because there's no such thing as evaluating a $\delta$ in what is a singular point to start with.)
Your integral is a good example. You can indeed write an integral representation of the Fourier transform, and if you can successfully calculate the integral (plus some boring further assumptions), then the function you obtain can be thought of as the distribution that is the “real” result, in the sense of the first paragraph. But the Fourier transform is also well-defined in many cases where the integral would diverge.
It's more natural to define Fourier transform in terms of distributions, because it allows many tricks people would be doing anyway, and because restricting oneself to normal functions would mean losing or obscuring many interesting cases from real world use*) **). For Laplace transform you get a strong enough theory in functions alone, so there's no need to make things harder by imposing an unnecessarily general formalism that takes its own module to explain.
*) Also because it beautifully reflects many internal properties and symmetries the transform has.
**) You can also define Fourier transform on the subspace of functions known as $L^2$ (important for quantum mechanics, for example) and stay within that realm. It's a slightly different assumption resulting in a slightly different theory. For example, the constant $1$ is not a $L^2$ function and would not have any Fourier transform in that version.