Some very pedestrian questions about Besov spaces. Just to fix notation:

1.Let $f \in \mathcal{S}'$, the space of tempered distributions.

2.$\Psi, \{ \Phi_n \}_{n \geq 0} \subset \mathcal{S}$ such that their Fourier transforms $\hat{\Psi}, \{ \hat{\Phi}_n \}$ form a partition of unity subordinate to the cover $A_0 = (-1, 1)$, $ A_n = \{2^{n-1} < |\xi| < 2^{n+1} \}$.

3.So $f = \Psi * f + \sum_{n \geq 0} \Phi_n * f$ in $ \mathcal{S}'$.

4.Say (my impression) $f$ lies in the inhomogeneous Besov space $B^{\alpha}_{p,q}$ if

$$ \| \Psi * f\|_p + (\sum_{n \geq 0} (2^{n \alpha} \|\Phi_n * f\|_p)^q)^{\frac{1}{q}} < \infty. $$

Questions

What exactly does the indices signify in terms of smoothness properties of the function? How does the frequency content of $f$ in the dyadic frequency bands, as summarized by the Besov norm, reflect its regularity properties?

For example, if I want to find $f \in L^2(\mathbb{R})$ that's $\beta$-times differentiable in the Sobolev sense, I can just look for the condition

$$ \| \hat{f}(\xi) \xi^{\beta} \|_2 < \infty $$

What would be a corresponding statement for $B^{\alpha}_{p,q}$ for, say, a piecewise constant function? What if different pieces have different degrees of smoothness?


This answer has a couple of parts.

Part I: A Computation

Consider the simplest case in $\mathbb{R}$. Let $\chi$ be the indicator function of $[-1,1]$. You have that $\|\Phi_n * \chi\|_{L^2} \approx 2^{- \frac{n}2}$. So in which Besov spaces does $\chi$ belong? First clearly we need $\alpha \leq \frac12$. When $\alpha = \frac12$ you have that $q = \infty$ necessarily. In general you want $$ \sum_{n = 0}^{\infty} 2^{n (\alpha - \frac12) q } < \infty$$ But as long as $\alpha < \frac12$, this is a geometric series and converges. And so for $\alpha < \frac12$ we have that $\chi \in B^{\alpha}_{2,q}$ for any $q \in [1,\infty]$.

Note also that $B^{\alpha}_{2,2} = H^{\alpha}$, so for your simple example the Besov space description does not tell you any more than the Sobolev regularity, except for the end-point $q = \infty$ which gives you infinitesimally more.

In fact, from the formula for the Besov norm, if the norm $\|\Phi_n*f\|_p$ has a polynomial decay rate $2^{-\beta n}$ then we see that the Sobolev description gives more or less the same regularity behaviour as the Besov one. Where Besov space shines is when there's a further logarithmic correction term. Imagine the norm now decays like $2^{-\beta n} f(n)$. In this situation you can win a little bit: the Sobolev estimate will not work with $\beta$ derivatives, since you lose convergence of the geometric series, but it will work with $\beta-\epsilon$ derivatives for any $\epsilon > 0$. In the Besov case you have hope to recover the regularity up to $\beta$ derivatives provided $f(n)$ is summable in some $\ell^q$. But generally speaking: that's it. In most practical situations the additional descriptive power of Besov over Sobolev is precisely that: one logarithmic divergence.

Part II: Spatially inhomogeneous smoothness

After doing some light reading on this subject [1,2, 3, 4], what I came to realize is that

  1. The studies are mostly within the field of estimation theory; and
  2. The stark contrast between spatially inhomogeneous smoothness versus spatially homogeneous smoothness is not so much between Sobolev and Besov classes, but between Besov classes and Holder classes.
  3. Furthermore, the contrast does not concern so much the membership of the relevant classes, but rather in the whether one can find estimators reconstructing a function based on (noisy) observations.

To explain: suppose we have a function $f$ on $\mathbb{R}$. We know that if $f$ has Holder regularity $\alpha$ for $\alpha \in (0,1)$ this means that locally we have the bound $$ |f(y) - f(x)| \leq C |y - x|^\alpha $$ when $|y-x| < 1$. Note that the regularity $\alpha$ gives a uniform upper-bound to the local scaling behaviour of the function near a point.

Now suppose $f$ satisfies the following conditions:

  • $f(0) = 0 = f(1)$
  • When $x < \frac14$ we have $f(x) = |x|^{\alpha}$
  • When $|x - 1| < \frac14$ we have $f(x) = |x-1|^{\beta}$.
  • $\alpha < \beta$.

The problem is that now because of the presence of the singularity at $x = 0$, the function cannot belong to any space better than $C^\alpha$. But measured relative to the $C^\alpha$ semi-norm, the singularity at $x = 1$ is invisible since $$ \limsup_{r \to 0} \sup_{y\in (1-r,1+r) \setminus \{1\}} \frac{|f(y) - f(x)|}{|x-y|^\alpha} = 0 $$ For this we see that a $C^\alpha$ function can easily have spatially inhomogeneous smoothness. (Same for Sobolev and Besov classes: most functions in both Sobolev and Besov spaces have really bad spatial inhomogeneity when it comes to pointwise smoothness, see [5].)

(An aside: Please note that the Besov spaces $B^s_{\infty,\infty}$ coincides with the Holder space $C^s$ for $s$ not an integer. (When $s$ is an integer $B^s_{\infty,\infty}$ is the slightly better Zygmund space.) So in a very real sense Besov spaces include among them both the $L^2$-Sobolev and the Holder spaces. So that the constraint that a function lies in a Besov space should be regarded as a relaxation of the constraint that a function lies in a Holder space.)

In any case, when in the literature people write that Besov spaces are adapted to considering spatially inhomogeneous smoothness, what they mean is that assuming that the signal lies in some Besov space (so relaxing the constraint that it lies in a Holder or Sobolev space), one can get very good estimators. And one of the justifications for this is that there exists very powerful wavelet based techniques [6, 7], and Besov (for that matter, also Triebel-Lizorkin) spaces are naturally adapted to wavelet characterisations.

That wavelet techniques are useful for spatially inhomogeneous regularity should not be surprising, as a basic idea behind wavelets is that of microlocalisation: that one decomposes a function simultaneously in physical and Fourier space (of course, the decomposition cannot be simultaneously sharp due to the uncertainty principle).

To finish, let me just say that even with Besov and Sobolev scales, the scales tell you something about "averaged" regularity. By this I mean that it is not possible to pin-point where various degrees of smoothness occur or even their spatial distribution. (If you have the wavelet decomposition sitting in front of you, on the other hand, you can get some additional information.) As indicated in [5], what you typically get by knowing that a function is in Besov class is some bounds on how "big" are the sets of the singularities of various degree.