A little-o dilemma or the expectation of the KDE

This question arose out of this answer on Cross Validated, but there is no need to click the link since all the necessary details will be summarized here. The level of probability theory and statistics involved in this question is very basic. It is about calculus if anything.

This question assumes the following definition of the little-o if given a function $f(x)$:

$$ f\in o(x) \iff \lim_{x\to x_0} \frac{f(x)}{x} = 0,$$ where $x_0$ is a real number, a complex number or $\pm \infty$.

Background

Suppose $x_1, ..., x_n$ are independent and identically distributed observations of a random variable $X$ with unknown distribution function $F$ and probability density function $f\in C^m$, for some $m>1$ fixed. Let $k\in C^{m+1}$ be a given fixed function such that \begin{align} k&\geq 0, \\ \mathrm{supp} (k)&=[-1,1], \\ \int_{\mathbb{R}} k(u)\mathrm{d}u&=1, \\ \int_{\mathbb{R}} k(u)u^l\mathrm{d}u&=0 \ \text{for all} \ 1\leq l<m \ \text{and}\\ \int_{\mathbb{R}} k(u)u^m\mathrm{d}u&<\infty . \end{align} Define the so-called kernel density estimator (KDE) $f_n$ of $f$ by $$f_n(t)=\frac{1}{n}\sum_{i=1}^n \frac{1}{h}k\left(\frac{t-x_i}{h}\right),$$ where $h=h(n)$ is the bandwidth. What is the expectation of $f_n$, i.e. $\mathbb{E}[f_n(t)]$?.

By linearity of the expectation, identical distribution of $x_1,...,x_n$, the law of the unconscious statistician and the change of variables $u=(t-x)/h$, \begin{align} \mathbb{E}[f_n(t)]&=\frac{1}{n}\sum_{i=1}^n \mathbb{E}\left[\frac{1}{h}k\left(\frac{t-x_i}{h}\right)\right]\\ &=\mathbb{E}\left[\frac{1}{h}k\left(\frac{t-x}{h}\right)\right]\\ &=\int_{\mathbb{R}}\frac{1}{h}k\left(\frac{t-x}{h}\right)f(x)\mathrm{d}x\\ &=\int_{\mathbb{R}}\frac{1}{h}k(u)f(t-hu)h\mathrm{d}u\\ &=\int_{\mathbb{R}}k(u)f(t-hu)\mathrm{d}u. \tag{1} \end{align} From $f\in C^m$, it follows that $$f(t-hu)=\sum_{l=0}^m \frac{f^{(l)}(t)}{l!} (-hu)^l+o((hu)^m).$$ Then from $(1)$ and linearity of integration, \begin{align} \mathbb{E}[f_n(t)]&=\int_{\mathbb{R}}k(u)\left(\sum_{l=0}^m \frac{f^{(l)}(t)}{l!} (-hu)^l+o((hu)^m)\right)\mathrm{d}u \\ &=\sum_{l=0}^m\int_{\mathbb{R}}k(u)\frac{f^{(l)}(t)(-hu)^l}{l!}\mathrm{d}u+\int_{\mathbb{R}}k(u)o((hu)^m)\mathrm{d}u. \tag{2} \end{align} From the given conditions on $k$, the $l=0$ term reads $$\int_{\mathbb{R}} k(u)f(t)\mathrm{d}u=f(t)\int_{\mathbb{R}} k(u) \mathrm{d}u=f(t).$$ The $1\leq l<m$ terms are $$\int_{\mathbb{R}} k(u)\frac{f^{(l)}(t)}{l!} (-hu)^l\mathrm{d}u=\frac{f^{(l)}(t)(-h)^l}{l!}\int_{\mathbb{R}} k(u)u^l\mathrm{d}u=0.$$ Finally, the $l=m$ term is $$ \frac{f^{(m)}(t)(-h)^m}{m!}\int_{\mathbb{R}} k(u)u^m\mathrm{d}u<\infty.$$ According to the above linked answer, it holds that $o((hu)^m) = u^m o(h^m)$ and thus the remainder term in $(2)$ is \begin{equation} \int_\mathbb{R} k(u) o((hu)^m)\mathrm{d}u = o(h^m)\int_\mathbb{R} k(u) u^m\mathrm{d}u = o(h^m). \end{equation}

Question

Why does $o((hu)^m) = u^m o(h^m)$ hold? According to the Taylor expansion and the given definition of little-o, $o((hu)^m)$ means all functions $f$ that satisfy $\lim_{hu\to 0} \frac{f(hu)}{(hu)^m} = 0$. One can pull out factors from the little-o, i.e. $o((hu)^m)=huo((hu)^{m-1})$, but $o((hu)^m) = u^m o(h^m)$ suggests that the variable which the limit in the definition of little-o is taken with respect to has changed.


If we are looking at asymptotics near $0$ $$ o\!\left(x^m\right)=\left\{f:\lim_{x\to0}\frac{f(x)}{x^m}=0\right\}\tag1 $$ As with many things, the nomenclature is not perfect. For example, the scope of the $x$ inside the braces contained there; it is not the same $x$ appearing outside the braces. Thus, we don't interpret $$ o\!\left(x^m\right)=\left\{f:\lim_{x^m\to0}\frac{f\!\left(x^m\right)}{x^m}=0\right\}\tag2 $$ This is because the independent variable is $x$, not $x^m$.

When we see $o\!\left((hu)^m\right)$, we need to determine the independent variable(s) from context. In the question, the integration is with respect to $u$, so the independent variable might appear to be $u$. However, the article is looking at letting $h\to0$ in the estimate, while $u$ is integrated over $\mathbb{R}$, so $$ o\!\left((hu)^m\right)=\left\{f:\lim_{h\to0}\frac{f(u,h)}{(hu)^m}=0\right\}\tag3 $$ We might also write it as $u^mo\!\left(h^m\right)$, but it could also be written as $(hu)^mo(1)$ if we understand that $h\to0$ is the independent variable.


Clarification

As seen in $(3)$ above, by writing $o\!\left((hu)^m\right)$ as a class, we are confounding two uses of little-o: one as a class of functions and another as a function. This is a prelude to confusion.

The class $o\!\left(x^m\right)$ defined in $(1)$ is a standard class of functions used in asymptotic analysis.

However, as used in the question, $o\!\left((hu)^m\right)$ means $f(hu)$ where $f(x)\in o\!\left(x^m\right)$. Then, we can see more easily why it is claimed that $o\!\left((hu)^m\right)=u^mo\!\left(h^m\right)$: $$ \underbrace{f(hu)}_{o\left((hu)^m\right)}=u^m\underbrace{g(h)}_{o\left(h^m\right)}\tag4 $$ where $g(x)\in o\!\left(x^m\right)$.

That is, $o\!\left((hu)^m\right)=u^mo\!\left(h^m\right)$ is a statement about functions, not about classes.