The result is false in this generality.

Fix some $z\in S$ and define $O\mu=\mu_c+\mu_d(S)\delta_z$, where $\mu_d$ and $\mu_c=\mu-\mu_d$ are the discrete and continuous parts of $\mu$, respectively. Then $O$ satisfies your conditions, but cannot possibly be given by a kernel, if there is a non-discrete probability measure $\nu$ on $S$.

Why not?

If $O$ is given by the kernel $K$ then $O\mu=\int \mu(dx) K(x,\cdot)$. Setting $\mu=\delta_x$ shows $K(x,\cdot)=O\delta_x$.

If $\nu$ has no discrete part, then $O\nu=\nu$ but $\int\nu(dx) K(x,\cdot)=\delta_z\neq \nu.$


In order for $O$ to be represented by a kernel you need two things:

  1. $x\mapsto O\delta_x$ should be measurable from $S$ to $M_1(S)$.
  2. $O\mu=\int\mu(dx) O\delta_x$ for every $\mu\in M_1(S)$.

My counterexample above satisfies 1, but not 2.