Solution 1:

If the distance from $\lambda$ to $\mathcal{W}(A)$ is $d > 0$, then $$ |((A-\lambda I)\phi,\phi)| \ge d\|\phi\|^{2},\;\;\; \phi \in \mathcal{D}(A)\\ \implies d\|\phi\| \le \|(A-\lambda I)\phi\|,\;\;\; \phi \in\mathcal{D}(A). $$ If $(A\phi,\phi)=(\phi,A^{\star}\phi)$ for $\phi$ in a core domain of $A^{\star}$, then you also get $$ d\|\phi\| \le \|(A^{\star}-\overline{\lambda}I)\phi\|,\;\;\;\phi \in \mathcal{D}(A^{\star}). $$ So, in this case $\mathcal{N}(A^{\star}-\overline{\lambda}I)=\{0\}$. Using the adjoint relation, $\mathcal{R}(A-\lambda I)^{\perp}=\mathcal{N}(A^{\star}-\overline{\lambda}I)$, it follows that $(A-\lambda I)$ has dense range and bounded inverse. So, under these assumptions, $\lambda \in \rho(A)$.

Solution 2:

Disclaimer: This prove is mostly due to T.A.E. - greatest regards to him!


Preparation:
(This is where real work has to be done.)

The distance of $\lambda$ to the numerical range gives us the estimate: $$d(\lambda,\mathcal{W})\leq|\langle \hat{x},(N-\lambda)\hat{x}\rangle|\leq\|(N-\lambda)\hat{x}\|\implies \|(N-\lambda)x\|\geq d(\lambda,\mathcal{W})\|x\|$$ Moreover since $N$ is normal and densely defined we have $N^*N=NN^*$ as a really nontrivial result (see proposition 4.17 b in [1]). Also since $N$ is normal it is closed and since it is densely defined $\mathcal{D}(NN^*)$ is a core for $N^*$ (see proposition 4.11 a in [1]). We thus have: $$d(\lambda,\mathcal{W})\leq|\langle \hat{c},(N-\lambda)\hat{c}\rangle|=|\langle (N^*-\overline{\lambda})\hat{c},\hat{c}\rangle|\leq\|(N^*-\overline{\lambda})\hat{c}\|\quad \hat{c}\in\mathcal{D}(N^*N)=\mathcal{D}(NN^*)\implies\|(N^*-\overline{\lambda})y\|=\lim_n\|(N^*-\overline{\lambda})c_n\|\geq\lim_n d(\lambda,\mathcal{W})\|c_n\|=d(\lambda,\mathcal{W})\|y\|$$

Main Work:
(The prove will be now an easy consequence.)

Assume $\lambda\notin\overline{\mathcal{W}}$.

Since $\lambda$ has positive distance to the numerical range $(N-\lambda)$ will be bounded below by the first estimate given in the preparation and due to this especially injective. Moreover since $N$ is normal it is closed so as well $(N-\lambda)$ and $(N-\lambda)^{-1}$. Thus by the closed graph theorem applied to $(N-\lambda)^{-1}$ the range of $(N-\lambda)$ is closed.

On the other hand we have that the $(N^*-\overline{\lambda})$ is bounded below as well due to the second estimate given in the preparation. But therefore the kernel is zero only and thus we have: $$\overline{\mathcal{R}(N-\lambda)}=\mathcal{N}(N^*-\overline{\lambda})^\perp=\{0\}^\perp=Y$$ That is the range of $(N-\lambda)$ is dense.

Collecting all together $(N-\lambda)$ is injective surjective and bounded below.
So $\lambda$ was in the resolvent set as was to be shown.


Reference: [1]: German version of Weidmann's 'Lineare Operatoren in Hilberträumen'