Why is the gradient of implicit surface the normal vector (i.e. parallel to the normal line)?

Fix a point $p$ on the surface and take a curve $\gamma=\gamma(t)$ lying on the surface and passing through $p$, say $\gamma(0)=p$. That implies $$ g(\gamma(t))=0 $$ for all $t$. Differentiate this relation termwise.

This argument shows that $\nabla g(p)$ is orthogonal to the velocity vector $\dot{\gamma}(0)$ for all curves on the surface that pass through $p$. By definition, the plane tangent to the surface at $p$ is the one that is composed of those vectors. Therefore, $\nabla g(p)$ is orthogonal to the tangent plane.