Solution 1:

I'm not really an algebraic geometer, so in my answer I'll stick to the simple situation where $f\colon X\to Y$ is a finite surjective morphism between smooth irreducible varieties over $k$. Both of your examples fall into this category.

If $x\in X$ is a (not necessarily closed) point and $y = f(x)$, then the multiplicity you are probably looking for is the integer I'll denote by $m_f(x)$, which is $$m_f(x):= \dim_{\kappa(y)}\mathcal{O}_{X,x}/\mathfrak{m}_y\mathcal{O}_{X,x} = \dim_{\kappa(y)}\mathcal{O}_{X,x}\otimes_{\mathcal{O}_{Y,y}}\kappa(y),$$ where here you use $f$ to make $\mathcal{O}_{X,x}$ into a $\mathcal{O}_{Y,y}$-module.

Another way of computing this integer is the following. The space $X_y$ you defined previously is a scheme, and its generic points correspond to the preimages of $y$. If $x$ is a generic point of $X_y$, then $\mathcal{O}_{X_y,x}$ is an artinian ring, and one can take its length, which I will denote $v_f(x):= length\,\mathcal{O}_{X_y,x}$. Then one can derive $m_f(x) = v_f(x)\times [\kappa(x):\kappa(y)]$. In both of your examples, $x$ and $y$ were closed points, so $\kappa(x) = \kappa(y) = k$, and the factor $[\kappa(x):\kappa(y)]$ is $1$. Thus $m_f(x) = v_f(x)$.

Let's apply these definitions to your examples.

(1) $f(a,b) = (a^2,b)$. Let $x = (p,q)$ and $y = (p^2,q) = f(x)$. Then $\mathcal{O}_{X,x} = k[u,v]_{(u-p,v-q)}$ and $\mathfrak{m}_y\mathcal{O}_{X,x} = (u^2-p^2,v-q)\subset k[u,v]_{(u-p,v-q)}$. If $p\neq 0$, then $$\mathcal{O}_{X,x}/\mathfrak{m}_y\mathcal{O}_{X,x}\cong k[u,v]_{(u-p,v-q)}/(u^2-p^2,v-q) = k,$$ which has dimension 1. If $p = 0$, then $$\mathcal{O}_{X,x}/\mathfrak{m}_y\mathcal{O}_{X,x}\cong k[u,v]_{(u,v-q)}/(u^2,v-q) = k[u]/(u^2),$$ which has dimension 2. Thus $m_f(x) = 1$ if $p\neq 0$ and $m_f(x) = 2$ if $p = 0$.

To compute this multiplicity the other way, we start from your observation that $X_y$ has two components so long as $p\neq 0$ and $1$ component (a double point) if $p = 0$. In the first case, $\mathcal{O}_{X_y,x}\cong k[t]/(t\pm p)$ for a preimage $x$ of $y$, which has length $1$, so $v_f(x) = m_f(x) = 1$. If $p = 0$, however, then $\mathcal{O}_{X_y,x}\cong k[t]/(t^2)$, which has length $2$. Thus $v_f(x) = m_f(x) = 2$.

Your second example is worked out similarly. The fact that $x$ and $y$ are or are not divisors doesn't make any real difference.

One can show that in the specific situation I'm considering (finite morphism between smooth varieties), every (not necessarily closed) point $y\in Y$ has the same number of preimages when counted with the multiplicity $m_f$, and that integer is the degree $d$ of the map $f$. A sketch of the proof of this fact is the following: Finite morphisms between smooth irreducible varieties over any algebraically closed field are flat, and hence $f_*\mathcal{O}_X$ is a locally free $\mathcal{O}_Y$-module of rank $d$. The fiber of $f_*\mathcal{O}_X$ at $y\in Y$ is exactly $\bigoplus_{f(x) = y} \mathcal{O}_{X,x}/\mathfrak{m}_y\mathcal{O}_{X,x}$, so that $$d = \sum_{f(x) = y} \dim\mathcal{O}_{X,x}/\mathfrak{m}_y\mathcal{O}_{X,x} = \sum_{f(x) = y} m_f(x).$$

Edit: Multiplicities when pulling back divisors. Let $f\colon X\to Y$ be a (surjective) morphism between smooth irreducible varieties. Let $D$ be a prime divisor on $Y$, and let $f^{-1}(D)$ have irreducible decomposition $D_1\cup\cdots\cup D_n$. Let $x\in X$ be a general point of $D_i$ and $y = f(x)$. Since $Y$ is smooth, $D$ has a local defining equation at $y$, that is, $D = \{\varphi = 0\}$ for some irreducible function $\varphi\in \mathcal{O}_{Y,y}$. This $\varphi$ pulls back to a function $\varphi\circ f\in \mathcal{O}_{X,x}$ which will define $D_i$ near $x$. However, $\varphi\circ f$ will not necessarily be irreducible, it might be $\varphi\circ f = \psi^{n_i}$, where $\psi$ is an irreducible local defining equation for $D_i$ at $x$. This integer $n_i$ should be the multiplicity of $D_i$ when pulling back divisors --- it doesn't depend on the choice of $x$, so long at $x$ is chosen generic. I think $f^*D = \sum_i n_iD_i$ is how you would pull back $D$. Of course, you can extend $f^*$ linearly to all divisors $D$, not just prime divisors.

When $f\colon X\to Y$ is finite surjective like we had considered before, the integers $n_i$ won't be the multiplicities $m_f$, but should be the multiplicities $v_f$. That is, if $D$ is a prime divisor of $Y$ with generic point $y$, then $f^{-1}(y) = \{x_1,\ldots, x_n\}$ consists of the generic points of the components $D_1,\ldots, D_n$ of $f^{-1}(D)$, and $n_i = v_f(x_i)$. Of course, if both $X$ and $Y$ are curves, then divisors are closed points, and $m_f = v_f$ on closed points, so in the special case of curves the $n_i$ should be given by $m_f$.

Solution 2:

Well, in the first case, I think the simple answer is: Congratulations! You've discovered one of the fundamental reasons to work with schemes, rather than simply with varieties! Non-reduced points and spaces play a fundamental role in many parts of algebraic geometry, e.g. deformation theory and singularity theory to name a couple, and are one of the main reasons for the substantial shift in language and technique a la Grothendieck and company. In the category of varieties, the preimage of $(0,q)$ in your first example really is just $(0,q)$, and somehow we have lost the information of multiplicity present on other nearby fibres.

In your second question, you're right that the definitions agree. Remember that the map $z\mapsto z^n, \mathbb A^1_{\mathbb C}\to\mathbb A^1_{\mathbb C},$ which you mention in your second example, is regular (i.e. algebraic). We can compute the fibres of this map exactly as you have done in your examples, to find that $z\neq 0$ has a fibre with $n$ distinct points closed points, while over $z=0$ all the points collapse to a single non-reduced point (the scheme theoretic fibre keeps track of the ramification).

I don't think the fact that in one of the examples the fibres can be thought of as divisors really comes into it. After all, we could consider higher codimensional subspaces as abstract objects in a Chow ring, or some such. We really just want to keep track of the multiplicity, which can be done via the scheme structure.

I've always thought naively about fibres in the category of varieties, so I don't have a great answer re. computing fibres specifically in this category. I imagine that putting the reduced induced closed subscheme structure on the fibre computed in the category of schemes would work in many cases.