Cartesian product of embeddings is embedding

For properness of $f\equiv f_1 \times f_2 : M_1 \times M_2 \to N_1 \times N_2$, i prefer to work directly with the original definition. That is show that for any compact subset $K \subseteq N_1 \times N_2$, the preimage $f^{-1}(K) = (f_1\times f_2)^{-1}(K)$ is compact. Let $K_1 := \pi_1(K)$ and $K_2 :=\pi_2(K)$, with $\pi_i : N_1 \times N_2 \to N_i$ are the canonical projections. By properness of $f_1$ and $f_2$ and the fact that

$$ f^{-1}(K) \subseteq f_1^{-1}(K_1) \times f_2^{-1}(K_2)$$

we conclude that $f^{-1}(K)$ compact so $f$ is a proper map. But i want to know if you can get around using sequentially compactness.

Now for the map $d(f_1 \times f_2)_{(p,q)}$, usually i encounter two ways to dealing with differential of maps with product manifolds as domain (and or codomain). The first one is more geometric and short. The second one is more strict (means that we don't do identifications) and more algebraic in some sense. I often prefer the second one.

$\textbf{First approach}$ Let $(p,q) \in M_1 \times M_2$ and $(v,w) \in T_pM_1 \oplus T_qM_2 = T_{(p,q)}(M_1 \times M_2)$. Choose smooth curves $\gamma(t)$ on $M_1$ and $\eta(t)$ on $M_2$ such that $\gamma(0)=p$ and $\gamma'(0)=v$ and $\eta(0)=q$ and $\eta'(0)=w$. By "identification", we may regard both curves are curves in $M_1 \times M_2$ passing through $(p,q)$, that is the first curve is $(\gamma(t),q) $ and the second is $(p,\eta(t))$. Then we have

\begin{align} df_{(p,q)}(v,w) &= df_{(p,q)}(\gamma'(0) + \eta'(0)) = df_{(p,q)}(\gamma'(0)) + df_{(p,q)}(\eta'(0)) \\ &= \frac{d}{dt}\Big|_{t=0} f\circ (\gamma(t),q) + \frac{d}{dt}\Big|_{t=0} f\circ (p,\eta(t)) \\ &= \Big(df_1(\gamma'(0)),0\Big) + \Big(0,df_1(\eta'(0))\Big) \\ &= \big(df_1(v), df_2(w)\big). \end{align}

$\textbf{Second approach}$ We need a bit of notations. Let $\iota_1 : M_1 \hookrightarrow M_1 \times M_2$ is the embedding $\iota_1(x)= (x,q)$ and $\iota_2 : M_2 \hookrightarrow M_1 \times M_2$ is $\iota_2(x) = (p,x)$. Also let $\pi_1 : N_1 \times N_2 \to N_1$ and $\pi_2 : N_1 \times N_2 \to N_2$ be the canonical projections. We don't do identification $T_{(p,q)}(M_1 \times M_2) = T_pM_1 \oplus T_qM_2$ instead we consider isomorphism between them $\tau : T_pM_1 \oplus T_qM_2 \to T_{(p,q)}(M_1 \times M_2)$ defined as $$ \tau(v,w) = d\iota_1(v)+d\iota_2(w). $$ Similarly, we have isomorphism $\sigma : T_{(f_1(p),f_2(q))}(N_1 \times N_2) \to T_{f_1(p)}N_1 \oplus T_{f_2(q)}N_2$ defined as $$ \sigma (u) = \big(d\pi_1(u), d\pi_2(u)\big). $$ Now what we actually want to compute is the composition map $$ \sigma \circ df_{(p,q)} \circ \tau : T_pM_1 \oplus T_qM_2 \longrightarrow T_{f_1(p)}N_1 \oplus T_{f_2(q)}N_2. $$

By direct computation, for any $v \in T_pM_1$ and $w \in T_qM_2$, \begin{align} (\sigma \circ df_{(p,q)} \circ \tau) (v,w) &= \sigma \circ df_{(p,q)} \Big( d\iota_1(v) + d\iota_2(w)\Big) \\ &=\sigma \Big( d(f \circ \iota_1)_p(v) + d(f \circ \iota_2)_q(w) \Big) \\ &= \Big( d(\pi_1 \circ f \circ \iota_1)_p(v) +d(\pi_1 \circ f \circ \iota_2)_q(w), \\ & \qquad d(\pi_2 \circ f \circ \iota_1)_p(v) +d(\pi_2 \circ f \circ \iota_2)_q(w) \Big) \\ & = \Big( df_1(v) , df_2(w) \Big). \end{align}