Deriving the formula for multivariate Gaussian distribution

The PDF for the distribution is

$$f(x)=\frac{1}{\sqrt{(2\pi)^n|\boldsymbol\Sigma|}} \exp\left(-\frac{1}{2}({\boldsymbol x}-{\boldsymbol \mu})^T {\boldsymbol\Sigma}^{-1}({\boldsymbol x}-{\boldsymbol \mu}) \right)$$

If I were trying to derive it from scratch, I would start with the univariate Gaussian distribution $$f(x)=\frac{1}{\sqrt{2\pi}\sigma} \exp\left(-\frac{1}{2}\frac{({x}-{\mu})^2}{\sigma^2}) \right),$$

and do the following:

(ignoring the normalisation factor)
substitute $(x-\mu)^2$ to $({\boldsymbol x}-{\boldsymbol \mu})^T({\boldsymbol x}-{\boldsymbol \mu})$
substitute $\sigma^2$ to $({\boldsymbol x}-{\boldsymbol \mu})^T \boldsymbol\Sigma ({\boldsymbol x}-{\boldsymbol \mu})$, which should be equal to the variance along the direction of $({\boldsymbol x}-{\boldsymbol \mu})$ and so exactly what I want.

This would result in the following formula

$$f(x)=\frac{1}{\sqrt{(2\pi)^n|\boldsymbol\Sigma|}} \exp\left(-\frac{1}{2}\frac{({\boldsymbol x}-{\boldsymbol \mu})^T({\boldsymbol x}-{\boldsymbol \mu})}{({\boldsymbol x}-{\boldsymbol \mu})^T \boldsymbol\Sigma ({\boldsymbol x}-{\boldsymbol \mu})}) \right),$$

which doesn't seem right.

My two questions are

What is the problem with the approach above? Please let me know if it doesn't make sense at all.
Could you recommend a good derivation of the multivariate Gaussian? (couldn't find anything on exchange or Quora).

If putting $({\boldsymbol x}-{\boldsymbol \mu})^T \boldsymbol\Sigma ({\boldsymbol x}-{\boldsymbol \mu})$ made sense, then it would make sense in the univariate case, $(x-\mu)\sigma^2(x-\mu),$ but it doesn't. In the univariate case you have $$ \sigma^2 = \operatorname{E}( (X-\mu)^2 ). $$ In the multivariate case you have $$ \Sigma = \operatorname{E}((X-\boldsymbol \mu) (X-\boldsymbol\mu)^T) = \text{an $n\times n$ matrix, where $\boldsymbol \mu$ is an $n\times 1$ vector.} $$

If $X\sim \operatorname{N}(\mu,\sigma^2)$ then $\dfrac{X-\mu} \sigma \sim\operatorname{N}(0,1).$ Similarly if $\boldsymbol X \sim \operatorname{N}(\boldsymbol\mu,\Sigma)$ then $\Sigma^{-1/2}(\boldsymbol X-\boldsymbol\mu) \sim \operatorname{N}(\boldsymbol0,I_n)$ where $I_n$ is the $n\times n$ identity matrix.

But what is $\Sigma^{-1/2}$? It is a consequence of the finite-dimensional version of the spectral theorem that a nonnegative-definite symmetric real matrix has a nonnegative-definite symmetric real square root, and this is it.

For $\boldsymbol X\sim\operatorname{N}(\boldsymbol 0, I_n),$ the density is $$ \frac 1 {\sqrt{2\pi}^n} \exp \left( \frac{-1} 2 x^T x \right). $$ For $\boldsymbol Y = A\boldsymbol X+\boldsymbol b,$ where $A$ is a $k\times n$ matrix and $b$ is a $k\times 1$ vector, the density is $$ \frac 1 {\sqrt{2\pi}^n} \cdot \frac 1 {\sqrt{\det (AA^T)}} \exp\left( \frac{-1} 2 \left( (\boldsymbol y - \boldsymbol b)^T (A A^T)^{-1} (\boldsymbol y - \boldsymbol b) \right) \right) $$ Notice that $$ \operatorname{var}(Y) = A\Big( \operatorname{var}(X) \Big) A^T = A I_n A^T = A A^T. $$ So the multiplication by $(AA^T)^{-1}$ corresponds to the division by $\sigma^2.$

Having trouble with proving this binomial identity. I have taken it as far as I can go.

Convergence / divergence examination $\sum_{k=1}^{\infty} \frac{\sin^{k}{\left(a \right)}}{k}$

Number of k-tuples of non-negative integers whose sum equals a given integer

Image and kernel of the action of the group $D_{12}$ on the $D_{12}/H$

Is the closure of the span in a Banach space complete?

Product measure and its marginals

Can you uniquely define a cube knowing the color of the faces but not its orientation?

First order ODE: $a(y)y^\prime + y = b(t)$ [closed]

How do you measure the 'explained variance' of arbitrary linear embeddings?

What is the characteristic equation of $a_{n+2}+2a_{n}=0$?

How to show that an algebra is contained in a clone

why preserving norm is equivalent to preserving inner product in rigid body transformation