Density of Diagonalizable matrices
Solution 1:
This follows straight from the Schur decomposition. Let $A = U T U^*$ be nondiagonalizable. Here, $U$ is unitary and $T$ is triangular, with the elements $t_{ij}$. Let $\varepsilon_i$, $i = 1,\dots,n$ be such that
$$t_{ii} + \varepsilon_i \ne t_{jj} + \varepsilon_j, \quad \text{for all $i \ne j$}$$
and $|\varepsilon_i|$ sufficiently small. I leave it to you to show that this can always be done.
Define $T' := T + \operatorname{diag}(\varepsilon_1, \dots, \varepsilon_n)$ and $A_\varepsilon := U T' U^*$. Obviously, all eigenvalues of $A_\varepsilon$ are distinct, so $A_\varepsilon$ is diagonalizable.
Solution 2:
Hint: a sufficient condition for a $n\times n$ matrix to be diagonalizable is for its characteristic polynomial to have $n$ distinct roots. Further, the coefficients of the characteristic polynomial are continuous with relation to the matrix $A$ — for whatever norm you wish, since all norms are equivalent in finite dimension; and the number of roots of a polynomial can be changed "easily" by small perturbations of the coefficients (although this is phrased rather "handwavily", this can be argued).
Hence, for all $\epsilon > 0$, you can show that there exists a matrix close to $A$ by less than $\epsilon$ in $\ell_2$ norm whose characteristic polynomial has $n$ distinct roots.
Solution 3:
The claim in the OP is only correct if the matrix elements lie in an algebraically closed field. For example, the set of diagonalizable matrices over $\mathbb{R}$ is not a dense subset of the space of all matrices over $\mathbb{R}$ (under the standard vector topology), as proven here.