Solution 1:

Without assuming additional properties on $S$ and $T$, such as commutation or self-adjointness (on a Hilbert space) — see points 2. and 3. at the end of the answer — continuity of the spectrum as a map from $\mathcal{L}(X)$ to the compact subsets of $\mathbb{C}$ with the Hausdorff distance $d_H$ doesn't hold, because the spectrum can “collapse” under arbitrarily small perturbations, as illustrated by the following simple example:

Let $X = \ell^p(\mathbb{Z})$ with basis $(e_n)_{n \in \mathbb{Z}}$ and $1 \leq p \leq \infty$. Define the operator $S: X \to X$ by $$ S(e_n) = e_{n-1}\text{ if }n\neq 0 \quad\text{and}\quad S(e_0) = 0. $$ It is not difficult to check that $\sigma(S) = \overline{\mathbb{D}} = \{\lambda \in \mathbb{C}\,:\,\lvert \lambda\rvert \leq 1\}$: Indeed, $\lVert S \rVert = 1$, so we certainly have the inclusion $\sigma(S) \subset \overline{\mathbb{D}}$ and if $\lvert\lambda\rvert \lt 1$, the vector $v = \sum_{n=0}^\infty \lambda^n e_n$ is an eigenvector of $S$ with eigenvalue $\lambda$, so $\lambda \in \sigma(S)$. It follows from compactness of $\sigma(S)$ that $\sigma(S) \supset \overline{\mathbb{D}}$.

Let now $C$ be the rank one operator defined by $Ce_0 = e_{-1}$ and $C(e_n) = 0$ for $n \neq 0$. Putting $T_\varepsilon = S + \varepsilon C$ we have an invertible operator for $\varepsilon \neq 0$ and $\lVert T_{\varepsilon} - S\rVert = \varepsilon$.

For the spectral radius of $T_\varepsilon$ we have $r(T_{\varepsilon}) =1$, so $\sigma(T_{\varepsilon}) \subset \overline{\mathbb{D}}$. The inverse of $T_{\varepsilon}$ also has spectral radius $r(T_{\varepsilon}^{-1}) = 1$, so $\sigma(T_{\varepsilon}) \subset \partial\mathbb{D}$.

It follows from $\sigma(S) = \overline{\mathbb{D}}$ and $\sigma(T_\varepsilon) \subset \partial \mathbb{D}$ that the Hausdorff distance between $\sigma(S)$ and $\sigma(T_{\varepsilon})$ is at least $1$, while $\lVert S-T_{\varepsilon}\rVert = \varepsilon$ is as small as we wish.


Added: On the other hand, in some sense this is the worst that can happen: T. Kato, Perturbation Theory for Linear Operators, Springer Classics in Mathematics, 1995, proves in §3, section 1. of chapter IV on pp208f the following:

  1. The spectrum is upper semicontinuous as a function from $\mathcal{L}(X)$ to the compact subsets of $\mathbb{C}$ with respect to the Hausdorff distance (Remark 3.3).

  2. If $S$ and $C$ commute then there is the estimate $d_{H}(\sigma(S),\sigma(S+C)) \leq r(C)$ on the Hausdorff distance of the spectra (Theorem 3.6).

  3. Much later: To address a question posed in a comment: if $S$ and $T$ are self-adjoint operators on a Hilbert space then $d_H(\sigma(S),\sigma(T)) \leq r(S-T) = \lVert S - T \rVert$ holds as well, mainly because the resolvent satisfies $\lVert R_T(\lambda)\rVert = 1/d(\lambda,\sigma(T))$, which allows us to apply a variant of the Neumann series, see Kato, Theorem V.4.10, page 291 and section II.1.3, in particular formula (1.13) on page 67. Far more precise results can be found at the beginning of chapter VIII, especially §1.2.

The example above (which I learned from Edi Zehnder) appears as Example 3.8 on page 210.