Why aren’t continuous functions defined the other way around?
The problem with your intuition is that an "open set" is not "a set whose elements are nearby each other". For example, considering the real numbers with the standard topology, the set $(0, \infty)$ contains elements arbitrarily far away from each other, while $\{0\}$ contains elements extremely nearby each other.
A better intuition is: an open set $X$ is a set such that if $x \in X$, then all points that are close to $x$ are also in $X$. This shows why the "forward definition" doesn't work: just because you are taking all points close to some $x$, does not mean that you should map onto all points close to $f(x)$ -- it just means that you should hit only points close to $f(x)$. But what does hitting only points close to $f(x)$ mean? It means that if you take all points $U$ close to $f(x)$, then $f^{-1}(U)$ should include all points close to $x$.
If you try to make the ideas in the previous paragraph precise and formal, you end up with the ordinary definition of continuity.
Edit: From the comments:
I still find your intuition hard to reconcile with the idea of “a discontinuous function rips points apart”.
Let us look at the converse, and take $f$ discontinuous. Informally, this means that there are $x$, $y$, which are close together, such that $f(x)$ and $f(y)$ are not close together. (Of course, to make this formal, you need to take one of $x$ or $y$ to be a sequence or even a net, etc.)
My intuition of an open set says: let $X$ be an open set, then $x \in X$ if and only if $y \in X$. Now let's see if the "forward definition of continuous" lets us prove that $f$ is discontinuous. Let's take any open set $X$. If $x \notin X$ then also $y \notin X$, and this doesn't seem to go anywhere. So let's look at open $X$ with $x, y \in X$. Then $f(X)$ is also open, and therefore $f(x), f(y) \in f(X)$ -- but this is precisely not what we wanted to prove.
Now let's apply my intuition to the "backward definition of continuous". Because $f(x), f(y)$ are far apart, there is an open set containing $f(x)$ but not $f(y)$. Let's call it $Y$. Then we have $x \in f^{-1}(Y)$, but $y \notin f^{-1}(Y)$. Thus $f^{-1}(Y)$ is not an open set, and $f$ is discontinuous.
Useful maps are, in context, those which preserve structure. An embedding of rings is a function which maps one ring into another. A map between vector spaces is linear. Maps between topological spaces are continuous.
Why? Well. The structure in a topological space is not the space, or a subset of the space, but rather a set of subsets of the space. And the preimage function $f^{-1}\colon\mathcal P(Y)\to\mathcal P(X)$ respects unions and intersections (whereas the direct image function does not respect intersections).
So in some sense, a continuous function from $X\to Y$ is telling you of a subspace of $Y$ which in some sense can be embedded into $X$.
To say that “$\mathcal U=f^{-1}(\mathcal O)$ is not open” for the open set $\mathcal O$ means that the complement of $\mathcal U$ approaches a point $x\in\mathcal U$.
But since $\mathcal O$ is open, the images of points outside of $\mathcal U$ approaching $x$ cannot approach the image of $x$. Thus $\mathcal O$ witnesses that $x$ has been torn from $\mathcal U^c$ by $f$.
The definition successfully carries the intuition to mention.... but perhaps its ”contravariantness” is tripping up your acceptance of the intuition.
What you describe in your last paragraph is called open mapping. For example, any continous bijective mapping $f:X \to Y$ (between topological spaces $X,Y$) is a homeomorphism iff $f$ is an open mapping.
Usually, students get introduced to continuity in $\mathbb{R}$ by the $\varepsilon,\delta$-Definition that says that a real function $f: \mathbb{R} \to \mathbb{R}$ is continuous at a point $\xi$ if for every $\varepsilon > 0$ there is a $\delta > 0$ s.t. $$ \vert x - \xi \vert < \delta \Rightarrow \vert f(x)-f(\xi) \vert < \varepsilon $$
However, this definition requires $\mathbb{R}$ to be a metric space which it is of course but if we take a topological space $X$ where a metric does not need to exist we have a similar definition for the continuity of a function in a given point $\xi$ that is
$f:X \to X$ is continous in $\xi \in X$ if for every neighbourhood $V$ of $f(\xi)$ there is a neighbourhood $U$ of $\xi$ such that $$ f(U) \subseteq V $$
which is exactly what the $\varepsilon,\delta$-criteria in metric spaces provides.
Now given these definitions i always found the general idea behind the concept of continouty to become much clearer.
Of course continuity in every point is equivalent to continuity in general. This is the first implication:
Let $f:X \to Y$ be a mapping between topological spaces $X,Y$. Let $f$ be continous in every $x \in X$ and $V$ an open subset of $Y$ i.e. $V \subset Y, V \in \mathcal{T}_Y$.
Since $f$ is continuous in every $x \in X$ it holds that for every $x_0 \in f^{-1}(V)$ there is a neighbourhood $U_{x_0}$ that contains $x_0$ and therefore an open subset $\Omega_{x_0} \subset U_{x_0}$ (follows from the definition of neighbourhoods) that contains $x_0$ such that $f(\Omega_{x_0}) \subset V$. However that means that for every $x_0 \in f^{-1}(V)$ there is an open subset $\Omega_{x_0}$ s.t. $\Omega_{x_0} \subset f^{-1}(f(\Omega_{x_0}))\subset f^{-1}(V)$ but then it holds that $f^{-1}(V)$ is an open subset of $X$, thus $f^{-1}(V) \in \mathcal{T}_X$
In addition to what has already been said, I want to point out that there is no way to define a relation on points in a topological space $X$ that captures "$x$ is near to $y$", since this is a statement depending on scale. If you consider the real numbers $0$ and $0.0001$ near to each other, then just zoom in a lot and realize that maybe they aren't.
Referring to the same notion as Frunobulax did in their answer, you can define a notion of a point $x\in X$ touching a subset $A\subseteq X$. For example in the reals $0$ is intuitively touching $(0,1]$, and that stays true no matter how far you zoom in! In usual topological terms this relation is expressed as "$x$ is in the closure $\overline{A}$ of $A$". And this does indeed describe continuity in a forward fashion: If $x$ is touching $A$, then $f(x)$ should be touching $f(A)$. In other words $$f\Big(\overline A\Big) \subset \overline{f(A)},$$ which is equivalent to the usual definition of continuity.