How did Hecke come up with Hecke-operators?

This isn't going to be a perfect answer and I'm not sure that it is what motivated Hecke (because I haven't read extensively enough), but I think a pretty nice of way of thinking about Hecke operators on modular forms is as follows.

First, we define a lattice $L \subset \mathbb{C}$ to be a subgroup $\mathbb{Z} \omega_1 +\mathbb{Z} \omega_2$, with $\omega_1, \omega_2$ linearly independent over $\mathbb{R}$. We assume that $\omega_1/\omega_2$ is in the upper-half plane. Then we let $R$ be the free group over $\mathbb{Z}$ generated by all such lattices in $\mathbb{C}$. Let $S$ be the set of all lattices in $\mathbb{C}$.

Next, for $k \in \mathbb{Z}$ we define a weight $k$ lattice function to be a function $F: S \rightarrow \mathbb{C}$ satisfying a homogeneity condition: that is $F(\lambda L) = \lambda^{-k} F(L)$ for all $L \in S, \lambda \in \mathbb{C}^*$. Let the set of lattice functions of weight $k$ be $LF_k$.

Then it is not difficult to see the following: that there is a bijection between suitable lattice functions of weight $k$ and modular forms of weight $k$ given by the following:

$$ \sigma: f \in M_k(SL_2(\mathbb{Z})) \mapsto (\mathbb{Z} \omega_1 +\mathbb{Z} \omega_2 \mapsto \omega_2^{-k} f(\omega_1/\omega_2)) \in LF_{k}$$ where $\omega_1/\omega_2$ is in the upper-half plane

$$\tau: F \in LF_{k} \mapsto (z \mapsto F(\mathbb{Z}z+\mathbb{Z})) \in M_k(SL_2(\mathbb{Z}))$$

You need to check these are well-defined (i.e. that $\sigma$ is independent of the choice of $\omega_1, \omega_2$ and that $\sigma, \tau$ are mutually inverse). With this in mind, we make the following two definitions:

For $n \in \mathbb{N}$ let $T(n): R \rightarrow R; L \mapsto \sum_{\substack{L' \subset L \\ [L:L']=n}}L'$ by the level $n$ Hecke operator
For $n \in \mathbb{N}$ let $R(n):= L \mapsto nL$

These seem like relatively simple, fairly natural functions to consider on lattices - they are probably the simplest two such functions we could think of on $R$. You may show without too much difficulty the following properties of $T, R$:

If $gcd(n,m)=1$ then $T(m)T(n)=T(nm)=T(n)T(m)$
If $p$ is prime and $r \geq 1$ then $T(p^r)T(p)=T(p^{r+1})+pR(p)T(p^{r-1})$

Now we can try to make a definition of the Hecke operator of a weight $k$ lattice function $F$ as follows:

$$ T(n)(F) : L \mapsto \sum_{\substack{L' \subset L \\ [L:L']=n}}F(L') $$

However, this doesn't turn out to be quite the right definition - we need some kind of extra scaling factor in front of the sum so that things turn out nicely. So we make the following definition:

$$ T(n)(F) : L \mapsto n^{k-1}\sum_{\substack{L' \subset L \\ [L:L']=n}}F(L') $$

Now, where does this $n^{k-1}$ come from? Well, if we rearrange our sum a bit we can show that due to the homogeneity of $F$ that

$$ T(n)(F) : L \mapsto \frac{1}{n}\sum_{\substack{L \subset L'' \\ [L'':L]=n}}F(L'') $$

So this Hecke operator really is averaging in some sense the images of $F$. Going to back to the previous description of $T(n)(F)$, if we make use of the bijection afforded to us by $\sigma$, then we can try to describe $T(n)(f)$ by computing $T(n)(\sigma(f))$ as a lattice function and then changing it back to a modular form.

For this, we use that the sublattices $L' \subset L$ of $L$ of index $n$ are in bijection with matrices $\left(\begin{array}{cc} a & b \\ 0 & d \end{array}\right) \in GL_2(\mathbb{Z})$ with $ad=n$ and $0 \leq b \leq d-1$. IF $L=\mathbb{Z} \omega_1 +\mathbb{Z} \omega_2$, then the $L'$ corresponding to $\left(\begin{array}{cc} a & b \\ 0 & d \end{array}\right)$ will be $L=\mathbb{Z} (a\omega_1+b\omega_2) +\mathbb{Z} d\omega_2$.

Putting all of the above together we conclude that $\tau(T(n)(\sigma(f)))$ has the form described by Hecke in his paper. This is the definition we take then for the Hecke operators of modular forms.

This has a useful description of all this with all the proofs: Stein & Ribet

When we try to study something with a nice rich mathematical structure, morphisms of those structures are often the best way to approach things. Hopefully the above is convincing enough that the Hecke operators as morphisms are fairly natural morphisms to consider when we speak about Hecke operators. They have all kinds of special significance, but that one that comes to mind most immediately for me is that all of the Eisentstein series are going to be eigenforms - that is they are eigenvectors for each Hecke operator. That's a pretty stunning property.

What's the best way to catch wild Pokémon in Pokémon GO?

A game with $n$ players - II

How to start learning high level mathematics? [closed]

Concrete Problems that can be solved by appealing to a Moduli Space

A question about converging derivatives

Given three non-overlapping circles, can we construct (via straightedge and compass) the triangle of minimum perimeter with one vertex on each circle?

Conjecture: "For every prime $k$ there will be at least one prime of the form $n! \pm k$" true?

Integrals of a Hopf algebra: Why that name?

How do I compute the area of a cross which contains several disks which may overlap?

What is the probability that every pair of students studies together at some point?

Expectancy value for the percentage of points lying in the Convex Hull (3D)

Roots with equal fractional parts