Application of Borel Cantelli lemma 2
You don't actually need BC here. That's more for sequences of events. Here, you have a fixed event.
Consider sequential block so 16 letters (the number of letters in the phrase, including spaces). In a given block, the probability of seeing the correct phrase is $p := 1/N^{16}$, where $N$ is the number of keys. This is due to your assumptions on uniformity and independence.
Now, what's the probability that you don't see this in the first $k$ blocks. The blocks are disjoint, so the probabilities are independent. Thus, the probability is $(1 - p)^k$. Indeed, you're just asking for $k$ independent events, each with probability $1 - p$ to happen.
Now, clearly $(1 - p)^k \to 0$ as $k \to \infty$, since $p \ne 0$. So this probability becomes smaller than any threshold, if we're allowed to take $k$ larger and larger. More precisely, let $q := \Pr(\text{never see phrase})$. We have shown that $q \le (1 - p)^k$---in fact, this is "never see before 16$k$ letter", but this is a stronger claim, so the inequality is the correct way around. Letting $k \to \infty$, we see that $$ q \le \limsup_{k \to \infty} (1 - p)^k = 0. $$ Thus, $q = 0$, as desired.
Incidentally, you can get tighter bounds on the decay in terms of the number of letters seen using renewal theory. Basically, if the first letter isn't "A", then you don't need to wait for a new block. You have to very carefully set up the locations of the blocks.
Hint: "All the worlds a stage" has $\ell$ symbols. Imagine dividing the infinite sequence of letters typed by the monkey into disjoint blocks of length $\ell$. Let $A_n$ be the event that the $n^{th}$ block is equal to "All the worlds a stage." Since the events $A_n$ are independent, you can apply the second Borel-Cantelli lemma to the sequence $(A_n)_{n\ge 1}$ to conclude that with probability one, infinitely many $A_n$ will occur. Just show that $\sum_{n=1}^\infty P(A_n)=\infty$; it is a very easy sum in this case. In particular, $A_n$ will occur at least once.