Why is it difficult to define n-category?

It's easy to define strict $n$-categories (meaning that composition is strictly associative at all levels and so forth): if you have a definition of strict $n-1$-category, you just define strict $n$-categories to be categories enriched over strict $n-1$-categories. The problem is first that many natural examples aren't strict, although when $n = 2$ you can hope to strictify them, and more importantly that when $n \ge 3$ many natural examples aren't strictifiable at all.

For example, whatever an $n$-category ought to be, every topological space ought to have a fundamental $n$-groupoid whose objects are points, whose morphisms are paths, whose $2$-morphisms are homotopies, and so forth. This is not naturally a strict $n$-groupoid (composition of paths is already not associative, but only associative "up to coherent higher homotopy"), and when $n \ge 3$ it generally isn't strictifiable: it turns out that strictifiability implies that Whitehead brackets vanish, and so already the fundamental $3$-groupoid of $S^2$ is not equivalent to a strict $3$-groupoid.

The difficulty of defining $n$-categories comes from figuring out all the coherence data and conditions necessary to define weak $n$-categories so that they capture, at the very least, $n$-groupoids as they arise in algebraic topology (see also the homotopy hypothesis). To get a sense of how hard this is to do "by hand," see, for example, Todd Trimble's definition of a weak $4$-category.

Even with strict $n$-groupoids, the naive way of defining what a functor is between these also fails to capture topological phenomena: for example, every group $G$ gives rise to a strict groupoid $BG$ with one object and automorphisms $G$, and every abelian group $A$ gives rise to a strict $2$-groupoid $B^2 A$ with one object, one morphism, and $2$-automorphisms $A$. The "correct" set of equivalence classes of morphisms $BG \to B^2 A$ is the second cohomology group $H^2(BG, A)$, but the naive description of a functor between strict $2$-groupoids won't get you this. The problem is that functors also cannot be taken to strictly preserve composition, etc. but must also come with coherence data and conditions.

As Qiaochu said, it's all about the coherence diagrams. The first non-trivial one is the pentagon axiom in the definition of a monoidal category, which also appears in the definition of a (weak) $2$-category. An excellent introduction into higher categories is given in