Method of characteristics for systems of PDE (vs. Lewy's example)
Any partial differential equation can be written as a first order system, so basically what you are asking is how to generalize method of characteristics to general PDEs and systems. As you have already observed, in general it is hopeless. However, there are some things that can be said.
For general PDEs and systems, the notion of characteristic surfaces plays a crucial role, which can be considered as a substitute for characteristic curves. Further, when we study high frequency asymptotics of (or how singularities propagate under) a general linear PDE, we are led to a fully nonlinear first order equation (of Hamilton-Jacobi type), which can be solved by the method of characteristics. The "characteristic curves" that arise are called bicharacteristics, which of course lie in a higher dimensional space (with double the dimension of the space of independent variables). Bicharacteristics refine the notion of characteristic surfaces, and are especially important for hyperbolic and dispersive equations, and there are many numerical methods based on this structure. Basically, what this means is that to solve the wave equation, you start with geometric optics, or to solve the Schrödinger equation, you start with classical mechanics. On the analytic side this idea eventually led to the powerful technique of Fourier integral operators.
Another thing I wanted to mention is that in 1 space dimension, if your (hyperbolic) system is simple enough, locally it would look like a collection of first order scalar equations, and you can use method of characteristics componentwise.