Sturm-Liouville Questions

In thinking about Sturm-Liouville theory a bit I see I have no actual idea what is going on.

The first issue I have is that my book began with the statement that given

$$L[y] = a(x)y'' + b(x)y' + c(x)y = f(x)$$

the problem $L[y] \ = \ f$ can be re-cast in the form $L[y] \ = \ \lambda y$.

Now it could be a typo on their part but I see no justification for the way you can just do that!

More importantly though is the motivation for Sturm-Liouville theory in the first place. The story as I know it is as follows:

Given a linear second order ode

$$F(x,y,y',y'') = L[y] = a(x)y'' + b(x)y' + c(x)y = f(x)$$

it is an exact equation if it is derivable from a differential equation of one order lower, i.e.

$$F(x,y,y',y'') = \frac{d}{dx}g(x,y,y').$$

The equation is exact iff

$$a''(x) - b'(x) + c(x) = 0. $$

If $F$ is not exact it can be made exact on multiplication by a suitable integrating factor $\alpha(x)$.

This equation is exact iff

$$(\alpha(x)a(x))'' - (\alpha(x)b(x))' + \alpha(x)c(x) = 0 $$

If you expand this out you get the Adjoint operator

$$L^*[\alpha(x)] \ = \ (\alpha(x)a(x))'' \ - \ (\alpha(x)b(x))' \ + \ \alpha(x)c(x) \ = 0 $$

If you expand $L^*$ you see that we can satisfy $L \ = \ L^*$ if $a'(x) \ = \ b(x)$ & $a''(x) \ = \ b'(x)$ which then turns $L[y]$ into something of the form

$$L[y] \ = \ \frac{d}{dx}[a(x)y'] \ + \ c(x)y \ = \ f(x).$$

Thus we seek an integratiing factor $\alpha(x)$ so that we can satisfy this & the condition this will hold is that $\alpha(x) \ = \ \frac{1}{a(x)}e^{\int\frac{b(x)}{a(x)}dx}$

Then we're dealing with:

$$\frac{d}{dx}[\alpha(x)a(x)y'] \ + \ \alpha(x)c(x)y \ = \ \alpha(x)f(x)$$

But again, by what my book said they magically re-cast this problem as

$$\frac{d}{dx}[\alpha(x)a(x)y'] \ + \ \alpha(x)c(x)y \ = \ \lambda \alpha(x) y(x)$$

Then calling

$$\frac{d}{dx}[\alpha(x)a(x)y'] \ + \ ( \alpha(x)c(x)y \ - \ \lambda \alpha(x) )y(x) \ = \ 0$$

a Sturm-Liouville problem.

My question is, how can I make sense of everything I wrote above? How can I clean it up & interpret it, like at one stage I thought we were turning our 2nd order ode into something so that it reduces to the derivative of a first order ode so we can easily find first integrals then the next moment we're pulling out eigenvalues & finding full solutions - what's going on? I want to be able to look at $a(x)y'' \ + \ b(x)y' \ + \ c(x)y \ = \ f(x)$ & know how & why we're turning this into a Sturm-Liouville problem in a way that makes sense of exactness & integrating factors, thanks for reading!


Solution 1:

The first issue I have is that my book began with the statement that given

$$ L[y]=a(x)y′′+b(x)y′+c(x)y=f(x) $$ the problem $L[y] = f$ can be re-cast in the form $L[y] = λy$.

Now it could be a typo on their part but I see no justification for the way you can just do that!

You're right: knowing nothing of spectral theory of these operators, it is very difficult to justify why this is so a priori. Here's an attempt at showing you the connection between the two forms.

First, recognize that the two uses of $y$ above refer to different functions; so let's change notations. The ODE is $L[y]=f$, and the eigenvalue problem is $L[u_n]=\lambda_n u_n$. The subscript denotes that there may be a family of such solutions, indexed by a number $n$. I'm skipping some details about continuous spectra of operators, but the ideas I give can generally be extended. For now, assume that a countable sequence of $\lambda_n$ exists that solves the eigenvalue problem, and a corresponding sequence of eigenfunctions $u_n$.

Your book uses the fact that these two problems are related as follows. Suppose the boundary conditions of the ODE are such that the eigenvalue problem has a sequence of solutions $u_n(x)$ for $n=1,2,\ldots$, and we found these eigenvalues and eigenfunctions by solving $L[u_n]=\lambda_n u_n$; further suppose that the function $f$ can be written by an infinite linear combination of such functions:

$$ f(x) = \sum_{n=1}^\infty a_n u_n(x) $$

A sequence $a_n$ can be found for many operators and boundary conditions and functions $f$ in practice, without getting into the details. They should be in your book later on. For now, it suffices to think that we picked the "best-fit" sequence of $a_n$ that fits the function $f$. Further, let $y$ be expanded in the same set of functions:

$$ y(x)=\sum_{n=1}^\infty b_n u_n(x) $$

Except this time, we don't know the $b_n$ coefficients, since we want to solve for $y$. Let's plug these into the ODE. For one side, I get

$$ L[y] = \sum_{n=1}^\infty b_n L[u_n(x)] = \sum_{n=1}^\infty b_n \lambda_n u_n(x) $$

Giving $$ L[y] = f\;\;\Rightarrow\;\;\sum_{n=1}^\infty b_n \lambda_n u_n(x) = \sum_{n=1}^\infty a_n u_n(x) $$

Matching term by term, you can see that the unknown $b$ coefficients must be $b_n=a_n/\lambda_n$. Now plugging this into the solution form gives a concrete answer in terms of known quantities:

$$ y(x) = \sum_{n=1}^\infty \frac{a_n}{\lambda_n} u_n(x) $$

That is the solution to our problem, but note that it is ONLY written in terms of the $u_n$ functions and some coefficients. The $u_n$ functions were chosen to solve $L[u_n]=\lambda u_n$, the eigenvalue problem, NOT the original problem. However, you can see that if you know the solution to the eigenvalue problem, you know the solution to the original problem. This is what they mean about "re-casting" the problem. Since the original problem is more complex, we solve the simpler eigenvalue problem instead, and superpose the solutions to solve the original problem. For operators of this kind, with the right conditions on the $a,b,c$ functions, this is always possible, so an ODE can be "re-cast" in terms of it's eigenvalue problem. The eigenfunctions are solutions to the operator problem, regardless of the forcing $f(x)$. Once you've solved that, you write everything in terms of these eigenfunctions and the problem simplifies nicely.