What is the intuition behind uniqueness of differential equation condition that $f$ and $\frac{\partial f}{\partial y}$ are continuous?
Geometrically, we are given a direction field $f(t,y)$, and we seek an integral curve $y(t)$ (over possibly smaller region) that is tangent to the slope lines defined by $f(t,y)$.
What happens if $f$ is not continuous?
Consider the ODE $f(t,y) = \frac{dy}{dt} = \frac{1}{t+1}$ with initial condition $y(0) = 0$. It is not continuous at $t=-1$ (see vertical line, an integral curve that is tangent can't exist).
However, over the interval $t \in [-0.5, 0.5]$ it is. Therefore, it is bounded with $|f(t,y)| \leq \frac{1}{-0.5+1}=2$, and geometrically, $f(t,y) = \frac{dy}{dt}$ is the slope of any solution $y$ passing through $(0,0)$, so the solution is contained in the gray area.
What happens if $\frac{\partial f}{\partial y}$ is infinite at some point?
Consider the ODE $$f(t,y) = \sqrt{|y|}, \qquad\qquad y(0) = 0$$
-
Although continuous, the derivative $\frac{\partial f}{\partial y}$ is discontinuous, particularly infinite at $y= 0$. At least two curves will pass through $(0,0)$, i.e. $$y(t) \equiv 0 \qquad \qquad y(t) = \begin{cases}\frac{t^2}{4} \quad \mbox{if } t\geq 0\\ -\frac{t^2}{4} \quad \mbox{otherwise}\end{cases}$$
Intuitively, the erratic transitions of the slope lines of $f(t,y)$ around $y = 0$ (due to $\frac{\partial f}{\partial y}$ being infinite) here allowed distinct integral curves (that followed different slope lines outside) to merge at y = 0. For example, for any $C \geq 0$, even $y(t) = \begin{cases}0 \qquad &\mbox{for}\ t< C\\\frac{(t-C)^2}{4} \qquad &\mbox{for} \ t\geq C \end{cases}$ or other combinations are solutions.
Why assuming $f$ and $\frac{\partial f}{\partial y}$ continuous is (more than) enough?
Fast forward to the main idea
If two curves $y_n, y_m$ are close, we assume the slope lines of $f(t, y_n), f(t, y_m)$ to be proportionally close. This is the key assumption (Lipschitz), also implied if $f$ and $\frac{\partial f}{\partial y}$ were assumed continuous. But doesn't rule out nice nondifferentiable functions like $f(t, y) = |y|$.
If $Ay_n, Ay_m$ are two curves that follow the slope lines $f(t,y_n), f(t, y_m)$, then we can show as a result they will get strictly closer to each other more than $y_n, y_m$ does. That is, $y_n, y_m$ gets contracted into $Ay_n, Ay_m$.
Any such contraction mapping $A$ has a unique fixed point $y$ with $Ay = y$, i.e. $y = y_0 + \int_{t_0}^tf(\tau,y)d\tau$, or $\frac{dy}{dt} = f(t,y)$.
How to follow the slope line, successively?
Picard's idea is to find curves $y_i$ tangent to slope lines defined by $f(\tau,y_{i-1})$ of previous solutions, i.e. $\frac{d y_i}{dt} = f(t, y_{i-1})$ so that we may define: $$y_i(t) = Ay_{i-1}(t) := y(t_0) + \int_{t_0}^t f(\tau,y_{i-1}) d\tau$$
Consider the ODE $\frac{dy}{dt}=f(t,y) = y$, with $y_0 = 1$. The successive solutions are easily calculated from the Picard mapping: $y_1 = Ay_0 = 1 + t$, $y_2 = 1 + (t + t^2/2)$, ... i.e. $y_n = \sum_{i=0}^n t^n/n!$ for all $n$, which is the series expansion of $e^t$.
Key technical tools: distance and fixed points
-
We can measure the distance between two (continuous on a bounded interval) functions at the time that gives maximum value (via the sup norm, which result in a very well behaved space of functions) $$d(y_a, y_b) = \| y_a - y_b\|_\infty := max_{t \in [a,b]} | y_a(t) - y_b(t)|$$
(In the figure, we'd be comparing the functions at $t = 4$)
If a map $A: M \to M$ always contracts the distance between any two points (functions in this nice, complete space), i.e. $d(Ay_a, Ay_b) \leq \lambda d(y_a, y_b)$ for some $0\leq\lambda < 1$, then a unique fixed point $y^* = Ay^*$ exists The Banach Fixed Point theorem
In a bit more details
As f, $\frac{\partial f}{\partial y}$ continuous then $\frac{\partial f}{\partial y}\leq K$ is bounded, and with mean value theorem implies $\|f(t,y_a) - f(t, y_b)\| \leq K \|y_a - y_b\|$ for any two functions $y_a, y_b$ (Lipschitz continuity).
Hence, $$d(Ay_a, Ay_b) \leq \int_{t_0}^t \| f(t, y_a) - f(t,y_b) \| d\tau\leq K \int_{t_0}^t \|y_a - y_b\|d\tau \leq K \alpha d(y_a, y_b)$$ which contracts (in possibly a smaller interval) when $Ka < 1$.
Therefore, the Picard map $A$ defined was indeed a contraction, giving the unique fixed point $y = Ay = y_0 + \int_{t_0}^t f(\tau, y)d\tau$.
*Lots of details are omitted, justifying where $Ay_a$ is, reduced size of region, the weaker Lipschitz assumption... among others. For an intuitive but rigorous introduction, see "Ordinary Differential Equations, V. I. Arnol'd"