If $\frac{dy}{dt}dt$ doesn't cancel, then what do you call it?
Solution 1:
This is closely related to the recent question asking whether $\frac{dy}{dt}$ was a fraction or not, as you note.
As in that question, when Leibnitz first came up with the notation for integrals (the $\int$ sign was really a capital $S$, standing for "summa", summation), and he wrote $S_a^b f(x)\,dx$, he was really thinking of this as a sum of products, with $dx$ representing an "infinitesimal change in $x$." He was thinking of dividing the interval $[a,b]$ into an infinite number of "infinitesimally thin" rectangles, each of height $f(x_i)$ (for whatever $x_i$ you happened to be in), and then adding up all the areas of these infinitesimally thin rectangles. Such a rectangle would have area $f(x_i)dx$ (height times base), and adding them together would yield the area.
If you take this point of view (ignoring for a moment the fact that infintesimals can't exist in the usual real numbers), then the First Fundamental Theorem of Calculus follows by "simple" arithmetic with a telescoping sum (assuming $\frac{dy}{dt}$ is continuous, say). You have that the integral $$\int_a^b\frac{dy}{dt}\,dt$$ is really the sum of the quotient of infinitesimal changes in $y$, divided by infinitesimal changes in $t$, multiplied by the infinitesimal change in $t$. If you think of $[a,b]$ as divided into infinitesimally thin subintervals $$a=t_0\lt t_0+dt\lt t_0+2dt\lt\cdots \lt b$$ then $dy = y(t_0+(k+1)dt) - y(t_0+kdt)$, so that the integral becomes the telescoping sum $$\sum \Bigl(y(t_0+(k+1)dt) - y(t_0+kdt)\Bigr) = y(b)-y(a)$$ because all the "middle terms" cancel out. So here, $dt$ really does cancel out with the $dt$ in $\frac{dy}{dt}$, the sum is telescoping, and the final answer comes out exactly as desired.
But, as with the derivative, there is a legion of logical problems with this way of thinking, not the least of which is that infinitesimals can't really exist in the usual setting of real numbers. So calculus had to be rewritten. There were proposals by Cauchy on how to define integrals, and eventually we had Riemann's way of defining integrals as limits. So that when we write $\int_a^b f(t)\,dt$ we no longer mean a sum of products of the form $f(t)\,dt$, but rather we mean a limit of certain Riemann sums. In this view, the "$dt$" is rather more like the symbol to balance the $\int_a^b$. Think of the integral sign on the left as a "left parenthesis", and the $dt$ on the right as the "right parenthesis" that closes out the expression.
So, just like $\frac{df}{dt}$ no longer literally means "the quotient of an infinitesimal change in $f$ by an infinitesimal change in $t$" but rather means "the limit as $h$ goes to $0$ of $(f(t+h)-f(t))/h$, so $\displaystyle \int_a^b f(t)\,dt$ no longer literally means "the sum of infinitesimally thin rectangles of height $f(t)$ from $a$ to $b$", but instead means "the limit of Riemann sums of $f(t)$ over partitions of $[a,b]$ as the mesh goes to $0$."
But one of the great advantages of Leibnitz notation is that it is very suggestive. So you get the First Fundamental Theorem of Calculus, which looks very natural in Leibnitz notation: $$\int _a^b \frac{df}{dt}\;dt = f(b) - f(a),$$ giving the suggestion that you are "cancelling $dt$", even though you're not really doing that. But since infinitesimals don't really exist, this is not literally true. But good notation is not something to be cast aside when it comes along, and Leibnitz notation, being suggestive, is very good notation, so we keep it because it helps with calculations.
Where did the "$dt$" go? Well, one might ask where the "$)$" goes in the following calculation: $$2\times(3+5) = 16.$$ So... where did the "$)$" go (or where did "$\times$", "$+$," and "$($" all go) ? Same place as the "$dt$" went: since it is part of the notation, it "goes away" when we are done with the evaluation.
Note. As with derivatives, with Nonstandard Analysis one can write calculus so that the $dt$ in the integral really represents a quantity you are multiplying by and then adding, so that in nonstandard analysis the First Fundamental Theorem of Calculus really is just the observation that if you divide by $dx$ and multiply by $dx$, then the two cancel out.
Solution 2:
This is where the Fundamental theorem of calculus comes in handy.
It comes in two flavors:
The first flavor (Riemann Setting)
Let $f(x)$ be a continuous function in $x$ on an interval $[a,b]$.
$\displaystyle F(x) = \int_{a_0}^{x} f(y) dy$, where $a_0 \in [a,b]$ is called a primitive of $f(x)$.
(The existence of a primitive can be proved since $f(x)$ is continuous.)
Then the Fundamental theorem of calculus states that \begin{align*} \displaystyle F(x) \in C^1{(a,b)}\\ \displaystyle \frac{dF(x)}{dx} = f(x) \end{align*}
The second flavor (Riemann Setting)
Let $f(x)$ be a continuous function on an interval $[a,b]$ and let its derivative $f'(x)$ exist on $(a,b)$ and let $f'(x)$ be integrable.
Then the Fundamental theorem of calculus states that \begin{align*} \displaystyle \int_{a_0}^{b_0} \frac{df(y)}{dy} dy = f(b_0) - f(a_0) \end{align*} where $a_0,b_0 \in [a,b]$
The second flavor is the one you use in your argument above.
You might be able to relax some conditions that I have stated to obtain the same conclusion.
Again, this is a place where a good notation helps. $\displaystyle \frac{dy}{dt} dt$ can be "treated" like usual ratio and the $dt$'s can be "canceled" out though the actual reasoning for the "canceling" out is different.
EDIT
As I always believe counterexamples are a great way to study a theorem or a property. Below are examples where you cannot "cancel out the dt's".
In case of Riemann Integration, if you take $y(t)$ as the Volterra function (Thanks to Theo Buehler for pointing that one), then $y(t)$ is differentiable i.e. $\frac{dy}{dt}$ exists for all $t$. However, $\frac{dy}{dt}$ is not Riemann integrable i.e.
$\displaystyle \int_{a}^{b} y'(t) dt$ doesn't exists where the integral is interpreted in Riemann sense.
It is however, Lebesgue integrable.
In case of Lebesgue Integration, there exists continuous functions, $f(x)$, which are differentiable almost everywhere i.e. $f'(x)$ exists almost everywhere, and its derivative, $f'(x)$, is Lebesgue integrable but the Lebesgue integral of $f'(x)$ is not equal to the change in $f(x)$ i.e.
$\displaystyle \int_{a}^{b} f'(x) dx \neq f(b) - f(a)$ where the integral is interpreted in Lebesgue sense.
The famous function here is the Cantor function $C(x)$. It is continuous everywhere and has zero derivative almost everywhere. Hence, the Lebesgue integral $\displaystyle \int_{0}^{1} C'(x) dx$ in the interval $[0,1]$ is $0$. However, $C(1)-C(0) = 1$
Solution 3:
It makes more sense to look at it in terms of partials
$$\int_{a}^{b}\dfrac{\partial y}{\partial x}\,{\rm d}x={\Delta y}_{{x\rightarrow a\ldots b}}$$
with ${\Delta y}$ indicating the change in $y$ as $x$ varies from $a$ to $b$. If there are more variables involved ($y(x,z,w)$) then $z$ and $w$ are assumed to be constant during the variation in $x$.
It is kind of the inverse of the chain rule.