Basic Confusion on Push-Forward of a Measure
Solution 1:
First of all it should be pointed out that push-forward is an operation applied to measures, rather than densities. If $\rho $ is a density (measurable, nonnegative function) on $\mathbb{R}^n$, then the associated measure $\mu _\rho $ is defined for every measurable set $E\subseteq \mathbb{R}^n$ by $$ \mu _\rho (E) = \int_E\rho (x) dx. $$
Calling the push-forward measure $\nu $, we have by definition $$ \nu (E) = \mu _\rho (f^{-1}(E)) = \int_{f^{-1}(E)}\rho (x) dx = \int_E\rho (f^{-1}(x))|J(f^{-1})(x)| dx, $$ by the change of variable formula, where $J$ refers to the Jacobian. In other words $\nu $ is the measure given by the density function $$ \tau (x)=\rho (f^{-1}(x))|J(f^{-1})(x)|, $$ so it is absolutely continuous with respect to Lebesgue measure.
Needless to say, all of this requires $f$ to be smooth! Otherwise strange things can happen: there are homeomorphisms of $\mathbb{R}$ which send a set of positive measure onto the Cantor set. The push-forward of Lebesgue measure (density 1) through such function assigns positive measure to the Cantor set, hence is clearly not absolutely continuous.
Solution 2:
When you take a probability measure with a density w.r.t. Lebesgue measure, and push it forwards, you get a new probability measure, but this push-forward measure need not have a density. For the first claim here, look at the theorem recited in the wikipedia page you cite, in the special case of the function $g$ being the constant $1$. For the second claim here, let $H:[0,1]\to[0,1]$ be any strictly increasing no-where differentiable function, for which $H(0)=0$ and $H(1)=1$, and let $f$ be its inverse function. If $X$ has uniform distribution on $[0,1]$ then $f(X)$ has $H$ as its cumulative distribution function, but (by hypothesis on $H$) has no density function.
Here is one class of examples of such $H$; they are discussed in Billingsley's Ergodic Theory book. Let $B_i$ be i.i.d. random bits, with $P(B_i=0)=1-p, P(B_i=1)=p$, where $0<p<1$ and $p\ne 1/2$. Then $H(x)=P(\sum_{n>0} B_n 2^{-n}\le x)$ has the required properties. (It is easy to plot the graph of $H$: you know $H(1/2)=1-p$, and you can fractally interpolate. Billingsley gives such a plot on p.37.)