Finding the maximum likelihood estimator

$$f_X(x) = \frac12e^{-|x-\theta|}, -\infty<x<\infty$$

is a special case of the Laplace distribution given as follows:

$$f_X(x|\mu,\sigma)=\frac{1}{\sqrt{2}\sigma}e^{-\frac{\sqrt{2}|x-\mu|}{\sigma}},x\in\mathbb{R}$$

for $\sigma=\sqrt{2}$ and $\mu:=\theta$. To be more general, lets consider the Laplace distribution with parameters $(\mu,\sigma)$.

Consider the likelihood function for $N$ data samples:

$$L(\mu,\sigma;x)=\prod_{t=1}^N \frac{1}{\sqrt{2}\sigma}e^{-\frac{\sqrt{2}|x_t-\mu|}{\sigma}}=(\sqrt{2}\sigma)^{-N}e^{\frac{-\sqrt{2}}{\sigma}\sum_{t=1}^N |x_t-\mu|}$$

Take the log likelihood funtion as $l(\mu,\sigma;x)=log(L(\mu,\sigma;x))$ and we get $$l(\mu,\sigma;x)=-N\ln (\sqrt{2}\sigma)-\frac{\sqrt{2}}{\sigma}\sum_{t=1}^N |x_t-\mu|$$ Take the derivative with respect to the parameter $\mu$ $$\frac{\partial l}{\partial \mu}=-\frac{\sqrt{2}}{\sigma}\sum_{t=1}^N \frac{\partial|x_t-\mu|}{\partial\mu}$$ which is equal to $$=\frac{\sqrt{2}}{\sigma}\sum_{t=1}^N\mbox{sgn}(x_t-\mu)$$ using the identity $$\frac{\partial |x|}{\partial x}=\frac{\partial \sqrt{x^2}}{\partial x}=x(x^2)^{-1/2}=\frac{x}{|x|}=\mbox{sgn(x)}$$ To maximize the likelihood function we need to solve
$$=\frac{\sqrt{2}}{\sigma}\sum_{t=1}^N\mbox{sgn}(x_t-\mu)=0 \quad\quad (1)$$ For which we have two cases; $N$ is even or odd.

If $N$ is odd and we choose $\hat{\mu}=\mbox{median}(x_1,\ldots ,x_N)$, then there are $\frac{N-1}{2}$ cases where $x_t<\mu$ and for the other $\frac{N-1}{2}$ cases $x_t>\mu$, therefore $\hat{\mu}$ satisfies ($1$) and is the Maximum likelihood estimator for the parameter $\mu$

If $N$ is even, we can not simply choose one $x_t$ which will satisfy ($1$), however we can still minimize it through ranking the observations as $x_1\leq x_2\leq \ldots,x_N$ and then choosing either $x_{N/2}$ or $x_{(N+1)/2}$

In summary $\hat{\mu}=\mbox{median}(x_1,\ldots ,x_N)$ is the maximum likelihood estimator for any $N$


If you look here, the estimator of $\theta$ is the median of $x_1,...,x_n$. This is standard because you are actually minimizing the sum of absolute deviations.