Normalisation using Softmax- What advantage does exponential provide
Let $D=\{x_i\}$ be a dataset.
Standard normalization is generally by subtracting out the mean: $$ \mu = \frac{1}{n}\sum_{i=1}^n x_i $$ and dividing out the standard deviation: $$\sigma = \frac{1}{n-1}\sum_{i=1}^n (x_i-\mu)^2 $$ So that the new dataset is given by: $$ \tilde{x}_i = \frac{x_i - \mu}{\sigma} $$ This is nice, because if you assume $x_i\sim\mathcal{N}(\mu,\sigma^2)$, then $\tilde{x}_i\sim\mathcal{N}(0,1)$.
Another, unrelated normalization, is given by: $$ y_i = \frac{x_i - \min_j x_j}{\max_j x_j - \min_j x_j} $$ which is nice because it forces everything between zero and one.
Then there is softmax: $$ s_i = \exp(x_i)\left[ \sum_{j=1}^n \exp(x_i) \right]^{-1} $$ Here is one nice property: $$ \sum_{i=1}^ns_i = 1 $$ which means the original dataset vector can now be interpreted as a probability distribution. (This is used in the softmax function appears in logistic regression.) Clearly we also always have $s_i>0$. The fact that the output is always positive (regardless of whether $x_i$ is positive or negative) can be useful. Finally, using exponentiation can give greater separation than a linear transformation as well.