Best practice for the notation of conditions in cases: 'if' vs 'for' vs ','
Solution 1:
I believe there is no formal difference between those $3$ and thus picking one over the others would probably be a matter of personal taste. Even so, the
$$\delta_{ij} = \begin{cases} 0 &\text{for } i \neq j, \\ 1 &\text{for } i=j. \end{cases}$$
seems the least natural one.
For the first, instead of
$$\delta_{ij} = \begin{cases} 0 &\text{if } i \neq j \\ 1 &\text{if } i=j \end{cases}$$
I would argue that it is nicer to see something along the lines of
$$\delta_{ij} = \begin{cases} 0 &\text{if } i \neq j \\ 1 &\text{otherwise} \end{cases}$$
even though it is quite clear that the "otherwise" happens precisely when $i = j$. The point is, when the last branch applies to everything else that hasn't been explicitly stated, the "otherwise" fits nicely.
Even better, like @Arthur pointed out in the comments, is when the "otherwise" is used to encompass the more general case. Hence a very clean way would be to have
$$\delta_{ij} = \begin{cases} 1 &\text{if } i = j \\ 0 &\text{otherwise} \end{cases}$$
Solution 2:
I agree that there is no formal difference and that either is acceptable provided you are consistent. I also think that punctuation creates unnecessary clutter.
However, it is important to keep your audience and your objective in mind.
Sometimes, a blend between spoken language and mathematical notation is acceptable or is preferable as a more natural and less elevated way to communicate oneself.
In other instances—especially when the author has attempted to explain a convoluted concept already in verbal form but has possibly failed—purely symbolic notation supplements spoken language and offers an independent approach to communication. It is also important to expose the reader to proper and commonplace notation.
As an individual who writes in multiple languages, I favor purely symbolic notation. However, I also understand that when my papers are submitted, they are randomly distributed to markers from six different continents. As such, I always explain my notation with written language in such a manner that they clarify each other without polluting one another; thusly, I in essence extract the “best from both worlds.”
I’d also like to draw your attention to another concept: piecewise functions. When dealing with these, I favor this format:
$$y = \begin{cases} ax^2+bx & x\in(-\infty,0] \\ mx+c & x\in(0,10) \\ \end{cases}$$
Notice the benefit of specifying the domain compactly and of indicating the independent variable.
(Side note: I wrote my extended essay for my IB Diploma over how the language of set theory has impacted probability theory in its jargon, concepts, definitions, notations and other areas. Fascinating to blend different approaches to knowledge this way!)