Color an $n\times n$ square with $n$ colors
Solution 1:
The exact number of latin squares of order n is known only up to n = 11, and these counts are reflected in the OEIS and Wikipedia entries already linked in the Comments.
For counting "diagonal" latin squares I could not find any ready online reference, so I wrote a quick Prolog program that provides solutions for the smaller orders.
Note that for the sake of counting we may assume a particular first row of the latin square, so that if the first row is $(1,2,...,n)$ it is in partially reduced form (reduced form having also the first column in standard order).
For n = 1 there is the one (trivial) case.
For n = 2,3 one quickly finds there are no diagonal latin squares.
For n = 4 there are two cases in partially reduced form. Therefore the total count is 2*4! = 48.
For n = 5 there are eight cases in partially reduced form, for a total count of 8*5! = 960.
My code is not optimized for speed but is clear, so the next step is to refactor and impose constraints in a more efficient manner to make some higher orders feasible.
Added:
For n = 6 there are 128 cases in partially reduced form, for a total count of 128*6! = 92160. This result was confirmed by a computation using a different normalization, see discussion below.
For n = 7 my preliminary result is a total count of 171200*7! = 862,848,000 solutions.
Using the partially reduced form (first row in standard order) the n = 6 case was enumerated in 6.3 seconds. Analyzing the time consumed in building stages of the search space suggested that normalizing the main diagonal (in standard order) would be better. I posted a Question here, Counting some special derangements, motivated by how this constrains the choice of anti-diagonals. Another feature of this approach is cutting the enumeration in half via the action of matrix transpose.
These efficiencies reduced the computation time for n = 6 to 1.5 seconds. Despite their use in the n = 7 case, the increased size of the search space led to a running time of 1776 seconds, nearly half an hour.
Such a three-order of magnitude increase in running time suggests that the n = 8 case will not be easily solved on a single CPU, even if the code were compiled instead of interpreted. [Update: For the n = 8 case there are 7450228224*8! = 300,393,201,991,680 diagonal latin squares. This was computed on a single CPU through a combination of speeding up the code that searches on a single main diagonal/anti-diagonal assignment and reducing the number of such assignments by equivalences. The code now runs such a single assignment in 1-2 hours. Given normalization of the main diagonal to the identity sequence $1,2,..,8$, there are 4,752 possibilities (µ-derangements) for the anti-diagonal. A first reduction to 630 equivalences classes is made by using orbits under a dihedral group action. A further reduction to 114 enlarged equivalence classes results from taking "orbits of orbits" under a group isomorphic to $(\mathbb{Z}/2\mathbb{Z})^4$ acting by conjugation. Finally it appears we actually need only count 18 distinct cases due to equivalences of a more contingent nature than those group symmetries. Similar results should apply to solving n = 9, though searching each given µ-derangement will take more time.]
While seeking citations for counts of diagonal latin squares I came across this interesting 2009 paper by Zhang and Ma, Counting solutions for the N-queens and Latin-square problems by Monte Carlo simulations, which is freely available through NIH's Pub Med Central. Perhaps their technique could be adapted for estimating counts of diagonal latin squares of "large" orders.
Solution 2:
In section 7 of this survey paper, Douglas Stones writes:
Comtet said that even estimating $L_n$ when $n → ∞$ “seems to be an extremely difficult combinatorial problem.” However, van Lint and Wilson showed that $$\frac{1}{n} L_n^{1/n^2} → \exp(−2).$$ ... This is not a particularly satisfying result ...
... there is still a long way to go to achieve a ‘clean’ asymptotic solution for the number of Latin squares.