If $L$ is regular, prove that $\sqrt{L}=\left\{ w : ww\in L\right\}$ is regular

Let $\mathcal{A} = (Q, \Sigma, \delta, q_0, F)$ be a DFA that recognizes $L$. Define $\mathcal{A'} = (Q', \Sigma, \delta', q_0', F')$ as follows:

  • $Q'$: The set of all functions $f : Q \to Q$.
  • $\delta'(f, a) = g$ where $g(q) = f(\delta(q, a))$.
  • $q_0'$ is the identity function.
  • $F'$ is the set of all functions $f$ so that $f^2(q_0) = f(f(q_0)) \in F$ .

Let $w \in \sqrt{L}$. This means that $w^2 = ww \in L$. We want to show that $\delta'(q_0', w) \in L'$.

Define $h(q) = \delta(q, w)$. Since $w^2 \in L$, it follows that $h^2(q_0) \in F$. Hence $h \in F'$.

What's left is to show that $\delta'(q_0', w) = h$, which can be done via induction on the length of $w$.

Finally, we should show that $F'$ only accepts elements from $\sqrt{L}$ (and not more). I'll let you work this one out. It follows from the definition of $\mathcal{A'}$.


The shortest way to prove this result is to use the fact that a language is regular iff it is recognized by a finite monoid. A language $L$ of $A^*$ is recognized by a finite monoid $M$ if there is a surjective monoid morphism $f:A^* \to M$ and a subset $P$ of $M$ such that $f^{−1}(P) = L$.

Now let $Q = \{ x \in M \mid x^2 \in P\}$. Then \begin{align} f^{−1}(Q) &= \{ u \in A^* \mid f(u) \in Q \} = \{ u \in A^* \mid f(u)^2 \in P \}\\ &= \{ u \in A^* \mid f(u^2) \in P \} = \{ u \in A^* \mid u^2 \in L \}\\ &= \sqrt{L} \end{align} Thus $\sqrt{L}$ is regular. This method actually shows that if $M$ recognizes $L$, then it also recognizes $\sqrt{L}$. This is useful to prove further properties: for instance if $L$ is star-free, so is $\sqrt{L}$.

References

[1] J.-É. Pin and J. Sakarovitch, Some operations and transductions that preserve rationality, in 6th GI Conference, Berlin, (1983), 277-288, LNCS 145, Springer.

[2] J.-É. Pin and J. Sakarovitch, Une application de la représentation matricielle des transductions, Theoret. Comp. Sci. 35 (1985), 271-293. (in French)


It is often simpler to construct a nondeterministic automaton than a deterministic one -- but having a deterministic automaton gives us more to work with. If we already know that the existence of an NFA or DFA are both equivalent to being regular, we can get the best of both worlds by assuming that we have a DFA $A_L$ for $L$, but just wanting to construct an NFA for $\sqrt L$.

The first thing our new automaton will do when it starts to read a string $w$ is to guess which state $A_L$ will be in after having read $w$ once. We can do this by having multiple parallel $\epsilon$ transitions out of an initial state, each leading to its own independent sub-automaton that recognizes "strings $w$ such that $w$ can move $A_L$ from the initial state to the guessed intermediate state, and $w$ can also move $A_L$ from the intermediate state to an accepting state.

The language each of these sub-automata must recognize is the intersection of two regular languages, so it has an automation we can use.

More concretely, the sub-automaton can have a state for each pair of states $(p,q)$ from $A_L$, simultaneously simulating two copies of $A_L$ recognizing the input string from different starting states. There are transitions $(p_1,q_1) \overset{a}\to (p_2,q_2)$ whenever $A_L$ has transtions $p_1\overset a\to p_2$ and $q_1\overset a\to q_2$, and $(p,q)$ accepts if $p$ is the guessed intermediate state and $q$ is accepting in $A_L$.