Catalan numbers: bijection between applications of a binary operator and Dyck words.

The Wikipedia article on Catalan numbers lists various combinatorial objects that are described by them. I posit that there might be bijections between these various combinatorial objects. For some of them (like Dyck paths, correctly matched parentheses and paths from the bottom left to top right of a $2n \times 2n$ grid) they are quite obvious.

I was then trying to find a bijection between the number of Dyck words and the number of ways of associating $n$ applications of a binary operator to $n+1$ items (third one on the list). I attempted to do this for a simple case ($n=3$ which is the example provided in the Wikipedia article). However, couldn't find one after multiple hours. Is it reasonable to expect such a bijection will exist? If so, how do we go about finding it?


EDIT: In addition to the very nice answer by @Marc, the following page also helped me see the bijection: http://math.sfsu.edu/federico/Clase/EC/Homework/3.3.Jorge.pdf

"Let $P$ be the Dyck path and $f(P)$ be the binary tree. If you go up in the Dyck path, create a left child. Otherwise, go up one vertex until creating a new right child is possible and create one."


Here is one of my attempts:

Number of Dyck words with length $2 \times 3$ is $\frac{6 \choose 3}{4} = 5$. They are:

hhhttt; hhthtt; hhttht; hthhtt; hththt

And the number of applications of a binary operator among $3+1=4$ factors is:

((ab)c)d; (a(bc))d; (ab)(cd); a((bc)d); a(b(cd))

Both combinatorial objects have been arranged in a way that there is some kind of ordering between them. For the Dyck words for example, if h equals +1 and t equals -1, then the order is lexicographical from the left to the right of the cumulative score along the sequence.

Now, the first and last characters of the Dyck words are always h and t respectively. So, we can ignore them. We are left with:

hhtt; htht; htth; thht; thth

I tried to start from the left of the sequence abcd and if I see 'h', merge the character with the one to its right. This approach didn't produce a valid mapping from the third Dyck word to the third binary operator precedence order.


Solution 1:

Yes this one is well known. Often for finding a bijection it suffices to compare the reasons why the two interpretations satisfy the fundamental recurrence $C_{n+1}=\sum_{i=0}^nC_i\,C_{n-i}$ for Catalan numbers. For $n+1$-leaf binary trees, which is what combining $n+1$ atoms using $n$ applications of a binary operator amounts to, this is quite clear: for $C_{n+1}$ one is looking at trees with $n+1$ internal nodes one of which is the root; if it has a left subtree with $i$ internal nodes its right subtree has $n-i$ internal nodes, and the value of $i$ together with the choice of such subtrees determines the tree. For $2n$-step Dyck paths it is slightly less natural since an asymmetric choice is necessary to decompose a path in sub-paths. Still there is a fairly obvious way to do this: the first step of a $2(n+1)$ step path is always an up-step (opening parenthesis), and there is a unique matching down-step (closing parenthesis) where we first re-descend to level$~0$. Then there are $2i$ steps in between them that form a Dyck path, and $2(n-i)$ steps left after the matching down-step, for some $0\leq i\leq n$. Repeating this decomposition constructs a binary tree associated to a Dyck path. For instance with $n=3$ and the path corresponding to $(())()$ one finds $i=1$ (the first descent to level$~0$ is at step$~4$) and the unique binary tree with $4$ leaves and the root in the middle: $(a*b)*(c*d)$. Note that the left-right symmetry is not preserved. Here are all $5$ cases with $n=3$: $$ \matrix{(~)~(~)~(~) & (~)~(~(~)~)~&(~(~)~)~(~) & (~(~)~(~)~)&(~(~(~)~)~)\\ a*(b*(c*d))&a*((b*c)*d) &(a*b)*(c*d) & (a*(b*c))*d & ((a*b)*c)*d} $$

Here is a maybe somewhat easier to see way to state this correspondence. First write the binary tree expression using symbols '[', '$*$', ']' together with atoms, so that each '$*$' has a pair '[' and ']' that surrounds its own operands, so for instance $a*((b*c)*d)$ becomes $[~a*[~[~b*c~]*d~]~]$ (note the outer brackets). Then substitute '(' for '[', and ')' for '$*$', and drop both the atoms and the symbols ']', to get a balanced list of parentheses, in the example $(.)((.)..)...$ where the dots mark positions where symbols were dropped.