Computing the expected size of the largest connected component in a "hitomezashi graph" (described in the question body)

A while ago there was a numberphile video about a certain graph you can build based on Hitomezashi Sashiko, a kind of decorative mending.

Intuitively, we alternate putting walls (originally stitches) in each column/row, which separate the integer lattice into components. $s_1$ governs the positions of the vertical walls and $s_2$ governs the positions of the horizontal walls. A $0$ tells us to start "at the edge" and a $1$ tells us to start "off the edge". For example, choosing $s_1 = 01101$ and $s_2 = 10110$ gives the following stitch pattern.

the stitch pattern

If we think of each unit square as a vertex, with adjacent cells connected whenever there isn't a wall separating them, we get a graph, $H(s_1,s_2)$:

the graph overlayed on the stitching

If it's still unclear how the walls work, it's made clear in the first few minutes of the linked numberphile video. For those looking for a precise description, formally, we build a graph $H(s_1, s_2)$ given two binary strings (say of length $n_1$ and $n_2$) as follows

  • $[n_1] \times [n_2]$ is the vertex set ($0$ indexed)
  • $(x,y) \sim (x+1,y)$ whenever $y \not \equiv s_1[x+1] \pmod{2}$
  • $(x,y) \sim (x,y+1)$ whenever $x \not \equiv s_2[y+1] \pmod{2}$

I also have a demo on my blog where you can input binary strings and it will output a picture of the graph.


Now, if we do this with longer binary strings, we get some quite intricate pictures:

a larger example

and there are some natural questions to ask. I've put a fair amount of my thoughts about these problems in a different blog post, but here is the one I'm primarily interested in:

Say we (uniformly) randomly choose two binary strings of a fixed length $n$. What is the expected size of the largest connected component of $H(s_1, s_2)$ as a function of $n$?

I'm not much of a probabilistic combinatorialist, so I pretty quickly exhausted my personal bag of tricks for attacking these kinds of problems. But this feels like something that somebody knows how to answer (and ideally, something somebody would enjoy answering ^_^).

I recognize this is probably hard to answer, so I'm open to partial progress. In particular, I wrote some sage code to get some data, and here's a graph of the average maximum region size (across a few hundred samples) as a function of $n$:

the data plus a curve of best fit

The blue curve is the polynomial of best fit, which turns out to be $\approx 1.95 n^{1.38}$

This brings us to some (hopefully easier) problems:

Can we show that the expected size of the largest component is $o(n^2)$? What about $o(n^{1.5})$? Is it possible to pin down the exponent exactly? Can we get lower bounds too?

You can find more of my thoughts, as well as some code for simulating these things yourself if you're interested, in my blog post here. I'm open to hearing any thoughts that people have about this, because I really have no idea how to proceed.


Edit (Jan 5):

Based on the new data from Daniel Mathias, it seems a good conjecture is $\frac{8}{3} n^{4/3}$. I've added a $50$ rep bounty as thanks, and afterwards I'll add a $100$ rep bounty for a proof of this conjecture (or some substantial progress). I need to wait $24$ hours before I can post that second bounty, though.


Thanks in advance! ^_^


Solution 1:

As promised, I have generated a more substantial amount of data to provide a more accurate plot. enter image description here Blue line is average maximum region size across 10,000 samples for $n=10k;5\le k\le 100$
Red dotted line is $1.95n^{1.38}$ and green dotted line (yes, it's there) is $\frac83n^{4/3}$

C source code: pastebin
As-is, the code runs only 1000 samples at each size. Run-time $\approx4\frac12$ minutes.


Edit:

More data from an extended run. The averages gradually fall short of the estimate. A new estimate of $2.9n^{1.32}$ is shown in dotted yellow over the blue data line.

enter image description here