Glue types in cubical type theory
Recently, I've been reading about Homotopy Type Theory.
I haven't yet finished the book (in fact, I'm reading "Introduction to Homotopy Type Theory" by Egbert Rijke), but from the Univalence Axiom, I've got the impression that some terms may not be reducible to primitives (which turned out to be "lack of canonicity").
Then I've read about Cubical Type Theory. That was, however, quite confusing experience. Due to the fact that it's a novel "theory" (or rather model, maybe?), there doesn't seem to be an accessible, well-organized text like the HoTT book. The main obstacle for me is the distinction of Path, Id and Glue types. I can sort of wrap my head around Path and Id, but I can't come up with a way to think about Glue types.
Is it a type of equivalences between two types? In all the texts, they get mentioned quite early with reference to "composition in the universe" (whatever that means; composition of paths? functions?), but then they immediately appear as a function $\textrm{Glue} : \textrm{Type} \rightarrow \textrm{ListOfSomething}$.
Example from A hands-on introduction to cubicaltt:
ua (A B : U) (e : equiv A B) : Path U A B =
<i> Glue B [ (i = 0) -> (A,e)
, (i = 1) -> (B,idEquiv B) ]
My questions are:
- What is the intuition behind Glue types? Do they correspond to "completing a square" of Paths?
- What does the Glue constructor mean? Specifically, what are the "almost lambdas" in the list? What is the type of that list?
Thanks for answer!
Solution 1:
I'm not really an expert, but I can at least explain some of the purpose of $\mathsf{Glue}$ with respect to at least the example you're showing (or something similar). I suppose first I'll answer part of question 2. Those aren't lambda expressions, and it isn't a list. It is syntax for cases on the variable $i$. It is specifying a value of $Σ_{X:U}X \simeq B$ for when $i$ is either 0 or 1.
Now I think it might help for a digression into higher inductive types. If you have the definition (using cubical Agda syntax, since it's what I know) of the circle:
data S¹ : Type₀ where
base : S¹
loop : base ≡ base
the way this works cubically is that base : S¹
, and also for any dimension variable loop i : S¹
, but at the endpoints we get computation, or judgmental equalities (however you want to think about them):
loop i0 = base
loop i1 = base
So, lots of weird higher constructors work out in pretty straight forward ways by considering them as normal constructors that take dimension arguments and reduce at the endpoints.
Now, I think one way of thinking about univalence is that it is something like a higher inductive definition of the universe. Some of the parts might be:
data U where
Π : (a : U) (b : El a → U) → U
...
ua : ∀{a b} → El a ≃ El b → a ≡ b
So ua eqv i : U
, and it reduces to a
and b
at the endpoints, but in 'the middle' it is kind of its own type. There's more to univalence than that (and your quoted ua
), but hopefully that bit of it makes some sense.
Now, this is exactly how $\mathsf{Glue}$ behaves. The difference is that for this example it involves three types. For instance, cubical Agda accepts the following definition (I'm using some features that let me avoid writing all the implicit arguments):
Glued : A ≃ C → B ≃ C → A ≡ B
Glued {A = A} {C = C} {B = B} eq₁ eq₂ i
= Glue C λ{ (i = i0) → A , eq₁ ; (i = i1) → B , eq₂ }
So, as mentioned, in this case, Glue C
takes a specification by cases on i
of values of Σ[ X ∈ Type ℓ ] X ≃ C
, and it computes to the first projection of those at the specified endpoints. In 'the middle' it just doesn't reduce.
Glue
is more general than just this case, though. For instance, you can decide to remove one of the cases, and if you removed i1
, it simply wouldn't reduce (I think), even though that is a concrete value for the dimension. You can also do cases over multiple dimension variables if you want (though I haven't seen that used with Glue
).
I suppose another question is why we need three types and two equivalences. I don't know that I can give a completely satisfying explanation, but in both this and the previous sense, I think it mirrors the composition structure elsewhere in the current cubical theories. We have A
and B
because those correspond to the definition of a type by cases on i
, and C
is a type that is well defined for the 'whole range' of i
. The equivalences show how A
and B
agree with C
at the endpoints, and the Glue
lifts C
's complete structure up to a completed structure with endpoints A
and B
. So for instance, cubical Agda also accepts this:
iGlued : (C : I → Type ℓ) → A ≃ C i0 → B ≃ C i1 → A ≡ B
iGlued {A = A} {B = B} C eq₁ eq₂ i
= Glue (C i) λ{ (i = i0) → A , eq₁ ; (i = i1) → B , eq₂ }
So, I think they do indeed correspond to completing a square. Specifically, this paper has a square like this (top of page 14):
$$ \require{AMScd} \begin{CD} A @>>> B \\ @Veq(i0)VV @VVeq(i1)V \\ C(i0) @>>C> C(i1) \end{CD} $$
Now, the rest of this is kind of speculation on my part. Part of my previous explanation was subtly wrong with respect to e.g. Coquand's papers. I said that Glue C
takes a definition by cases of a Σ type involving the universe. This is the case in Agda, but there you cannot really talk about things being types separately from the (unique) universe they inhabit. In most of the cubical papers, though, the definition by cases for $\mathsf{Glue}$ is just a primitive construction, which has types together with equivalences in each case (similar, but not identical to the cases for composition).
Why is this done? Well, I think it was known quite a while ago that univalence is in some sense not just about a universe, but the types in that universe. A sub-universe of a univalent universe must be univalent (if I'm not mistaken) because equivalences between types in the sub-universe induce paths between them and we can thus transport between those types.
So, I think what cubical type theories are doing with $\mathsf{Glue}$ is making the entire universe of discourse univalent. Paths between types are no longer tied to any universe type, they are just judgments $Γ ⊢ A\ \mathsf{type}$ where $Γ$ contains dimension variables. $\mathsf{Glue}$ adds the ability for equivalences to induce these ambient path types, like a higher constructor of the $\mathsf{type}$ judgment, the logical theory itself. Like a higher constructor, it reduces at the specified endpoints (the cases you specify).
Then, since the universe type $U$ is necessarily a sub-universe of the implicit universe of discourse, it is also automatically univalent, via $\mathsf{Glue}$. And this presumably works even if you have more than one universe. Of course, $\mathsf{Glue}$ is integral to getting univalence for the universe, but I think the point is that in cubical type theory, equivalences would induce paths even if there were no universe (or, in a theory with a universe, equivalences induce paths even for types that aren't members of the universe, like the universe itself).
This might be bad if you want your theory to be able to talk about non-univalent universes as well, but I would guess you can do that by having a separate judgment for the non-univalent universe of discourse.
That got a bit longer than I expected, but hopefully it cleared at least some things up.
Edit: I thought of an idea for an explanation that might satisfy Mike's background slightly more.
Suppose we investigate my idea above about a higher inductive (a la Tarski) universe definition a bit more thoroughly. I previously showed that $\mathsf{ua}$ is like a path constructor for the universe codes which takes an equivalence. However, the other part of a universe is a decoding family $\mathsf{El}$.
So, what does $\mathsf{El}$ do with a path created by $\mathsf{ua}$? You might guess that it produces a value of a path type, but this actually doesn't make any sense, because there is no type of paths between types (there could be, a la observational type theory, but there isn't in the cubical type theory papers I've seen). As mentioned earlier, there are just $\mathsf{type}$ judgments in a context with dimensions. So, the judgments we can give look like:
$$ Γ ⊢ \mathsf{El} (\mathsf{ua}\ a\ b\ e\ i)\ \mathsf{type} \\ Γ ⊢ \mathsf{El}(\mathsf{ua}\ a\ b\ e\ \mathsf{i0}) = \mathsf{El}(a) \\ Γ ⊢ \mathsf{El}(\mathsf{ua}\ a\ b\ e\ \mathsf{i1}) = \mathsf{El}(b) $$
So, $\mathsf{El}$ decodes $\mathsf{ua}$ terms to types that reduce at the endpoints and probably don't reduce in 'the middle'. But, this is $\mathsf{Glue}$. We will additionally expect there to be rules about how $\mathsf{El}(\mathsf{ua}\ a\ b\ e\ i)$ interacts with values of $\mathsf{El}(a)$ and $\mathsf{El}(b)$ and things like $\mathsf{transp}$, and these correspond to stuff about $\mathsf{glue}$ and the like.
The difference is that these $\mathsf{El}$-based glue types only exist for types that are the decoding of values of $\mathsf{U}$. But arguably this is a strange state of affairs. Why should we be able to transport between equivalent types only because they are in the universe? The original univalence axiom must be stated that way (I believe), because Martin-löf type theory has no identity type between types, and has nothing like the cubical dimensional typing judgment. But it seems like a mistake to see that as meaning that univalence is fundamentally about the universe, and not about the types within it.
Of course, you could want a theory where univalence works for some types and not for others. But it seems to me like you'd want to make that decision actively, not as an accident of what the theory you're extending allows you to say.
Incidentally, here is an Agda file showing an inductive-recursive sub-universe of the built-in universe illustrating this idea. I don't know if anyone has thought about whether Agda's I-R makes sense with higher types, but it's good enough for fooling with ideas.