» Publishers, Monetize your RSS feeds with FeedShow: More infos (Show/Hide Ads)
Representation theorists make good use of the “category algebra” construction. This is a way of turning a linear category (one whose hom-sets are vector spaces) into an associative algebra. In this post, I’ll describe what the category algebra is and why it seems to be important.
I’ll also ask two basic questions about the category algebra construction. I hope someone can tell me the answers.
First I’ll describe the category algebra construction. To make life easier for all of us, I’ll always use the word category to mean what category theorists would call a “category enriched in vector spaces”: one in which the hom-sets are vector spaces and composition is bilinear. Similarly, functors will be assumed to preserve this linear structure.
The construction takes as input a category $\mathbf{C}$ (assumed to have just a set of objects, not a proper class) and produces as output an associative algebra $\Sigma \mathbf{C}$, called its category algebra. As a vector space,
$\Sigma\mathbf{C} = \bigoplus_{c, d \in \mathbf{C}} \mathbf{C}(c, d)$
— the direct sum or coproduct of all the hom-sets of $\mathbf{C}$ (which, remember, are vector spaces). To define the multiplication on $\Sigma\mathbf{C}$, it’s enough to define the product $g \cdot f$ whenever $g$ and $f$ are maps in $\mathbf{C}$, and we do this by putting
$g \cdot f = \begin{cases} g \circ f &\text{if } domain(g) = codomain(f)\\ 0 &\text{otherwise.} \end{cases}$
In other words, multiplication is composition where that makes sense, and zero elsewhere.
(The notation $\Sigma\mathbf{C}$ for the category algebra is something I just made up. Is there standard notation?)
So far I’ve followed the time-honoured tradition of not bothering to say whether “algebras” are supposed to have a multiplicative identity. But actually, the issue of multiplicative identities is crucial to understanding category algebras.
Let me briefly try to explain why. The first observation is that if the category $\mathbf{C}$ has only finitely many objects, then the algebra $\Sigma\mathbf{C}$ does have a multiplicative identity. It’s $\sum_{c \in \mathbf{C}} 1_c$. If $\mathbf{C}$ has infinitely many objects then this sum makes no sense, so $\Sigma \mathbf{C}$ usually doesn’t have a multiplicative identity.
I’ll assume from now that $\mathbf{C}$ has only finitely many objects, so that $\Sigma\mathbf{C}$ is a unital algebra.
We’ll come back to the significance of identities in $\mathbf{C}$ and in $\Sigma\mathbf{C}$, but for now, let’s just observe:
- Taking the category algebra has the effect of concentrating all the identities of $\mathbf{C}$ into a single identity for $\Sigma\mathbf{C}$, with the individual identities $1_c$ of $\mathbf{C}$ being merely idempotents in $\Sigma\mathbf{C}$.
(The sum of all these idempotents is the multiplicative identity of $\Sigma\mathbf{C}$.) Alternatively, looking at it from the point of view of the algebra $\Sigma\mathbf{C}$:
- The multiplicative identity of $\Sigma\mathbf{C}$ is smeared all across the category $\mathbf{C}$, with one summand of the multiplicative identity $\sum_{c \in \mathbf{C}} 1_c$ attached to each object of $\mathbf{C}$.
Time for some examples.
Let $S$ be a finite preordered set — that is, a finite set equipped with a reflexive, transitive binary relation $\leq$. We can construct a category $\mathbf{C}$ from it as follows. Abstractly: view $S$ as an ordinary, unenriched category, then let $\mathbf{C}$ be the free linear category on it. Concretely, the objects of $\mathbf{C}$ are the elements of $S$, the hom-set $\mathbf{C}(s, t)$ is the ground field $k$ if $s \leq t$ and zero otherwise, and composition (where it’s nontrivial) is multiplication of scalars.
Now, the category algebra $\Sigma\mathbf{C}$ is a subalgebra of the algebra $M_S(k)$ of all $S \times S$ matrices over $k$. It consists of just those matrices $P$ satisfying the condition that $P(s, t)$ is only allowed to be nonzero when $s \leq t$.
A special case of the last example: let $S = \{1, \ldots, n\}$, with the obvious ordering. Then $\mathbf{C}$ is the category that you’d usually draw as $\bullet \to \bullet \to \quad \cdots \quad \to \bullet$ and $\Sigma\mathbf{C}$ is the algebra of $n \times n$ upper-triangular matrices.
Another special case: let $S = \{1, \ldots, n\}$ with the discrete ordering: $s \leq t \iff s = t$. Then $\mathbf{C}$ is the discrete category on $n$ objects — the disjoint union of $n$ copies of the ground field $k$ — and $\Sigma\mathbf{C}$ is the algebra of $n \times n$ diagonal matrices. Equivalently, $\Sigma\mathbf{C}$ is the $n$-fold product $k^n$.
A final special case: let $S = \{1, \ldots, n\}$ with the other trivial ordering: $s \leq t$ for all $s, t$. Then $\mathbf{C}$ is the codiscrete category on $n$ objects (so that all objects are isomorphic and all hom-sets are $k$), and $\Sigma\mathbf{C}$ is the full matrix algebra $M_n(k)$.
Why are category algebras important? I’m not sure I fully know, but here’s a fundamental fact:
A category and its category algebra are Morita equivalent.
What this means is that for any category $\mathbf{C}$, there’s an equivalence of categories
$[\mathbf{C}, \mathbf{Vect}] \simeq {\Sigma\mathbf{C}}\text{-}\mathbf{Mod}$
where the left-hand side is the category of functors $\mathbf{C} \to \mathbf{Vect}$. If you regard the algebra $\Sigma\mathbf{C}$ as a one-object category, then the right-hand side is the category of functors $\Sigma\mathbf{C} \to \mathbf{Vect}$.
So as far as linear representations are concerned, $\mathbf{C}$ and $\Sigma\mathbf{C}$ are the same thing.
How can we prove this equivalence? It’s one of those follow-your-nose proofs… but in following your nose, you discover the pivotal role of the identities.
In one direction, it’s straightforward: given a functor $F: \mathbf{C} \to \mathbf{Vect}$, put $X = \bigoplus_{c \in \mathbf{C}} F(c)$; then $X$ is a $\Sigma\mathbf{C}$-module in what is, if you think about it, an obvious way.
The other direction isn’t quite so obvious. Starting with a $\Sigma\mathbf{C}$-module $X$, how can we manufacture a functor $F: \mathbf{C} \to \mathbf{Vect}$? Given $X$ and an object $c \in \mathbf{C}$, we have to cook up a vector space $F(c)$. The key here is that, for each $c \in \mathbf{C}$, the element $1_c$ of $\Sigma\mathbf{C}$ is idempotent. It follows that $1_c \cdot - : X \to X$ is idempotent. The image of this map (which is also its set of fixed points) is a vector space; and that’s what we take $F(c)$ to be.
The rest of the details of this equivalence are easy enough, and I won’t bother you with them. Instead, I’ll show you one consequence of the equivalence, then ask you two questions.
First, here’s the consequence. It begins with the observation that equivalent categories don’t usually have isomorphic category algebras. Indeed, suppose we have two equivalent categories, $\mathbf{C}$ and $\mathbf{D}$. Then they’re certainly Morita equivalent. So, by using the result above, we get a chain of equivalences:
$\Sigma\mathbf{C}\text{-}\mathbf{Mod} \simeq [\mathbf{C}, \mathbf{Vect}] \simeq [\mathbf{D}, \mathbf{Vect}] \simeq \Sigma\mathbf{D}\text{-}\mathbf{Mod}$
The end result is that the categories $\Sigma\mathbf{C}$-modules and $\Sigma\mathbf{D}$-modules are equivalent. And, since $\Sigma\mathbf{C}$ and $\Sigma\mathbf{D}$ are not usually isomorphic, this isn’t quite trivial.
The most famous example is due to Morita. Say $\mathbf{C}$ is the codiscrete category on $n \geq 1$ objects (so that all hom-sets are the ground field $k$), and $\mathbf{D}$ is the codiscrete category on a single object. Then, as we saw earlier, $\Sigma\mathbf{C}$ is the full matrix algebra $M_n(k)$, while $\Sigma\mathbf{C} = M_1(k) = k$. But $\mathbf{C}$ and $\mathbf{D}$ are equivalent categories (since all objects of $\mathbf{C}$ are isomorphic), so
$M_n(k)\text{-}\mathbf{Mod} \simeq \mathbf{Vect}$
for any $n \geq 1$.
Now here are my two questions.
First question Is there a good categorical explanation of the category algebra construction?
The first observation is that the construction isn’t even functorial, or at least, not in the obvious way. A functor $F: \mathbf{C} \to \mathbf{D}$ does induce a linear map $\Sigma\mathbf{C} \to \Sigma\mathbf{D}$, but it doesn’t usually preserve multiplication. For instance, consider the obvious functor from the discrete category $\mathbf{C}$ on two objects to the discrete category $\mathbf{D}$ on one object. The induced linear map $k^2 \to k$ is addition, which is not a homomorphism of algebras.
Second question Does the category algebra construction suggest that we should study representations of categories rather than representations of algebras?
I need to explain the thinking behind this. The Morita equivalence between a category $\mathbf{C}$ and its category algebra $\Sigma\mathbf{C}$ tells us that from a representation-theoretic viewpoint, it doesn’t much matter which we use. However, if $\mathbf{C}$ has infinitely many objects then $\Sigma\mathbf{C}$ is usually not a unital algebra, and one may view a non-unital algebra as a rather deficient sort of thing. In that case, the thinking goes, it’s better to stick with the original category than pass to the category algebra.
Another way to put it: whenever you see a non-unital algebra (especially an infinite-dimensional one), ask yourself whether it’s the category algebra of some category with infinitely many objects. If it is, you might be better off working with the category rather than the algebra.
I picked up this point of view from a couple of different conversations with algebraists, but I’m not sure I’ve properly understood it. Let me test it out on a couple of examples. One of them kind of “works”, in the sense of corroborating this viewpoint. The other appears not to work at all.
Example Let $L^1(\mathbb{T})$ be the set of complex-valued integrable functions on the circle $\mathbb{T} = \mathbb{R}/\mathbb{Z}$. It’s a $\mathbb{C}$-algebra under addition and convolution.
This algebra has no multiplicative identity. If it did have one, it would be the Dirac delta function — that is, a function $\delta$ such that $\int_\mathbb{T} f \cdot \delta = f(0)$ for all integrable $f$. But, of course, no such delta function exists. This is what gives Fourier analysis its richness.
So we have before us an infinite-dimensional, non-unital algebra. The viewpoint described above tells us to look for a category of which it’s the category algebra. How can we do this?
Well, if $L^1(\mathbb{T}) = \Sigma\mathbf{C}$ for some category $\mathbf{C}$, then each object of $\mathbf{C}$ gives rise to an idempotent in $L^1(\mathbb{T})$. So we start by looking for the idempotents in $L^1(\mathbb{T})$. Since multiplication in $L^1(\mathbb{T})$ is convolution, this means looking for functions $f: \mathbb{T} \to \mathbb{C}$ such that $f \ast f = f.$ Since taking Fourier coefficients turns convolution into multiplication, this implies that each Fourier coefficient of $f$ is a multiplicative idempotent, that is, $0$ or $1$. Let’s write $e_k: x \mapsto e^{2\pi i k x}$ for the $k$th character of the circle ($k \in \mathbb{Z}$). Then $f = \sum_{k \in S} e_k$ for some $S \subseteq \mathbb{Z}$.
My thinking gets a bit fuzzy around here, but I think one can follow the argument through to show that the objects of $\mathbf{C}$ must be the integers (or if you prefer, the characters of $\mathbb{T}$), and that all the hom-sets are zero apart from an endomorphism ring $\mathbb{C}$ on each object. In other words, $\mathbf{C}$ is the discrete category on $\mathbb{Z}$.
The category algebra of this discrete category $\mathbf{C}$ consists of those double sequences $(c_k)_{k \in \mathbb{Z}}$ that are zero in all but finitely many places. Alternatively, you can think of this as the algebra of all trigonometric polynomials (that is, finite linear combinations of characters $e_k$).
This is not, of course, our original algebra $L^1(\mathbb{T})$. Most integrable functions on $\mathbb{T}$ are not trigonometric polynomials. So, you might say that the viewpoint advocated above has failed. However, perhaps it’s achieved some kind of moral victory. Although not every integrable function is a trigonometric polynomial, the whole theory of Fourier series tells us how, under various hypotheses and in various senses, arbitrary integrable functions can be expressed as limits of trigonometric polynomials. So perhaps it’s a category algebra in some suitably analytic sense.
Here’s another sign that this is a good point of view. When $\mathbf{C}$ is a category with infinitely many objects, we want to say that the identity of $\Sigma\mathbf{C}$ is $\sum_{c \in \mathbf{C}} 1_c$, except that this sum, being infinite, doesn’t exist. In the case of our particular $\mathbf{C}$, this sum is $\sum_{k \in \mathbb{Z}} e_k$, the sum of all the characters of $\mathbb{T}$. On the other hand, if the identity for convolution — the Dirac delta function — existed, then all its Fourier coefficients would be $1$. So this is exactly the nonexistent sum that the Dirac delta wants to be.
Non-example Here’s another commonly-encountered non-unital algebra, also arising in a soft-analytic context. But this one doesn’t seem to support the viewpoint advocated at all.
Gelfand duality tells us that the commutative not-necessarily-unital $C^\ast$-algebras are dual to the locally compact Hausdorff spaces, with a space $X$ corresponding to the $C^\ast$-algebra $C_0(X)$ of continuous functions $X \to \mathbb{C}$ that vanish at infinity. (The algebra operations are pointwise.) This restricts to a duality between unital commutative $C^\ast$-algebras and compact Hausdorff spaces.
In particular, if $X$ is a Hausdorff space that is locally compact but not compact, then $C_0(X)$ is a commutative algebra without a multiplicative identity. Concretely, the multiplicative identity of $C_0(X)$ would have to be the function with constant value $1$, but this does not vanish at infinity.
Is $C_0(X)$ a category algebra? Apparently not. Again, if it was one, each object of the category would give rise to an idempotent in $C_0(X)$. But in general, $C_0(X)$ has no nontrivial idempotents. For the algebra structure on $C_0(X)$ is pointwise multiplication, so an idempotent in $C_0(X)$ is just a function taking only the values $0$ and $1$; but assuming $X$ is connected, this forces the function to have constant value zero.
Perhaps this failure is somehow due to my ignoring the extra structure on $C_0(X)$. I’ve been treating it as a mere associative algebra, not a $C^\ast$-algebra. But I’m not convinced… can anyone help?
What I’d most like is for someone to explain a bit more the viewpoint that non-unital algebras are often categories in disguise. And some compelling examples would be even better.
A month ago, the newsletter of the London Mathematical Society published an opinion piece of mine, Should mathematicians cooperate with GCHQ?. It has just published an opposing opinion by Richard Pinch, GCHQ’s Strategic Advisor for Mathematics Research and formerly a number theorist at Cambridge.
Pinch’s reply is short and curiously insubstantial. First he makes a couple of general assertions in opposition to what I wrote. But unlike my piece, which linked heavily to sources, he provides no evidence for his assertions. Nor does he dispute any of the specific facts stated in my article. Then he quotes a politician and the director of GCHQ saying that they believe GCHQ operates with integrity. And that’s it.
So it’s almost too flimsy to be worth answering. However, it’s probably worth rebutting even insubstantial arguments when they come from people in positions of influence. Here’s my rebuttal.
Richard Pinch writes:
Dr Leinster’s opinion piece makes a range of allegations of unethical and unlawful conduct against GCHQ.
Whether GCHQ’s conduct is unlawful is not something I’m qualified to judge, and I didn’t: I wrote that it was “accused of law-breaking on an industrial scale”. Some of those doing the accusing are very well-qualified to do so. E.g. here’s the opinion of a Queen’s Counsel (a high rank of lawyer in the British system) specializing in public law:
- Huge swath of GCHQ mass surveillance is illegal, says top lawyer. The Guardian, 28 January 2014.
Then there’s European law:
- NSA and GCHQ activities appear illegal, says EU parliamentary inquiry. The Guardian, 9 January 2014.
And then there’s GCHQ’s own opinion:
- Leaked memos reveal GCHQ efforts to keep mass surveillance secret. The Guardian, 25 October 2013.
This article describes GCHQ internal memos showing how it feared legal challenge in the European courts if the existence of its mass surveillance programmes became known. So even GCHQ was well aware that its methods were legally precarious, at the very least.
All these articles were linked to in my original piece.
Pinch continues:
The allegations are so widely drawn that it is impossible for GCHQ to recognise them as a description of its activities.
Snowden’s leaks provide detailed documentary evidence for my claims. Neither GCHQ nor the NSA has challenged their authenticity. For every allegation I made in my article, I linked to either the documents or journalism based on them. The leaked documents are available for anyone to read.
Pinch provides no evidence of any kind in his article. Nor does he deny any specific assertion that I made.
Continuing with Pinch’s article:
GCHQ, along with the other intelligence agencies of the UK, is subject to some of the most rigorous legislative and oversight arrangements in the world.
Compare the statement of one of GCHQ’s own lawyers:
“We have a light oversight regime compared with the US”.
(The legal loopholes that allow GCHQ to spy on the world. The Guardian, 21 June 2013.)
How rigorous is “light oversight […] compared with the US”? Well, the secret court that regulates the NSA (and to which the NSA has been legally found to have lied repeatedly) rejects just 1 in 3000 of the NSA’s surveillance requests. And GCHQ claims an oversight regime that’s even lighter.
It’s not just this one GCHQ lawyer who says that GCHQ is more weakly regulated than the NSA:
in the documents GCHQ describes Britain’s surveillance laws and regulatory regime as a “selling point” for the Americans.
(Exclusive: NSA pays £100m in secret funding for GCHQ. The Guardian, 1 August 2013.) Update: See also this comment below.
(Incidentally, it’s not clear whether GCHQ gets away with whole-population surveillance by being so weakly controlled that it can break the law with impunity, or by not needing to break the law because the law’s so weak. As I said, I’m not qualified to judge what’s legal, and actually, legality isn’t of primary interest to me — as we all know, laws can be arbitrary or wrong.)
Pinch continues:
These ensure that all the work of the agencies is carried out in accordance with a strict legal and policy framework so that their activities are at all times legal, authorised, necessary and proportionate.
Whether it’s legal, I’ve already discussed. As for “policy framework”: sure, presumably the vast surveillance programmes being run by GCHQ do fit into some internal policy framework, but it’s no kind of democratic policy. Obviously there was no public discussion, but far more radically, even a senior Member of Parliament on the UK National Security Council claims not to have known:
- Cabinet was told nothing about GCHQ spying programmes, says Chris Huhne. The Guardian, 6 October 2013.
The rest of Pinch’s piece consists of pro-GCHQ quotes from the British Foreign Secretary and the Director of GCHQ. I could say pro-GCHQ things too; like just about everyone, I believe that some of what GCHQ does is worthwhile and justified. But that’s just opinion.
It’s the facts revealed by the Snowden papers that are so shocking. And when it comes to the facts, Pinch has disputed no factual statement about GCHQ made in my article, nor has he given us any reason to disbelieve the evidence before our eyes.
I’ve got a full-page opinion piece in this week’s New Scientist, on why mathematicians should refuse to cooperate with agencies of mass surveillance. If you’re in the US, UK or Australia, it’s the print edition that came out yesterday.
The substance is much the same as my piece for the London Mathematical Society Newsletter, but it’s longer, and it’s adapted for a US readership too.
I don’t currently have much to add to the article or what I wrote about mathematicians and the secret services previously. But I do have some observations to make about the process of writing for New Scientist.
This was my first time writing for a magazine. The article received substantial edits from at least three editors; you can compare it with the version I originally submitted. I have mixed feelings about this process.
On the one hand, it’s great to have the input of experienced magazine journalists, and I can definitely see ways that they improved what I wrote. On the other hand — and despite the editors I dealt with being reasonable, helpful, and pleasant — I found the process pretty frustrating. I think that’s because of where the control lies.
What doesn’t happen is that you submit your piece, the editors read it and give you their critiques, and then you amend your article accordingly. What does happen is that you submit something, the editors change it how they like, and if you don’t like any of their changes, you have to argue for why it should be changed back. This process may be iterated several times, perhaps with different editors with different opinions. Rationally, I know that the article goes out not only under my name but also under the magazine’s, but by the end of the process, I did have the depressing feeling that the article wasn’t entirely mine.
(Small example: there were three words that I disliked and repeatedly removed from the editor’s edits: “moral”, “snoop” and “spook”. The editors I dealt with directly respected my wish to avoid them, after I’d made the case. But in the online version, the headline and the standfirst — which I neither wrote nor saw before publication — managed to use two out of those three words.)
Anyway, it was a new experience.
Comments are open. As ever, if you’re leaving comments on the political aspects, please keep them focused on the relationship between mathematicians and the secret services.
Update Here’s a list of the various press articles that followed on from my original article:
- Mathematician Spies, Slate, 27 April 2014. (Reprint of New Scientist article)
- Mathematicians: refuse to work for the NSA!, Boing Boing, 27 April 2014
- Mathematicians Push Back Against The NSA, Slashdot, 27 April 2014
- Un mathématicien appelle ses collègues à ne plus travailler pour la NSA, Mediapart, 28 April 2014 (free version here)
- Mathematiker ruft zum Geheimdienst-Boykott auf, Zeit Online, 28 April 2014
- Mathematiker-Aufruf: Arbeitet nicht für die Geheimdienste!, Spiegel Online, 28 April 2014.
Congratulations to Mike for being a part of a research team who will receive $7. 5 million to carry on the good work of the IAS Univalent Foundations program. Homotopy Type Theory: Unified Foundations of Mathematics and Computation will run for five years, organised by Steve Awodey at CMU. (Technical portion of the grant proposal is here.)
I hope they’re allowed to use some of that funding to spill into physics a little.
Guest post by Sam van Gool
Monads provide a categorical setting for studying sets with additional structure. Similarly, 2-monads provide a 2-categorical setting for studying categories with additional structure. While there is really only one natural notion of algebra morphism in the context of monads, there are several choices of algebra morphism in the context of 2-monads. The interplay between these different kinds of morphisms is the main focus of the paper that I discuss in this post:
- [BKP] Two-dimensional monad theory, R. Blackwell, G. M. Kelly and A. J. Power, J. Pure and Appl. Algebra 59 (1989), pp. 1-41.
I will give an overview of the results and methods used in this paper. Also, especially towards the end of my post, I will also indicate some points that I think could still be clarified further by formulating some questions, which will hopefully lead to fruitful discussions below.
This post forms the 9th instalment of the series of posts written by participants of the Kan Extension Seminar, of which I’m very glad to be a part. In preparing the post I have greatly benefited from discussions with the other participants in the seminar, and of course with the seminar’s organizer, Emily Riehl. I am very grateful for the enthusiasm, encouragement and guidance that you all offered.
2-monads, their algebras, and their morphisms
Two-dimensional universal algebra goes beyond the $\mathbf{Cat}$-enriched setting in that it allows for non-strict morphisms. Consider the following (very) simple example.
Example. For a category $A$, let $T A$ be the category $A$ provided freely with a terminal object. This assignment can be extended to a 2-monad $T$ on $\mathbf{Cat}$. Then:
- an algebra for $T$ is (entirely determined by giving) a pair $(A,t_A)$ where $A$ is a category and $t_A$ is a designated terminal object in $A$;
- a strict morphism $(A,t_A) \to (B,t_B)$ is a functor $f$ for which $f(t_A) = t_B$;
- a pseudo morphism is a functor $f$ such that $f(t_A)$ is isomorphic to $t_B$;
- a colax morphism is just any functor from $A$ to $B$, with no additional requirement on the terminal object.
If you didn’t know them already, you will probably have guessed the general definitions of strict, pseudo and lax morphisms by now, as well as the definition of 2-cells between them. Note that, in this post, all 2-monads and algebras for them will be strict, as in [BKP].
For any 2-monad $T$, we thus get the following inclusions of 2-categories:
$T\text{-}\mathrm{Alg}_s \to T\text{-}\mathrm{Alg}_p \to T\text{-}\mathrm{Alg}_l.$
(In [BKP], the category $T\text{-}\mathrm{Alg}_p$ is denoted by $T\text{-}\mathrm{Alg}$.) Roughly the first half of the paper [BKP] is devoted to the construction of left adjoints (in the 2-categorical sense) to these inclusion functors.
Note that $T\text{-}\mathrm{Alg}_s$ is simply the Eilenberg-Moore $\mathcal{V}$-category of the $\mathcal{V}$-enriched monad $T$ in the case where $\mathcal{V} = \mathbf{Cat}$, in the sense of the second paper that we read in this seminar. The categories $T\text{-}\mathrm{Alg}_p$ and $T\text{-}\mathrm{Alg}_l$, on the other hand, are special to the $\mathbf{Cat}$-enriched setting.
Limits in $T$-Alg$_p$
The category $T\text{-}\mathrm{Alg}_s$ has all 2-limits that the base 2-category $\mathcal{K}$ has. For $T\text{-}\mathrm{Alg}_p$, the situation is more subtle.
Example (c’t’d). In the example where $T A$ is $A$ provided freely with a terminal object, let $1 = \{\ast\}$ be the terminal category and $I$ the category with two objects $0$, $1$ and a unique isomorphism between them. There are two pseudo-morphisms $(1,\ast) \to (I,0)$, one sending $\ast$ to $0$, the other sending $\ast$ to $1$. However, if $C \to 1$ is any functor which equalizes these two morphisms, then $C$ is empty, and so it does not admit a $T$-algebra structure. Thus, the category $T\text{-}\mathrm{Alg}_p$ does not admit equalizers in general.
Assuming that the $2$-category $\mathcal{K}$ is complete, it is however possible to construct the following limits in $T\text{-}Alg_p$:
- Products,
- Inserters and iso-inserters,
- Equifiers,
and they are created by the forgetful functor $T\text{-}Alg_p \to \mathcal{K}$. As we saw in last week’s post, these PIE-limits allow for the construction of many other limits. In particular, from the results discussed last week, we see that $T\text{-}\mathrm{Alg}_p$ also has inverters and co-tensors, and hence also lax and pseudo limits.
It is also worth noting that each of the results on existence of limits “restricts to strict” (for lack of a better name), by which I mean that, for each of these limits, there exists a limiting cone such that the algebra 1-cells in the limiting cone:
are strict, and
detect strictness.
For example, for any parallel pair $f, g : B \to C$ in $T\text{-}\mathrm{Alg}_p$ there is an inserter $p : A \to B$ such that (1) $p$ is strict, and (2) if $ph$ is strict for some algebra morphism $h : D \to A$, then $h$ is strict.
The pseudomorphism classifier
Example (c’t’d). In the example where $T A$ is $A$ provided freely with a terminal object, note that pseudo-morphisms can be mimicked using strict morphisms: for any algebra $(A,t_A)$, consider the algebra $(A',t_{A'})$, defined by adding one new object $t_{A'}$ and an isomorphism $t_{A} \cong t_{A'}$ to $A$. It is then clear that, for any algebra $(B,t_B)$, strict morphisms $(A',t_{A'}) \to (B,t_B)$ correspond to pseudo-morphisms $(A,t_A) \to (B,t_B)$. In fact, this correspondence is an isomorphism between the categories of morphisms and natural transformations between them.
The following theorem, which is arguably at the heart of the paper [BKP], says that the above phenomenon in fact occurs for any reasonably well-behaved 2-monad.
Theorem. Let $T$ be an accessible 2-monad on a 2-category $\mathcal{K}$ that is complete and cocomplete. Then the inclusion 2-functor $T\text{-}\mathrm{Alg}_s \to T\text{-}\mathrm{Alg}_p$ has a left adjoint.
Proof (Sketch). The proof of the theorem consists of three steps:
- A general fact: in order to find a left adjoint to a 2-functor $G : \mathcal{K} \to \mathcal{L}$, it suffices to find a left adjoint to its underlying ordinary functor $G_o$, provided that $\mathcal{K}$ has cotensors with the walking arrow category $2$ and $G$ preserves them.
- Using (1), one shows that there exists a left adjoint, $()^o$, to the inclusion functor $T\text{-}\mathrm{Alg}_s \to T/{\mathcal{K}},$ where $T/{\mathcal{K}}$ is the comma 2-category.
- The hardest part: pseudo-morphisms out of a $T$-algebra $(A,a)$ can be mimicked by $T/\mathcal{K}$-morphisms out of a certain object $(C,c,Z)$ of $T/{\mathcal{K}}$.
Now, composing (2) and (3), one associates to any $T$-algebra $(A,a)$ the $T$-algebra $(C,c,Z)^o$ and observes that this gives an ordinary (1-categorical) left adjoint to $T\text{-}\mathrm{Alg}_s \to T\text{-}\mathrm{Alg}_p$. Then, by (1) and the fact that cotensors exist in $T\text{-}\mathrm{Alg}_p$, it is also a 2-categorical left adjoint. $\qed$
The image under the left adjoint of an algebra $A$, seen as an object of $T\text{-}\mathrm{Alg}_p$, is denoted by $A'$ and called the pseudo-morphism classifier of $A$. Under the conditions of the Theorem, there is also a lax morphism classifier.
There are more conceptual proofs of these facts, using the concept of codescent objects; see, for example, this paper (which will be discussed in these series in a month or so) and Section 4 of the 2-categories companion by Stephen Lack. The latter paper, by the way, has been an indispensable source for me in preparing this post, and those who are familiar with it will probably recognize its influence throughout the post.
Flexibility
We denote by the letters $p$ and $q$ the unit and co-unit of the adjunction
$J : T\text{-}\mathrm{Alg}_s \leftrightarrows T\text{-}\mathrm{Alg}_p : ()'$
from the above theorem. For any algebra $A$ in $T\text{-}\mathrm{Alg}_p$, the morphism $q_A : A' \to A$ is in fact always a surjective equivalence in the 2-category $T\text{-}\mathrm{Alg}_p$, but in general $q_A$ does not even need to be an equivalence in $T\text{-}\mathrm{Alg}_s$, as we will see shortly. If $q_A : A' \to A$ is an equivalence in $T\text{-}\mathrm{Alg}_s$, then $A$ is called semi-flexible, and $A$ is called flexible if $q_A$ is a surjective equivalence in $T\text{-}\mathrm{Alg}_s$. The flexible objects are the cofibrant objects in a model structure on $T\text{-}\mathrm{Alg}_s$ lifted from the model structure on $\mathcal{K}$, and the pseudomorphism classifier $A'$ is then a special cofibrant replacement of $A$ (see Section 7.3 of the 2-categories companion for more details about this).
Several equivalent characterizations of flexibility and semi-flexibility are given in Theorems 4.4 and 4.7, respectively, of [BKP]. One useful equivalent way to say that a $T$-algebra $A$ is semi-flexible is that every pseudo-morphism out of $A$ is isomorphic to a strict morphism out of $A$. With this definition, we can see that not every $T$-algebra is semi-flexible:
Example. Let $T$ be the 2-monad on $\mathbf{Cat}$ whose algebras are small categories with assigned finite limits. Let $A$ be the terminal category, with finite limits assigned in the only possible way. Let $B$ be any category with assigned finite limits in which $t_B$ is the assigned terminal object and the assigned product $t_B \times t_B$ is not equal to $t_B$ (the two objects will of course be isomorphic). Then the functor $A \to B$ which sends the unique object of $A$ to $t_B$ is a pseudo-morphism, but it is clearly not isomorphic to any strict morphism.
The following example shows that flexibility and semi-flexibility are really different concepts.
Example. Categories whose objects are functors can also often be represented as the $T$-algebras for an appropriate monad $T$ on an appropriate base 2-category $\mathcal{K}$. For instance, there is a 2-monad $T$ on $\mathbf{Cat} \times \mathbf{Cat}$, given on objects by $T(X,Y) := (X,X+Y)$, such that $T$-algebras are functors, a pseudomorphism from $f : A \to B$ to $g : C \to D$ is a diagram of the form $\begin{matrix} A & \overset{f}{\to} & B \\ u \downarrow & \overset{\alpha}{\cong} & \downarrow v \\ C & \overset{g}{\to} & D \end{matrix}$ and such a pseudomorphism is strict exactly when $\alpha$ is the identity. Now, letting $1$ denote the terminal category, it is easy to describe the pseudomorphism classifier of the $T$-algebra $a : A \to 1$: this is the inclusion functor $j : A \to \overline{A}$, where $\overline{A}$ is the indiscrete category on objects $\mathrm{ob}(A) + \{\ast\}$ (As a simple but nice exercise, you may check that, indeed, any pseudomorphism out of the algebra $a : A \to 1$ corresponds uniquely to a strict morphism out of the algebra $j : A \to \overline{A}$.) Now, letting $I$ again denote the category with two objects $0$, $1$ and a unique isomorphism between them, one may check that the algebra $I \to 1$ is equivalent in $T\text{-}\mathrm{Alg}_s$ to the algebra $1 \to 1$, which is flexible, and therefore $I \to 1$ is semi-flexible. However, $I \to 1$ is not flexible. (See example 4.11 in [BKP]).
Biadjunctions and bicolimits in $T\text{-}\mathrm{Alg}_p$
So far, we have only considered limits, which one would expect to exist in a category of algebras. On the other hand, we wouldn’t generally expect colimits to exist in a category of algebras, but as it turns out, in the last section of [BKP], the authors prove that:
- the category $T\text{-}\mathrm{Alg}_p$ admits bicolimits, and
- any strict map of 2-monads $\theta : S \to T$ induces a map $T\text{-}\mathrm{Alg}_p \to S\text{-}\mathrm{Alg}_p$ that has a left biadjoint.
Both of these results are consequences of the following more technical fact:
Theorem. If $G : T\text{-}\mathrm{Alg}_p \to \mathcal{L}$ is a 2-functor so that the composite 2-functor
$\begin{matrix} T-Alg_s & \overset{J}{\to} & T-Alg_p & \overset{G}{\to} & L \end{matrix}$
has a left adjoint $H$, then $H$ maps into flexible algebras, and $J \circ H$ is left biadjoint to $G$.
From the above theorem and the relation between biadjoints and bicolimits that we discussed last week, bicolimits can now be constructed in $T\text{-}\mathrm{Alg}_p$, as claimed in (1) above. To prove (2), one first notices that 2-functor $\theta^* : T\text{-}\mathrm{Alg}_s \to S\text{-}\mathrm{Alg}_s$ extends to a 2-functor $\theta^\# : T\text{-}\mathrm{Alg}_p \to S\text{-}\mathrm{Alg}_p$ making the diagram
$\begin{matrix} T\text{-}\mathrm{Alg}_s & \overset{\theta^*}{\to} & S\text{-}\mathrm{Alg}_s \\ J\downarrow & \quad & \downarrow J \\ T\text{-}\mathrm{Alg}_p & \overset{\theta^\sharp}{\to} & S\text{-}\mathrm{Alg}_p \end{matrix}$
commute. One may then apply the Theorem in the case $G = \theta^\#$.
More examples of 2-monads
Above I motivated the concepts and theorems in [BKP] with some simple examples of 2-monads. The last section of [BKP] contains many more examples. About the general method for constructing such examples, the authors make the following interesting comment.
“In practice one is seldom presented with a 2-monad and invited to consider its algebras; more commonly one contemplates some structure borne by a category (…) and one concludes in certain cases that the structure is given by an action of a 2-monad (…)”
With this comment in mind, one may now construct 2-monads whose algebras are monoidal categories, symmetric monoidal closed categories (here the 2-monad is over the 2-category $\mathbf{Cat}_g$, where the 2-cells are only taken to be natural isomorphisms), and even finitary 2-monads themselves (they are the algebras for a certain 2-monad $R$ on the functor 2-category $[\mathbf{Cat}_f,\mathbf{Cat}]$, where $\mathbf{Cat}_f$ is the full subcategory of $\mathbf{Cat}$ consisting of the finitely presentable objects. This perspective was exploited in a later paper by Kelly and Power on presentations of 2-monads.
A final point of interest is that one may distinguish a special kind of 2-monad $T$, namely those for which the $T$-algebra structure on an object $A$ is unique if it exists. Such 2-monads $T$ define a property of rather than a structure on the objects of the base 2-category $\mathcal{K}$, and may thus be called property-like (as they are in this later paper by Kelly and Lack). As the authors of [BKP] remark, it “may well be a hard problem” how to distinguish the property-like 2-monads $T$ from, say, a presentation for them. A particular class of 2-monads which are ‘property-defining’ are the lax-idempotent 2-monads (which also go by the names “quasi-idempotent” and “Kock-Zöberlein” 2-monads).
Questions
Let me finish with a (non-exhaustive) list of questions that may be interesting to discuss below.
Can the fact that limits in $T\text{-}\mathrm{Alg}_p$ can be chosen with a fair amount of “strictness” be understood using this account of lax / pseudo limits for morphisms between $T$-algebras using $\mathcal{F}$-enrichment?
The flexible algebras are exactly the strict retracts of pseudomorphism classifiers. The latter are “free algebras”, in some sense (at least in the sense that they are the images of a left adjoint). This suggests that one could think of the concept ‘flexible algebra’ as a 2-categorical version of the familiar concept ‘projective algebra’ in the 1-categorical setting. Is this a good intuition, and if so, can it be (or has it already been) made more precise?
In order to better understand the concept of flexible algebra and the biadjunctions in the later part of [BKP], it would probably be useful to study different examples of 2-monads, and in particular, answer the following questions in such examples:
(a) is there a concrete construction of the pseudomorphism classifier?
(b) which algebras are (semi-)flexible?
(c) (for a strict map between 2-monads) what does the biadjunction do?
Here’s a classic theorem about finite products theories, also known as algebraic theories or Lawvere theories. You can find it in Toposes, Triples and Theories, but it must go way back to Lawvere’s thesis. In my work with Nina Otter on phylogenetic trees, we need a slight generalization of it… and if true, this generalization must already be known. So I really just want a suitable reference!
Theorem. Suppose $C$ is a category with finite products that is single-sorted: every object is a finite product of copies of a single object $x \in C$. Let $Mod(C)$ be the category of models of $C$: that is, product-preserving functors
$\phi : C \to Set$
and natural transformations between these. Let
$U : Mod(C) \to Set$
be the functor sending any model to its underlying set:
$U(\phi) = \phi(x)$
Then $U$ has a left adjoint
$F : Set \to Mod(C)$
and $Mod(C)$ is equivalent to the category of algebras of the monad
$U F : Set \to Set$
As a result, any model $\phi$ can be written as a coequalizer
$F U F U(\phi) \stackrel{\longrightarrow}{\longrightarrow} F U(\phi) \longrightarrow \phi$
where the arrows are built from the counit of the adjunction
$\epsilon : F U \to 1_{Mod(C)}$
in the obvious ways: $F U (\epsilon_{F U (\phi)})$, $\epsilon_{F U F U(\phi)}$ and $\epsilon_\phi$.
The generalization we need is quite mild, I hope. First, we need to consider multi-typed theories. Second, we need to consider models in $Top$ rather than $Set$. So, this is what we want:
Conjecture. Suppose $C$ is a category with finite products that is $\Lambda$-sorted: every object is a finite product of copies of certain objects $x_\lambda$, where $\lambda$ ranges over some index set $\Lambda$. Let $Mod(C)$ be the category of topological models of $C$: that is, product-preserving functors
$\phi : C \to Top$
and natural transformations between these. Let
$U : Mod(C) \to Top^\Lambda$
be the functor sending any model to its underlying spaces:
$U(\phi) = (\phi(x_\lambda))_{\lambda \in \Lambda}$
Then $U$ has a left adjoint
$F : Top^\Lambda \to Mod(C)$
and $Mod(C)$ is equivalent to the category of algebras of the monad
$U F : Top^\Lambda \to Top^\Lambda$
As a result, any model $\phi$ can be written as a coequalizer
$F U F U(\phi) \stackrel{\longrightarrow}{\longrightarrow} F U(\phi) \longrightarrow \phi$
where the arrows are built from the counit of the adjunction in the obvious ways.
Comments
1) There shouldn’t be anything terribly special about $Top$ here: any sufficiently nice category should work… but all I need is $Top$. What counts as ‘sufficiently nice’? If we need to replace $Top$ ‘convenient category of topological spaces’, to make it cartesian closed, that’s fine with me—just let me know!
2) I’m sure this conjecture, if true, follows from some super-duper-generalization. I don’t mind hearing about such generalizations, and I suspect some of you will be unable to resist mentioning them… so okay, go ahead, impress me…
… but remember: all I need is this puny little result!
In fact, all I really need is a puny special case of this puny result. If it works, it will be a small, cute application of algebraic theories to biology.
The Legendre–Fenchel transform, or Fenchel transform, or convex conjugation, is, in its naivest form, a duality between convex functions on a vector space and convex functions on the dual space. It is of central importance in convex optimization theory and in physics it is used to switch between Hamiltonian and Lagrangian perspectives.
Suppose that $V$ is a real vector space and that $f\colon V\to [-\infty ,+\infty ]$ is a function then the Fenchel transform is the function $f^{\ast }\colon V^{#}\to [-\infty ,+\infty ]$ defined on the dual vector space $V^{#}$ by $f^{\ast }(k)\coloneqq \sup _{x\in V}\big \{ \langle k,x\rangle -f(x)\big \} .$
If you’re a regular reader then you will be unsurprised when I say that I want to show how it naturally arises from enriched category theory constructions. I’ll show that in the next post. In this post I’ll give a little introduction to the Legendre–Fenchel transform.
There is probably no best way to introduce the Legendre–Fenchel transform. The only treatment that I knew for many years was in Arnold’s book Mathematical Methods of Classical Mechanics, but I have recently come across the convex optimization literature and would highly recommend Tourchette’s The Fenchel Transform in a Nutshell — my treatment here is heavily influenced by this paper. I will talk mainly about the one-dimensional case as I think that gives a lot of the intuition.
We will start, as Legendre did, with the special case of a strictly convex differentiable function $f\colon \mathbb{R}\to \mathbb{R}$; for instance, the function $x^{2}+1/2$ pictured on the left hand side above. The derviative of $f$ is strictly increasing and so the function $f$ can be parametrized by the derivative $k =d f/d x$ instead of the parameter $x$. Indeed we can write the parameter $x$ in terms of the slope $k$, $x=x(k)$. The Legendre-Fenchel transform $f^{*}$ can then be defined to satisfy $\langle k,x \rangle = f(x) +f^{\ast }(k),$ where the angle brackets mean the pairing between a vector space and its dual. In this one-dimensional case, where $x$ and $k$ are thought of as real numbers, we just have $\langle k,x \rangle = k x$.
As $x$ is a function of $k$ we can rewrite this as $f^{\ast }(k)\coloneqq \langle k,x(k) \rangle -f(x(k)).$ So the Legendre-Fenchel transform encodes the function is a different way. By differentiating this equation you can see that the $d f^{\ast }/d k=x(k)$, thus we have interchanged the abcissa (the horizontal co-ordinate) and the slope. So if $f$ has derivative $k_{0}$ at $x_{0}$ then $f^{\ast }$ has derivative $x_{0}$ at $k_{0}$. This is illustrated in the above picture.
I believe this is what Legendre did and then that what Fenchel did was to generalize this to non-differentiable functions.
For non-differentiable functions, we can’t talk about tangent lines and derivatives, but instead can talk about supporting lines. A supporting line is one which touches the graph at at least one point and never goes above the graph. (The fact that we’re singling out lines not going above the graph means that we have convex functions in mind.)
For instance, at the point $(x_{0},f(x_{0}))$ the graph pictured below has no tangent line, but has supporting lines with gradient from $k_{1}$ to $k_{2}$. A convex function will have at least one supporting line at each point.
It transpires that the right way to generalize the transform to this non-differentiable case is to define it as follows: $f^{\ast }(k)\coloneqq \sup _{x\in \mathbb{R}}\big \{ \langle k,x\rangle -f(x)\big \} .$ In this case, if $f$ has a supporting line of slope $k_{0}$ at $x_{0}$ then $f^{\ast }$ has a supporting line of slope $x_{0}$ at $k_{0}$. In the picture above, at $x_{0}$, the function $f$ has supporting lines with slope from $k_{1}$ to $k_{2}$: correspondingly, the function $f^{\ast }$ has supporting lines with slope $x_{0}$ all the way from $k_{1}$ to $k_{2}$.
If we allow the function $f$ to be not strictly convex then the transform will not alway be finite. For example, if $f(x)\coloneqq ax+b$ then we have $f^{\ast }(a)=-b$ and $f^{\ast }(k)=+\infty$ for $k\ne a$. So we will allow functions taking values in the extended real numbers: $\overline{\mathbb{R}}\coloneqq [-\infty ,+\infty ]$.
We can use the above definition to get the transform of any function $f\colon \mathbb{R}\to \overline{\mathbb{R}}$, whether convex or not, but the resulting transform $f^{\ast }$ is always convex. (When there are infinite values involved we can also say that $f^{\ast }$ is lower semi-continuous, but I’ll absorb that into my definition of convex for functions taking infinite values.)
Everything we’ve done for one-dimensional $\mathbb{R}$ easily generalizes to any finite dimensional real vector space $V$, where we should say ‘supporting hyperplane’ instead of ‘supporting line’. From that we get a transform between sets of functions $(\text {--})^{\ast }\colon \mathrm{Fun}(V,\overline{\mathbb{R}})\to \mathrm{Fun}(V^{#},\overline{\mathbb{R}}),$ where $V^{#}$ is the vector space dual of $V$. Similarly, we have a reverse transform going the other way, which is traditionally also denoted with a star $(\text {--})^{\ast }\colon \mathrm{Fun}(V^{#},\overline{\mathbb{R}})\to \mathrm{Fun}(V,\overline{\mathbb{R}}),$ for $g\colon V^{#}\to \overline{\mathbb{R}}$ we define $g^{\ast }(x)\coloneqq \sup _{k\in V^{#}}\big \{ \langle k,x\rangle -g(k)\big \} .$
This pair of transforms have some rather nice properties, for instance, they are order reversing. We can put a partial order on any set of functions to $\overline{\mathbb{R}}$ by defining $f_{1}\ge f_{2}$ if $f_{1}(x)\ge f_{2}(x)$ for all $x$. Then $f_{1}\ge f_{2} \quad \Rightarrow \quad f_{2}^{\ast }\ge f_{1}^{\ast }.$ Also for any function $f$ we have $f^{\ast }=f^{\ast \ast \ast }$ which implies that the operator $f\mapsto f^{\ast \ast }$ is idempotent: $f^{\ast \ast }=f^{\ast \ast \ast \ast }.$ This means that $f\mapsto f^{\ast \ast }$ is a closure operation. What it actually does is take the convex envelope of $f$, which is the largest convex function less than or equal to $f$. Here’s an example.
This gives that if $f$ is already a convex function then $f^{\ast \ast }=f$. And as a consequence the Legendre–Fenchel transform and the reverse transform restrict to an order reversing bijection between convex functions on $V$ and convex functions on its dual $V^{#}$. $\mathrm{Cvx}(V,\overline{\mathbb{R}})\cong \mathrm{Cvx}(V^{#},\overline{\mathbb{R}}).$
There are many other things that can be said about the transform, such as Fenchel duality and the role it plays in optimization, but I don’t understand such things to my own satisfaction yet.
Next time I’ll explain how most of the above structure drops out of the nucleus construction in enriched category theory.
Guest post by Christina Vasilakopoulou
In the eighth installment of the Kan Extension Seminar, we discuss the paper “Elementary Observations on 2-Categorical Limits” by G.M. Kelly, published in 1989. Even though Kelly’s classic book Basic Concepts of Enriched Category Theory, which contains the abstract theory related to indexed (or weighted) limits for arbitrary $\mathcal{V}$-categories, was available since 1982, the existence of the present article is well-justifiable.
On the one hand, it constitutes an independent account of the fundamental case $\mathcal{V}$=$\mathbf{Cat}$, thus it motivates and exemplifies the more general framework through a more gentle, yet meaningful exposition of 2-categorical limits. The explicit construction of specific notable finite limits such as inserters, equifiers etc. promotes the comprehension of the definitions, via a hands-on description. Moreover, these finite limits and particular results concerning 2-categories rather than general enriched categories, such as the construction of the cotensor as a PIE limit, are central for the theory of 2-categories. Lastly, by introducing indexed lax and pseudo limits along with Street’s bilimits, and providing appropriate lax/ pseudo/ bicategorical completeness results, the paper serves also as an indespensable reference for the later “2-Dimensional Monad Theory” by Blackwell, Kelly and Power.
I would like to take this opportunity to thank Emily as well as all the other participants of the Kan Extension Seminar. This has been a unique experience of constant motivation and inspiration for me!
Basic Machinery
Presently, our base of enrichment is the cartesian monoidal closed category $\mathbf{Cat}$ of (small) categories, with the usual adjunction $-\times\mathcal{A}\dashv[\mathcal{A},-]$. The very definition of an indexed limit requires a good command of the basic $\mathbf{Cat}$-categorical notions, as seen for example in “Review of the Elements of 2-categories” by Kelly and Street. In particular, a 2-natural transformation $\alpha:G\Rightarrow H$ between 2-functors consists of components which not only satisfy the usual naturality condition, but also the 2-naturality one expressing compatibility with 2-cells. Moreover, a modification between 2-natural transformations $m:\alpha\Rrightarrow\beta$ has components families of 2-cells $m_A:\alpha_A\Rightarrow\beta_A:GA\to HA$ compatible with the mapped 1-cells of the domain 2-category, i.e. $m_B\cdot Gf=Hf\cdot m_A$ (where $\cdot$ is whiskering).
A 2-functor $F:\mathcal{K}\to\mathbf{Cat}$ is called representable, when there exists a 2-natural isomorphism $\alpha:\mathcal{K}(K,-)\xrightarrow{\quad\sim\quad}F.$ The components of this isomorphism are $\alpha_A:\mathcal{K}(K,A)\cong FA$ in $\mathbf{Cat}$, and the unit of the representation is the corresponding `element’ $\mathbf{1}\to FK$ via Yoneda.
For a general complete symmetric monoidal closed category $\mathcal{V}$, the usual functor category $[\mathcal{A},\mathcal{B}]$ for two $\mathcal{V}$-categories is endowed with the structure of a $\mathcal{V}$-category itself, with hom-objects ends $[\mathcal{A},\mathcal{B}](T,S)=\int_{A\in\mathcal{A}} \mathcal{B}(TA,SA)$ (which exist at least when $\mathcal{A}$ is small). In our context of $\mathcal{V}$=$\mathbf{Cat}$ it is not necessary to employ ends and coends at all, and the hom-category $[\mathcal{K},\mathcal{L}](G,H)$ of the functor 2-category is evidently the category of 2-natural transformations and modifications. However, we note that computations via (co)ends simplify and are essential for constructions and (co)completeness results for enrichment in general monoidal categories.
The definition of weighted limits for 2-categories
To briefly motivate the definition of a weighted limit, recall that an ordinary limit of a ($\mathbf{Set}$-) functor $G:\mathcal{P}\to\mathcal{C}$ is characterized by an isomorphism $\mathcal{C}(C,limG)\cong[\mathcal{P},\mathcal{C}](\Delta C, G)$ natural in $C$, where $\Delta C:\mathcal{P}\to\mathcal{C}$ is the constant functor on the object $C$. In other words, the limit is the representing object of the presheaf $[\mathcal{P},\mathcal{C}](\Delta -,G):\mathcal{C}^\op\to\mathbf{Set}.$ Since components of a natural transformation $\Delta C\Rightarrow G$ (i.e. cones) can be viewed as components of a natural $\Delta\mathbf{1}\Rightarrow\mathcal{C}(C,G-):\mathcal{C}\to\mathbf{Set}$, the above defining isomorphism can be written as $\mathcal{C}(C,\mathrm{lim}G)\cong[\mathcal{P},\mathbf{Set}](\Delta\mathbf{1},\mathcal{C}(C,G-)).$ In this form, ordinary limits can be easily seen as particular examples of conical indexed limits for $\mathcal{V}$=$\mathbf{Set}$, and we are able to generalize the concept of a limit by replacing the functor $\Delta\mathbf{1}$ by an arbitrary functor (weight) $\mathcal{C}\to\mathbf{Set}$.
We may thus think of a 2-functor $F:\mathcal{P}\to\mathbf{Cat}$ as a (small) indexing type or weight, and a 2-functor $G:\mathcal{P}\to\mathcal{K}$ as a diagram in $\mathcal{K}$ of shape $\mathcal{P}$: $\begin{matrix} & \mathbf{Cat}\quad \\ {}^{weight} \nearrow_{F} & \\ \mathcal{P} & \overset{G}\underset{diagram}{\rightarrow} & \mathcal{K}. \\ \end{matrix}$ The 2-functor $G$ gives rise to a 2-functor $\int_p [Fp,\mathcal{K}(-,Gp)]=[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(-,G)):\; \mathcal{K}^\op\longrightarrow\mathbf{Cat}$ which maps a 0-cell $A$ to the category $[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(A,G))$. A representation of this contravariant 2-functor is an object $\{F,K\}\in\mathcal{K}$ along with 2-natural isomorphism $\mathcal{K}(-,\{F,G\})\xrightarrow{\;\sim\;}[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(-,G))$ with components isomorphisms between categories $\mathcal{K}(A,\{F,G\})\cong[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(A,G-)).$ The unit of this representation is $\mathbf{1}\to[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(\{F,G\},G))$ which corresponds uniquely to a 2-natural transformation $\xi:F\Rightarrow\mathcal{K}(\{F,G\},G)$.
Via this 2-natural isomorphism, the object $\{F,G\}$ in $\mathcal{K}$ satisfies a universal property which can be expressed in two levels:
The 1-dimensional aspect of the universal property states that every natural transformation $\rho$ factorizes as $\begin{matrix} F \xrightarrow{\rho} & \mathcal{K}(A,G) \\ {}_\xi \searrow & \uparr_{\mathcal{K}(h,1)} \\ & \mathcal{K}(\{F,G\},G) \\ \end{matrix}$ for a unique 1-cell $h:A\to\{F,G\}$ in $\mathcal{K}$, where the vertical arrow is just pre-composition with $h$.
The 2-dimensional aspect of the universal property states that every modification $\theta:\rho\Rrightarrow\rho'$ factorizes as $\mathcal{K}(\alpha,1)\cdot \xi$ for a unique 2-cell $\alpha:h\Rightarrow h'$ in $\mathcal{K}$.
The fact that the 2-dimensional aspect (which asserts an isomorphism of categories) does not in general follow from the 1-dimensional aspect (which asserts a bijection between the hom-sets of the underlying categories) is a recurrent issue of the paper. In fact, things would be different if the underlying category functor $\mathcal{V}(I,-)=(\;)_0:\mathcal{V}\text{-}\mathbf{Cat}\to\mathbf{Cat}$ were conservative, in which case the 2-dimensional universal property would always imply the 1-dimensional one. Certainly though, this is not the case for $\mathcal{V}$=$\mathbf{Cat}$: the respective functor discards all the 2-cells and is not even faithful. However, if we know that a weighted limit exists, then the first level of the universal property suffices to detect it up to isomorphism.
Completeness of 2-categories
A 2-category $\mathcal{K}$ is complete when all limits $\{F,G\}$ exist. The defining 2-natural isomorphism extends the mapping $(F,G)\mapsto\{F,G\}$ into a functor of two variables (the weighted limit functor) $\{-,-\}:[\mathcal{P},\mathbf{Cat}]^{op}\times[\mathcal{P},\mathcal{K}]\longrightarrow \mathcal{K}$ as the left parametrized adjoint (actually its opposite) of the functor $\mathcal{K}(-,?):\mathcal{K}^{op}\times[\mathcal{P},\mathcal{K}]\to[\mathcal{P},\mathbf{Cat}]$ mapping an object $A$ and a functor $G$ to $\mathcal{K}(A,G-)$. A colimit in $\mathcal{K}$ is a limit in $\mathcal{K}^op$, and the weighted colimit functor is $-\ast-:[\mathcal{P}^op,\mathbf{Cat}]\times[\mathcal{P},\mathcal{K}]\longrightarrow\mathcal{K}.$ Apart from the evident duality, we observe that often colimits are harder to compute than limits. This may partially be due to the fact that $\{F,G\}$ is determined by the representable $\mathcal{K}(-,\{F,G\})$ which gives generalized elements of $\{F,G\}$, whereas the description of $\mathcal{K}(F\ast G,-)$ gives us arrows out of $F\ast G$. For example, limits in $\mathbf{Cat}$ are easy to compute via $[\mathcal{A},\{F,G\}]\cong[\mathcal{P},\mathbf{Cat}](F,[\mathcal{A},G-])\cong[\mathcal{A},[\mathcal{P},\mathbf{Cat}](F,G)]$ and in particular, taking $\mathcal{A}=\mathbf{1}$ gives us the objects of the category $\{F,G\}$ and $\mathcal{A}=\mathbf{2}$ gives us the morphisms. On the contrary, colimits in $\mathbf{Cat}$ are not straightforward (except than their property $F\ast G\cong G\ast F$).
Notice that like ordinary limits are defined, via representability, in terms of limits in $\mathbf{Set}$, we can define weighted limits in terms of limits of representables in $\mathbf{Cat}$: $\mathcal{K}(A,\{F,G\})\cong\{F,\mathcal{K}(A,G-)\},\quad\mathcal{K}(F\ast ,G,A)\cong\{F,\mathcal{K}(G-,A)\}.$ On the other hand, if the weights are representables, via Yoneda lemma we get $\{\mathcal{P}(P,-),G\}\cong GP, \qquad \mathcal{P}(-,P)\ast G\cong GP.$
The main result for general $\mathcal{V}$-completeness in Kelly’s book says that a $\mathcal{V}$-enriched category is complete if and only if it admits all conical limits (equivalently, products and equalizers) and cotensor products. Explicitly, conical limits are those with weight the constant $\mathcal{V}$-functor $\Delta I$, whereas cotensors are those where the domain enriched category $\mathcal{P}$ is the unit category $\mathbf{1}$, hence the weight and the diagram are determined by objects in $\mathcal{V}$ and $\mathcal{K}$ respectively. Once again, for $\mathcal{V}$=$\mathbf{Cat}$ an elementary description of both limits is possible.
Notice that when a 2-category admits tensor products of the form $\mathbf{2}\ast A$, then the 2-dimensional universal property follows from the 1-dimensional for every limit, because of conservativity of the functor $\mathbf{Cat}_0(\mathbf{2},-)$ and the definition of tensors. Moreover, the former also implies that the category $\mathbf{2}$ is a strong generator in $\mathbf{Cat}$, hence the existence of only the cotensor $\{\mathbf{2},B\}$ along with conical limits in a 2-category $\mathcal{K}$ is enough to deduce 2-completeness.
$\mathbf{Cat}$ itself has cotensor and tensor products, given by $\{\mathcal{A},\mathcal{B}\}=[\mathcal{A},\mathcal{B}]$ and $\mathcal{A}\ast\mathcal{B}=\mathcal{A}\times\mathcal{B}$. It is ultimately also cocomplete, all colimits being constructed from tensors and ordinary colimits in $\mathbf{Cat}_0$ (which give the conical limits in $\mathbf{Cat}$ by the existence of the cotensor $[\mathbf{2},B]$).
If we were to make use of ends and coends, the explicit construction of an arbitrary 2-(co)limit in $\mathcal{K}$ as the (co)equalizer of a pair of arrows between (co)products of (co)tensors coincides with $\{F,G\}=\int_K \{FK,GK\}, \qquad F\ast G=\int^K FK\ast GK.$ Such an approach simplifies the proofs of many useful properties of limits and colimits, such as $\{F,\{G,H\}\}\cong\{F\ast G,H\},\;\;(G\ast F)\ast H\cong F\ast(G\ast H)$ for appropriate 2-functors.
Famous finite 2-limits
The paper provides the description of some important classes of limits in 2-categories, essentially by exhibiting the unit of the defining representation for each particular case. A table which summarizes the main examples included is the following:
Let’s briefly go through the explicit construction of an inserter in a 2-category $\mathcal{K}$. The weight and diagram shape are as in the first line of the above table, and denote by $B\overset{f}\underset{g}{\rightrightarrows}C$ the image of the diagram in $\mathcal{K}$. The standard technique is to identify the form of objects and morphisms of the functor 2-category $[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(A,G-))$, and then state both aspects of the universal property.
An object is a 2-natural transformation $\alpha:F\Rightarrow\mathcal{K}(A,G-)$ with components $\alpha_\bullet:1\to\mathcal{K}(A,B)$ and $\alpha_\star:\mathbf{2}\to\mathcal{K}(A,C)$ satisfying the usual naturality condition (2-naturality follows trivially, since $\mathcal{P}$ only has the identity 2-cell). This amounts to the following data:
an 1-cell $A\xrightarrow{\alpha_\bullet}B$, i.e. the object in $\mathcal{K}(A,B)$ determined by the functor $\alpha_\bullet$;
a 2-cell ${\alpha_\star0}\overset{\alpha_\star}{\Rightarrow}{\alpha_\star1}$, i.e. the morphism in $\mathcal{K}(A,C)$ determined by the functor $\alpha_\star$;
properties, which make the 1-cells $\alpha_\star0,\alpha_\star1$ factorize as $\alpha_\star0=A\xrightarrow{\alpha_\bullet}B\xrightarrow{f}C$ and $\alpha_\star1=A\xrightarrow{\alpha_\bullet}B\xrightarrow{g}C$.
We can encode the above data by a diagram $\begin{matrix} & B & \\ {}^{\alpha_\bullet} \nearrow && {\searrow}^f \\ A\; & \Downarrow{\alpha_\star}& \quad C. \\ {}_{\alpha_\bullet} \searrow && \nearrow_g \\ & B & \\ \end{matrix}$ Now a morphism is a modification $m:\alpha\Rrightarrow\beta$ between two objects as above. This has components
$m_\bullet:\alpha_\bullet\Rightarrow\beta_\bullet$ in $\mathcal{K}(A,B)$;
$m_\star:\alpha_\star\Rightarrow\beta_\star$ given by 2-cells $m_\star^0:\alpha_\star0\Rightarrow{\beta_\star0}$ and $m_\star^1:\alpha_\star1\Rightarrow\beta_\star1$ in $\mathcal{K}(A,C)$ satisfying naturality $m^1_\star\circ\alpha_\star=\beta_\star\circ m^0_\star$.
The modification condition $m^0_\star=f\cdot m_\bullet$ and $m^1_\star=g\cdot m_\bullet$ gives the components of $m_\star$ as whiskered composites of $m_\bullet$. We can thus express such a modification as a 2-cell $m_\bullet$ satisfying $gm_\bullet\circ\alpha_\star=fm_\bullet\circ\beta_\star$ (graphically expressed by pasting $m_\bullet$ accordingly to the sides of $\alpha_\star,\beta_\star$).
This encoding simplifies the statement of the universal property for $\{F,G\}$, as the object of in $\mathcal{K}$ through which any natural transformation and modification uniquely factorize in an appropriate way (in fact, through the unit $\xi$). A very similar process can be followed for the identification of the other classes of limits. As an illustration, let’s consider some of these limits in the 2-category $\mathbf{Cat}$.
The inserter of two functors $F,G:\mathcal{B}\to\mathcal{C}$ is a category $\mathcal{A}$ with objects pairs $(B,h)$ where $B\in\mathcal{B}$ and $h:FB\to GB$ in $\mathcal{C}$. A morphism $(B,h)\to(B',h')$ is an arrow $f:B\to B'$ in $\mathcal{B}$ such that the following diagram commutes: $\begin{matrix} FB & \overset{Ff}{\longrightarrow} & FB' \\ {}_h\downarrow && \downarrow_{h'} \\ FB & \underset{Gh}{\longrightarrow} & GB'. \\ \end{matrix}$ The functor $\alpha_\bullet=P:\mathcal{A}\to\mathcal{B}$ is just the forgetful functor, and the natural transformation is given by $(\alpha_\star)_{(B,h)}=h$.
The comma-object of two functors $F,G$ is precisely the comma category. If the functors have also the same domain, their inserter is a subcategory of the comma category.
The equifier of two natural transformations $\phi^1,\phi^2:F\Rightarrow G:\mathcal{B}\to\mathcal{C}$ is the full subcategory $\mathcal{A}$ of $\mathcal{B}$ over all objects $B$ such that $\phi^1_B=\phi^2_B$ in $\mathcal{C}$.
There is a variety of constructions of new classes of limits from given ones, coming down to the existence of endo-identifiers, inverters, iso-inserters, comma-objects, iso-comma-objects, lax/ oplax/pseudo limits of arrows and the cotensors $\{\mathbf{2},K\}$, $\{\mathbf{I},K\}$ out of inserters, equifiers and binary products in the 2-category $\mathcal{K}$. Along with the substantial construction of arbitrary cotensors out of these three classes, P(roducts)I(nserters)E(quifiers) limits are established as essential tools, also relatively to categories of algebras for 2-monads. Notice that equalizers are `too tight’ to fit in certain 2-categories of importance such as $\mathbf{Lex}$.
Weaker notions of limits in 2-categories
The concept of a weighted 2-limit strongly depends on the specific structure of the 2-category $[\mathcal{P},\mathbf{Cat}]$ of 2-functors, 2-natural transformations and modifications, for the 2-categories $\mathcal{P}$ and $\mathbf{Cat}$. If we alter this structure by considering lax natural transformations or pseudonatural transformations, we obtain the definition of the lax limit $\{F,G\}_l$ and pseudo limit $\{F,G\}_p$ as the representing objects for the 2-functors $\begin{matrix} Lax[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(-,G)):\mathcal{K}^\op\to\mathbf{Cat} \\ Psd[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(-,G)):\mathcal{K}^\op\to\mathbf{Cat}. \end{matrix}$ Notice that the functor categories $Lax[\mathcal{P},\mathcal{L}]$ and $Psd[\mathcal{P},\mathcal{L}]$ are 2-categories whenever $\mathcal{L}$ is a 2-category, hence the defining isomorphisms are again between categories as before.
An important remark is that any lax or pseudo limit in $\mathcal{K}$ can be in fact expressed as a `strict’ weighted 2-limit. This is done by replacing the original weight with its image under the left adjoint of the incusion functors $[\mathcal{P},\mathbf{Cat}]\hookrightarrow Lax[\mathcal{P},\mathbf{Cat}]$, $[\mathcal{P},\mathbf{Cat}]\hookrightarrow Psd[\mathcal{P},\mathbf{Cat}]$. The opposite does not hold: for example, inserters and equifiers are neither lax not pseudo limits.
We can relax the notion of limits in 2-categories even further, and define the bilimit $\{F,G\}_b$ of 2-functors $F$ and $G$ as the representing object up to equivalence: $\mathcal{K}(A,\{F,G\}_b)\simeq Psd[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(A,G)).$ This is of course a particular case of general bilimits in bicategories, for which $\mathcal{P}$ and $\mathcal{K}$ are requested to be bicategories and $F$ and $G$ homomorphism of bicategories. The above equivalence of categories expresses a birepresentation of the homomorphism $Hom[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(-,G)):\mathcal{K}^op\to\mathbf{Cat}$.
Evidently, bilimits (firstly introduced by Ross Street) may exist even when pseudo limits do not, since they require an equivalence rather than isomorphism of hom-categories. The following two results sum up the conditions ensuring whether a 2-category has all lax, pseudo and bilimits.
A 2-category with products, inserters and equifiers has all lax and pseudo limits (whereas it may not have all strict 2-limits).
A 2-category with biproducts, biequalizers and bicotensors is bicategorically complete. Equivalently, it admits all bilimits if and only if for all 2-functors $F:\mathcal{P}\to\mathbf{Cat}$, $G:\mathcal{P}\to\mathcal{K}$ from a small ordinary category $\mathcal{P}$, the above mentioned birepresentation exists.
Street’s construction of an arbitrary bilimit requires a descent object of a 3-truncated bicosimplicial object in $\mathcal{K}$. An appropriate modification of the arguments exhibits lax and pseudo limits as PIE limits.
These weaker forms of limits in 2-categories are fundamental for the theory of 2-categories and bicategories. Many important constructions such as the Eilenberg-Moore object as well as the Grothendieck construction on a fibration, arise as lax/oplax limits. They are also crucial in 2-monad theory, for example when studying categories of (strict) algebras with non-strict (pseudo or even lax/oplax) morphisms, which are more common in nature.
A new Spanish language mathematical magazine has been launched: universo.math. Hispanophones should check out the first issue! There are some very interesting looking articles which cover areas from art through politics to research-level mathematics.
The editor-in-chief is my mathematical brother Jacob Mostovoy and he wants it to be a mix of Mathematical Intellingencer, Notices of the AMS and the New Yorker, together with less orthodox ingredients; the aim is to keep the quality high.
Besides Jacob, the contributors to the first issue that I recognise include Alberto Verjovsky, Ernesto Lupercio and Edward Witten, so universo.math seems to be off to a high quality start.
Guest post by Bruce Bartlett
The following is the greatest math talk I’ve ever watched!
- Etienne Ghys (with pictures and videos by Jos Leys), Knots and Dynamics, ICM Madrid 2006. [See below the fold for some links.]
I wasn’t actually at the ICM; I watched the online version a few years ago, and the story has haunted me ever since. Simon and I have been playing around with some of this stuff, so let me share some of my enthusiasm for it!
The story I want to tell here is how, via modular flow of lattices in the plane, certain matrices in $\SL(2,\mathbb{Z})$ give rise to knots in the 3-sphere less a trefoil knot. Despite possibly sounding quite scary, this can be easily explained in an elementary yet elegant fashion.
As promised above, here are some links related to Ghys’ ICM talk.
- Video — watch this now!
- Slides — the (ungarbled) slides for the talk
- Web version with animations — a great summary
- Article — the accompanying article with details
- More pictures — from Jos Leys’s webpage
- More animations — from Jos Leys’s webpage
- Article by Dana Lyons — a great summary with historical context and interviews
I’m going to focus on the last third of the talk — the modular flow on the space of lattices. That’s what produced the beautiful picture above (credit for this and other similar pics below goes to Jos Leys; the animation is Simon’s.)
Lattices in the plane
For us, a lattice is a discrete subgroup of $\mathbb{C}$. There are three types: the zero lattice, the degenerate lattices, and the nondegenerate lattices:
Given a lattice $L$ and an integer $n \geq 4$ we can calculate a number — the Eisenstein series of the lattice: $G_{n}(L) = \sum _{\omega \in L, \omega \neq 0} \frac{1}{\omega ^{n}}.$ We need $n \geq 3$ for this sum to converge. For, roughly speaking, we can rearrange it as a sum over $r$ of the lattice points on the boundary of a square of radius $r$. The number of lattice points on this boundary scales with $r$, so we end up computing something like $\sum _{r \geq 0} \frac{r}{r^{n}}$ and so we need $n \geq 3$ to make the sum converge.
Note that $G_{n}(L)$ = 0 for $n$ odd since every term $\omega$ is cancelled by the opposite term $-\omega$. So, the first two nontrivial Eisenstein series are $G_{4}$ and $G_{6}$. We can use them to put `Eisenstein coordinates’ on the space of lattices.
Theorem: The map $\begin{aligned} \{ \text{lattices} \} &\rightarrow \mathbb{C}^{2} \\ L & \mapsto (G_{4} (L), \, G_{6}(L)) \end{aligned}$ is a bijection.
The nicest proof is in Serre’s A Course in Arithmetic, p. 89. It is a beautiful application of the Cauchy residue theorem, using the fact that $G_{4}$ and $G_{6}$ define modular forms on the upper half plane $H$. (Usually, number theorists set up their lattices so that they have basis vectors $1$ and $\tau$ where $\tau \in H$. But I want to avoid this ‘upper half plane’ picture as far as possible, since it breaks symmetry and mystifies the geometry. The whole point of the Ghys picture is that not breaking the symmetry reveals a beautiful hidden geometry! Of course, sometimes you need the ‘upper half plane’ picture, like in the proof of the above result.)
Lemma: The degenerate lattices are the ones satisfying $20 G_{4}^{3} - 49G_{6}^{2} = 0$.
Let’s prove one direction of this lemma — that the degenerate lattices do indeed satisfy this equation. To see this, we need to perform a computation. Let’s calculate $G_{4}$ and $G_{6}$ of the lattice $\mathbb{Z} \subset \mathbb{C}$. Well, $G_{4}(\mathbb{Z}) = \sum _{n \neq 0} \frac{1}{n^{4}} = 2 \zeta (4) = 2 \frac{\pi ^{4}}{90}$ where we have cheated and looked up the answer on Wikipedia! Similarly, $G_{6}(\mathbb{Z}) = 2 \frac{\pi ^{6}}{945}$.
So we see that $20 G_{4}(\mathbb{Z})^{3} - 49 G_{6}(\mathbb{Z})^{2} = 0$. Now, every degenerate lattice is of the form $t \mathbb{Z}$ where $t \in \mathbb{C}$. Also, if we transform the lattice via $L \mapsto t L$, then $G_{4} \mapsto t^{-4} G_{4}$ and $G_{6} \mapsto t^{-6} G_{6}$. So the equation remains true for all the degenerate lattices, and we are done.
Corollary: The space of nondegenerate lattices in the plane of unit area is homeomorphic to the complement of the trefoil in $S^{3}$.
The point is that given a lattice $L$ of unit area, we can scale it $L \mapsto \lambda L$, $\lambda \in \mathbb{R}^{+}$ until $(G_{4}(L), G_{6}(L))$ lies on the 3-sphere $S^{3} = \{ (z,w) : |z|^{2} + |w|^{2} = 1\} \subset \mathbb{C}^{2}$. And the equation $20 z^{3} - 49 w^{2} = 0$ intersected with $S^{3}$ cuts out a trefoil knot… because it is “something cubed plus something squared equals zero”. And the lemma above says that the nondegenerate lattices are precisely the ones which do not satisfy this equation, i.e. they represent the complement of this trefoil.
Since we have not divided out by rotations, but only by scaling, we have arrived at a 3-dimensional picture which is very different to the 2-dimensional moduli space (upper half-plane divided by $\SL(2,\mathbb{Z})$) picture familiar to a number theorist.
The modular flow
There is an intriguing flow on the space of lattices of unit area, called the modular flow. Think of $L$ as sitting in $\mathbb{R}^{2}$, and then act on $\mathbb{R}^{2}$ via the transformation $\left ( \begin{array}{cc} e^{t} & 0 \\ 0 & e^{-t} \end{array} \right ),$ dragging the lattice $L$ along for the ride. (This isn’t just some formula we pulled out the blue — geometrically this is the ‘geodesic flow on the unit tangent bundle of the modular orbifold’.)
We are looking for periodic orbits of this flow.
“Impossible!” you say. “The points of the lattice go off to infinity!” Indeed they do… but disregard the individual points. The lattice itself can ‘click’ back into its original position:
How are we to find such periodic orbits? Start with an integer matrix $A = \left ( \begin{array}{cc} a & b \\ c & d \end{array}\right ) \in \SL(2, \mathbb{Z})$ and assume $A$ is hyperbolic, which simply means $|a + d| \geq 2$. Under these conditions, we can diagonalize $A$ over the reals, so we can find a real matrix $P$ such that $P A P^{-1} = \pm \left ( \begin{array}{cc} e^{t} & 0 \\ 0 & e^{-t} \end{array} \right )$ for some $t \in \mathbb{R}$. Now set $L \coloneqq P(\mathbb{Z}^{2})$. We claim that $L$ is a periodic orbit of period $t$. Indeed: $\begin{aligned} L_{t} &= \left ( \begin{array}{cc} e^{t} & 0 \\ 0 & e^{-t} \end{array} \right ) P (\mathbb{Z}^{2}) \\ &= \pm PA (\mathbb{Z}^{2}) \\ &= \pm P (\mathbb{Z}^{2}) \\ &= L. \end{aligned}$ We have just proved one direction of the following.
Theorem: The periodic orbits of the modular flow are in bijection with the conjugacy classes of hyperbolic elements in $\SL(2, \mathbb{Z})$.
These periodic orbits produce fascinating knots in the complement of the trefoil! In fact, they link with the trefoil (the locus of degenerate lattices) in fascinating ways. Here are two examples, starting with different matrices $A \in \SL(2, \mathbb{Z})$.
The trefoil is the fixed orange curve, while the periodic orbits are the red and green curves respectively.
Ghys proved the following two remarkable facts about these modular knots.
- The linking number of a modular knot with the trefoil of degenerate lattices equals the Rademacher function of the corresponding matrix in $\SL(2, \mathbb{Z})$ (the change in phase of the Dedekind eta function).
- The knots occuring in the modular flow are the same as those occuring in the Lorenz equations!
Who would have thought that lattices in the plane could tell the weather!!
I must say I have thought about many aspects of these closed geodesics, but it had never crossed my mind to ask which knots are produced. – Peter Sarnak
Nina Otter is a master’s student in mathematics at ETH Zürich who has just gotten into the PhD program at Oxford. She and I are writing a paper on operads and the tree of life.
Anyone who knows about operads knows that they’re related to trees. But I’m hoping someone has proved some precise theorems about this relationship, so that we don’t have to.
By operad I’ll always mean a symmetric topological operad. Such a thing has an underlying ‘symmetric collection’, which in turn has an underlying ‘collection’. A collection is just a sequence of topological spaces $O_n$ for $n \ge 0$. In a symmetric collection, each space $O_n$ has an action of the symmetric group $S_n$.
I’m hoping that someone has already proved something like this:
Conjecture 1. The free operad on the terminal symmetric collection has, as its space of $n$-ary operations, the set of rooted trees having some of their leaves labelled $\{1, \dots, n\}$.
Conjecture 2. The free operad on the terminal collection has, as its space of $n$-ary operations, the set of planar rooted trees having some of their leaves labelled $\{1, \dots, n\}$.
Calling them ‘conjectures’ makes it sound like they might be hard — but they’re not supposed to be hard. If they’re right, they should be easy and in some sense well-known! But there are various slightly different concepts of ‘rooted tree’ and ‘rooted planar tree’, and we have to get the details right to make these conjectures true. For example, a graph theorist might draw a rooted planar tree like this:
while an operad theorist might draw it like this:
If the conjectures are right, we can use them to define the concepts of ‘rooted tree’ and ‘rooted planar tree’, thus side-stepping these details. And having purely operad-theoretic definitions of ‘tree’ and ‘rooted tree’ would make it a lot easier to use these concepts in operad theory. That’s what I want to do, ultimately. But proving these conjectures, and of course providing the precise definitions of ‘rooted tree’ and ‘rooted planar tree’ that make them true, would still be very nice.
And it would be even nicer if someone had already done this. So please provide references… and/or correct mistakes in the following stuff!
Rooted Trees
Definition. For any natural number $n = 0, 1, 2, \dots$, an $n$-tree is a quadruple $T=(V,E,s,t)$ where:
- $V$ is a finite set whose elements are called internal vertices;
- $E$ is a finite non-empty set whose elements are called edges;
- $s: E \to V\sqcup \{1,\dots, n\}$ and $t: E \to V \sqcup\{0\}$ are maps sending any edge to its source and target, respectively.
Given $u,v\in V \sqcup\{0\} \sqcup \{1,\dots, n\}$, we write $u \stackrel{e}{\longrightarrow} v$ if $e\in E$ is an edge such that $s(e)=u$ and $t(e)=v$.
This data is required to satisfy the following conditions:
- $s : E \to V\sqcup \{1,\dots, n\}$ is a bijection;
- there exists exactly one $e\in E$ such that $t(e)=0$;
- for any $v \in V\sqcup\{1,\dots , n\}$ there exists a directed edge path from $v$ to $0$: that is, a sequence of edges $e_0, \dots, e_n$ and vertices $v_1 , \dots , v_n$ such that $v \stackrel{e_0}{\longrightarrow} v_1 , \; v_1 \stackrel{e_1}{\longrightarrow} v_2, \; \dots, \; v_n \stackrel{e_n}{\longrightarrow} 0 .$
So the idea is that our tree has $V \sqcup\{0\} \sqcup \{1,\dots, n\}$ as its set of vertices. There could be lots of leaves, but we’ve labelled some of them by numbers $1, \dots, n$. In our pictures, the source of each edge is at its top, while the target is at bottom.
There is a root, called $0$, but also a ‘pre-root’: the unique vertex with an edge going from it to $0$. I’m not sure I like this last bit, and we might be able to eliminate this redundancy, but it’s built into the operad theorist’s picture here:
It might be a purely esthetic issue. Like everything else, it gets a bit more scary when we consider degenerate special cases.
I’m hoping there’s an operad $Tree$ whose set of $n$-ary operations, $Tree_n$, consists of isomorphism classes of $n$-trees as defined above. I’m hoping someone has already proved this. And I hope someone has characterized this operad $Tree$ in a more algebraic way, as follows.
Definition. A symmetric collection $C$ consists of topological spaces $\{C_n\}_{n \ge 0}$ together with a continuous action of the symmetric group $S_n$ on each space $C_n$. A morphism of symmetric collections $f : C \to C'$ consists of an $S_n$-equivariant continuous map $f_n : C_n \to C'_n$ for each $n \ge 0$. Symmetric collections and morphisms between them form a category ${STop}$.
(More concisely, if we denote the groupoid of sets of the $\{1, \dots, n\}$ and bijections between these as $S$, then $STop$ is the category of functors from $S$ to $Top$.)
There is a forgetful functor from operads to symmetric collections
$U : Op \to STop$
with a left adjoint
$F: STop \to Op$
assigning to each symmetric collection the operad freely generated by it.
Definition. Let $Comm$ be the terminal operad: that is, the operad, unique up to isomorphism, such that $Comm_n$ is a 1-element set for each $n \ge 0$.
The algebras of $\Comm$ are commutative topological monoids.
Conjecture 1. There is a unique isomorphism of operads
$\phi : F (U (Comm)) \stackrel{\sim}{\longrightarrow} Tree$
that for each $n \ge 0$ sends the unique $n$-ary operation in $Comm$ to the corolla with $n$ leaves: that is, the isomorphism class of $n$-trees with no internal vertices.
(When I say “the unique $n$-ary operation in $Comm$”, but treating it as an operation in $F(U(Comm))$, I’m using the fact that the unique operation in $\Comm_n$ gives an element in $U(\Comm)_n$, and thus an operation in $F(U(\Comm))_n$.)
Planar Rooted Trees
And there should be a similar result relating planar rooted trees to collections without symmetric group actions!
Definition. A planar $n$-tree is an $n$-tree in which each internal vertex $v$ is equipped with a linear order on the set of its children, i.e. the set $t^{-1}(v)$.
I’m hoping someone has constructed an operad $PTree$ whose set of $n$-ary operations, $PTree_n$, consists of isomorphism classes of planar $n$-trees. And I hope someone has characterized this operad $PTree$ as follows:
Definition. A collection $C$ consists of topological spaces $\{C_n\}_{n \ge 0}$. A morphism of collections $f :C \to C'$ consists of a continuous map $f_n : C_n \to C'_n$ for each $n \ge 0$. Collections and morphisms between them form a category $\mathbb{N}Top$.
(If we denote the category with natural numbers as objects and only identity morphisms between as $\mathbb{N}$, then $\mathbb{N}Top$ is the category of functors from $\mathbb{N}$ to $Top$.)
There is a forgetful functor
$\Upsilon : Op \to \mathbb{N}Top$
with a left adjoint
$\Phi : \mathbb{N}Top \to Op$
assigning to each collection the operad freely generated by it. This left adjoint is the composite
$\mathbb{N}Top \stackrel{G}{\longrightarrow} STop \stackrel{F}{\longrightarrow} Op$
where the first functor freely creates a symmetric collection $G(C)$ from a collection $C$ by setting $G(C)_n = S_n \times C_n$, and the second freely generates an operad from a symmetric collection, as described above.
Conjecture 2. There is a unique isomorphism of operads
$\psi : \Phi(\Upsilon (Comm)) \stackrel{\sim}{\longrightarrow} PTree$
that for each $n \ge 0$ sends the unique $n$-ary operation in $Comm$ to the corolla with $n$ leaves ordered so that $1 < \cdots < n$.
Have you seen a proof of this stuff???
Guest post by Sean Moss
In this post I shall discuss the paper “On a Topological Topos” by Peter Johnstone. The basic problem is that algebraic topology needs a “convenient category of spaces” in which to work: the category $\mathcal{T}$ of topological spaces has few good categorical properties beyond having all small limits and colimits. Ideally we would like a subcategory, containing most spaces of interest, which is at least cartesian closed, so that there is a useful notion of function space for any pair of objects. A popular choice for a “convenient category” is the full subcategory of $\mathcal{T}$ consisting of compactly-generated spaces. Another approach is to weaken the notion of topological space, i.e. to embed $\mathcal{T}$ into a larger category, hopefully with better categorical properties.
A topos is a category with enough good properties (including cartesian closedness) that it acts like the category of sets. Thus a topos acts like a mathematical universe with ‘sets’, ‘functions’ and its own internal logic for manipulating them. It is exciting to think that if a “convenient topos of spaces” could be found, then its logical aspects could be applied to the study of its objects. The set-like nature of toposes might make it seem unlikely that this can happen. For instance, every topos is balanced, but the category of topological spaces is famously not. However, sheaves (the objects in Grothendieck toposes) originate from geometry and already behave somewhat like generalized spaces.
I shall begin by elaborating on this observation about Grothendieck toposes, and briefly examine some previous attempts at a “topological topos”. I shall explore the idea of replacing open sets with convergent sequences and see how this leads us to Johnstone’s topos. Finally I shall describe how Johnstone’s topos is a good setting for homotopy theory.
I would to thank Emily Riehl for organizing the Kan Extension Seminar and for much useful feedback. Thanks go also to the other seminar participants for being a constant source of interesting perspectives, and to Peter Johnstone, for his advice and for writing this incredible paper.
Are there toposes of spaces?
We shall need to be flexible about what we mean by “space”. For the rest of this post I shall try to use the term “topological space” in a strict technical sense (a set of points plus specified open sets), whereas “space” will be a nebulous concept. The idea is that spaces have existence regardless of having been implemented as a topological space or not, and may naturally have more (or perhaps less) structure. Topology merely forms one setting for their rigorous study. In topology we can only detect the topological properties of spaces. For example, $\mathbb{R}$ and $(0,1)$ are isomorphic as topological spaces, but they are far from being the same space: consider how different their implementations as, say, metric spaces are. Some spaces are naturally considered as having algebraic or smooth structure. The type of question one wishes to ask about a space will bear upon the type of object as which it should be implemented.
An extremely important class of toposes consists of the Grothendieck toposes, which are categories of sheaves on a site. A site is a small category together with a Grothendieck coverage (also known as a Grothendieck topology). Informally, the Grothendieck coverage tells us how some objects can be “covered” by maps coming out of other objects. In the special case where the site is a topological space, the objects are open sets and the coverage tells us that an open set is covered by the inclusions of any family of open sets whose union is all of that open set. A sheaf on a site is then a contravariant $\mathrm{Set}$-valued functor on the underlying category (a presheaf) which satisfies a “unique patching” condition with respect to each covering sieve.
In the following two senses, a Grothendieck topos always behaves like a category of spaces:
(A) One way to describe the properties of a space is to consider the maps into that space. This is the idea behind the homotopy groups, where we consider (homotopy classes of) maps from the $n$-sphere into a space. Given a small category $\mathcal{C}$, each object is determined by knowing all the arrows into it and how these arrows “restrict” along other arrows, i.e. precisely the data of the representable presheaf. A non-representable presheaf can be viewed as a generalized object of $\mathcal{C}$, which is testable by the ‘classical’ objects of $\mathcal{C}$: it is described entirely by what the maps into it from objects of $\mathcal{C}$ ought to be. If we have in mind that $\mathcal{C}$ is some category of spaces, with some sense in which some spaces are covered by families of maps out of other spaces, (i.e. we have a Grothendieck coverage), then we should be able to patch maps into these generalized spaces together. So the topos of sheaves on this site is a setting in which we may be able to implement certain spaces, if we wish to study their properties testable by objects of $\mathcal{C}$.
(B) The category of presheaves on a small category $\mathcal{C}$ is its free cocompletion. Intuitively, it is the category of objects obtained by formally gluing together objects of $\mathcal{C}$. The use of the word “gluing” is itself a spatial metaphor. CW-complexes are built out of gluing together cells - simplicial sets are instructions for carrying out this gluing. Manifolds are built from gluing together open subsets of Euclidean space. Purely formal ‘gluing’ is not quite sufficient: the Yoneda embedding of $\mathcal{C}$ into its presheaves typically does not preserve any colimits already in $\mathcal{C}$. But if $\mathcal{C}$ is a category of spaces, its objects are not neutral with respect to each other: there may be a suitable Grothendieck coverage on $\mathcal{C}$ which tells us how some objects can cover others. The topos of sheaves is then the category of objects obtained by formally gluing objects of $\mathcal{C}$ in a way that respects these coverings. This is strongly connected with the preservation of colimits by the embedding of $\mathcal{C}$ into the sheaves. Colimits in the presheaf topos are constructed pointwise; to get the sheaf colimit one applies the reflection into the category of sheaves (“sheafification”) to the presheaf colimit. The more covers imposed on $\mathcal{C}$, the more work is done by the sheafification, so the closer we end up to the original colimit.
Are there toposes in topology?
It is far from clear that we can choose a site for which the space-like behaviour of sheaves accords with the usual topological intuition. If we want to use a topological topos for homotopy theory, then ideally it should contain objects that we can recognize as the CW-complexes, and we should be able to construct them via more or less the usual colimits.
Attempt 1: The “gros topos” of Giraud
The idea is to take sheaves on the ‘site’ of topological spaces, where covers are given by families of open inclusions of subspaces whose union is the whole space. We do not automatically get a topos unless the site is small, so instead take some small, full subcategory $\mathcal{C}$ of $\mathcal{T}$, which is closed under open subspaces. The gros topos is the topos of sheaves for this site.
The Yoneda embedding exhibits $\mathcal{C}$ as a subcategory, and in fact we can ‘embed’ $\mathcal{T}$ via the functor $X \mapsto \hom_\mathcal{T}(-,X)$, this will be full and faithful on a fairly large subcategory. By (B) one may like to consider the gros topos as the category of spaces glued together from objects of $\mathcal{C}$. This turns out not to be useful, since the site does not have enough covers for colimits to agree with those in $\mathcal{T}$. Moreover the site is so large that calculations are difficult.
Attempt 2: Lawvere’s topos
We use observation (A). Motivated by the use of paths in homotopy theory, we take $M$ to be the full subcategory of $\mathcal{T}$ whose only object is the closed unit interval $I$. So $M$ is the monoid of continuous endomorphisms of $I$. Lawvere’s topos $\mathcal{L}$ is the topos of sheaves on $M$ with respect to the canonical Grothendieck coverage (the largest Grothendieck coverage on $M$ for which $\hom_M(-,I)$ is a sheaf).
Then an object $X$ of $\mathcal{L}$ is a set $X(I)$ of paths, together with, for any continuous $\gamma\colon I \to I$, a reparametrization map $X(\gamma) \colon X(I) \to X(I)$, where this assignment is functorial. The points of such a space are given by natural transformations $1 \to X$, i.e. ‘constant paths’ or paths which are fixed by every reparametrization. We can see which point a path visits at time $t$ by reparametrizing that path by the constant map $I \to I$ with value $t$. A word of caution: a given object in $\mathcal{L}$ may have distinct paths which agree on points for all time.
This site is much easier to calculate with than the gros site (once we have a handle on the canonical coverage). Again there is a functor $P \colon \mathcal{T} \to \mathcal{L}$ given by $X \mapsto \hom_\mathcal{T}(I,X)$, which is full and faithful on a fairly large subcategory (including CW-complexes). However, it is still the case that the site could do with more covers: the functor $P$ does not preserve all the colimits used to build up CW-complexes. By observation (B), an object of $\mathcal{L}$ is obtained by gluing together copies of the unit interval $I$, so it is possible to construct the circle $S^1$ out of copies of $I$, but we cannot do this in the usual way. The coequalizer of $I$ by its endpoints in $\mathcal{L}$ is not $S^1$, but a “signet-ring”: it is a circle with a ‘lump’, through which a path can cross only if waits there for non-zero time. We cannot solve this problem by adding in more covers, because the coverage is already canonical (adding in more covers will evict the representable $\hom_\mathcal{T}(-,I)$ from the topos).
The key idea in Johnstone’s topos is to replace paths with convergent sequences. Given a topological space $X$, a convergent sequence in $X$ is a function from $a \colon \mathbb{N}\cup\{\infty\} \to X$ such that whenever $U \subseteq X$ is an open set containing $a_\infty$, then there exists an $N$ such that $a_n \in U$ for all $n \gt N$. The convergent sequences are precisely the continuous maps out of $\mathbb{N}\cup\{\infty\}$ when we give it the topology that makes it the one-point compactification of the discrete space $\mathbb{N}$ - we denote this topological space by $\mathbb{N}^+$.
Convergent sequences as primitive
It is a basic theorem in general topology that, given a function $f\colon X \to Y$ between topological spaces, if it is continuous then it preserves convergent sequences. The converse is not true for general topological spaces, but it is true whenever $Y$ is a sequential space. Given a topological space $X$, A set $U \subseteq X$ is sequentially open if for any convergent sequence $(a_n)$, with $a_\infty \in U$, $(a_n)$ is eventually in $U$. (Clearly any open subset is sequentially open.) A topological space is then said to be sequential if all of its sequentially open sets are open. The sequential spaces include all first-countable spaces and in fact they can be characterized as the topological quotients of metrizable spaces, so they certainly include all CW-complexes.
The notion of convergent sequence is arguably more intuitive than that of open set. For example, each convergent sequence gives you concrete data about the nearness of some family of points to another point, whereas open sets only give you such data when the topology (or at least a neighbourhood basis) is considered as a whole. It would be compelling to define a continuous function as one that preserves convergent sequences. This motivates the study of subsequential spaces.
A subsequential space consists of a set $X$ (of points) and family of “convergent sequences”: a specified subset of the set of functions $\mathbb{N}\cup\{\infty\} \to X$, such that:
- for every point $x \in X$, the constant sequence $(x)$ converges to $x$;
- if $(x_n)$ converges to $x$, then so does every subsequence of $(x_n)$;
- if $(x_n)$ is a sequence and $x$ is a point such that every subsequence of $(x_n)$ contains a (sub)subsequence converging to $x$, then $(x_n)$ converges to $x$.
The third axiom is the general form of intertwining two or more sequences with the same limit or changing a finite initial segment of a sequence. Note that there is no ‘Hausdorff‘-style condition on the convergent sequences: a sequence may converge to more than one limit. A continuous map between subsequential spaces $X \to Y$ is a function from the points of $X$ to the points of $Y$ that preserves convergence of sequences.
The axioms above are all true of the set of convergent sequences which arise from a topology on a set. In fact, this process gives a full and faithful embedding of sequential spaces into subsequential spaces. Thus sequential spaces live inside both topological and subsequential spaces. They are coreflective in the former and reflective in the latter: given a topological space $X$, its sequentially open sets constitute a new (finer) topology; given a subsequential space $Y$, we can consider the “sequentially open” sets with respect to its convergent sequences, and then take all convergent sequences in the resulting (sequential) topological space. Observe that the sense of the adjunction in each case comes from the fact that we either throw in more open sets - so there is a natural map $(X)_\text{seq} \to X$, or throw in more convergent sequences - so there is a natural map $Y \to (Y)_\text{seq}$.
In the following I shall denote the category of sequential spaces by $\mathcal{F}$ and that of subsequential spaces by $\mathcal{F}'$.
Johnstone’s topos
Let $\Sigma$ be the full subcategory of $\mathcal{T}$ on the objects $1$ (the singleton space) and $\mathbb{N}^+$ (the one-point compactification of the discrete space of natural numbers). The arrows in this category can be described without topology as well: as functions, the maps $\mathbb{N}^+ \to \mathbb{N}^+$ are the eventually constant ones and the ones that “tend to infinity”.
Given an infinite subset $T \subseteq \mathbb{N}$, let $f_T$ denote the unique order-preserving injection $\mathbb{N}^+ \to \mathbb{N}^+$ whose image is $T \cup \{\infty\}$. One can check that there is a Grothendieck coverage $J$ on $\Sigma$ where $1$ is covered by only the maximal sieve, and where $\mathbb{N}^+$ is covered by any sieve $R$ such that:
- $R$ contains all of the points $n \colon 1 \to \mathbb{N}^+$, $n \in \mathbb{N}^+$.
- For any infinite subset $T \subseteq \mathbb{N}$ there exists an infinite subset $T' \subseteq T$ such that $f_{T'} \in R$.
The topos $\mathcal{E}$ is then defined to be $\mathrm{Sh}(\Sigma,J)$.
The objects in our topos are a slight generalization of subsequential space. If $X \in \mathcal{E}$, then $X(1)$ is its set of points, and $X(\mathbb{N}^+)$ is its set of convergent sequences. Each point $n \colon 1 \to \mathbb{N}^+$ induces a ‘projection map’ $X(n)\colon X(\mathbb{N}^+) \to X(1)$, giving you the point of the sequence at time $n$. The unique map $\mathbb{N}^+ \to 1$ induces a map $X(1) \to X(\mathbb{N}^+)$, which sends each point to a canonical choice of constant sequence. Note that there may be more than one convergent sequence with the same points, thus it may be helpful to think of $X(\mathbb{N}^+)$ as the set of proofs of convergence for sequences.
Clearly we can embed $\mathcal{F}'$, the subsequential spaces, into $\mathcal{E}$: the points are the same, and the convergence proofs are just the convergent sequences. The first two axioms are satisfied because of the equations that hold in $\Sigma$. The third axiom is encoded into the coverage. Conversely, any object $X$ of $\mathcal{E}$ for which the projection maps $X(n)\colon X(\mathbb{N}^+) \to X(1)$, $n \in \mathbb{N}^+$ are jointly injective is isomorphic to one coming from a subsequential space. There is a functor $H \colon \mathcal{T} \to \mathcal{E}$ sending $X \mapsto \hom_\mathcal{T}(-,X)$, and it is indeed sheaf-valued since it is equal to the composite of the coreflection $\mathcal{T} \to \mathcal{F}$ with the inclusions $\mathcal{F} \to \mathcal{F}' \to \mathcal{E}$. In fact, the Grothendieck coverage defining $\mathcal{E}$ is canonical, so it is the largest for which this functor is well-defined.
We can use observation (B) to think of $\mathcal{E}$ as all spaces constructed from gluing sequences together. It is just about possible that we could have motivated the construction of $\mathcal{E}$ this way: classically, any sequential space $X$ is the quotient in $\mathcal{T}$ of a metrizable space, which may be taken to be a disjoint union of copies of $\mathbb{N}^+$ - one for every convergent sequence in $X$. Compare this with the canonical representation of a presheaf as a colimit of representables (one for each of its elements).
Colimits
It turns out that $\mathcal{F}'$ is the subcategory of $\neg\neg$-separated objects in $\mathcal{E}$, hence it is a reflective subcategory. $\mathcal{F}$ is reflective in $\mathcal{F}'$, hence it is also reflective in $\mathcal{E}$. In particular, all limits in $\mathcal{F}$ are preserved by the inclusion into $\mathcal{E}$. Take some caution, however, since products do not agree with those in $\mathcal{T}$: one has to take the sequential coreflection of the topological product. This is only a minor issue; having to modify the product arises in other “convenient categories” such as compactly-generated spaces.
The colimits in $\mathcal{F}$ do agree with those in $\mathcal{T}$ because it is a coreflective subcategory. Surprisingly, the inclusion $\mathcal{F} \to \mathcal{E}$ preserves many of these colimits.
Theorem Let $X$ be a sequential space, and $\{U_\alpha \mid \alpha \in A\}$ an open cover of $X$. Then the obvious colimit diagram in $\mathcal{F}$: $\begin{matrix} U_\alpha\cap U_\beta & \rightarrow & U_\alpha & & \\ & \searrow & & \searrow & \\ U_\beta \cap U_\gamma & \rightarrow & U_\beta & \rightarrow & X \\ \vdots & \searrow & & \nearrow & \\ & & U_\gamma & & \\ & & \vdots & & \end{matrix}$ is preserved by the embedding $\mathcal{F} \to \mathcal{E}$.
Proof The recipe for this sort of theorem is: take the colimit in presheaves, show that the comparison map is monic, then show that it is $J$-dense, for then it will exhibit $X$ as the colimit upon reflecting into the topos $\mathcal{E}$. The colimit $L$ in presheaves is calculated “objectwise”, so $L$ has the same points of $X$, but only those convergent sequences which are entirely within some $U_\alpha$ (hence the comparison map $L \to X$ is monic). To sheafify, we need to add in all those sequences $x \in X(\mathbb{N}^+)$ which are locally in $L$, i.e. for which the sieve $\{f \colon ? \to \mathbb{N}^+ \mid X(f)(x) \in L(?) \}$ in $\Sigma$ is $J$-covering. For any $x \in X(\mathbb{N}^+)$, this sieve clearly contains all the points $1 \to \mathbb{N}^+$. But $x$ must also be eventually within one of the $U_\alpha$, so the second condition for the covering sieves is also satisfied. $\square$
There are several other colimit preservation results one can talk about (with similar proofs to the above). The amazing consequence of these is that the colimits used to construct CW-complexes are all preserved by the embedding $\mathcal{F} \to \mathcal{E}$. Thus classical homotopy theory embeds into $\mathcal{E}$ and we have successfully found a topos of spaces which agrees with the classical theory.
Geometric realization
Let $\Delta$ be the category of non-zero finite ordinals and order-preserving maps. Then objects of the presheaf category $[\Delta^\mathrm{op},\mathrm{Set}]$ are known as simplicial sets.
Theorem $[\Delta^\mathrm{op},\mathrm{Set}]$ is the classifying topos for intervals in $\mathrm{Set}$-toposes.
The closed unit interval $[0,1]$ is sequential and is in fact an interval (a totally ordered object with distinct top and bottom elements). Thus it corresponds to a geometric morphism $\mathcal{E} \to [\Delta^\mathrm{op},\mathrm{Set}]$ (an adjunction $(f^\star \dashv f_\star)$ with $f^\star$ left-exact).
Theorem If $S \in [\Delta^\mathrm{op},\mathrm{Set}]$ is a simplicial set, then $f^\star(S)$ is its geometric realization, considered as a sequential space and hence as an object of $\mathcal{E}$. If $X \in \mathcal{E}$ is a sequential space, then $f_\star(E)$ is its singular complex.
The usual geometric realization is not left-exact if considered to take values in $\mathcal{T}$, one must choose a “convenient subcategory” first, and then there is some work to do in proving it. Here the left-exactness just arises out of the general theory of geometric morphisms. Should we wish to do so, the above method allows us to replace $[0,1]$ with any other object that the internal logic of $\mathcal{E}$ sees as an interval to get a different realization of simplicial sets.
The above is far from a complete survey of “On a Topological Topos”, which contains several more results of interest relating to $\mathcal{E}$ and captures the elegance of using the site $\Sigma$ for calculation - I thoroughly recommend taking a look if you know some topos theory. We have seen enough though to understand that for many spaces the sequential properties align with the topological properties. Unfortunately, $\mathcal{E}$ is yet to receive the attention it deserves.
Guest post by Nils Carqueville and Daniel Murfet
(My university probably isn’t alone in encouraging mathematicians and computer scientists to embrace the idea of “big data”, or in more sober terminology, “data science”. Here, Nils Carqueville and Daniel Murfet introduce their really excellent article on big data in whole-population surveillance. —TL)
In recent years Big Data has become an increasingly relevant topic in the economic sector, for intelligence agencies, and for the sciences.
A particularly far-reaching development made possible by Big Data is that of unprecedented mass surveillance. As Alexander Beilinson, Stefan Forcey, Tom Leinster and others have pointed out, the role that mathematicians and computer scientists play here is central. With this in mind last January we wrote an essay on
stressing some of those aspects of the matter that we think deserve more attention or additional elaboration. We hope this to be a useful contribution to the necessary discussion on modern mass surveillance, and we thank Tom for his efforts in this direction, and for allowing us to post here.
Standard logic involving the truth values ‘true’ and ‘false’ can make it difficult to model some of the fuzziness we use in everyday speech. If you’d bought a bike yesterday then today it would be truthful to say “This bike is new”, but it wouldn’t be truthful so say it in 20 years’ time. However, between now and then there won’t be a specific day on which the statement “This bike is new” suddenly switches from being true to being false. How can you model this situation?
One approach to modelling this situation is with fuzzy logic where you allow your truth values to be things other than just true and false. For instance, you can take the interval $[0,1]$ as the set of truth values with $0$ representing false and $1$ representing true. So the truth degree of the statement “This bike is new” would vary, being $1$ today and decreasing to something very close to $0$ in 20 years’ time.
This post is an attempt by me to understand this fuzzy logic in the context of enriched category theory, in particular, using $[0,1]$ as a monoidal category to enrich over. We will see that categories enriched over $[0,1]$ can be interpreted as fuzzy posets or fuzzy preorders.
This was going to be a comment on Tom Avery’s Kan Extension Seminar post on Metric Spaces, Generalized Logic, and Closed Categories but grew too big!
I have been trying to understand a bit of fuzzy logic because it feeds into fuzzy concept analysis which I’m also trying to understand. As ever, I understand the category theory better than the logic. In terms of references, I’ve picked up bits and pieces from Belohlavek’s Concept lattices and order in fuzzy logic and Hájek’s Metamathematics of fuzzy logic.
Truth degrees in $[0,1]$: fuzzy logic
As mentioned above you can replace your set of truth values $\{ \mathrm{false},\mathrm{true}\}$ by the interval $[0,1]$, with $0$ representing completely false and $1$ representing completely true. Thus an element of $[0,1]$ represents a ‘degree’ of truth. The idea is that this represents the degree of truthfulness of statements like “George is old” or “The film is good”: it is not supposed to represent than the probabitlity of the truth of a statement like “I will receive an email from a student tomorrow”. We are modelling vagueness and not probabilty. Also, we are supposed to be modeling ojectivity so the truth degree of the statement “George is old” will depend on the age of George and not on anyone’s opinion of how old he seems to be.
Being category theorists, we might try to fit this into Lawvere’s framework of generalized logic, where we think of our truth values as forming a closed monoidal category. We can make $[0,1]$ it into a poset — and hence a category — by using the order $\le$, this will be our notion of entailment. So if $P$ and $Q$ are statements then $P\vdash Q$ precisely when the truth degree of $P$ is less than or equal to the truth degree of $Q$. However, I’m not entirely sure how to interpret that.
The observant amongst you will have noticed that this poset $[0,1]$ is isomorphic to the poset we use for metric spaces $([0,\infty ],\ge )$, via the maps $x\mapsto \exp (-x)$ and $a\mapsto -\ln (a)$. So an alternative interpretation of these truth values will be as ‘proximities’ — proximity approximately $0$ meaning not at all close, and proximity approximately $1$ meaning really, really close. [Switching from distances in $[0,\infty ]$ to proximities in $[0,1]$ is an important step in calculating magnitudes of metric spaces, but I won’t say any more about that.]
Next we want some notion of ‘conjunction’ $\otimes$ and ‘implication’ $\Rightarrow$. In category theory terms we want a closed monoidal structure on the category $[0,1]$. There are at least three such structures that are well studied.
The product structure: Here $a\otimes b \coloneqq a\cdot b$ and $a\Rightarrow b \coloneqq 1$ if $a\le b$ and $b/a$ otherwise. Via the exponential and logarithm maps this corresponds to the usual closed monoidal structure on $[0,\infty ]$ of addition and truncated subtraction.
The Gödel structure: Here $a\otimes b \coloneqq \min (a,b)$ and $a\Rightarrow b \coloneqq 1$ if $a\le b$ and $b$ otherwise. Via the exponential and logarithm maps this corresponds to the closed monoidal structure on $[0,\infty ]$ which gives rise to ultrametric spaces.
The Łukasiewicz structure: Here $a\otimes b \coloneqq \max (a+b-1,0)$ and $a\Rightarrow b\coloneqq \min (1-a+b,1)$. Via the exponential and logarithm maps this corresponds to the monoidal structure $x\otimes y\coloneqq -\ln (\max (e^{-x}+e^{-y}-1,0))$ and that doesn’t look at all familiar!
If the truth degrees of two statements are $P$ and $Q$ then the truth degree of both statements together should be given by the conjunction $P\otimes Q$, but I don’t know examples of real world models with these three different logics. Anybody? I would be very interested in hearing thoughts and ideas on this.
In all of the above three cases the closed monoidal category $\{ \mathrm{false},\mathrm{true}\}$ of classical truth values embeds closed monoidally.
Enriching over $[0,1]$: fuzzy preorders
We can now think about what categories enriched in $([0,1],\le )$ are, as generalizations of preorders. Such an enriched category consists of a set $C$ and to each pair of elements $c,c'\in C$ we associate a truth degree $C(c,c')\in [0,1]$, we want to think of this as the degree to which $c$ is ordered before $c'$. This is what we will call a fuzzy preorder. It will satisfy fuzzy transitivity: $C(c',c'')\otimes C(c,c')\le C(c,c'')$. How this is actually interpreted will depend on which product $\otimes$ we took.
Here’s an example of a fuzzy preorder. Suppose that Bart is aged 9, Marge is aged 39 and Homer is aged 40. Is it true that Homer is (at least) as old as Marge? Yes. Is it true that Marge is (at least) as old as Homer? Well, nearly. Is it true that Bart is (at least) as old as Marge? Well, not really at all. We can formalise this into a fuzzy preorder $\succeq$ “is at least as old as” where $\begin{aligned} (\text {Homer}\succeq \text {Marge})&=1 \\ (\text {Marge}\succeq \text {Homer})&=\text {a tiny bit less than}\, \, 1\\ (\text {Bart}\succeq \text {Marge})&=\text {much less than}\, \, 1. \end{aligned}$ Algebraically we could define $A\succeq B\coloneqq \begin{cases} 1 &\text {if}\, \, \mathrm{age}(A)\ge \mathrm{age}(B)\\ \mathrm{age}(A)/\mathrm{age}(B)&\text {if}\, \, \mathrm{age}(A)\lt \mathrm{age}(B) . \end{cases}$ This gives a fuzzy preorder on the set $(\text {Bart}, \text {Marge}, \text {Homer})$ with respect the product monoidal structure $\cdot$ on $[0,1]$, or in other words, it gives a category enriched over the monoidal category $([0,1],\cdot ,1)$.
There should be some more convincing examples, but I haven’t found any yet.
An exercise at this point is to look at the dictionary in my post on Galois correspondences and enriched categories and extend it by another column for categories enriched over $[0,1]$.
Term is nearly over, which for me means the end of the 4th year Fourier Analysis course I’ve been teaching for the last couple of years.
I was fortunate enough to take over the course from Jim Wright, a genuine expert on the subject, and I inherited a great set of notes from him. But I felt the need to make the course my own, so I’ve been writing my own notes, which I’ve just finished: notes here, plus accompanying problem sheets. They’re mostly about convergence of Fourier series, with a delicious dessert of Fourier analysis on finite abelian groups.
But what I wanted to write about here — and get your opinions on — was not Fourier analysis, but some questions of teaching. This year, I’ve been (in the jargon) “flipping the classroom”, or at least partially flipping it (which reminds me of that mysterious substance, partially inverted sugar syrup, that you sometimes see on ingredients lists). I’d like to hear about other people’s similar experiences.
The allotted class time is two 50-minute “lectures” a week, plus one “workshop” (Edinburgh lingo for tutorial or exercise class) a fortnight. Here’s the routine for lectures:
At least a week before a given lecture, I make Latexed notes for that lecture available online.
The students are committed to spending at least an hour before each lecture reading over the notes.
I spend the first half of each lecture giving an overview of that portion of the notes, on the firm assumption that the students have done the prescribed amount of reading beforehand. Thus, I can spend time talking about the big picture rather than the mechanics, and I can concentrate on parts that seem likely to cause most difficulty rather than having to go through everything.
The second half of the lecture is interactive. We do different activities each time, e.g.:
- solving exercises (individually, in small groups, or all together)
- question and answer sessions
- definitions quizzes
- working on mathematical writing
- identifying hard parts of the course and having students explain them to each other.
In a completely flipped classroom, all the time would be taken up with interactive work. The first half of each class isn’t too far from a traditional lecture, except for the data-projected notes and the prior reading by the students. That’s why I said it’s only partially flipped.
It’s the first time I’ve done things quite like this, so this year has been pretty experimental. In particular, before every lecture, I’ve had to come up with an idea for the second half of the class, and evidently some have been more successful than others. I’m also running out of ideas! It’s a good thing it’s the end of term.
Have you ever taught in a similar way? If so, what worked well, and what didn’t? Can you share some ideas for classroom activities?
I’ve just submitted a piece for the new Opinions section of the monthly LMS Newsletter: Should mathematicians cooperate with GCHQ? (Update: now available (p.34).) The LMS is the London Mathematical Society, which is the UK’s national mathematical society. My piece should appear in the April edition of the newsletter, and you can read it below.
Here’s the story. Since November, I’ve been corresponding with people at the LMS, trying to find out what connections there are between it and GCHQ. Getting the answer took nearly three months and a fair bit of pushing. In the process, I made some criticisms of the LMS’s total silence over the GCHQ/NSA scandal:
GCHQ is a major employer of mathematicians in the UK. The NSA is said to be the largest employer of mathematicians in the world. If there had been a major scandal at the heart of the largest publishing houses in the world, unfolding constantly over the last eight months, wouldn’t you expect it to feature prominently in every issue of the Society of Publishers’ newsletter?
To its credit, the LMS responded by inviting me to write an inaugural piece for a new Opinions section of the newsletter. Here it is.
I had a 500-word limit, so I omitted a lot. Here are the facts on the LMS’s links with GCHQ, as stated to me by the LMS President Terry Lyons:Should mathematicians cooperate with GCHQ?
Tom Leinster
One of the UK’s largest employers of mathematicians has been embroiled in a major international scandal for the last nine months, stands accused of law-breaking on an industrial scale, and is now the object of widespread outrage. How has the mathematical community responded? Largely by ignoring it.
GCHQ and its partners have been systematically monitoring as much of our lives as they possibly can, including our emails, phone calls, text messages, bank transactions, web browsing, Skype calls, and physical location. The goal: “collect it all”. They tap internet trunk cables, bug charities and political leaders, disrupt lawful activist groups, and conduct economic espionage, all under the banner of national security.
Perhaps most pertinently to mathematicians, the NSA (GCHQ’s major partner and partial funder) has deliberately undermined internet encryption, inserting a secret back door into a standard elliptic curve algorithm. This can be exploited by anyone sufficiently skilled and malicious — not only the NSA/GCHQ. (See Thomas Hales’s piece in February’s Notices of the AMS.) We may never know what else mathematicians have been complicit in; GCHQ’s policy is not to comment on intelligence matters, which is to say, anything it does.
Indifference to mass surveillance rests partly on misconceptions such as “it’s only metadata”. This is certainly false; for instance, GCHQ has used webcams to collect images, many sexually intimate, of millions of ordinary citizens. It is also misguided, even according to the NSA’s former legal counsel: “metadata absolutely tells you everything about somebody’s life”.
Some claim to be unbothered by the recording of their daily activities, confident that no one will examine their records. They may be right. But even if you feel that way, do you want the secret services to possess such a powerful tool for chilling dissent, activism, and even journalism? Do you trust an organization operating in secret, and subject to only “light oversight” (a GCHQ lawyer’s words), never to abuse that power?
Mathematicians seldom have to face ethical questions. But now we must decide: cooperate with GCHQ or not? It has been suggested that mathematicians today are in the same position as nuclear physicists in the 1940s. However, the physicists knew they were building a bomb, whereas mathematicians working for GCHQ may have little idea how their work will be used. Colleagues who have helped GCHQ in the past, trusting that they were contributing to legitimate national security, may justifiably feel betrayed.
At a bare minimum, we as a community should talk about it. Sasha Beilinson has proposed that working for the NSA/GCHQ should be made “socially unacceptable”. Not everyone will agree, but it reminds us that we have both individual choice and collective power. Individuals can withdraw their labour from GCHQ. Heads of department can refuse staff leave to work for GCHQ. The LMS can refuse GCHQ’s money.
At a bare minimum, let us acknowledge that the choices are ours to make. We are human beings before we are mathematicians, and if we do not like what GCHQ is doing, we do not have to cooperate.
The Society has an indirect relationship with GCHQ via a funding agreement with the Heilbronn Institute, in which the Institute will give up to £20,000 per year to the Society. This is approximately 0.7% of our total income. This is a recently made agreement and the funding will contribute directly to the LMS-CMI Research Schools, providing valuable intensive training for early career mathematicians. GCHQ is not involved in the choice of topics covered by the Research Schools.
So, GCHQ’s financial support for the LMS is small enough that declining it would not make a major financial impact.
I hope the LMS will make a public statement clarifying its relationship with GCHQ. I see no argument against transparency.
Another significant factor (which Lyons alludes to above and is already a matter of public record) is that GCHQ is a funder of the Heilbronn Institute, which is a collaboration between GCHQ and the University of Bristol. I don’t know that the LMS is involved with Heilbronn beyond what’s mentioned above, but Heilbronn does seem to provide an important channel through which (some!) British mathematicians support the secret services.
Finally, I want to make clear that although I think there are some problems with the LMS as an institution, I don’t blame the people running it, many of whom are taking time out of extremely busy schedules for the most altruistic reasons. As I wrote to one of them:
I’m genuinely in awe of the amount that you […] give to the mathematical community, both in terms of your selflessness and your energy. I don’t know how you do it. Anything critical I have to say is said with that admiration as the backdrop, and I hope I’d never say anything of the form “do more!”, because to ask that would be ridiculous.
Rules for commenting here I’ve now written several posts on this and related subjects (1, 2, 3, 4). Every time, I’ve deleted some off-topic comments — including some I’ve enjoyed and agreed with heartily. Please keep comments on-topic. In case there’s any doubt, the topic is the relationship between mathematicians and the secret services. Comments that stray too far from this will be deleted.
Guest post by Alexander Campbell
We want to develop category theory in a general 2-category, in order to both generalise and clarify our understanding of category theory. The key to this endeavour is to express the basic notions of the theory of categories in a natural 2-categorical language. In this way we are continuing a theme present in previous posts from the Kan Extension Seminar, wherein monads and adjunctions were given a 2-categorical setting, and by analogy, in our very first paper, whose purpose was to express basic notions of the theory of sets in a natural categorical language. In this post we consider a concept very central and special to category theory: the Yoneda lemma.
So what’s the Yoneda Lemma again?
The Yoneda lemma says that for any object $a$ of a category $A$, the diagram $\begin{matrix} 1 & \overset{a}{\rightarrow} & A \\ {}_{\ast} \searrow & \overset{\iota}{\Rightarrow} & \swarrow_{A(a,-)} \\ & Set \\ \end{matrix}$ is a left extension.
In this post I will give a motivation for the notion of Yoneda structure, as defined in the paper Yoneda Structures on 2-Categories of Ross Street and Bob Walters. But before we begin I would like to take this opportunity to thank Emily for inviting me to join the Kan Extension Seminar and for her support and encouragement throughout the course. This has been and continues to be a singularly valuable experience in my first year as a category theorist.
Liftings and extensions
After the various compositions available in a 2-category (nicely unified as the operation of pasting), the most basic notions of two-dimensional algebra are liftings and extensions. A diagram $\begin{matrix} & A \\ {}^h \swarrow & \overset{\alpha}{\Leftarrow} & \searrow^{g} \\ C & \overset{f}{\rightarrow} & B \\ \end{matrix}$ in a 2-category $\mathcal{K}$ is said to be a left lifting if pasting along $h$ effects a bijection of 2-cells $\frac{h \Rightarrow k}{g \Rightarrow f k}.$
There is a similar elementary definition of left extension, but we can simply say that a left extension is a left lifting in $\mathcal{K}^{op}$ (reverse 1-cells). Hence these are just dual versions of a single notion. Left liftings in $\mathcal{K}^{co}$ and $\mathcal{K}^{coop}$ are right liftings and right extensions respectively, but these will not concern us here.
An elementary result of 2-categorical algebra is the pasting lemma. This tells us that given a diagram $\begin{matrix} & & A \\ {}^h \swarrow & \overset{\beta}{\Leftarrow} & \downarrow^{g} & \overset{\alpha}{\Leftarrow} & \searrow^{f} \\ D & \underset{l}{\rightarrow} & C & \underset{k}{\rightarrow} & B \\ \end{matrix}$ in which $\alpha$ is a left lifting, $\beta$ is a left lifting if and only if the composite is a left lifting. By duality, we immediately get a corresponding result for left extensions.
The Yoneda lemma and Yoneda structures
A Yoneda structure on a 2-category consists of two pieces of data satisfying three axioms. We will see that the data is what is necessary to naturally express the Yoneda lemma in the 2-category, and that the axioms are expressions of familiar results of category theory.
In a general 2-category, following the philosophy of generalised elements, it is natural to consider arrows $X \longrightarrow A$ with arbitrary domain (not just restricted to the terminal object 1) as objects of $A$. In order to state the Yoneda lemma for such generalised objects, we must replace $\ast \colon 1 \longrightarrow \text{Set}$ with some arrow $y_X \colon X \longrightarrow P X$. Here $P X$ could be loosely thought of as “sets freely varying over $X$”; in CAT, it is the category of presheaves over $X$ and $y_X$ is the Yoneda embedding.
But now we must address the issue of size. Firstly, in CAT, the Yoneda embedding $y_X$ exists only for locally small $X$. Secondly, in the wish for a snappy statement of the Yoneda lemma, I left an important condition unmentioned: it is not true for a general category $A$ that the hom-functor $A(a,-)$ exists for every object $a$. (It would be true if $A$ were locally small, but we do not want CAT to consist only of locally small categories; for if $X$ is not small, then $P X$ is not locally small.) Hence we must restrict those elements for which we state the Yoneda lemma.
We can now specify the data of a Yoneda structure on a 2-category. The first piece of data is a class of admissible arrows in the 2-category; the only property we require of this class is that it be a right ideal, meaning that if $g$ is admissible, then so is $g f$ for every such composable $f$. We say that an object $A$ is admissible if the identity arrow $1_A \colon A \longrightarrow A$ is admissible; in the Yoneda structure on CAT, the admissible objects are the locally small categories. Hence we see that there is nothing mysterious about size; it is just part of the structure. The other piece of a Yoneda structure is the Yoneda arrows: an admissible arrow $y_A \colon A \longrightarrow P A$ for each admissible object $A$.
With this structure we can naturally state the Yoneda lemma in a 2-category. We take this as our first axiom for a Yoneda structure.
Axiom 1. For $X$, $a$ both admissible, the left extension $\begin{matrix} X & \overset{a}{\rightarrow} & A \\ {}_{y_X} \searrow & \overset{\chi^a}{\Rightarrow} & \swarrow_{A(a,1)} \\ & P X \\ \end{matrix}$ exists.
The universal property of this extension is indicated by $\frac{A(a,1) \Rightarrow f}{y_X \Rightarrow f a}.$
So we have used the Yoneda lemma as a definition of the hom-functors $A(a,1)$; the axiom asserts their existence. We can further define the hom-functors on 1-cells and 2-cells in both variables: in the first variable by the universal property of extension, and in the second variable by composition. In particular, for (suitably admissible) $f \colon A \longrightarrow B$, we can define $P f \colon P B \longrightarrow P A$ as $P f = (P B)(B(1,f),1)$, with a similar definition on 2-cells. This all just follows from Axiom 1, hence is part of any Yoneda structure.
All further use of the admissibility notion is just to ensure the existence of the data of Axiom 1. So I will make minimal explicit mention of it from here; it can be easily filled in as necessary.
Universal arrows and absolute liftings
The second axiom of a Yoneda structure is expressed in terms of liftings. I will now exhibit the presence of liftings in category theory.
The first formalisation of the notion of universal property that Mac Lane gives in Categories for the Working Mathematician is that of universal arrow (I will deal with the ‘left’ sense here). Given a functor $f \colon A \longrightarrow B$ and an object $b$ of $B$, he defines what it means for a “pair” consisting of an object $a$ of $A$ and an arrow $\theta \colon b \longrightarrow fa$ to be a universal arrow from $b$ to $f$. This definition says precisely that $\begin{matrix} & 1 \\ {}^a \swarrow & \overset{\theta}{\Leftarrow} & \searrow^{b} \\ A & \underset{f}{\rightarrow} & B \\ \end{matrix}$ is a left lifting diagram.
In a general 2-category, the restriction of our consideration to such ‘global objects’ is untenable. So how do we express the universal property of a ‘generalised arrow’ from a ‘generalised object’ $b \colon X \longrightarrow B$ to $f$? It is not enough to say that the corresponding diagram $\begin{matrix} & X \\ {}^a \swarrow & \overset{\theta}{\Leftarrow} & \searrow^{b} \\ A & \underset{f}{\rightarrow} & B \\ \end{matrix}$ is a left lifting diagram; we must consider the object $b$ at all its stages of development. Hence we define a 2-cell $\theta$ to be a universal arrow from $b$ to $f$ when for every earlier stage $Y \longrightarrow X$, the diagram $\begin{matrix} & Y \\ & \downarrow \\ & X \\ {}^a \swarrow & \overset{\theta}{\Leftarrow} & \searrow^{b} \\ A & \underset{f}{\rightarrow} & B \\ \end{matrix}$ is a left lifting. That is, we require our corresponding diagram to be an absolute left lifting (meaning that the property of being a left lifting is preserved when composed with any arrow into $X$).
Hence absolute liftings give a natural 2-categorical expression of the notion of universal arrow. This slogan can help us to interpret some basic facts of the algebra of 2-categories as they relate to category theory. For instance, the result that a 2-cell $\begin{matrix} & A \\ {}^f \swarrow & \overset{\eta}{\Leftarrow} & \searrow^{1} \\ B & \underset{g}{\rightarrow} & A \\ \end{matrix}$ is the unit of an adjunction $f \dashv g$ if and only if it is an absolute left lifting corresponds to the familiar result of basic category theory that states that a functor $g \colon B \longrightarrow A$ is a right adjoint if and only if there is a universal arrow to $g$ from every object of $A$.
Universal elements and representability
Now, observe that the 2-cell in the ordinary Yoneda lemma, in addition to being a left extension, is trivially a left lifting; for this just means that there is a bijection between arrows $a \longrightarrow b$ and elements of the set $A(a,b)$. In familiar terms, this says that $\iota$ is a universal element of the functor $A(a,-)$, and by the above discussion we can immediately incorporate this into our setting as the second axiom for a Yoneda structure.
Axiom 2. The 2-cell $\chi^a$ of Axiom 1 is an absolute left lifting.
The universal property of this absolute left lifting is indicated by $\frac{a x \Rightarrow b}{X(1,x) \Rightarrow A(a,b)}.$
The combination of these two universal properties of the 2-cells $\chi^a$ is the ignition that allows us to begin to develop category theory in our 2-category. Together they give a bijection of 2-cells $\pi$ and $\eta$ as indicated by the bijections $\frac{\frac{B(s,1) \Rightarrow C(j,t)}{y_A \Rightarrow C(j,t s)}}{j \Rightarrow t s}$ and as displayed in the diagram $\begin{matrix} & A \\ {}^s \swarrow & \overset{\chi^s}{\Leftarrow} & \searrow^{y_A} \\ B & \overset{B(s,1)}{\rightarrow} & P A \\ {}_t \searrow & \Downarrow^{\pi} & \nearrow_{C(j,1)} \\ & C \\ \end{matrix} \qquad = \qquad \begin{matrix} & & A \\ {}^s \swarrow & \overset{\eta}{\Leftarrow} & \downarrow^{j} & \overset{\chi^j}{\Leftarrow} & \searrow^{y_A} \\ B & \underset{t}{\rightarrow} & C & \underset{C(j,1)}{\rightarrow} & P A \\ \end{matrix}$ From Axiom 2 and the pasting lemma, we get that the LHS is an absolute left lifting (with base $C(j,t) = C(j,1)t$) if and only if $\eta$ is an absolute left lifting. This is exactly the realisation in our setting of the familiar correspondence between universal elements of $C(j,t)$ and universal arrows from $j$ to $t$. In particular, if $\pi$ is an isomorphism, then this condition holds, and we have a universal element of $C(j,t)$. Note that we can understand $\pi$ being an isomorphism as meaning that $C(j,t)$ is representable.
Now, recall the representability condition of ordinary category theory that says that a functor $f \colon B \longrightarrow \text{Set}$ is representable if and only if it has a universal element. We have seen that one half of this implication follows from the first two axioms of a Yoneda structure. We could take the other half as an axiom, being a basic result of category theory that we wish to incorporate into our setting.
Axiom 3*. If a 2-cell $\sigma \colon A(a,1) \Rightarrow f \colon A \longrightarrow P X$ yields an absolute left lifting diagram when pasted onto $\chi^a$, then $\sigma$ is an isomorphism.
Understood in our interpretation, this is indeed the other implication of the representability condition.
However this axiom is too strong to capture all the examples we want! While it does hold in ordinary category theory and in 2-categories of internal categories and variable (=indexed=parametrized) categories, it is does not hold in general in enriched category theory. For in enriched category theory, representability of a functor is expressed as an isomorphism in the base of enrichment, and the bijection of sets of the universal element condition is not enough to capture this.
Hence we do not take Axiom 3* as an axiom for Yoneda structures. Instead we take a couple of special cases. By Axiom 1 for $y_A$ and $gf$ in place of $a$, we have the bijections of 2-cells $\frac{P1_A \Rightarrow 1_{P A}}{y_A \Rightarrow y_A} \qquad \frac{C(g f,1) \Rightarrow P f.C(g,1)}{y_A \Rightarrow P f.C(g,1).g f}.$ Let $\iota_A$ correspond to $1_{y_A}$ in the first bijection, and $\theta_{f,g}$ correspond to the 2-cell $\begin{matrix} A & \overset{f}{\rightarrow} & B & \overset{g}{\rightarrow} & C \\ {}^{y_A} \downarrow & \overset{\chi^{B(1,f)}}{\Rightarrow} & {}^{y_B} \downarrow & \overset{\chi^g}{\Rightarrow} & \swarrow_{C(g,1)} \\ P A & \underset{P f}{\leftarrow} & P B \\ \end{matrix}$ in the second. These corresponding 2-cells are an identity and a pasting of two absolute left liftings respectively. Hence (by the pasting lemma in the second case) they are both absolute left liftings, and Axiom 3* would imply that they are isomorphisms. We take these special cases as our third and final axiom.
Axiom 3. The 2-cells $\iota_A$ and $\theta_{f,g}$ are isomorphisms.
In summary, a Yoneda structure on a 2-category consists of admissible arrows and Yoneda arrows satisfying Axioms 1, 2, and 3.
Formal category theory
Now that we have established the notion of a Yoneda structure on a 2-category, we can begin to develop category theory within such a 2-category. What is remarkable is that everything follows formally and naturally from the Yoneda structure and from the elementary two-dimesional algebra of pasting, liftings, extensions and adjunctions. In their paper, Street and Walters develop results concerning adjunctions, weighted colimits, EM objects and Kleisli objects for monads, and they introduce the notion of totality. I will now briefly survey some of these results; I refer the reader to the paper for details.
They show that the 2-categorical definition of an adjunction $f\dashv g$ in terms of its unit and counit is equivalent to the condition $B(f,1) \cong A(1,g)$, which expresses the classical hom-set definition of an adjunction. This is an isomorphism of modules, and indeed, one benefit of a Yoneda structure is that it equips our 2-category with a notion of modules (a.k.a. profunctors): modules from $A$ to $B$ are arrows $A \longrightarrow P B$.
It is well known that a satisfactory theory of colimits in enriched category theory requires the consideration of colimits weighted by presheaves. However the natural generality of the theory calls for weighting by modules (presheaves on $A$ are modules from $I$ to $A$). Given a module $j\colon M \longrightarrow P A$ and an arrow $s\colon A \longrightarrow C$ (suitably admissible), we define the colimit $\text{colim}(j,s) \colon M \longrightarrow C$ of $s$ weighted by $j$ by the formula $C(\text{colim}(j,s),1) \cong (P A)(j,C(s,1))$. From this formula we can derive many familiar results such as the fact that left adjoints preserve colimits, associativity of colimits in the weight, and the Yoneda isomorphisms. Also we can introduce the notion of pointwise extension in terms of colimits and show that they are indeed extensions. Thus all the results of the calculus of colimits, which in enriched category theory depend on a “complicated machinery” involving enriched extranatural transformations, coends, enriched functor categories etc., in fact follow easily and formally in the context of the elementary notion of a 2-category with a Yoneda structure.
Street and Walters show that a certain result from Street’s earlier paper, the characterisation of the Eilenberg-Moore category of a monad in CAT as certain sheaves on the Kleisli category of the monad, holds for any Yoneda structure. They also show that in the presence of more properties of the 2-category (particularly a bo-ff factorisation of its arrows), we can make sense of the idea of the Kleisli object as a “full subcategory” of the Eilenberg-Moore object.
This paper introduced the notion of totality. An arrow $s \colon A \longrightarrow C$ is said to be total when $A$ and $s$ are admissible and the module $C(s,1) \colon C \longrightarrow P A$ has an admissible left adjoint. It follows immediately from the developed theory that this left adjoint $z$ must be given by $z j = \text{colim}(j,s)$ and be the pointwise left extension of $s$ along the Yoneda arrow $y_A$ (note that these assertions require all the admissibility assumptions in the definition). Such adjunctions are absolutely fundamental to the use of category theory; they sometimes go by the name of nerve and realization.
An object $A$ is said to be total when the identity arrow $1_A$ is total; this is the same as saying the Yoneda arrow $y_A \colon A \longrightarrow P A$ has a left adjoint. This can be thought of as a sort of strong cocompleteness property. In particular we get a “very satisfactory” adjoint functor theorem: if $A$ is total, an arrow $f \colon A \longrightarrow B$ has a right adjoint if and only if $B(f,1)$ is admissible and $f$ preserves the colimit $\text{colim}(B(f,1),1)$.
Examples
Important motivating examples of 2-categories with Yoneda structures are 2-categories of internal categories (see the papers of Street and Weber) and enriched categories, in particular CAT is the 2-category of categories internal to SET. Given categories $\mathcal{E}$ and $\mathcal{V}$ suitably nice and approriate for internalisation and enrichment respectively, we get Yoneda structures on the 2-categories Cat$(\mathcal{E}$) and $\mathcal{V}$-Cat roughly as follows. For such a category $\mathcal{C}$, the size structure arises from an object $S$ of $\mathcal{C}$ which is in some way understood as a “full subcategory” of $\mathcal{C}$; for internal categories, this can be characterised as an internal full subcategory, or equivalently, a classifying discrete opfibration. Then the objects $P A$ arise as exponentials $S^{A^{op}}$, and the Yoneda arrows are exponential adjoints of hom-functors.
The 2-category LEX of finitely complete categories and finitely continuous functors inherits a Yoneda structure from CAT. This example is of interest because the total objects are (almost) Grothendieck toposes.
Another example, not treated in this paper but in this later paper of Street, are 2-categories of variable categories. These are of the form $\mathcal{K}$ = Hom($\mathcal{E}^{op}$,CAT), consisting of pseudofunctors, pseudonatural transformations, and modifications. I mention this example here for the interesting way in which the size structure arises. An arrow $f \colon A \longrightarrow B$ in $\mathcal{K}$ is admissible if for every $a\colon U \longrightarrow A$ and $b \colon V \longrightarrow B$ where $U$ and $V$ are representables on objects of $\mathcal{E}$, the comma object $f a/b$ is representable.
Guest post by Dimitri Zaganidis
First of all, I would like to thank Emily for organizing the Kan extension seminar. It is a pleasure to be part of it. I want also to thank my advisor Kathryn Hess and my office mate Martina Rovelli for their revisions.
In the fifth installment of the Kan Extension Seminar we read the paper “Review of the Elements of 2-categories” by G.M Kelly and Ross Street. This article was published in the Proceedings of the Sydney Category Theory Seminar, and its purpose is to “serve as a common introduction to the authors’ paper in this volume”.
The article has three main parts, the first of them being definitions in elementary terms of double categories and 2-categories, together with the notion of pasting. In a second chapter, they review adjunctions in 2-categories with a nice expression of the naturality of the bijection given by mates using double categories. The last part of the article introduces monads in 2-categories, and specializing to 2-monads towards the end.
Double categories and 2-categories
The article starts with the definition of a double category as a category object in the (not locally small) category of categories $\mathbf{CAT}$. (I think that there might be some set theoretic issues with such a category, but you can add small everywhere if you want to stay safe.)
The authors then switch to a description of such an object in terms of objects, horizontal arrows, vertical arrows, and squares, with various compositions and units. I will explain a bit how to go from one description to the other.
A category object is constituted of a category of objects, a category of morphisms, target and source functors, identity functor and a composition.
The category of objects is the category whose morphisms are “the objects” and whose morphisms are the vertical arrows. The category of morphisms is the category whose objects are the horizontal morphisms and whose morphisms are the squares, with vertical composition.
Since the functors $\mathrm{Obj}, \mathrm{Mor}: \mathbf{CAT} \longrightarrow \mathbf{SET}$ preserve pullbacks, by applying them to a double category seen as a category object, we get actual categories. Applying $\mathrm{Obj}$ to the double category, we get the category whose objects are “the objects” and whose morphisms are the horizontal arrows. Applying $\mathrm{Mor}$, we get the category whose objects are the vertical morphisms and whose morphisms are the squares, but this time with horizontal composition.
An interesting thing to notice is that the symmetry of the explicit description of a double category is much more apparent than the symmetry of its description as a category object.
One can define a $2$-category as a double category with a discrete category of objects, or as a $\mathbf{CAT}$-enriched category, exactly as one can define a simplicially enriched small category as either a category enriched over $\mathbf{sSet}$ or as a category object in $\mathbf{sSet}$ with a discrete simplicial set of objects.
The second viewpoint on 2-categories leads to definitions of 2-functors and 2-natural transformations and also to modifications, once one makes clear what enrichment a category of 2-functors inherits.
It is also worthwhile mentioning that the pasting operation makes computations easier to make, because they are more visual. The proof of proposition 2.1 of this paper is a good illustration of this.
The basic example of a 2-category is $\mathbf{CAT}$ itself, with natural transformations as 2-cells (squares).
As category theory describes set-like constructions, 2-category theory describes category-like constructions. You can usually build up categories with as objects sets with extra structure. In the same way, small V-categories, V-functors, and V-natural transformations form a 2-category.
My first motivation to learn about 2-categories was the 2-category of quasi-categories defined by Joyal and which has been studied by Emily Riehl and Dominic Verity in the article The 2-category theory of quasi-categories in particular the category-like constructions one can make with quasi-categories, such as adjunctions and limits.
Adjunctions and mates in 2-categories
It is not a surprise that 2-categories are the right framework in which to define adjunctions. To build the general definition from the usual one, you just need to replace categories by objects in a 2-category, functors by 1-cells of the 2-category, and natural transformations by its 2-cells.
Adjunctions in a 2-category $\mathcal{C}$ compose (as in $\mathbf{CAT}$), and one can form two, a priori distinct double categories of adjunctions. Both of them will have the objects of $\mathcal{C}$ as objects and the horizontal morphisms being the morphisms of $\mathcal{C}$, while their vertical morphisms are the adjunctions (going in the same direction as the right adjoint, by convention). The two double categories differ on the squares. Given adjunctions $f \dashv u$ and $f' \dashv u'$ together with 1-cells $a:A \longrightarrow A'$ (between the domains of $u$ and $u'$) and $b:B \longrightarrow B'$ (between the codomains of $u$ and $u'$), the squares of the first double category are 2-cells $b u \Rightarrow u'a$ while the squares of the second are 2-cells $f'b \Rightarrow a f$.
Now, the bijective correspondence between these kind of 2-cells given by mates induces an isomorphism of double categories. This means in particular that the horizontal (or vertical) composite of mates is equal to the mate of the corresponding composite.
This is a very beautiful way to express the naturality of the mate correspondence, and it provides a one-line proof of the fact that two 1-cells that are left adjoints to a same 1-cell are naturally isomorphic.
Monads in 2-categories
2-categories are also the right framework to define monads. A monad in a 2-category $\mathcal{C}$ and on an object $B$ is a 1-cell $t:B \longrightarrow B$ together with 2-cells $\mu: t^2 \Rightarrow t$ and $\eta: 1_B \Rightarrow t$, verifying the usual equations $\mu \circ (t\mu)= \mu \circ (\mu t)$ and $\mu \circ(t\eta) = 1_B = \mu \circ(\eta t)$. Since 2-functors preserve both horizontal and vertical compositions, for all objects $X$ of $\mathcal{C}$, $t$ induces a monad on $\mathcal{C}(X,B)$, given by post-composition $(t_{\ast},\mu_{\ast},\eta_{\ast})$. The authors call * an action of $t$ on $s:X \longrightarrow B$* a $t_\ast$ algebra structure on $s$.
The definition of monad morphism in this article is quite surprising for someone who has read the previous article by Ross Street, The formal theory of monads, which we read for the second meeting of the Kan extension seminar.
In Ross Street’s original paper, a monad morphism $(B,t,\mu, \eta) \longrightarrow (B',t',\mu', \eta')$ is a 1-cell $f: B \longrightarrow B'$ together with a $2$-cell $\phi: t'f \Rightarrow f t$ verifying certain conditions.
In this paper, morphisms of monads are defined only for monads on the same object, letting the $1$-cell part of a monad transformation of the previous article be the identity. This leads the authors to reverse the direction of the morphism, since the $2$-cell seems to go in the reverse direction of the $1$-cell!
One might think that fixing $f=1$ is needed by the result which explains that there is a bijection between monad morphisms $t \Rightarrow t'$ and actions of $t$ on $t'$ making $t'$ a “$(t,t')$-bimodule”. In fact, in the case where $f$ is not necessarily the identity, there is a bijection between 2-cells $\phi:t f \Rightarrow f t'$ such that $(f,\phi)$ is a monad functor and actions of $t$ on $ft'$ making $ft'$ a “$(t,t')$-bimodule”. A statement of the same kind can be also made for monad functor transformations (in the sense of the formal theory of monads). A 2-cell $\sigma : f \Rightarrow f'$ is a monad functor transformation $(f,\phi) \longrightarrow (f', \phi')$ if and only if $\sigma t': f t' \Rightarrow f' t'$ is a morphism of “$(t,t')$-bimodules”.
A 2-category admits the construction of algebras if for every monad $(B,t,\mu, \eta)$, the 2-functor $X \mapsto \mathcal {C}(X,B)^{(t_\ast, \mu_\ast, \eta_\ast)}$ is representable. The representing object is called the object of $t$-algebras. By Yoneda, the free-forgetful adjunction can be made internal in this case.
The terminology is justified, because in the $2$-category $\mathbf{CAT}$, it specializes to the usual notions of the category of $t$-algebras and the corresponding free-forgetful adjunction.
A monad in $\mathcal{C}$ is the same as a 2-functor $\mathbf{Mnd} \longrightarrow \mathcal{C}$, where $\mathbf{Mnd}$ is the 2-category with one object and $\Delta_+$, the algebraist’s simplicial category as monoidal hom-category (with ordinal sum). Since moreover, $\mathcal {C}(X,B)^{(t_\ast, \mu_\ast, \eta_\ast)} \cong [\mathbf{Mnd}, \CAT]( \Delta_{+\infty}, \mathcal{C}(X,-)),$ (where $\Delta_{+\infty}$ is the subcategory of maps of $\Delta$ preserving maxima, which is acted on by $\Delta_+$ via ordinal sum) one can see that the object of t-algebras can be expressed as a weighted limit.
As a consequence, it is not surprising that a 2-category admits the construction of algebras under some completeness assumptions.
Doctrines
In the last part of the article, the authors review the notion of a doctrine, which is a 2-monad in 2-$\mathbf{CAT}$, i.e., a 2-functor $D: \mathcal {C} \longrightarrow \mathcal{C}$, where $\mathcal{C}$ is a 2-category, and 2-natural transformations $m$ and $j$, which are respectively the multiplication and the unit, verifying the usual identities. The fact that it is both a monad on a 2-category and in another one can be a bit disturbing at first.
If $(D,m,j)$ is a doctrine over a 2-category $\mathcal{C}$, then its algebras will be objects $X$ of $\mathcal{C}$ together with an action $DX \longrightarrow X$, exactly as in the case of algebras over a usual monad.
Already with morphisms, we can take advantage of the fact that a 2-category $\mathcal{C}$ has 2-cells, and define $D$-morphisms to be lax in the sense that the diagram $\begin{matrix} DX & \longrightarrow & DY \\ \downarrow & & \downarrow \\ X & \longrightarrow & Y \end{matrix}$ is not supposed to be commutative, but is rather filled by a 2-cell with some coherence properties.
As one might expect, we can actually form a 2-category of such $D$-algebras by adding 2-cells, using again the $2$-cells existing in $\mathcal{C}$.
If we keep only the $D$-morphisms that are strict, we obtain the object of algebras (which should be a $2$-category) that we discussed before.
One example of a doctrine is $\Delta_+ \times - : \mathbf{CAT} \longrightarrow \mathbf{CAT}$ together with the multiplication induced by the ordinal sum, and unit given on $\mathcal {D}$ by the functor $\mathcal{D} \longrightarrow \Delta_+ \times \mathcal{D}$ that sends $d$ to $(\emptyset,d)$.
The algebras for this doctrine will be categories equipped with a monad acting on them, while the $D$-morphisms are transformations of monads, and the $D$-2-cells are exactly the monad functor transformations of Street’s article.
Here, since we have two different 2-categories of algebras (with strict $D$-morphisms or with all of them), one can wonder if monad morphisms $D \longrightarrow D'$ will induce $2$-functors $D'$-$\mathbf{Alg} \longrightarrow D$-$\mathbf{Alg}$ on the level of these $2$-categories.
This is indeed the case, and one can actually go even one step further and define monad modifications, using the fact that 2-$\mathbf{CAT}$ is in fact a 3-category! These modifications between two given monad morphisms are in fact in bijective correspondence with the 2-natural transformations between the $2$-functors induced by these monad morphisms on the level of algebras (with lax D-morphisms). Note that they are not the same as monad morphisms transformations of Street’s article.
This bijection is nice because it implies that you can compare 2-categories of algebras by only looking at the doctrines: if they are equivalent, so are the 2-categories of algebras.
The fact that this bijection does not hold when we restrict only to strict morphism was really surprising to me, but I guess this is the price to pay to use the 3-category structure.
During the last days of April, the Kan extension seminar will be reading the article “Two dimensional monad theory”, by Blackwell, Kelly and Powell. We will then have more to say about these 2-monads!
Leila Schneps is trying to raise $6,000 for what sounds like a good cause: translating a biography of Grothendieck into English:
As of this moment she’s raised $350… including $100 of her own money.
Leila Schneps writes:
Two volumes of a projected 3-volume biography of A. Grothendieck have been completed by Winfried SCHARLAU, and are available in German on Amazon.com through “Books on Demand” as
“Wer ist Alexander Grothendieck? Teil 1: Anarchy”
and
“Wer ist Alexander Grothendieck? Teil 3: Spiritualität”.
Grothendieck is aware of Scharlau’s work and even met him during the preparation of the part concerning the lives of his parents, although he subsequently ruptured relations with Scharlau as he has with everyone else.In 2013, an English translation of volume 1 by my sister, Melissa Schneps became available on Amazon as
“Who is Alexander Grothendieck? Part 1, Anarchy”
Melissa’s translation of volume 1 was funded by Jim Carlson of the Clay Institute for Mathematics.The new director of the Clay Institute has not accepted to fund translation of volume 3. The book covers Grothendieck’s life from his rupture with the mathematical establishment in 1970 to his disappearance in 1991 and contains a wealth of information from documents and interviews.
Due to multiple expressions of interest in an English translation, we have decided to use this method to raise funding for the translation of volume 3. Once the project is underway, we propose to create a web page where chapters of the volume can be made accessible to donors as they appear. Once completed, the new volume will be available on Amazon through “Books on Demand” like the others.
If you are interested in Alexander Grothendieck and his astonishing life, please help to make this happen.
In response Kevin Lin asked the obvious question: “What about volume 2?” I don’t know the answer! Do you?
Guest post by Nick Gurski
I have been thinking about various sorts of operads with my PhD student Alex Corner, and have become interested in the following very concrete question: what are examples of operads in the category of finite groups under the cartesian product? I don’t know any really interesting examples, but maybe you do! After the break I will explain why I got interested in this question, and tell you about some examples that I do know.
Alex and I started off thinking about various sorts of things you might do with operads in $\mathbf{Cat}$, and were eventually forced into what we currently call an action operad. This is an operad $G$ whose job it is to act on the objects of other operads. The key examples to keep in mind are the terminal operad (each set is just a singleton), the symmetric operad (the $n$th set is the $n$th symmetric group), and the braid operad (the $n$th set is the $n$th braid group). The technical definition involves an operad $G$, a group structure on each set $G(n)$, a map of operads $\pi:G \rightarrow \Sigma$ to the symmetric operad which is levelwise a group homomorphism, and a final condition (when it makes sense) relating operadic composition, $\mu$, with group multiplication:
$\mu(g; f_1, \ldots, f_n) \cdot \mu(g'; f_1', \ldots, f_n') = \mu (g g'; f_{\pi(g')(1)}f_{1}', \ldots, f_{\pi(g')(n)}f_{n}').$
These ideas have cropped up before, for example in Nathalie Wahl’s thesis or this preprint of Wenbin Zhang. Once you have this definition, you can define operads which are equivariant with respect to $G$: you have an operad $P$, a group action of $G(n)$ on $P(n)$ for each $n$, and some equivariance conditions that generalize the equivariance conditions for a symmetric operad.
This isn’t the only thing you can do with an action operad, you can also think about the 2-monad on $\mathbf{Cat}$ whose algebras are strict monoidal categories where $G(n)$ acts naturally on $n$-fold tensor products. If you do this with the symmetric operad, you get symmetric strict monoidal categories (or permutative categories, if you are a topologist); if you do this with the (ribbon) braid operad, you get (ribbon) braided strict monoidal categories; and if you do this with the action operad of all terminal groups, you get back plain old strict monoidal categories. The only other “naturally-occurring” example of an action operad that I know of is the operad of $n$-fruit cactus groups, $J_{n}$. These groups come up in the representation theory of quantum groups, particularly the theory of crystals, and the monoidal structure you get out here is something Drinfeld called a coboundary category. I can give you a generators-and-relations definition of these groups, at which point I would have completely exhausted my understanding of this operad. The best reference that I know of is the paper Crystals and coboundary categories by Henriques and Kamnitzer.
What does this have to do with my question about operads of finite groups? Well, as it turns out, the structure map for an action operad $\pi:G \rightarrow \Sigma$ only has two options: it can be surjective, or it can be the zero map (i.e., everything maps to the identity permutation). Furthermore, that condition I wrote down relating group multiplication and operadic composition says that giving an action operad with $\pi$ the zero map is equivalent to giving an operad in which the operadic composition maps all preserve group multiplication. Alex and I already showed that the operadic composition of all identity elements is the identity element in the target, in other words an action operad with $\pi$ the zero map is just an operad in the category of groups using the cartesian product.
You can take kernels of maps between action operads, so in particular given any action operad $G$ you can take the kernel of $\pi:G \rightarrow \Sigma$; this gives you an operad in groups. For the examples above, you get finite groups when $G$ is the terminal operad or $G = \Sigma$ (as the terminal operad is obviously the kernel in the case of $G = \Sigma$), but for the rest of the examples you get an operad in the category of groups, but most of those groups are infinite. The groups involved are the so-called pure versions: pure braids, pure ribbon braids, and pure $n$-fruit cacti. One can then think of an action operad with surjective map $\pi$ as being an extension of the symmetric operad by an operad in the category of groups (the “pure” version), and that action operad is finite if and only if the operad in the category of groups is one containing only finite groups. We can translate this back into thinking about monoidal categories by then noting if we have some notion of strict monoidal category in which $n$th tensor powers come equipped with a natural action of a finite group $G(n)$ for all $n$, then we must be able to dig up an operad in the category of finite groups.
Now let’s talk concrete examples: what operads do I know in the category of finite groups? Well, there is obviously the terminal operad, but I can go very slightly further in that I can tell you how to construct some new ones. Here are two methods you can use to construct operads of finite groups.
- Let $A$ be a finite abelian group. Then there is an operad $\underline{A}$ where $\underline{A}(n) = A^{n}$ (the power here is a cartesian power). The operadic composition map $A^{n} \times A^{k_{1}} \times \cdots \times A^{k_{n}} \rightarrow A^{\Sigma k_{i}}$ takes the vector $(a_{1}, \ldots, a_{n})$ in the first coordinate and duplicates the $i$th coordinate $k_{i}$ times, then adds the result to the vector you get by just concatenating the $n$ vectors in the $A^{k_{i}}$. Here $A$ must be abelian as it appears as $\underline{A}(1)$, and $G(1)$ must be abelian for any action operad by an Eckmann-Hilton argument.
- Now let $G$ be any finite group, but in fact you will see that we don’t use anything interesting about the group structure here. There is an operad $G^{c2}$ with $G^{c2}(n) = G^{\binom{n}{2}}$ given the pointwise group structure. You should think of this as the set of functions from $\{ (i,j): 1 \leq i \lt j \leq n \}$ to $G$. Operad composition is a map $G^{\binom{n}{2}} \times G^{\binom{k_{1}}{2}} \times \cdots \times G^{\binom{k_{n}}{2}} \rightarrow G^{\binom{\Sigma k_{i}}{2}}.$ If we are given $1 \leq a \lt b \leq \sum k_{i}$, we must give back an element of $G$. If there is some $r$ such that $\sum_{i=1}^{r-1} k_{i} \leq a \lt b \lt \sum_{i=1}^{r} k_{i},$ then we use the function coming from $G^{c2}(r)$ evaluated on $1 \leq a +1 - \sum_{i=1}^{r-1} k_{i} \lt b+1- \sum_{i=1}^{r-1} k_{i} \leq k_{r}.$ If not, then there exist $r \lt s$ such that $a$ lies in the $r$th “interval” as we had before and $b$ lies in the $s$th interval, so then you use the function coming from $G^{c2}(n)$ evaluated on $r \lt s$.
I am reasonably content with the first of these constructions, I understand how to do the second but don’t really know where it comes from, and I have some ideas about how one could try to insert the finite group of their choice as $G(0)$ but haven’t checked the details, and know basically nothing else. Furthermore, I don’t know how these interact with each other, or how you can form extensions of $\Sigma$ with them outside of some obvious constructions.
Those are just the two straightforward approaches that I know of to construct operads in the category of finite groups. You can also try to construct these operads from operads of topological spaces or simplicial sets, but once again I don’t know of an example that produces finite things (apart from ones giving the groups above). Do you know any others?