• Shortcuts : 'n' next unread feed - 'p' previous unread feed • Styles : 1 2
aA :  -   +

Date: Wednesday, 12 Feb 2014 12:24

One of my dreams these days is to get people to apply modern math to ecology and biology, to help us design technologies that work with nature instead of against it. I call this dream ‘green mathematics’. But this will take some time to reach, since living systems are subtle, and most mathematicians are more familiar with physics.

So, I’ve been warming up by studying the mathematics of chemistry, evolutionary game theory, electrical engineering, control theory and information theory. There are a lot of ideas in common to all these fields, but making them clear requires some category theory. I call this project ‘network theory’. I’m giving some talks about it at Oxford.

(This diagram is written in Systems Biology Graphical Notation.)

Here’s the plan:

#### Network Theory

Nature and the world of human technology are full of networks. People like to draw diagrams of networks: flow charts, electrical circuit diagrams, signal flow diagrams, Bayesian networks, Feynman diagrams and the like. Mathematically minded people know that in principle these diagrams fit into a common framework: category theory. But we are still far from a unified theory of networks. After an overview, we will look at three portions of the jigsaw puzzle in three separate talks:

I. Electrical circuits and signal flow diagrams.

II. Stochastic Petri nets, chemical reaction networks and Feynman diagrams.

III. Bayesian networks, information and entropy.

All these talks will be in Lecture Theatre B of the Computer Science Department—you can see a map here, but the entrance is on Keble Road. Here are the times:

• Friday 21 February 2014, 2 pm: Network Theory: Overview.

• Tuesday 25 February, 3:30 pm: Network Theory I: electrical circuits and signal flow diagrams.

• Tuesday 4 March, 3:30 pm: Network Theory II: stochastic Petri nets, chemical reaction networks and Feynman diagrams.

• Tuesday 11 March, 3:30 pm: Network Theory III: Bayesian networks, information and entropy.

I thank Samson Abramsky, Bob Coecke and Jamie Vicary of the Computer Science Department for inviting me, and Ulrike Tillmann and Minhyong Kim of the Mathematical Institute for helping me get set up. I also thank all the people who helped do the work I’ll be talking about, most notably Jacob Biamonte, Jason Erbele, Brendan Fong, Tobias Fritz, Tom Leinster, Tu Pham, and Franciscus Rebro.

Ulrike Tillmann has also kindly invited me to give a topology seminar:

#### Operads and the Tree of Life

Trees are not just combinatorial structures: they are also biological structures, both in the obvious way but also in the study of evolution. Starting from DNA samples from living species, biologists use increasingly sophisticated mathematical techniques to reconstruct the most likely “phylogenetic tree” describing how these species evolved from earlier ones. In their work on this subject, they have encountered an interesting example of an operad, which is obtained by applying a variant of the Boardmann–Vogt “W construction” to the operad for commutative monoids. The operations in this operad are labelled trees of a certain sort, and it plays a universal role in the study of stochastic processes that involve branching. It also shows up in tropical algebra. This talk is based on work in progress with Nina Otter.

I’m not sure exactly where this will take place, but probably somewhere in the Mathematical Institute, shown on this map. Here’s the time:

• Monday 24 February, 3:30 pm, Operads and the Tree of Life.

If you’re nearby, I hope you can come to some of these talks — and say hi!

(This diagram was drawn by Darwin.)

Author: "john (baez@math.ucr.edu)" Tags: "Conference"

Date: Wednesday, 12 Feb 2014 12:01

A friend, John Wigglesworth, has asked me to announce the following conference. Perhaps he’ll let us know if pluralist homotopy type theorists are welcome ;).

I wonder if any progress has been made on my MO ‘multiverse’ question.

CFP: Symposium on the Foundations of Mathematics, Kurt Gödel Research Center, University of Vienna.

Set theory is taken to serve as a foundation for mathematics. But it is well-known that there are set-theoretic statements that cannot be settled by the standard axioms of set theory. The Zermelo-Fraenkel axioms, with the Axiom of Choice (ZFC), are incomplete. The primary goal of this symposium is to explore the different approaches that one can take to the phenomenon of incompleteness.

One option is to maintain the traditional “universe” view and hold that there is a single, objective, determinate domain of sets. Accordingly, there is a single correct conception of set, and mathematical statements have a determinate meaning and truth-value according to this conception. We should therefore seek new axioms of set theory to extend the ZFC axioms and minimize incompleteness. It is then crucial to determine what justifies some new axioms over others.

Alternatively, one can argue that there are multiple conceptions of set, depending on how one settles particular undecided statements. These different conceptions give rise to parallel set-theoretic universes, collectively known as the “multiverse”. What mathematical statements are true can then shift from one universe to the next. From within the multiverse view, however, one could argue that some universes are more preferable than others.

These different approaches to incompleteness have wider consequences for the concepts of meaning and truth in mathematics and beyond. The conference will address these foundational issues at the intersection of philosophy and mathematics. The primary goal of the conference is to showcase contemporary philosophical research on different approaches to the incompleteness phenomenon.

To accomplish this, the conference has the following general aims and objectives:

(1) To bring to a wider philosophical audience the different approaches that one can take to the set-theoretic foundations of mathematics. (2) To elucidate the pressing issues of meaning and truth that turn on these different approaches. (3) To address philosophical questions concerning the need for a foundation of mathematics, and whether or not set theory can provide the necessary foundation

Date and Venue: 7-8 July 2014 - Kurt Gödel Research Center, Vienna

Confirmed Speakers: Sy-David Friedman (Kurt Gödel Research Center for Mathematical Logic), Hannes Leitgeb (Munich Center for Mathematical Philosophy)

Call for Papers: We welcome submissions from scholars (in particular, young scholars, i.e. early career researchers or post-graduate students) on any area of the foundations of mathematics (broadly construed). Particularly desired are submissions that address the role of set theory in the foundations of mathematics, or the foundations of set theory (universe/multiverse dichotomy, new axioms, etc.) and related ontological and epistemological issues. Applicants should prepare an extended abstract (maximum 1500 words) for blind review, and send it to sotfom [at] gmail [dot] com. The successful applicants will be invited to give a talk at the conference and will be refunded the cost of accommodation in Vienna for two days (7-8 July).

Notification of Acceptance: 30 April 2014

Scientific Committee: Philip Welch (University of Bristol), Sy-David Friedman (Kurt Gödel Research Center), Ian Rumfitt (University of Birmigham), John Wigglesworth (London School of Economics), Claudio Ternullo (Kurt Gödel Research Center), Neil Barton (Birkbeck College), Chris Scambler (Birkbeck College), Jonathan Payne (Institute of Philosophy), Andrea Sereni (Università Vita-Salute S. Raffaele), Giorgio Venturi (Université de Paris VII, “Denis Diderot” - Scuola Normale Superiore)

Organisers: Sy-David Friedman (Kurt Gödel Research Center), John Wigglesworth (London School of Economics), Claudio Ternullo (Kurt Gödel Research Center), Neil Barton (Birkbeck College), Carolin Antos (Kurt Gödel Research Center)

Conference Website: sotfom [dot] wordpress [dot] com

Claudio Ternullo (ternulc7 [at] univie [dot] ac [dot] at)

Neil Barton (bartonna [at] gmail [dot] com)

John Wigglesworth (jmwigglesworth [at] gmail [dot] com)

Author: "david (d.corfield@kent.ac.uk)" Tags: "Conference"

Date: Tuesday, 11 Feb 2014 00:08

Stefan Forcey just alerted me to a long and reflective piece in the Chronicle of Higher Education about the relationship between the NSA and American academics — mathematicians in particular.

Now many academics are trying to be heard from the outside, arguing that the NSA’s spying tactics are proving counterproductive and that university researchers have a duty to stop assisting them. […]

In the months since Edward J. Snowden fled the United States with tens of thousands of electronic documents describing NSA practices, mathematicians are realizing that they are in the same position as nuclear physicists in the middle of the last century, and business students in more recent times — suddenly needing to figure out the ethics behind what they do, said Edward Frenkel, a professor of mathematics at the University of California at Berkeley.

The Chronicle article cites not only various sources that have been mentioned here at the Café before (e.g. Beilinson’s letter and Stefan’s Notices article), but also something that hasn’t:

Their appeals were followed on January 24 by an open letter from a group of 50 researchers warning of long-term damage to society and to the nation’s technological enterprise from the NSA’s reported tactic of intentionally weakening computer-security standards so it can carry out spy operations.

“Every country, including our own, must give intelligence and law-enforcement authorities the means to pursue terrorists and criminals,” the researchers wrote, “but we can do so without fundamentally undermining the security that enables commerce, entertainment, personal communication, and other aspects of 21st-century life.”

Shunned as NSA advisers, academics question their ties to the agency, The Chronicle of Higher Education, 10 February 2014.

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Science and Politics"

Date: Monday, 10 Feb 2014 10:31

This is the fourth post in a series on categorical ideas related to formal concept analysis (the other posts are linked below). I want to bring together the ideas of the previous posts and tell you what relations and Galois correspdonences have to do with profunctors and nuclei. To do that I have to tell you what happens when you think of posets as categories enriched over the category of truth values.

Here is a picture of a poset, with $x\le y$ if you can climb up the arrows from $x$ to $y$.

The picture can also be interpreted as representing a thin category, that is a category with at most one morphism between each pair of objects.

Most people who know what a category is know that a thin category is the same thing as a poset; with an arrow $x\to y$ corresponding to the relation $x\le y$. However, despite being well-known, this fact is also not strictly true. You have to be wary of the antisymmetry axiom.

In a poset the antisymmetry axiom asserts that if $a\le b$ and $b\le a$ then $a=b$, but this translates in a thin category to having morphisms $a\to b$ and $b\to a$ implying that $a=b$ and that is not true in general, all you can say is that $a$ is isomorphic to $b$.

A poset-without-antisymmetry is called a preorder, and the true fact is that thin categories correspond to preorders (and posets correspond to skeletal, thin categories).

Less well-known than the above not-entirely-true fact, at least according to my unscientific straw poll of mathematicians, is that a category enriched over truth values is the same thing as a preorder. Although this seems, superficially, to be the same as the previous fact, it does have some deeper connotations due to the depths of enriched category theory. It is some of these depths that I want to start to explore here.

This follows on from three previous posts, but doesn’t require you to have read the other ones — unless you want to appreciate the punchline!

1. In The Nucleus of a Profunctor: Some Categorified Linear Algebra I gave a construction which starts with an enriched profunctor between two enriched categories and gives rise to an adjoint equivalence between certain presheaves and copresheaves on the two enriched categories. This is the nucleus construction.

2. In Formal Concept Analysis I gave a construction which starts with a relation between two sets and gives rise to a Galois correspondence between certain sets of subsets of the two sets.

3. In Classical Dualities and Formal Concept Analysis I gave examples of this construction popping up all over mathematics.

In this post I’ll try to convince you that the second construction is just an example of the first. I’ll do this by explaining what familiar (?!) concepts from ordinary category theory become when you work in categories enriched over truth values. This is summarized in the following dictionary.

\array {\arrayopts{ \colalign{left left} \rowlines{solid} } \\ \text{category }&\text{ preorder}\\ \text{functor }&\text{ order preserving map}\\ \text{set of natural transformations}&\text{domination relation}\\ \text{internal hom object in Set }&\text{ logical implication in Truth}\\ \text{presheaf }&\text{ downward closed subset (downset)}\\ \text{copresheaf }&\text{ upward closed subset (upset)}\\ \text{category of presheaves }&\text{ downsets ordered by inclusion}\\ \text{category of opcopresheaves }&\text{ upsets ordered by containment}\\ \text{ category of presheaves on a set }&\text{ powerset of a set ordered by inclusion}\\ \text{ category of opcopresheaves on a set } & \text {powerset of a set ordered by containment}\\ \text{adjunction }&\text{ Galois connection}\\ \text{profunctor }&\text{ relation}\\ \text{nucleus of a profunctor }&\text{ Galois correspondence from a relation}\\  &   }

In the rest of this post (which has got quite long) I will just explain how the translation arises. It is basic enriched category theory, so if you already understand the above dictionary you can stop reading now!

Next time I hope to explain when you apply the nucleus construction when you enrich over things more like the real numbers.

### The enriching category $\mathrm{Truth}$

The category of truth values, $\mathrm{Truth}$, has two objects: $\mathrm{true}$ and $\mathrm{false}$. The morphisms correspond to ‘logical entailment’ which is what working mathematicians would usually refer to as ‘implication’ but logicians reserve ‘implication’ for logical operation, we’ll see this below. Entailment is written $\vdash$ (pronounced ‘entails’), so we have $\mathrm{false}\vdash \mathrm{true}$ corresponds to the only non-identity morphism in $\mathrm{Truth}$. So the category can be simply pictured as follows. $\mathrm{false}\longrightarrow \mathrm{true}$ We can make this category of truth values into a monoidal category by using logical ‘and’ as the monoidal product; I’ll write this as $\wedge$.

### Categories

A category enriched over $\mathrm{Truth}$, $\mathcal{C}$, consists of a set $\text {ob}\mathcal{C}$ and for each $c,c'\in \mathcal{C}$ we have the ‘hom-object’ $\mathcal{C}(c,c')\in \{ \mathrm{true}, \mathrm{false}\}$. We should think of $\mathcal{C}(c,c')$ as being the truth of a relation $c\le c'$. We need composition morphisms in this category, so for each triple, $c,c',c''\in \text {ob}\mathcal{C}$ we need a morphism in truth values $\mathcal{C}(c',c'')\otimes \mathcal{C}(c,c')\to \mathcal{C}(c,c'');$ in this context it means we have an entailment $\mathcal{C}(c',c'')\wedge \mathcal{C}(c,c')\vdash \mathcal{C}(c,c''),$ or, in other words, if $c\le c'$ and $c'\le c''$ then $c\le c''$. So we have a transitive relation. We also need an identity for each $c\in \mathcal{C}$, generally speaking this means we need a morphism (where $1$ is the monoidal unit) $1\to \mathcal{C}(c,c);$ in this context it means $\mathrm{true}\vdash \mathcal{C}(c,c),$ or, in other words, $c\le c$. So the relation is also reflexive. Thus we have a preorder.

Conversely, a preorder gives rise to a $\mathrm{Truth}$ category.

Slogan: $\mathrm{Truth}$-categories correspond to preorders.

### Functors

There is a notion of an enriched functor which we can look at in this context. If we have two $\mathrm{Truth}$-categories $\mathcal{C}$ and $\mathcal{D}$ then a $\mathrm{Truth}$-functor consists of a function $F\colon \text {ob}\mathcal{C}\to \text {ob}\mathcal{D}$ and, for each pair $c,c'\in \text {ob}\mathcal{C}$, a morphism in $\mathrm{Truth}$ $\mathcal{C}(c,c')\to \mathcal{D}(F(c),F(c'));$ using square brackets to mean ‘the truth value of’ this is the same as $[c\le c']\vdash [F(c)\le F(c')],$ or, in other words, $F$ preserves the order.

Slogan: $\mathrm{Truth}$-functors correspond to order preserving maps.

### Natural transformation objects

Enriched category theory is richer if we enrich over a complete closed, symmetric monoidal category. If the category is complete then we means that we can make the collection of enriched functors between two enriched categories into an enriched category. If we have $F,G\colon \mathcal{C}\to \mathcal{D}$ then we have a hom-object $[\mathcal{C},\mathcal{D}](F,G)$ and this is given by an end $[\mathcal{C},\mathcal{D}](F,G)=\int _{c} \mathcal{D}(F(c),G(c)).$ In the context of enriching over $\mathrm{Truth}$, this end is just a big ‘and’, or a ‘for all’, if you prefer. $[\mathcal{C},\mathcal{D}](F,G)=\bigwedge _{c} \mathcal{D}(F(c),G(c)).$ Rewriting this is terms of truth values of the relations we get that this means $[F\le G] \coloneqq \bigwedge _{c} [F(c)\le G(c)].$ In other words, given preorders $\mathcal{C}$ and $\mathcal{D}$, there is a canonical preorder on the set of order preserving functions $\mathcal{C}\to \mathcal{D}$, and this is $F\le G$ if and only if $F(c)\le G(c)$ for all $c$. We can say that $F$ is dominated by $G$ if $F\le G$.

Slogan: The $\mathrm{Truth}$-natural transformation object corresponds to the domination relation.

### The closed structure

If the enriching category is closed then it can be considered as a category enriched over itself. There is a small problem here with type checking and we should be more precise. If $\mathcal{V}$ is closed then there is an internal hom functor $[-,-]\colon \mathcal{V}^{\mathrm{op}}\times \mathcal{V}\to \mathcal{V}$. We form a $\mathcal{V}$-category $\overline{\mathcal{V}}$ with the same objects as $\mathcal{V}$ but with the hom-object given by the internal hom $\overline{\mathcal{V}}(v,w):=[v,w]$. People often use the same notation for $\mathcal{V}$ and $\overline{\mathcal{V}}$; this is potentially confusing, but so you should do this with some care.

The category of truth values, $\mathrm{Truth}$, is closed and the internal hom is given by ‘logical implication’: $[a,b]:=a\Rightarrow b$. Here implication is understood in the logical operation sense and not the deductive sense (that’s what we are using entails for). So, using brackets purely for clarity, \begin{aligned} (\mathrm{true}\Rightarrow \mathrm{true}) &=\mathrm{true};& (\mathrm{true}\Rightarrow \mathrm{false})&=\mathrm{false};\\ (\mathrm{false}\Rightarrow \mathrm{true})&=\mathrm{true};& (\mathrm{false}\Rightarrow \mathrm{false})&=\mathrm{true}. \end{aligned}

Slogan: Internal hom for $\mathrm{Truth}$ is implication.

We know that $\mathrm{Truth}$-categories are the same as preorders, so this gives a natural preorder on the set $\{ \mathrm{true},\mathrm{false}\}$.

Slogan: The $\mathrm{Truth}$-category structure on $\mathrm{Truth}$ corresponds to the preorder $\mathrm{false}\le \mathrm{true}$.

### Presheaves and copresheaves

As we are considering a closed enriching category $\mathcal{V}$, we can think of $\mathcal{V}$ as a $\mathcal{V}$-category and consider $\mathcal{V}$-functors into $\mathcal{V}$. This is like considering scalar valued functions on a vector space.

A presheaf on $\mathcal{C}$ is a $\mathcal{V}$-functor $P\colon \mathcal{C}^{\mathrm{op}}\to \mathcal{V}$ and a copresheaf is a $\mathcal{V}$-functor $Q\colon \mathcal{C}\to \mathcal{V}$. In the context of preorders and truth values, a presheaf is an order-reversing function $P\colon \mathcal{C}\to \{ \mathrm{true},\mathrm{false}\}$ and a copresheaf is an order-preserving function $Q\colon \mathcal{C}\to \{ \mathrm{true},\mathrm{false}\}$.

Knowing a function to $\{ \mathrm{true},\mathrm{false}\}$ is the same as knowing the preimage of $\mathrm{true}$, so we can identify a function $P\colon \mathcal{C}\to \{ \mathrm{true},\mathrm{false}\}$ with the subset $\tilde{P}=P^{-1}(\mathrm{true})\subseteq \mathcal{C}$. The order reversing condition $\text {if }\, c\le c' \, \text { then } \, P(c')\le P(c)$ translates into $\text {if }\, c\le c'\, \text { and }c'\in \tilde{P}\, \text { then }\, c\in \tilde{P}.$ So $\tilde{P}$ is a downward closed subset of the preorder $\mathcal{C}$ and all downward closed subsets arise in this way.

Similarly if $Q\colon \mathcal{C}\to \mathrm{Truth}$ is a copresheaf then it corresponds to an upward closed subset $\tilde{Q}=Q^{-1}(\mathrm{true})$.

Because presheaves are $\mathrm{Truth}$-functors, we canonically have the domination relation between them. Given presheaves $P,P'\colon \mathcal{C}^{\mathrm{op}}\to \mathrm{Truth}$ then $P\le P'\quad \text {if and only if}\quad P(c)\le P'(c)\text { for all }c\in \mathcal{C},$ which when translated to downward closed sets becomes $P\le P'\quad \text {if and only if}\quad \tilde{P}\subseteq \tilde{P'}.$

Slogan: The poset of presheaves on a preorder can be identified with the set of downward closed subsets equipped with the subset ordering.

Similarly, the domination relation on copresheaves corresponds to inclusion of the associated upward closed subsets. However, it is usually the opposite of the category of copresheaves that crops up, so this has the opposite relation.

Slogan: The poset of opcopresheaves on a preorder can be identified with the set of upward closed subsets equipped with the superset ordering.

Sets can be thought of as discrete posets, that is to say, where $c\le c'$ if and only if $c=c$. In that case all subsets are both upward closed and downward closed. In this case, then, the set of copresheaves and the set of presheaves can both be identified with the powerset of the original set.

Slogan: The poset of presheaves on a set is the powerset with the subset ordering; the poset of opcopresheaves on a set is the powerset with the superset ordering

An enriched adjunction consists of a pair of enriched functors $F\colon \mathcal{C}\leftrightarrows \mathcal{D}\colon G$ together with an isomorphism in $\mathcal{V}$, natural in $c\in \mathcal{C}$ and $d\in \mathcal{D}$ $\mathcal{D}(F(c),d)\cong \mathcal{C}(c,G(d)).$ This means that when we enrich over the category of truth values we get a Truth-adjunction being a pair of order-preserving maps between posets $F\colon \mathcal{C}\leftrightarrows \mathcal{D}\colon G$ with the condition that $F(c)\le d \, \text { if and only if }\, c\le G(d).$ In other words we have the following slogan.

Slogan: A Truth-adjunction is precisely a Galois connection between preorders.

Then the adjoint equivalence induced on the fixed sets is precisely the Galois correspondence induced on the fixed set of the Galois connection.

### Profunctors

A profunctor between $\mathcal{V}$-categories $\mathcal{C}$ and $D$ is an a $\mathcal{V}$-functor $I\colon \mathcal{C}^{\mathrm{op}}\otimes \mathcal{D}\to \mathcal{V}.$ We should remind ourselves what the definition of the tensor product of $\mathcal{V}$-cateogries is. The set of objects of $\mathcal{A}\otimes \mathcal{B}$ is the set of ordered pairs $\text {ob}\mathcal{A}\times \text {ob}\mathcal{B}$ and the hom-sets are given by $\mathcal{A}\otimes \mathcal{B}((a,b),(a',b')):=\mathcal{A}(a,a')\otimes \mathcal{B}(b,b').$ It is perhaps worth noting that in order to define composition we will need that $\mathcal{V}$ has some extra structure such as being symmetric or braided.

When enriching over truth values, this means that for preorders $\mathcal{A}$ and $\mathcal{B}$ the preorder on $\mathcal{A}\times \mathcal{B}$ is given by $(a,b)\le (a',b')\quad \text { if and only if }\quad a\le a' \, \, \text {and}\, \, b\le b'.$ A profunctor in the $\mathrm{Truth}$ case can then be identified with a subset of $\mathcal{C}\times \mathcal{D}$ which is upward closed. Writing $c\preceq d$ for $I(c,d)$, we have: $\text {if}\, \, c\preceq d\, \, \text {and}\, \, (c,d)\le (c',d')\, \, \text {then}\, \, c'\preceq d'.$ This can make more sense if we expand it out: $\text {if}\, \, c'\le c\preceq d \le d' \, \, \text {then}\, \, c'\preceq d'.$ Slogan: Profunctors between preorders correspond to relations extending the preorders

### Nuclei

In the first post in this sequence I showed that a profunctor $I$ gives rise to an adjunction between presheaves and opcopresheaves, and thus to an adjoint equivalence between the fixed sets of the adjunction.

In the third post I showed how a relation between sets gives rise to a Galois connection between the powersets and thus to a Galois correspondence between the sets of closed subsets. I also gave several examples of correspondences around different areas of maths that arise in this way.

I will leave it to the interested reader to show that the Galois correspondence construction is what you get when you apply the profunctor nucleus construction to the situation of discrete $\mathrm{Truth}$-categories. It should drop out from the dictionary.

Slogan: The nucleus of a profunctor between discrete categories corresponds to the Galois correspondence between the closed subsets coming from a relation between two sets.

That’s not a very snappy slogan to end with, I grant you.

### Next time

In the next post I will show how costructing the nucleus of a profunctor when you enrich over something like the real numbers instead of truth values gives rise to some dualities in other areas of maths.

Author: "willerton (S.Willerton@sheffield.ac.uk)" Tags: "Categories"

Date: Monday, 10 Feb 2014 10:26

Guest post by Fosco Loregian

The aim of this note is to give a short account of

Freyd, P. J., & Kelly, G. M. (1972). Categories of continuous functors, I. JPAA, 2(3), 169-191.

as part of the Kan Extension Seminar series of lectures. I warmly thank all the participants and the organizer Emily Riehl for giving me this way of escape to the woeful solitude a “baby” category theorist (like I am preparing to become) suffers here in Italy. It is an amazing and overwhelming experience, I can’t even estimate the amount of things I already learned after these three lectures. There are two other people without whom this wouldn’t have been possible: my current advisor, D. Fiorenza, who patiently helped me to polish the exposition you are about to read, and my friend Paolo, for which the words “I don’t want to learn this” are meaningless.

That said, let’s begin with the real discussion.

Freyd and Kelly’s paper was the first to raise and solve in a very elegant way some fundamental questions in elementary Category Theory, the so-called Orthogonal subcategory problem, and Continuous functor problem.

### Orthogonality between arrows

Definition. An object $B$ in a category $\mathbf{A}$ is said to be orthogonal to an arrow $k\colon A\to X$ (we say $k\perp B$) if the hom-functor $\mathbf{A}(-,B)$ sends $k$ to a bijection $\mathbf{A}(X,B)\to \mathbf{A}(A,B)$ between sets.

If $\mathbf{A}$ has a terminal object, then $k\perp B$ if and only if the terminal arrow $B\to 1$ has the so-called (unique) right lifting property against $k$: this means that for any choice of $f$ in the diagram $\begin{array}{ccc} A &\stackrel{f}{\longrightarrow}& B \\ {}^k \downarrow && \downarrow \\ X & \longrightarrow & 1 \end{array}$ there exists a unique arrow $a\colon X\to B$ making the upper triangle commute. Obviously, there is a dual notion of left lifting property.

### Statement of the problem: the CFP and the OSP.

Now, a classical issue in elementary Category Theory is the so-called “orthogonal subcategory problem”:

Orthogonal Subcategory Problem (OSP). Given a class $\mathcal{H}$ of arrows in a category $\mathbf{A}$, when is the full subcategory ${\mathcal{H}}^\perp$ of all objects orthogonal to $\mathcal{H}$ a reflective subcategory (i.e., when there exists a left adjoint to the inclusion ${\mathcal{H}}^\perp\hookrightarrow \mathbf{A}$)? $\blacksquare$

There are lots of “protoypical” examples of the OSP in Algebra and Geometry: think for example to the case of sheaves of sets (on a given Grothendieck site) as a reflective subcategory among presheaves: the sheaf condition can be easily stated in terms of an orthogonality request: a presheaf $F$ on a site $(\mathbf{C},J)$ is a $J$-sheaf if and only if $i_C\perp F \quad \forall C\in\mathbf{C}$ for every covering sieve $i_C\colon S\to y_{\mathbf{C}}(C)=\mathbf{C}(-,C)$.

In fact this is not a case, since following a general tenet “(at least some) localizations are determined by an orthogonality class” (see for example the definition of $\mathcal{S}$-local object and the $n$Lab page about the OSP).

Freyd and Kelly were the first to point out that a solution for the OSP turns out to solve another fundamental question, which falls under the name of “continuous functor problem”:

Continuous Functor Problem (CFP). Given a category $\mathbf{C}$ and a class of diagrams (say $\Gamma$) in it, when is the category of all functors $\mathbf{C}\to \mathbf{D}$ which preserve limits of all $\Gamma$-shaped diagrams reflective in the category of functors $\mathbf{C}\to \mathbf{D}$? $\blacksquare$

(Important) Remark. Before going on, we must spend a word on the notion of continuity: Freyd and Kelly published an erratum shortly after the paper, to correct the “stupid mistake of supposing that the limit of a constant diagram is the constant itself”; counterexamples to this statement abound, and in fact it can be easily shown that the limit of the constant functor $\Delta(C)\colon \mathbf{J}\to\mathbf{C}$ is (whenever this copower exists) precisely $C^{\pi_0(\mathbf{J})}$, where $\pi_0(\mathbf{J})$ is the set of connected components of the category $\mathbf{J}$.

Once this is fixed, notice that the CFP arises in an extremely elementary way: for example,

1. An additive functor $F$ between abelian categories is left exact if and only if it commutes with finite limits, and
2. The above sheaf condition can be easily restated in the good old familiar continuity request on coverings of objects $C\in\mathbf{C}$.

This should give you evidence that the two problems are not unrelated:

Proposition. Given a class of diagrams $\Gamma$ in a small complete category $\mathbf{C}$, we get a family of natural transformations $\mathcal{G}(\Gamma)=\Big\{ m_\gamma \colon \colim \mathbf{C}(\gamma,-) \to \mathbf{C}\big( \lim\; \gamma,- \big) \Big\}_{\gamma\in\Gamma}$ and a functor $F\colon \mathbf{C}\to Set$ is $\Gamma$-continuous if and only if it is orthogonal to each arrow in $\mathcal{G}(\Gamma)$.

(This result is not in its full generality: see [FK], Prop. 1.3.1.)

Proof. The following diagram commutes, $\begin{array}{ccc} Nat (colim\; \mathbf{C}(\gamma,-),F) &\leftarrow & Nat (\mathbf{C}(lim \;\gamma,-),F)\\ \wr\!| && \wr\!| \\ lim\; Nat (\mathbf{C}(\gamma,-),F) && F(lim\; \gamma)\\ \wr| && \downarrow \\ lim\; F\gamma &=& lim\; F\gamma \end{array}$ and one of the two arrows is an isomorphism if and only if the other is. $\blacksquare$

### OSP $\Rightarrow$ CFP: Strategy of the proof.

The strategy adopted by Freyd and Kelly to solve the OSP, is to find sufficient conditions on $\mathcal{H}$ so that Freyd’s Adjoint Functor Theorem applies to the inclusion $\mathcal{H}^\perp\hookrightarrow \mathbf{A}$ (in particular, since it can be shown that $\mathcal{H}^\perp$ is always complete, this boils down to find a solution set for $\mathcal{H}^\perp$ to apply Freyd Adjoint Functor Theorem).

These conditions are of 1+3 different types:

1. Cocompleteness;
2. The presence of a proper factorization system;
3. The presence of a generator;
4. A (global) boundedness condition (or equivalently, on the generator in the previous point).

### Factorization systems

Notation. We denote $llp(\mathcal{H})$ (resp, $rlp(\mathcal{H})$) the (possibly large) class of all arrows left (resp, right) orthogonal to each arrow of the class $\mathcal{H}$.

In the previous notation, $k\perp B \iff k \in llp(B \to 1) \iff (B\to 1) \in rlp(k)$.

Definition. A prefactorization system on a category $\mathbf{A}$ consists of two classes of arrows $\mathbb{F}=(\mathcal{E},\mathcal{M})$ such that $\mathcal{E} = llp(\mathcal{M})$ and $\mathcal{M} = rlp(\mathcal{E})$.

A prefactorization system $\mathbb{F}$ on $\mathbf{A}$ is said proper if $\mathcal{E}\subset Epi$ and $\mathcal{M}\subset Mono$.

A factorization system (OFS, or simply FS) on a category $\mathbf{A}$ corresponds to the modern notion of orthogonal factorization system: a (proper) factorization on the category $\mathbf{A}$ is precisely a (proper) prefactorization $\mathbb{F}=(\mathcal{E},\mathcal{M})$ such that each $f\colon X\to Y$ can be written as a composition $X\stackrel{e}{\to}W\stackrel{m}{\to}Y$ with $e\in \mathcal{E}, m\in\mathcal{M}$.

Examples. 0. Any category $\mathbf{C}$ has two trivial factorization systems, namely $( Mor_\mathbf{C} , Iso_\mathbf{C} )$ and $( Iso_\mathbf{C} , Mor_\mathbf{C} )$, where $Iso_\mathbf{C}$ denotes the class of all isomorphisms, and $Mor_\mathbf{C}$ the class of all arrows in $\mathbf{C}$; 1. The category $Set$ has a factorization system $\mathbb{F}=(Epi,Mono)$ where $Epi$ denotes the class of surjective maps, and $Mono$ the class of injective maps. More generally, the category of models of any algebraic theory (monoids, (abelian) groups, …) has a proper FS $(Epi^\ast, Mono)$, where $Epi^\ast$ is the class of extremal epimorphisms (which may or may not coincide with plain epimorphisms); and for abelian categories, (elementary) toposes…

### Generators

Definition. If $\mathbf{A}$ is a category with a proper factorization system $\mathbb{F}$, we say that a family of objects $\{q_i\colon B_i\to C\}_{i\in I}$ lies in $\mathcal{E}$ if there exists a unique $t\colon C\to X$ solving (all at once) the lifting problems $\begin{array}{ccc} B_i & \stackrel{f_i}{\longrightarrow} & X \\ {}^{q_i} \downarrow && \downarrow^m\\ C & \longrightarrow & Y \end{array}$ (one for each $i\in I$). If $\mathbf{A}$ has sufficiently large coproducts, this condition is obviously equivalent to ask that the arrow $\left(\bar q\colon \amalg_{i\in I} B_i\to C\right)\in\mathcal{E}$.

Definition. A generator in a category with a proper factorization system $\mathbb{F}=(\mathcal{E}, \mathcal{M})$ consists of a small full subcategory $\mathbf{G}\subseteq\mathbf{A}$ such that for any $A\in\mathbf{A}$ the family $\{G\to A\}_{G\in\mathbf{G}}$ lies in $\mathcal{E}$ in the former sense.

Remark. Mild completeness assumptions on $\mathbf{A}$ entail that

1. A generator separates objects, i.e. if $f\neq g$ then there exists an object $G\in\mathbf{G}$ and an arrow $G\to A$ such that $f k\neq g k$.
2. A small dense subcategory of $\mathbf{A}$ is a generator;
3. Any finitely complete category with a generator is well-powered.

For extremal FSs (in which the left/right class coincides with that of extremal epi/mono) the converse of 1,2 is also true, so as to recover the notion of generator as a “separator for objects”.

### Boundedness

Notation. In this section $\mathbf{A}$ admits all limits and colimits whenever needed

Definition(s). An ordered set $J$ is said to be $\sigma$-directed (for a regular cardinal $\sigma$) if every subset of $J$ with less than $\sigma$ elements has an upper bound in $J$. A $\sigma$-directed family $\{C_j\to B\}_{j\in J}$ of subobjects of $B\in\mathbf{A}$ consists of a functor $J\to Sub_\mathbf{A}(B)$ from a $\sigma$-directed set to the posetal class of subobjects of $B$. The colimit of such a functor, denoted $\bigcup_{j\in J} C_j$ is called the $\sigma$-directed union of the family.

With these conventions, we say that an object $A\in\mathbf{A}$ is bounded by a regular cardinal $\sigma$ (called the bound of $A$) if every arrow $A\to \bigcup_{j\in J} C_j$ to a $\sigma$-directed union factors through one of the $C_j$. The category $\mathbf{A}$ is bounded if each $A\in\mathbf{A}$ is bounded by a regular cardinal $\sigma_A$ (possibly depending on $A$).

Example. In $\mathbf{A}= Set$ a set of cardinality $\le \sigma$ is $\sigma$-bounded.

Remark. $\sigma$-boundedness is obviously linked to local $\sigma$-presentability: [PK]’s locally $\sigma$-presentable categories are precisely those categories $\mathbf{A}$ which

• admit a generator $\mathbf{G}$ each of which object is $\sigma$-presentable.

Examples. Examples of such structures/properties on categories abound:

1. Any abelian, $AB(5)$, bicomplete and bi-well-powered category $\mathbf{A}$, is bounded;
2. Given a regular cardinal $\sigma$, locally $\sigma$-presentable categories are $\sigma$-bounded, and admit a generator with respect to the proper FS $(Epi^\ast, Mono)$: sets, small categories, presheaf toposes and Grothendieck abelian categories all fall under this example. Less obviously, the converse implications is false: exhibiting a $\sigma$-bounded category with a generator which is not locally $\sigma$-presentable requires to accept the inexistence of measurable cardinals (see [FK], Example 5.2.3).

### Solution of the OSP

Theorem (OS theorem). If $\mathbf{A}$ is complete, cocomplete, bounded and co-well-powered with a proper FS $\mathbb{F}=(\mathcal{E},\mathcal{M})$, and $\mathcal{H}$ is a class of arrows whose elements are “almost all” in $\mathcal{E}$, i.e. $\mathcal{H}=\mathcal{S}\cup \overline{\mathcal{E}}$ (we call these classes quasi-small with respect to $\mathcal{E}$), where

• $\mathcal{S}$ is a set;
• $\overline{\mathcal{E}}$ is possibly large but contained in $\mathcal{E}$.

Then $\mathcal{H}^\perp$ is a reflective subcategory. $\blacksquare$

Proof. [FK] performs a clever transfinite induction to generate a solution set for any object $A\in\mathbf{A}$: if $k\colon M\to N$ is the typical arrow in $\mathcal{S}$, we define

• (zero step) $\S_{0, A} = Quot_{\mathbf{A}}(A)$;
• (successor step) $\S_{\alpha+1, A} = \bigcup_L Quot_{\mathbf{A}}(L)$, where $L\in \Big\{ C\amalg \coprod_{M\to C}N\mid C\in \S_\alpha,\; (M\to N)\in \mathcal{S} \Big\}$
• (limit step) $\S_{\lambda, A} = \bigcup_W Quot_{\mathbf{A}}(W)$, where $W\in \Big\{ \coprod_{\alpha\lt \lambda} C_\alpha\mid C_\alpha\in \S_\alpha \Big\}$.

This is where boundedness comes into play: if $\sigma$ is the cardinal bounding $A$, then the induction stops at $\sigma$: $\S_{\sigma, A}\cap \mathcal{H}^\perp$ is the desired solution set for $A\in\mathbf{A}$, namely every arrow $f\colon A\to B$ whose codomain lies in $\mathcal{H}^\perp$ factors through some $X\in \S_{\sigma, A}\cap \mathcal{H}^\perp$.

### Solution of the CFP

The procedure we adopted to reduce the CFP to the OSP (building $\mathcal{ G}(\Gamma)$) doesn’t take care of any size issue: to repair this deficiency we exploit the following

Lemma. Let $\mathbf{A}$ be cocomplete, endowed with a proper factorization system $\mathbb{F}$ and a generator $\mathbf{G}$. For any class $\Theta$ of natural transformations in $Fun(\mathbf{C}, Set)$ we denote $\begin{array}{rl} \mathcal{H} &= \Big\{ \beta\otimes A\mid \beta\in\Theta,\; A\in\mathbf{A} \Big\} \\ \mathcal{H}_1 &= \Big\{ \beta\otimes G\mid \beta\in\Theta,\; G\in\mathbf{G} \Big\} \end{array}$ Then there exists a class $\mathcal{W}$ contained in $\mathcal{E}$ such that $\mathcal{H}^\perp = (\mathcal{H}_1\cup \mathcal{W})^\perp$.

The particular shape of $\mathcal{H}$ is due to the procedure used in [FK] to reduce the CFP to the OSP. The $\otimes$ operation is a copower, in the obvious sense: given $\beta\colon \mathbf{C}\to Set$, $\beta\otimes A\colon F\otimes A\to G\otimes A$, where $F\otimes A\colon C\mapsto F C\otimes A = \coprod_{c\in F C}A$.

The key point of this result is that the class $\mathcal{H}_1$ is small (obviously) whenever $\Theta$ is, so we can conclude applying the OS theorem:

Theorem (CF theorem). Let $\mathbf{C}$ be a small category, and $\mathbf{D}$ a bicomplete, bounded, co-wellpowered category with a generator and a proper factorization $\mathbb{F}=(\mathcal{E},\mathcal{M})$. Let $\Gamma$ be a class of cylinders whose elements are almost all cones (this means that the collection of diagrams which are not cones is a set). Then the subcategory of $\Gamma$-continuous functors is reflective in $Fun(\mathbf{C},\mathbf{D})$. $\blacksquare$

The rough idea behind this result is the following: $\mathcal{H}^\perp$ can be written as $(\mathcal{H}_1\cup\Omega)^\perp$, and $\mathcal{H}_1$ itself can be split as a union $\mathcal{H}_1^M \cup \mathcal{H}_1^{E}$, where the two sub-classes consist of the $\mathcal {M}$-arrows and the $\mathcal{E}$-arrows of the various $h\in\mathcal{H}$. The assumptions made on $\Gamma$ and the presence of a generator on $\mathbf{A}$ entail that $\mathcal{H}_1^M$ is a set, so we can conclude.

### The state of the art.

1. [FK]’s solution of the OSP can be generalized: [AHS] show that $\mathcal{H}^\perp$ is reflective in a category $\mathbf{A}$ with a proper FS $\mathbb{F}=(\mathcal{E},\mathcal{M})$ whenever the class $\mathcal{H}$ is quasi-presentable, namely it can be written as a union $\mathcal{H}_0\cup \mathcal{H}_e$, where $\mathcal{H}_e\subset \mathcal{E}$ and $\mathcal{H}_0$ is presentable (in a suitable sense).

2. The same paper offers a fairly deep point of view about the “weak analogue” of the OSP, which can be regarded as a generalization of the Small Object Argument (SOA) in Homotopical Algebra; if we build the class $\mathcal{H}^\square$ of arrows having a non-unique lifting property against each $h\in\mathcal{H}$, then we can only hope in a weak reflection, where the unit of the adjunction is only weakly universal. In a setting where “things are defined up to homotopy” this can still be enough, provided that we ensure the reflection maps satisfy some additional properties. The additional property requested in the SOA is that the weak reflection maps belong to the cellular closure of $\mathcal{H}$, i.e. they can be obtained as a transfinite composition of pushouts of maps in $\mathcal{H}$.

3. The theory of factorization systems is deeply intertwined with the SOA, too: in [SOA] R. Garner defines an “algebraic” Small Object Argument, exploiting a description of OFS and WFS as suitable pairs $(comonad, monad)$ over the category $\mathbf{A}^{\Delta^1}$. In this respect I think that the best person which can give us sensible references for this is our boss, since she wrote this paper.

[LPAC] Jiří Adámek, Jiří Rosický, Locally presentable and accessible categories, LMS Lecture Notes Series 189, Cambridge University Press, (1994).

[FK] Freyd, P. J., & Kelly, G. M. (1972). Categories of continuous functors, I. JPAA, 2(3), 169-191.

[SOA] R. Garner, Understanding the small object argument, Applied Categorical Structures 17 (2009), no. 3, pages 247-285.

[PK] P. Gabriel and F. Ulmer, Lokal präsentierbare Kategorien, Springer LNM 221, 1971.

[AHS] J. Adamek, M. Hebert, L. Sousa, The Orthogonal Subcategory Problem and the Small Object Argument, Applied Categorical Structures 17, 211-246.

Author: "riehl (eriehl@math.harvard.edu)" Tags: "Categories"

Date: Sunday, 02 Feb 2014 18:15

Last time, I described an emerging pattern in algebra and topology, exemplified by the cohomology of the space of configurations of ordered points in the plane. In language I’ll introduce below, cohomology with rational coefficients defines a uniformly representation stable sequence, the fundamental data of which is an $S_n$-representation for each $n$.

In this post, I want to tell you in more detail about the representation stable sequences introduced in a paper by Tom Church and Benson Farb. I’ll also discuss the second generation approach to representation theory, which centers of the functor category of what they call FI-modules. Jordan Ellenberg, the third author of this paper, has blogged about this also. The first post in his series can be found here. In contrast with Part I, my exposition here will be more explicitly directed at the categorically-minded reader.

## Representation stable sequences

Let $k$ be a commutative ring and let $S_n$ denote the symmetric group on $n$ letters. Write $V = (V_n)_{n \geq 0}$ for a collection of $S_n$-representations $V_n$. Equivalently, $V$ defines a functor $\oplus S_a \to \mathbf{Mod}_k$ whose domain is the disjoint union of the symmetric groups $S_n$, regarded as one-object groupoids (or, if you prefer, the category of finite sets and isomorphisms). This data is sometimes called a symmetric sequence or a species. The symmetric groups come with injective group homomorphisms

$S_1 \hookrightarrow S_2 \hookrightarrow S_3 \hookrightarrow \cdots$

where the image of $S_n \hookrightarrow S_{n+1}$ consists of those permutations that fix the element $n+1$. The “mapping telescope” $\mathbf{Tel}$ of this sequence is the category defined by gluing $\oplus S_n$ to the poset $\omega$ along the discrete category $\mathbb{N}$.

Definition. A consistent sequence of representations is a functor $V\colon \mathbf{Tel} \to \mathbf{Mod}_k$.

Explicitly, a consistent sequence is a symmetric sequence $V$ together with $S_n$-equivariant maps (natural transformations) $V_n \to \text{res}(V_{n+1})$ from $V_n$ to the restriction of the $S_{n+1}$-representation $V_{n+1}$ along $S_n\hookrightarrow S_{n+1}$. Here’s another way to think about this. Define a category $\mathbf{Rep}$ over $\omega$ whose objects in the fiber over $n$ are $S_n$-representations $V_n$. Hom-sets in Rep are defined by

$\text{Hom}_{\mathbf{Rep}}(V_m, V_n) = \text{Hom}_{S_m}(V_m, \text{res} V_{n}) = \text{Hom}_{S_{n}}(\text{ind}V_m, V_{n})$

if $m \leq n$ and empty otherwise. Here $\text{ind} \colon \mathbf{Mod}_k^{S_n} \to \mathbf{Mod}_k^{S_{m}}$ is left adjoint to the restriction along the composite inclusion $S_n\hookrightarrow S_m$. It’s precisely the left Kan extension. In representation theory, this functor is called induction, which explains the notation I’m using here. Now a consistent sequence is exactly a section $V$ to the projection functor $\mathbf{Rep} \to \omega$, i.e., a countable sequence of composable morphism in $\mathbf{Rep}$, with exactly one object in the fiber over each $n \in \omega$.

Definition. A consistent sequence of representations $V$ is a representation stable sequence if

(i) All but finitely many of the maps in $V \colon \omega \to \mathbf{Rep}$ are monomorphisms.

(ii) All but finitely many of the maps in $V \colon \omega \to \mathbf{Rep}$ are epimorphisms.

(iii) For each partition $\lambda$, the coefficients $c_{n,\lambda}$ of the irreducible representation $V(\lambda)$ in the decomposition $V_n \cong \oplus_{\lambda} V(\lambda)^{c_{n,\lambda}}$ is independent of $n$ for $n$ sufficiently large.

By definition, a morphism $V_m \to V_{n}$ in $\mathbf{Rep}$ is represented by either of the two adjunct arrows $V_m \to \text{res}V_{n}$ in $\mathbf{Mod}_k^{S_m}$ or $\text{ind}V_m \to V_{n}$ in $\mathbf{Mod}_k^{S_{n}}$. The map $V_m \to V_{n}$ is a monomorphism if and only if $V_n \to \text{res}V_{n}$ is injective and an epimorphism if and only if $\text{ind}V_m \to V_{n}$ is surjective.

The sequence is uniformly representation stable if the stable range in condition (iii) can be chosen independently of $\lambda$.

For example, consider the space $\text{Conf}_n(\mathbb{C})$ of configurations of $n$ labeled points in the plane. “Forgetting the last point” defines a continuous map $\text{Conf}_{n+1}(\mathbb{C}) \to \text{Conf}_n(\mathbb{C})$, inducing a map $H^i(\text{Conf}_n(\mathbb{C});\mathbb{Q}) \to H^i(\text{Conf}_{n+1}(\mathbb{C}),\mathbb{Q})$ that is $S_n$-equivariant in the sense described above. So $i$-th cohomology of configuration spaces defines a consistent sequence of representations. The theorem mentioned at the end of Part I now takes the following form:

Theorem. The consistent sequence of representations $H^i(\text{Conf}_n(\mathbb{C});\mathbb{Q})$ is uniformly representation stable with stable range $n \geq 4i$.

Later, this theorem was generalized to configurations of ordered points in any connected, orientable, manifold of finite-type (meaning that the rational cohomology consists of finite dimensional vector spaces).

## FI-modules

The second-generation account of representation stability begins with the following observation. The functor $\text{Conf}_{(-)}(\mathbb{C}) \colon \mathbf{Tel}^{op} \to \mathbf{Top}$ factors through a quotient of the category $\mathbf{Tel}$ that turns out to be much easier to describe. The consistent sequence of configuration spaces was defined using maps that “forget the last point.” But of course, we could have forgotten any of the points or any number of points. Moreover we could have chosen new labels for the points that we do not forget. These “forgetful” maps assemble into a functor $\mathbf{FI}^{op} \to \mathbf{Top}$, where FI is the category of finite sets and injections. (When notationally convenient, we’ll consider $\mathbf{FI}$ to be skeletal.)

Definition. An FI-module is a functor $\mathbf{FI} \to \mathbf{Mod}_k$. The category FI-Mod is the category of FI-modules and natural transformations.

A consistent sequence is an FI-module if and only if $V \colon \mathbf{Tel}(S) \to \mathbf{Mod}_k$ factors through the functor $\mathbf{Tel} \to \mathbf{FI}$ induced by the obvious cone under the pushout diagram defining the category $\mathbf{Tel}$. This is the case just when, for each $v$ in the image of the canonical map $V_m \to V_n$ ($m \lt n$), and for any $\sigma, \tau \in S_n = hom(n,n)$ with common restriction along $m \hookrightarrow n$, we have $\sigma \cdot v= \tau \cdot v$.

A consistent sequence that is not an FI-module is the sequence of regular representations

$k \to k[S^1] \to k[S^2] \to k[S^3] \to \cdots$

in which $k[S^n]$ is the free $k$-module module with basis $S_n$.

## Representable FI-modules and free FI-modules

For each $n$, there is a free FI-module that Church, Ellenberg, and Farb denote by $M(a)$ defined to be the composite of the functor represented by $a$ and the free $k$-module functor $k[-] \colon \mathbf{Set}\to \mathbf{Mod}_k$. So $M(a)_m = k[hom(a,m)]$. By the Yoneda lemma and the free-forgetful adjunction, a map of $FI$-modules $M(a) \to V$ corresponds to a vector $v \in V_a$.

There are more general notions of free FI-modules defined by forming the left Kan extension of an $S_a$-representation or of a symmetric sequence. (We might regard an $S_a$-representation as a symmetric sequence in which all of the other representations are 0.)

Write $\oplus S_a$-Rep for the functor category $\mathbf{Mod}_k^{\oplus S_a}$. Note that $\oplus S_a$ is the maximal subgroupoid of $\mathbf{FI}$. Left Kan extension defines the free-FI-module functor

$M(-)\colon \oplus S_a\text{-}\mathbf{Rep} \to \mathbf{FI}\text{-}\mathbf{Mod}$

which is left adjoint to restriction: the functor that sends an FI-module to its underlying symmetric sequence. Because the category $\mathbf{Mod}_k$ is cocomplete, we obtain an explicit formula for $M(-)$ as a special case of the formula for pointwise left Kan extensions:

$M(W)_n = \oplus_a \left( M(a)_n \otimes_{k[S_a]} W_a \right).$

If $k$ is a field (so that $W_a$ is free), this formula tells us that

$\text{dim} M(W)_n = \sum_a \left(\begin{array}{c} n \\ a \end{array}\right) \text{dim} W_a.$

## Categorical properties of FI-$\mathbf{Mod}$

FI-$\mathbf{Mod}$ is a category of functors taking values in a complete and cocomplete abelian category. Hence FI-$\mathbf{Mod}$ is also abelian and has these limits and colimits (and kernels and cokernels), defined pointwise. The tensor product over the commutative ring $k$ also induces a number of monoidal structures that we decline to discuss here. Another monoidal structure is induced by Day convolution from the disjoint union on FI.

The abelian category structure of FI-modules will have computational implications.

Definition. An FI-module $V$ is finitely generated if there exists a finite sequence of integers $m_i$ and an epimorphism $\oplus_i M(m_i) \to V$ in FI-Mod.

The map $\oplus_i M(m_i) \to V$ identifies a sequence of elements $v_i \in V_{m_i}$ that span $V$ in the sense that $V$ is the largest submodule containing each of the $v_i$. We have the elementary result:

Proposition. Consider a short exact sequence $U \to V \to W$ of FI-modules. If $V$ is finitely generated, then so is $W$. If both $U$ and $W$ are finitely generated, then so is $V$.

If $k$ is a Noetherian ring containing $\mathbb{Q}$, then FI-$\mathbf{Mod}$ is Noetherian, meaning that any submodule of a finitely generated FI-module is finitely generated. This is harder to prove.

## FI-modules and representation stability

The reason we care about finite generation is that in good cases it is equivalent to uniform representation stability.

Theorem. An FI-module over a field of characteristic zero is finitely generated if and only if its underlying consistent sequence is uniformly representation stable and each vector space is finite dimensional.

Moreover, it is computationally advantageous to show that a sequence of FI-modules is finitely generated. For instance, we have the following theorem, which is a special case of a more general result about character polynomials. (These are polynomials whose variables $X_i$ are the class functions $X_i \colon S_n \to \mathbb{N}$ that carry a permutation to the number of $i$-cycles in its cycle decomposition.)

Theorem. If $V$ is a finitely generated FI-module over a field of characteristic zero, then there is a polynomial $P$ with rational coefficients so that $dim V_n = P(n)$ for all $n$ sufficiently large.

## FI$\sharp$-modules

In some examples, FI-modules come with even more structure. Let us return our attention to the configuration spaces $\text{Conf}_n(M)$ but now assume that the manifold $M$ has non-empty boundary. This allows us to define maps $\text{Conf}_m(M) \to \text{Conf}_{n}M$ for $m$ less than $n$, which introduce new points in a small neighborhood of the boundary. Composition of maps of this form isn’t associative on the nose, but it is well-defined up to homotopy, which means that the cohomology $H^i(\text{Conf}_n(M);k)$ for $i$ fixed has both the structure of an FI-module and of a co-FI-module (a contravariant functor $\mathbf{FI}^{op} \to \mathbf{Mod}_k$).

Church, Ellenberg, and Farb call symmetric sequences of this form FI$\sharp$-modules, which are $\mathbf{Mod}_k$-valued functors whose domain is the category $\mathbf{FI}\sharp$ of finite sets are partially-defined injections. It’s convenient to think of $\mathbf{FI}\sharp$ as the category of (isomorphism classes of) spans in $\mathbf{FI}$. The span

$m \leftarrowtail s \rightarrowtail n$

specifies a subset of $m$ (the domain of the partial function) and a bijection (mediated by $s$) with its image, a subset of $n$. We call $|s|$ the rank of this map. There are natural bijective-on-objects inclusions $\mathbf{FI} \to \mathbf{FI}\sharp$ and $\mathbf{FI}^\op \to \mathbf{FI}\sharp$ whose images consists of those spans with one component the identity. In this way we see that an FI$\sharp$-module has both an underlying FI-module and co-FI-module structure.

The representable FI-module $M(a)$ can be extended to an FI$\sharp$-module, which we also denote by $M(a)$, in a unique way. Recall that $M(a)_s = 0$ for all $s \lt a$. Note that in $\mathbf{FI}\sharp$, the span $m \leftarrowtail s \rightarrowtail n$ factors through $s$ in the evident way (as a map in $\mathbf{FI}^\op$ followed by a map in $\mathbf{FI}$). So an immediate consequence of functoriality is that any map in $\mathbf{FI}\sharp$ of rank less than $a$ must define the zero map on the FI$\sharp$-module $M(a)$.

To define the linear map

$M(a)_m = k[\mathrm{hom}_{\mathbf{FI}}(a,m)] \to k[\mathrm{hom}_{\mathbf{FI}}(a,n)]$

corresponding to $m \leftarrowtail s \rightarrowtail n$, it of course suffices to specify the image for each basis element $a \rightarrowtail m$. To do so, form the composite span $a \leftarrowtail t \rightarrowtail n$. Either $a =t$, in which case this map lies in the subcategory $\mathbf{FI}\subset \mathbf{FI}\sharp$, and we define the image of $a \rightarrowtail m$ to be the basis vector $a \rightarrowtail n$. If not, this composite has rank less than $a$ and we must define its image to be zero.

Note the FI$\sharp$-module $M(a)$ is not the left Kan extension of the FI-module $M(a)$ along $\mathbf{FI}\hookrightarrow \mathbf{FI}\sharp$: the left Kan extension would be the representable FI$\sharp$-module. (Recall that Kan extensions are only guaranteed to be extensions in the case where the map being extended along is full and faithful.) This confused me for a while. A big difference between $\mathbf{FI}$ and $\mathbf{FI}\sharp$ is that objects in the latter category have non-invertible endomorphisms. Indeed, each subset $s \subset a$ defines an idempotent

$a \leftarrowtail s \rightarrowtail a$

whose splitting is the object $s$; note the category $\mathbf{FI}\sharp$ is idempotent complete.

Now allowing $a$ to vary, the $M(a)$ define a symmetric sequence of FI$\sharp$-modules (with $S_a$ acting on $M(a)$ by precomposition). So we can define a functor $M(-)$ from symmetric sequences to FI$\sharp$-modules by functor tensor product:

$M(W) = \oplus_a \left( M(a) \otimes_{k[S_a]} W_a \right).$

This functor induces a classification of FI$\sharp$-modules.

Theorem. $M(-) \colon \oplus S_a\text{-}\mathbf{Rep} \to \mathbf{FI}\sharp\text{-}\mathbf{Mod}$ defines an equivalence of categories.

A stronger version of this theorem was proven by Teimuraz Pirashvili in Dold-Kan type theorem for Gamma-groups.

If $k$ is a field, a corollary of this result and the dimension calculation above is that the dimension of any finitely generated FI$\sharp$-module $V$ is given by a single polynomial in $n$ for all $n$. In fact, over a field finite generation of an FI$\sharp$-module $V$ is equivalent to the dimensions dim $V_n$ being bounded. This, together with the Noetherianness mentioned above, provides useful tools for proving that FI-modules are finitely generated.

Author: "riehl (eriehl@math.harvard.edu)" Tags: "Algebra"

Date: Sunday, 02 Feb 2014 03:32

At the Joint Mathematics Meetings in Baltimore, I saw Benson Farb deliver a joint invited address on representation stability, the above eponymous “emerging pattern.” He began this work in 2010 with Tom Church, then a PhD student at the University of Chicago. I’ve been a fan for a few years now, and Benson’s beautiful talk has inspired me to write a brief summary.

This post is divided in two parts. In Part I, I’ll tell you about the talk, which was largely accessible to anyone with an undergraduate math background. In Part II, I’ll say a bit about the technical details and write about more recent developments, joint also with Jordan Ellenberg, in which some categorical ideas enable a simplified conceptual understanding of the patterns that frequently arise in practice.

## Configuration spaces and polynomials

The topological spaces we’ll consider are configuration spaces. Let $M$ be any topological space (typically a manifold) and write $\text{Conf}_n(M)$ for the space of ordered tuples of $n$ distinct points in $M$. The space $\text{Conf}_n(M)$ has a natural free action by $S_n$, the symmetric group on $n$ elements, that simply permutes the labels. The space of orbits, (the colimit of this action) is denoted by $\text{Conf}_n(M)/S_n$ and is the space of unordered tuples of $n$ distinct points in $M$.

Configuration spaces are basic mathematical objects. For example

$\text{Conf}_n(\mathbb{C})/S_n = \text{Poly}_n(\mathbb{C})$

is the space of monic, degree $n$, square-free polynomials. Here, the tuple of unordered points corresponds to the roots of the polynomial. The square-free condition guarantees that the roots are distinct. The spaces $\text{Poly}_n(\mathbb{C})$ are algebraic varieties. For example:

$\text{Poly}_2(\mathbb{C}) = \{ z^2 + b z + c \mid b^2 - 4 c \neq 0\}$ $\text{Poly}_3(\mathbb{C}) = \{ z^3 + b z^2 + c z + d \mid b^2 c^2 - 4 c^3 - 4 b^3 d - 27 d^2 + 18 b c d \neq 0\}.$

There are many reasons to care about the topology of these spaces. Even basic questions, such as whether $\text{Conf}_n(M)$ is connected, can have non-trivial applications to robotics. (Aside: I hereby propose “configuration” as the collective noun for a group of robots.) Another point of interest is that a loop in the space $\text{Conf}_n(\mathbb{C})/S_n$ defines a braid with $n$-strands. Indeed, the fundamental group $\pi_1(\text{Conf}_n(\mathbb{C})/S_n)$ of endpoint-preserving homotopy classes of loops is the braid group on $n$-strands.

It will be important in what follows for us to learn how to visualize this — the visualization is easiest if you replace $\mathbb{C}$ with a small disk. Picture $n$ distinct points in the plane, perhaps each in a different color, but as we’re visualizing the space of unordered configurations, you don’t have to remember which point has which color. We’ll visualize a path in this space as a “movie” in which we can watch the points move around. Each “frame” is a point in the configuration space. If we stack all the frames on top of each other in a direction orthogonal to the disk, the ambient space is a cylinder. As the colored points move around, they trace strings from the top of the cylinder to the bottom. Because the points never collide, the strings never intersect. Hence, a loop defines a braid. A loop in the space of ordered configurations is called a pure braid, meaning that each strand starts and ends in the same relative position.

The fundamental problem is to understand the topology of the spaces of configurations of ordered and unordered points. A standard way to try to do this is to compute the cohomology of the space. It’s okay if you don’t know exactly what this means. The point is that to any topological space $X$ one can define a sequence of rational vector spaces $H^i(X; \mathbb{Q})$ for each $i \geq 0$. These algebraic invariants capture a significant portion of the topological information about the space. For fixed $i$, our aim is to reduce infinitely many computations to finitely many by investigating how these vector spaces depend on $n$.

## Case study: first cohomology

Let us consider this problem for $i=1$. Elements of $H^1(\text{Conf}_n(\mathbb{C});\mathbb{Q})$ are represented by homomorphisms $\phi\colon\pi_1(\text{Conf}_n(\mathbb{C})) \to \mathbb{Q}$. To define such a $\phi$, we must assign an algebraic invariant (a rational number) to each loop in $\text{Conf}_n(\mathbb{C})$ in a natural way; here this means that the invariant must be additive with respect to concatenation of loops, so that the map $\phi$ is a homomorphism.

One idea is to use the winding number. For each pair $1 \leq i,j \leq n$ with $i \neq j$, define $\omega_{ij} \colon \pi_1(\text{Conf}_n(\mathbb{C})) \to \mathbb{Q}$ by

$\omega_{ij}(\alpha) =$ winding number of $j$ around $i$ while performing $\alpha$.

Note that $\omega_{ij}=\omega_{ji}$. Benson described a beautiful intuition for this observation that he learned from his advisor, William Thurston, who always encouraged him to project himself into the space. Imagine $i$ and $j$ are dance partners holding hands. As they turn around the dance floor, $i$ performs one clockwise rotation each time $j$ does so and conversely.

Proposition. $\{\omega_{ij} \mid 1 \leq i \lt j \leq n \}$ is a linearly independent in $H^1(\text{Conf}_n(\mathbb{C}))$.

Linear independence is implied by the existence of loops that produce a non-trivial winding number for some pair $(i,j)$ while fixing the remaining points. Moreover:

Theorem (Arnol’d 1968). $H^1(\text{Conf}_n(\mathbb{C});\mathbb{Q})$ is spanned by $\{\omega_{ij} \mid 1 \leq i \lt j \leq n \}$. Hence $H^1(\text{Conf}_n(\mathbb{C});\mathbb{Q}) \cong \mathbb{Q}^{\left(\begin{array}{c} n \\ 2 \end{array}\right)}$.

This is exactly the sort of closed form we might hope for. This leads to the next question:

Question: What is $H^1(\text{Conf}_n(\mathbb{C})/S_n; \mathbb{Q}) = H^1(\text{Poly}_n(\mathbb{C});\mathbb{Q})$?

As before, we seek a numerical invariant for loops of unordered configurations. One such invariant is the total winding number

$\alpha \mapsto$ total winding number of $\alpha = \sum_{i \neq j} \omega_{ij}(\alpha).$

It turns out this is the only invariant.

Corollary. $H^1(\text{Poly}_n) \cong \mathbb{Q} = \mathbb{Q}\{$total winding number$\}$.

Proof: $H^1(\text{Poly}_n) = H^1(\text{Conf}(\mathbb{C})/S_n)$ = $S_n$-fixed vectors in $H^1(\text{Conf}_n(\mathbb{C}))$ = $\mathbb{Q}$-span of the total winding number. $\square$

## Homological stability

More generally:

Theorem (Arnol’d 1968, F Cohen 1972). For any $i \geq 0$, $H^i(\text{Poly}_n)$ does not depend on $n$ for $n \geq i$.

This theorem implies that the spaces $\text{Poly}_n$ satisfy homological stability. While configurations of unordered points satisfy homological stability, configurations of ordered points do not. In fact, for each $i$

$\dim H^i(\text{Conf}_n(\mathbb{C})) \to \infty, n \to \infty.$

The reason that homological stability always fails for the spaces of ordered configurations has to do with representation theory. Because the symmetric group $S_n$ acts on $\text{Conf}_n(\mathbb{C})$, $S_n$ also acts on the vector space $H^i(\text{Conf}_n(\mathbb{C}))$.

A fact from representation theory is that (except for $n=4$) any vector space that admits a non-trivial $S_n$ action must have dimension at least $n-1$. (Technically this is only true if I can also exclude alternating representations. Exercise: prove that $H^1(\text{Conf}_n(\mathbb{C}))$ does not contain any alternating representations.) In particular

$\dim H^i(\text{Conf}_n(\mathbb{C})) \to \infty, n \to \infty$

and homological stability always fails. But somehow homological stability should hold morally. For example, up to permuting indices, there’s only one $\omega_{ij} \in H^1(\text{Conf}_n(\mathbb{C}))$. A more precise statement is that each $H^1(\text{Conf}_n(\mathbb{C}))$ is the same representation of $n$. From the basis described in Arnol’d’s theorem above, it is just the representation where $S_n$ acts in the expected way on unordered pairs of distinct elements of $n$.

## Representation theory of the symmetric groups

Here is a quick primer on representation theory. A representation is a $\mathbb{Q}$-vector space with a linear group action, or, equivalently, a functor from the one-object groupoid whose morphisms correspond to group elements to the category $\mathbf{Vec}_{\mathbb{Q}}$. A representation is irreducible if it has no invariant subspaces. By Maschke’s theorem, every representation decomposes into a direct sum of irreducibles. Thus, the goals of representation theory are to:

classify all the irreducible representations of a given group

understand how to decompose a given representation into irreducibles

Here, we’re interested in representations of symmetric groups, for which we have the following beautiful theorem.

Theorem (Young 1900). Irreducible representations of $S_n$ correspond to partitions of $n$.

For example:

The trivial representation of $S_n$ corresponds to the partition $n$.

The standard representation of $S_n$ (the orthogonal complement of the fixed vector in the permutation representation of $S_n$ on $\mathbb{Q}^n$) corresponds to the partition $(n-1)+1$.

We want notation for irreducible representations that is independent of $n$. For this, we denote the irreducible representation of $S_n$ corresponding to the partition

$(n-5) + 3 + 1 + 1$

of $n$ by $V(3,1,1)$. So $V(0)$ is the trivial representation, and $V(1)$ is the standard representation.

## Representation stability?

Maschke’s theorem says that

$H^i(\text{Conf}_n(\mathbb{C})) = \oplus_{\text{partitions} \lambda} V(\lambda)^{c_{n,\lambda}}$ for some indices $c_{n,\lambda}$. The main problem is to find the $c_{n,\lambda}$.

Example. $H^1(\text{Conf}_n(\mathbb{C})) = V(0) \oplus V(1) \oplus V(2)$ for all $n \geq 3$. Here the trivial representation $V(0)$ is the collection of fixed vectors, the span of the total winding number. The subspace $V(0) \oplus V(1)$ is the permutation representation of some collection of basis vectors $\{u_1,\ldots, u_n\}$ of $H^1(\text{Conf}_n(\mathbb{C}))$. These basis vectors have geometric meaning: $u_i(\alpha)$ is the total winding of the loop $\alpha$ around the point $i$.

This example implies that first cohomology of ordered configuration spaces is representation stable: the index $c_{n,\lambda}$ of the irreducible representation $\lambda$ is independent of $n$ for $n$ sufficiently large. Tom and Benson wondered whether this phenomenon would also occur for the higher cohomology groups. They recruited David Hemmer to compute $H^2(\text{Conf}_n(\mathbb{C}))$ for low $n$. Unfortunately, his calculations, aided by some computer algebra package whose name I didn’t catch, did not stabilize. But thinking this theory was too beautiful not to be true, Benson pressed him to try again, and it turned out that there was a bug in the software, relating to a failure to account for the outer automorphism of $S_6$. Once this was corrected, Hemmer calculated that the second cohomology groups appear to stabilize for $n \geq 8$.

This lead to the following result:

Theorem (Church-Farb 2010). Fix $i \geq 0$. Then $H^i(\text{Conf}_n(\mathbb{C})) = \oplus_{\text{partitions} \lambda} V(\lambda)^{c_{n,\lambda}}$ and $c_{n,\lambda}$ does not depend on $n$ for $n \geq 4i$.

## Applications

Now it turns out that the stable representations are quite complicated. See their paper. But nevertheless, this stability phenomenon has led to useful computations with numerous practical applications. I’ll mention just one: to combinatorial statistics for polynomials over a finite field $\mathbb{F}_q$. More details can be found in this paper, joint with Jordan Ellenberg.

By the Grothendieck-Lefschetz theorem, the multiplicity of a given irreducible representation in $H^i(\text{Conf}_n(\mathbb{C}))$ corresponds to a point count in the variety $\text{Poly}_n(\mathbb{F}_q)$ weighted by the character of the representation, and this correspondence is linear. Thus each index encodes some “statistic” for monic, square-free polynomials over $\mathbb{F}_q$. Representation stability implies that these statistics are asymptotically stable.

For example:

The multiplicity of the trivial representation $V(0)$ corresponds to the number of degree $n$ square-free polynomials, which is $q^n - q^{n-1}$.

The multiplicity of the standard representation $V(1)$ corresponds to the expected number of linear factors, which is $1 - \frac{1}{q} + \frac{1}{q^2} - \cdots \pm \frac{1}{q^{n-2}}$.

Author: "riehl (eriehl@math.harvard.edu)" Tags: "Algebra"

Date: Friday, 31 Jan 2014 17:39

Guest post by Mark Meckes

A recent paper of Mark’s proved the very substantial result that Minkowski dimension (one of the most important types of fractal dimension) can be derived from magnitude (an invariant ultimately coming from category theory). More exactly, he showed that the Minkowski dimension of a compact subset of $\mathbb{R}^n$ is exactly equal to its “magnitude dimension” (Cor 7.4). Here, Mark explains not this specific result, but the overall framework that makes such theorems possible. —TL

I’d like to report on some recentish progress in understanding the magnitude of compact metric spaces. To set the stage I’ll recap some background, which will repeat a number of things which have already appeared in posts by Tom and Simon. For those who want to be reminded of more, Tom has provided a reading list.

## Background, part I: finite spaces

The magnitude of a finite metric space $(A, d)$ is defined as follows. First, define the matrix $\zeta_A = \bigl[e^{-d(a,b)} \bigr]_{a,b \in A} \in \mathbb{R}^{A \times A}$. A weighting for $A$ is a vector $w \in \mathbb{R}^A$ such that for each $a \in A$,

(1)$(\zeta_A w)_a = \sum_{b\in A} e^{-d(a,b)} w_b = 1.$

If $w$ is a weighting for $A$, then the magnitude of $A$ is defined to be

(2)$| A | = \sum_{a \in A} w_a.$

In general, there are well-definedness issues with this definition as written. For the purposes of this post I’ll brush them aside by noting that if $\zeta_A$ happens to be a positive definite matrix, and hence invertible, then $A$ possesses a unique weighting. When this is the case, we call $A$ a positive definite metric space, or PDMS. For example, it turns out that all finite subsets of $\mathbb{R}^n$ (with the usual Euclidean metric) are positive definite metric spaces. PDMSs have the additional nice properties that their magnitude is always at least 1 (if they’re nonempty), and that magnitude is increasing with respect to inclusion.

## Background, part II: compact spaces

Given an invariant of finite metric spaces, it’s natural to try to extend it to some class of infinite spaces, say compact spaces. There are two obvious general approaches: approximate a given compact space by finite spaces, or adapt the definition to apply to infinite spaces directly.

Let’s first consider approximation. We’d like to say that if $A$ is a given compact metric space, we can find a sequence of finite spaces $A_k$ which approximate $A$ in some sense and whose magnitudes $| A_k |$ converge to some limit, which we could then take as the definition of $| A \vert$. There are well-definedness issues here, in particular whether different approximating sequences would give different values of $| A |$. The usual way to handle such an issue, if you can, is by appealing to continuity: if magnitude were a continuous function of a finite metric space (with respect to an appropriate topology), then we could define magnitude on the family of compact spaces as its unique continuous extension.

Fortunately, there is a natural topology on the family of isometry classes of compact metric spaces. Unfortunately, magnitude is not a continuous function of a finite metric space. However, if we restrict to finite PDMSs, it turns out that magnitude is lower semicontinuous, or lsc for short. Like a continuous function, an lsc function has a canonical extension from a domain to its closure—not a unique lsc extension, but a minimal one. So we can naturally define the magnitude of compact PDMSs (a compact space is a PDMS if each of its finite subspaces is) as the minimal lsc extension of the magnitude of finite PDMSs.

Concretely, this means that $| A |$ is well-defined as the limit of $| A_k |$, whenever $A_k$ is a sequence of finite PDMSs which approximate $A$ and for which $| A_k |$ is increasing. Since magnitude is increasing with respect to inclusion for finite PDMSs, the last condition can be guaranteed by taking an increasing sequence $A_k$. (Notice that this minimal lsc extension of magnitude a priori takes values in $[1, \infty]$. I don’t know whether every compact PDMS has finite magnitude. Tom proved that every compact subset of $\mathbb{R}^n$ has finite magnitude.)

Using roughly this approach (although before the semicontinuity result was proved), Tom and Simon showed, among other things, that the magnitude of an interval of length $\ell$ is $1 + \ell/2$. So it became clear that magnitude knows something about classical geometric invariants.

The approximation approach leads to some nice concrete results, and is justified by semicontinuity. But it’s rather unsatisfactory in at least two ways. On the one hand, it would simply be nicer from an aesthetic point of view to give a direct definition of the magnitude of a compact space, which extends the original definition; it might also shed new light on the original definition. On the other hand, such a direct definition would involve some notion of weighting for a compact space, which would provide a new tool for analyzing magnitude.

Later, Simon gave a definition of magnitude for certain compact spaces which extends the original definition. Given a compact metric space $(A, d)$, a weight measure for $A$ is a measure $\mu$ such that for each $a \in A$,

(3)$\int_A e^{-d(a,b)} d\mu(b) = 1.$

If $\mu$ is a weight measure for $A$, then we define

(4)$| A | = \mu(A).$

It turns out that this approach agrees with the approximation approach. That is, if $A$ is a compact PDMS and has a weight measure, then this definition of $| A |$ agrees with the earlier one. On the other hand, the domains of definition are different: not every compact PDMS has a weight measure, and many spaces which are not positive definite do possess weight measures.

Using this definition, Simon studied magnitudes of homogeneous Riemannian manifolds. Again, he found that magnitude knows about classical geometric invariants. However, the fact that many (probably most, in some sense) spaces don’t have weight measures makes the measure-based definition of magnitude unsatisfactory as well.

## Weightings for compact PDMSs

More recently, I found a definition of magnitude for a compact PDMS which directly extends the original definition for a finite space. To begin with, we take inspiration from Simon’s definition and think of the weighting $w$ of a finite space as the measure (more precisely, as a signed measure, since its components need not be positive)

(5)$\sum_{a \in A} w_a \delta_a,$

where $\delta_a$ is a point mass at $a$. We next take inspiration from the finite approximation definition and say that the weighting of a compact space $A$ should be a limit of weightings of finite spaces. That is, it should be a limit of some sequence of finitely supported signed measures on $A$. To make sense of this, we need to put some topology on the space $F M(A)$ of finitely supported signed measures. Since it’s a vector space, we’d probably like to make it a topological vector space. The nicest topological vector spaces are inner product spaces, so it makes sense to try to find an inner product on $F M(A)$ which will work nicely with magnitude.

If $A$ is a PDMS, then such an inner product is staring us in the face (although it took many months before I noticed that): if $B \subseteq A$ is finite, then

(6)$\Bigl\langle \sum_{a \in B} v_a \delta_a, \sum_{b \in B} w_b \delta_b \Bigr\rangle_{\mathcal{W}} := \sum_{a, b \in B} v_a e^{-d(a,b)} w_b.$

Define $\mathcal{W}_A$ to be the Hilbert space completion of $F M(A)$ with this inner product. A weighting for $A$ will be an element of $\mathcal{W}_A$ which satisfies an appropriate analogue of the original condition $\zeta_A w = 1$.

When $v = \sum_{b \in B} v_b \delta_b \in F M(A)$, for some finite $B \subseteq A$, we can straightforwardly define the function

(7)$Z v(a) = \sum_{b \in A} e^{-d(a,b)} v_b$

for $a \in A$. It’s not hard to show that $Z v$ is a bounded continuous function, and that $\| Z v \|_\infty \le \| v \|_{\mathcal{W}}$, so the linear map $Z:F M(A) \to C(A)$ has a unique continuous linear extension $Z:\mathcal{W}_A \to C(A)$. We can now define a weighting of $A$ to be a $w \in \mathcal{W}_A$ such that $Z w(a) = 1$ for each $a \in A$.

The map $Z$ is injective, so $A$ has at most one weighting. Moreover, $A$ has a weighting if and only if there is a uniform bound on the magnitudes of its finite subsets. In other words, $A$ possesses a weighting iff the magnitude $| A |$, as defined by finite approximations, is finite. (This also means that I don’t know whether every compact PDMS has a weighting, but I do know that every compact subset of $\mathbb{R}^n$ does.)

Now that we know what a weighting for $A$ is, we’re ready to define the magnitude of $A$. This is slightly less straightforward, since if $w \in \mathcal{W}_A$ is not in $F M(A)$, then there is no meaning to $w_a$. To get around this, we notice that the original definition for a finite space can be rewritten as

(8)$| A | = \sum_{a \in A} w_a \cdot 1 = \sum_{a \in A} w_a \sum_{b \in A} e^{-d(a,b)} w_b = \sum_{a,b \in A} w_a e^{-d(a,b)} w_b,$

and the latter sum fits right into our Hilbert space framework. So, if $A$ is a compact PDMS with weighting $w \in \mathcal{W}_A$, then we define the magnitude of $A$ to be

(9)$| A | = \langle w, w \rangle_{\mathcal{W}} = \| w \|_{\mathcal{W}}^2.$

This fits with all the previous definitions: if $A$ is a finite PDMS, this simply repeats the original definition in a more complicated way; if $A$ is a compact PDMS, this gives the same value of magnitude as finite approximation; and if $A$ is a compact PDMS with a weight measure $\mu$, then $\mu$ is a weighting for $A$ and $\mu(A) = \| \mu \|_{\mathcal{W}}^2$.

So what does this give us that we didn’t have before? As hoped for, it’s a definition which works for a broad class of spaces, and is (to me, at least) more elegant than approximation by finite spaces. It may shed a little bit of light on the original definition of magnitude for finite spaces—it suggests that one should think of the magnitude not, as originally written, as the sum of the entries of the weighting $w$, but as the double sum

(10)$\sum_{a,b \in A} w_a e^{-d(a,b)} w_b.$

In other words, it suggests that the matrix $\zeta_A$ should be thought of, not as defining a linear map, but as defining a bilinear form. (The same vague, speculative remark applies more generally to the definition of the magnitude of an enriched category.)

What about the hope that the weighting $w$ of $A$ would turn out to be a useful tool? Before addressing that, let’s first specialize to subsets of $\mathbb{R}^n$.

## Weightings and magnitude in Euclidean space

If $A \subseteq \mathbb{R}^n$, then the linear operator $Z$ is a convolution operator, which suggests using a bunch of Fourier mumbo-jumbo. If you do that, it turns out that $\mathcal{W}_A$ is a subspace of the Sobolev space (or more properly, Bessel potential space) $H^{-(n+1)/2}$, and the norm is, up to a constant depending on $n$, the standard Sobolev norm on $H^{-(n+1)/2}$. This isn’t the most familiar, classical sort of thing—it’s a Sobolev space, but with a negative, fractional (half the time) exponent—but it is a known and well-understood thing. Its elements are distributions, not functions, but a lot of machinery exists to work with such things.

Even better, $Z$ maps $\mathcal{W}_A$ into the Sobolev space $H^{(n+1)/2}$. This is a Sobolev space whose elements are functions. (Real, honest, continuous functions—not even almost-every-equal equivalence classes!) If $w$ is the weighting of $A$, then $h = Z w \in H^{(n+1)/2}$ is, by definition, the unique function in $Z(\mathcal{W}_A)$ such that $h(a) = 1$ for all $a \in A$. It’s not hard to show that $| A | = c_n \| h \|^2_{H^{(n+1)/2}}$, for an explicit dimensional constant $c_n$. With a bit more work, one can show that

(11)$| A | = \inf \left\{c_n \| f \|_{H^{(n+1)/2}}^2 \,:\, f \in H^{(n+1)/2},\,\, f \equiv 1 \, on \, A \right\},$

and the infimum is uniquely achieved when $f = h$.

This reformulation of magnitude in terms of a variational problem for functions turns out to be extremely useful. It opens the door to using tools from Hilbert space geometry, Fourier analysis, potential theory, and PDEs to study magnitude. But this post has gotten to be rather long, so I’ll leave it to you to read the paper (and hopefully future papers and posts) to see how!

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Geometry"

Date: Thursday, 30 Jan 2014 11:45

There’s a meeting in Erlangen coming up!

Structures on Tensor Categories and Topological Field Theories, Department Mathematik, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, March 4-6, 2014, organized by Catherine Meusburger and Christoph Schweigert.

This is a three day learning event on recent developments in (extended) topological field theories. The focus is the interplay of structures on tensor categories and topological field theories, including structures related to non-semisimple categories.

The idea is not to have yet another workshop or conference but to study these topics in depth, with a schedule more like the one of an Oberwolfach workshop, with a lecture series and plenty of time for discussions. It will include a lecture series by Christopher Schommer-Pries on topological field theories and the underlying categorical structures.

Speakers include:

• Bruce Bartlett, Oxford University
• Alain Bruguières, Université Montpellier II
• Thomas Kerler, Ohio State University
• Michael Mueger*, Radboud Universiteit Nijmegen
• Christopher Schommer-Pries, Max Planck Institute for Mathematics, Bonn
• Noah Snyder*, Columbia University
• Alexis Virelizier*, Université Lille I

* to be confirmed.

Author: "john (baez@math.ucr.edu)" Tags: "Topological Field Theory"

Date: Monday, 27 Jan 2014 19:52

Guest post by Eduard Balzin

The Kan extension seminar continues, and with it we now come to the paper by Ross Street. Published in 1972, this paper is one of the first instances where the notion of a monad was made relative to an arbitrary 2-category. A lot of aspects, such as algebras over a monad, Eilenberg-Moore and Kleisli categories, the relation with adjunctions, were generalised accordingly. Other topics involve representability, duality, and the example of $\mathbf{Cat}$. I will therefore try to talk about this paper in detail, and add something of my own.

Myself, in the course of my (higher)-categorical education, I have not learned in full about monads. I do not mean definitions or main theorems, but rather the answer to the question why monads. While preparing for the seminar, I’ve managed to give a series of answers to this (my own) question, and I hope that the discussion will take me (and everyone else) even further.

Let me sincerely thank Emily Riehl for organising the seminar. I truly hope it will be a new castle on the landscape of categorical life. I am also quite grateful for the participants of the seminar, who have greatly contributed during the online discussion and in their readers discussion. It all makes me happy.

I. What follows below is my summary and exposition of Street’s paper. At times, I add words of mine. The notation is not exactly of Street, but follows him closely for reading convenience. In addition, I rearranged some of the paragraphs in my exposition.

At the time this paper has written, monads had been around for quite a while and had achieved some recognition due to their usefulness in homological algebra and homotopy theory. Indeed, this is not the first paper where a monad internal to an arbitrary (weak) 2-category appears. Apparently the first mention of such general monads was made by Jean Bénabou in 1967. Formal Theory of Monads also restricts its attention to 2-categories instead of bicategories, strict 2-adjunctions instead of coherent ones. This brings more transparency to the theory but lessens the ability to apply the results of this paper directly.

II. A monad in a 2-category $K$ is a monoid object $S$ inside $K(X,X)$ for some $X \in K$. Thus we have ‘multiplication’ 2-cells $\mu: S S \to S$ and ‘unit’ $\eta: 1_X \to S$ such that if we draw associativity and unit diagrams, we witness commutativity of those. A monad morphism $(U,\phi):(X,S) \to (Y,T)$ consists of a 1-cell $U: X \to Y$ together with a 2-cell $\phi: T U \to U S$. It should be required that $\phi$ preserves units and associativity. A monad functor transformation $(U, \phi) \to (V, \psi)$ is a 2-cell $U \to V$ suitably commuting with 2-cells. For each $K$, this defines a 2-category $\mathbf{Mnd}(K)$, the construction is actually functorial in $K$.

Maybe some readers are not willing to take this definition for granted. Then I remind them of the following situation in $\mathbf{Cat}$. For a monad $S: C \to C$, a $C$-algebra is an object $x \in C$ equipped with a map $S x \to x$, commuting suitably with $\mu$ and $\eta$. The map of monads $(U,\phi):(C,S) \to (D,T)$ will send an $S$-algebra $x$ to a $T$-algebra.

A 2-category $K$ admits construction of algebras if the inclusion 2-functor $Inc:K \to \mathbf{Mnd}(K)$, sending $X$ to $(X,1_X)$, has a right adjoint (in the strict 2-categorical sense) $Alg:(X,S) \mapsto X^S$. ($Inc$ always has a left adjoint $Und:(X,S) \mapsto X$.) The counit of 2-adjunction gives rise to a monad morphism $(U^S, \chi) : (X^S,1) \to (X,S)$ through which each morphism of monads $(Y,1) \to (X,S)$ factors. The adjunct of $(S, \mu) \colon (X,1) \to (X,S)$ defines a map $J^S:X \to X^S$ that is left adjoint to $U^S$ and such that $S = U^S J^S$. This gives us a theorem which says that when $Alg$ exists, each monad is generated by an adjunction.

Also, for each adjunction $J:X \leftrightharpoons A:U$, there is a comparison 1-morphism $A \to X^{UJ}$. $A$ is monadic when this is an equivalence (Street asks for an isomorphism, though).

III. The paper also studies representability questions. In $\mathbf{Cat}$, we know that $C^S$ has the form of the category of algebras for a monad mentioned above. By direct consideration, it turns out, that there is an isomorphism (of categories — things are strict in this paper!) $K(Y, X^S) \cong K(Y,X)^{K(Y,S)}$ where $K(Y,S):K(Y,X) \to K(Y,X)$ is the monad induced by Yoneda embedding. With this, we have two things to mention:

(a) $X^S$ represents a functor $K^{op} \to Cat$. It turns out that $X^S$ can be obtained as a weighted limit. This is observed but not proved in the paper, but was actually done afterwards.

(b) Many things about monads in $K$ can be deduced by passing to $\mathbf{Cat}$-presheaves. For instance, $N:A \to X^S$ is an equivalence iff for each $Y \in K$ the functor $K(Y,N):K(Y,A) \to K(Y, X^S) \cong K(Y,X)^{K(Y,S)}$ is an equivalence. So $A$ is monadic iff $K(Y,A)$ is for each $Y$.

IV. For a given object $X$ of $K$, we can form the comma category $\mathbf{Mnd}(K)/X$ (we use the underlying functor) of monads over $X$. One can ask, given a monad $(A,R)$ and $E: A \to X$, if there is a pushforward monad $E_{\ast} R$ on $X$ which is universal (any monad morphism $(A,R) \to (X,S)$ factors through the canonical one $(A,R) \to (X,E_{\ast} R)$). Those $(A,R,E)$ for which this pushforward exists are called tractable.

For instance, we can try taking $E_\ast R$ to be the right Kan extension of $E R: A \to X$ along $E: A \to X$, and this (if exists) will give us a desired universal pushforward. When $R$ is unity, this is precisely what is called codensity monad. Observe that when $E$ has a left adjoint $J$, we can always take $E_{\ast} {1_A} = E J$, and this will satisfy the desired universal property. Thus codensity monads are monads one would get if $E:A \to X$ actually had a left adjoint.

Let us restrict our attention to the category $\mathbf{Mnd}(X)$ which is the same as the inverse image of the underlying functor $\mathbf{Mnd}(K) \to K$ over $X$. The construction of algebras can be actually seen as a ‘semantics’ functor $Sem: \mathbf{Mnd}(X) \to \mathbf{Mnd}(K)/X$, $(X,S) \mapsto (X^S,1,E^S)$, and it will land in the full subcategory $\mathbf{Tract}(X)$ of tractable objects. For each tractable object, forming the universal monad gives us the ‘structure’ functor $Str: \mathbf{Tract}(X) \to \mathbf{Mnd}(X)$, $(A,R,E) \mapsto (X,E_{\ast} R)$. $Sem$ is right adjoint to $Str$, and the counit of this adjunction is moreover an isomorphism.

V. For a 2-category $K$, there are its duals $K^{op}$, $K^{co}$, and $K^{coop}$. There is a big section of the paper which studies questions involving interpretations of monads and the construction of algebras in these dual 2-categories.

In $K^{op}$ we again get monads (and so if $S$ is generated by an adjunction in $K^{op}$, same is true in $K$), but the functors between them are different. Precisely, if we consider $\mathbf{Mnd}(K^{op})^{op}$ (so that the direction of 1-morphisms is the same as in $K$), we get that a 1-morphism $(X,S) \to (Y,T)$ here is a monad opfunctor in $K$: a 1-cell $U: X \to Y$ with a 2-cell $\phi: U S \to S T$, opposite direction. The category of algebras in this case is denoted $X_S$, and comes with canonical $J_S: X \to X_S$. That’s the generalisation of Kleisli category.

Opfunctors often appear if one considers adjunctions: for two monads (X,S) and (Y,T) and an adjunction $J:Y \rightleftharpoons X:U$, supplying the right adjoint $U$ with a 2-cell to make it into a monad functor is the same as supplying $J$ in a 2-cell to make it into a monad opfunctor.

Monads in $K^{co}$ and $K^{coop}$ by contrast are comonads in $K$. Of more interest for consideration (for reasons explained below) is the 2-category $\mathbf{Mnd}(K^{coop})^{coop}$, whose objects are comonads and morphisms are comonad opfunctors (the 2-cells are going in the same direction as 2-cells for monad functors). Again, when we have the construction of algebras for $K^{coop}$, there is a universal object $X_S$ for each comonad $S$, with the opfunctor $X \to X_S$.

If there is an adjunction $J:X \rightleftharpoons A:U$, then we get a monad on $X$ but a comonad on $A$. When we do the $X^S$ construction for a monad, the fact that $S=E^S J^S$ gives us a comonad $J^S E^S$ on $X^S$. This allows to define a 2-functor $ALG: \mathbf{Mnd}(K) \to \mathbf{Mnd}(K^{coop})^{coop}$, which sends $(X,S)$ to $(X^S, J^S E^S)$ (provided that $K$ admits the construction of algebras). Dually, there is $COALG: \mathbf{Mnd}(K^{coop})^{coop} \to \mathbf{Mnd}(K)$ (if, again, $K^{coop}$ is nice enough). A big calculation reveals that this is an adjunction: $COALG:\mathbf{Mnd}(K^{coop})^{coop} \rightleftharpoons \mathbf{Mnd}(K):ALG$ which can be thought of lifting one of the construction of algebras adjunctions $CoAlg:\mathbf{Mnd}(K^{coop})^{coop} \rightleftharpoons \mathbf{Mnd}(K):Inc$ (here $CoAlg$ is just a fancy name for the ‘coop’ of $Alg$ on $K^{coop}$). Street hypothesises if one can use this lifting idea together with ‘2-triangle’ theorems to actually prove that if $K^{coop}$ admits algebras, then so does $K$. One could hope that a version of 2-lifting theorem would obtain the right adjoint $ALG$ as above from knowing the right adjoint $Inc$ to $CoAlg$, and then apply the (dual of the) underlying functor $\mathbf{Mnd}(K^{coop})^{coop} \to K$ after $ALG$, as to show that $K$ admits algebras for monads. That would be something special, however.

The other result mentioned in this section is when $K$ has a duality involution $(-)^{op}:K \to K^{co}$. In that case, $K$ admits construction of algebras whenever $K^{op}$ does, and $Alg(X,S) \cong Alg(X^{op},S^{op})^{op}$.

VI. A distributive law consists of two monads $(X,S)$ and $(X,T)$ together with a 2-cell $\lambda: ST \to TS$ so that $\lambda$ makes $T$ into a monad functor $(X,S) \to (X,S)$ with $\eta_T$ and $\mu_T$ becoming monad morphism transformations (equivalently, $S$ becomes a monad opfunctor on $(X,T)$). An example here is given by

$Mon: Set \to Set$, the monoid monad, and

$Ab: Set \to Set$, the abelian group monad

(the usual distribution of $+$ and $\times$ gives a map $Ab Mon \to Mon \Ab$). By writing out what $\mathbf{Mnd}(\mathbf{Mnd}(K))$ means, one can witness that distributive laws in $K$ are the objects of $\mathbf{Mnd}(\mathbf{Mnd}(K))$. Moreover, using the $\lambda$’s coming with each distributive law, one can define a natural transformation $\mathbf{Mnd}\mathbf{Mnd} \to \mathbf{Mnd}$, which together with $Inc: Id \to \mathbf{Mnd}$ makes $\mathbf{Mnd}$ into a (3-)monad on $2$-$\mathbf{Cat}$.

In addition, one can also consider the ‘opfunctor’ monad $\mathbf{Mnd}^{op} = (-)^{op} \mathbf{Mnd} (-)^{op}$. Taking compositions like $\mathbf{Mnd}^{op} \mathbf{Mnd}^{op}$, $\mathbf{Mnd} \mathbf{Mnd}^{op}$ or $\mathbf{Mnd}^{op} \mathbf{Mnd}$ again gives distributive laws (as objects), but different kinds of morphisms between them. One can ask if the resulting monads $\mathbf{Mnd}$ and $\mathbf{Mnd}^{op}$ are related by a distributive law; however one can actually check that the law is a canonical isomorphism:

$\mathbf{Mnd}^{op} \mathbf{Mnd} \cong \mathbf{Mnd} \mathbf{Mnd}^{op}.$

VII. Essentially, the sole example of this paper is $\mathbf{Cat}$. Using the fact that, for a monad $(C,S) \in \mathbf{Mnd}^{op}(\mathbf{Cat})$, the category of algebras $C_S$ is in fact what is known as Kleisli category, which certainly exists (its objects are the same as of $C$ and morphisms are maps of the form $x \to S y$), we conclude that $\mathbf{Cat},\mathbf{Cat}^{op},\mathbf{Cat}^{co},$ and $\mathbf{Cat}^{coop}$ all admit the construction of algebras. A really nice result which follows is the statement that for a monad $S: C \to C$, $C^S \cong C \times_{Pr(C)} Pr(C_S)$ where $Pr(-)=Fun((-)^{op},Set)$ and $Pr(C_S) \to Pr(C)$ is induced by the canonical functor $J_S: C \to C_S$. So we see that algebras for $S$ is a full subcategory of presheaves on Kleisli category which are representable when restricted to $C$.

Finally, for $V$-$\mathbf{Cat}$, it is stated that the formal theory is applicable to this 2-category once a couple of properties are satisfied. Unfortunately, not much more is said about the enriched categories case in detail.

VIII. And that was it, (some of) the formal theory of monads! I must say that the 2-categorical content of that paper is really interesting, and one could ask if there is a way to proceed in the same manner in a higher $(\infty,2)$-categorical setting. Another thing is that, if one wishes to discuss monads in full, it is (as I presume) quite desirable to look beyond the formal theory for as many examples as possible. This is why I conclude my exposé with some references I know to the subjects that look interesting. Everyone is welcome to make her or his own list!

There is a follow-up of “The formal theory of monads”, called “The formal theory of monads, II”, in which the idea of $X^S$ being a weighted limit is taken seriously, to the extent that the authors consider, for a 2-category $K$, its completion $EM(K)$ with respect to these limits. The 1-interior of $EM(K)$ turns out to be the same as the one of $\mathbf{Mnd}(K)$, and moreover one can use $EM(K)$ to work with monads in a very useful way using mostly the universal property of this completion. The authors also study wreaths — the objects of $EM(EM(K))$, and in addition the paper is quite filled with examples for the notions presented.

The result everyone knows today as the Monadicity Theorem has a lot of applications. One of them which I like is the result that, given an (elementary) topos $T$, the ‘maps into the classifier’ functor $A \mapsto \Omega^A$ from $T^{op} \to T$ is monadic. Because of this, $T$ has all finite colimits. On the other hand, when I talk to some (derived) algebraic geometers, they tell me that from their experience the only place where monadicity theorem comes close to be used fully and faithfully is Tannaka duality, and even there it can be avoided as was done in some first expositions. This might be subjective; however my search of any details on the question revealed a paper named The Formal Theory of Tannaka Duality, the author of which claims that it is inspired by Ross Street’s writeup! One might want to have a look for the detail.

Monads are strongly connected with algebraic theories (the latter term can actually be used here in a general sense). What is less known, probably, is that one can use monads themselves to describe some ‘ring-like’ objects. What I mention here is the theory of generalised commutative rings (which are some special monads on $Set$) by Durov, which appeared in his thesis, and can include such things as the field of one element as an example. One can then do a lot of algebraic geometry with this monads, up to K-theory and intersection stuff.

Monads do appear in higher categories. When working with model categories, one can work with weak monadic projections to treat Bousfield localisation (e.g. see Section 11.2 of Simpson’s book on Segal categories). Another thing is that it is often the case that one runs into a question if the category of algebras for a monad (or even a comonad) admits a model structure. For monads, one can see this, and for comonads, this and probably the recent one(!) . Finally, our dear host of this seminar has co-authored a paper which treats homotopy coherent adjunctions and monadicity theorem in quasi-categories, in a way much simpler than originally done by Jacob Lurie in his DAG II (the approach of Lurie still has some interesting flavour, coming from the treatment of algebraic structures via fibrations).

Author: "riehl (eriehl@math.harvard.edu)" Tags: "Categories"

Date: Wednesday, 22 Jan 2014 12:54

I’ve just arXived a new paper about a new invariant, The magnitude of a graph. Much of the development of this invariant has taken place at this blog, with two previous posts and crucial contributions from David Speyer and Simon Willerton. Your comments, critical or otherwise, would be very welcome.

Magnitude seems to be orthogonal to other graph invariants. You can’t derive from it such classic invariants as the Tutte polynomial or the chromatic number, nor the girth nor the clique number nor even the number of connected components. And conversely, you can’t derive the magnitude from these or any other well-known graph invariants. Apparently, it captures information of a different kind.

Its most appealing characteristic is that it behaves like cardinality. It’s multiplicative with respect to cartesian product, additive with respect to disjoint union, and obeys a restricted version of the inclusion-exclusion principle. And it also behaves a bit like the Tutte polynomial. For instance, it’s often invariant under Whitney twists — though not always, as I’ll explain.

That it behaves like cardinality is no surprise, given its origins.

Steve Schanuel taught us that the Euler characteristic of a space should be thought of as analogous to the cardinality of a set. There is a notion of the Euler characteristic of a category, closely related to other forms of Euler characteristic. This notion can be generalized to enriched categories, then specialized to categories enriched in the poset $(\mathbb{N}, \geq)$, seen as a monoidal category via addition. Any graph gives rise to an $\mathbb{N}$-enriched category, the objects being the vertices and the homs being distances in the graph. So, we get an invariant of graphs — and that’s what’s called magnitude.

That’s a highly condensed account, and you can find out more by following the links, but let me skip all that and tell you the end result, which is completely concrete.

The magnitude $\# G$ of a graph $G$ is a rational function over $\mathbb{Q}$ — the ratio of two polynomials with integer coefficients. It can also be expressed as a power series over $\mathbb{Z}$. For example, the graph

has magnitude

$\frac{5 + 5q - 4q^2}{(1 + q)(1 + 2q)} = 5 - 10q + 16q^2 - 28q^3 + 52q^4 - 100q^5 + \ldots.$

Here $q$ is just a formal variable.

The definition goes like this. Take a graph $G$. Let $Z$ be the square matrix whose rows and columns are indexed by the vertices of $G$, and whose entries are given by

$Z(x, y) = q^{d(x, y)}$

for each pair of vertices $x$ and $y$. Here $d(x, y)$ is the distance between $x$ and $y$ in the graph (and by convention, $q^\infty = 0$). It turns out that $Z$ is invertible over both the field $\mathbb{Q}(q)$ of rational functions and the ring $\mathbb{Z}[q]$ of power series. The magnitude of $G$ is the sum of all the entries of its inverse.

I know of no direct graph-theoretic motivation for this definition. Frankly, it looks odd. The real motivation is that it’s part of a large family of related invariants, including not only cardinality and various types of Euler characteristic, but also metric dimension, maximum entropy, and, conjecturally, geometric measures like volume, surface area and perimeter. From that point of view, it would be odd if graph magnitude wasn’t important.

The paper has two main theorems. First, magnitude obeys an inclusion-exclusion principle: when $G$ and $H$ are subgraphs of some larger graph, and certain further conditions hold,

$\# (G \cup H) = \#G + \#H - \#(G \cap H)$

(Theorem 4.9). I was frustrated by having to impose those “further conditions”, because it meant that magnitude didn’t behave entirely like cardinality. But then I figured out that no nontrivial graph invariant behaves entirely like cardinality (Lemma 4.1).

The second main theorem was conjectured by Simon. Take a graph with two distinct vertices distinguished, and another graph with two distinct vertices distinguished. Glue them together by pairing up the distinguished vertices. There are two different ways you can do this, depending on how you do the pairing, and the two resulting graphs are said to differ by a Whitney twist.

David Speyer pointed out that (as all graph theorists know) the Tutte polynomial is invariant under Whitney twists, so if magnitude is a specialization of the Tutte polynomial, it had better be invariant too. Then David got his computer to randomly generate a couple of 18-vertex graphs differing by a Whitney twist, and lo and behold, their magnitudes were equal. That magnitude was a rational function of degree 23 whose coefficients were integers of up to 5 decimal digits, so it was highly unlikely that this happened by chance.

So, when Simon found an example showing that magnitude isn’t invariant under Whitney twists, it came as a considerable surprise. What’s going on?

Well, Simon immediately conjectured that you get invariance only when the two points of identification are joined by an edge in one of the original two graphs. That turns out to be true — and that’s the second theorem.

The apparently implausible coincidence in David’s experiment can therefore be explained as follows. He generated two 10-vertex graphs $G$ and $H$, putting an edge between each pair of vertices with probability $0.4$. He then joined them together in two ways, giving two 18-vertex graphs $X$ and $Y$ differing by a Whitney twist. The second theorem applies if $X$ (or equivalently $Y$) has an edge between the two points of identification, and the probability that this happens is $1 - (1 - 0.4)^2 = 0.64$. So, it was more likely than not that he’d get what he got.

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Combinatorics"

Date: Tuesday, 21 Jan 2014 18:59

If someone told you that in order to flourish rather than languish, you have to have at least 2.9013 times as many positive as negative emotions, and that they know this because of the Lorenz equations, you’d be instantly suspicious, right? It sounds like run-of-the-mill pseudoscience, but it was published in American Psychologist, official organ of the American Psychological Association and one of the very top journals in the field. Not only that, the lead author on this two-author paper, Barbara Fredrickson, is one of the editors of this journal.

The Guardian Observer carried a great piece yesterday telling how it was debunked by — it gets better — a master’s student, Nick Brown. He teamed up with the legendary Alan Sokal and the psychologist Harris Friedman, and they wrote this highly enjoyable account of exactly how incompetent the original paper was. American Psychologist, to its credit, is publishing it.

Sample from the original paper:

An interesting observation that highlights the usefulness of fluid dynamics concepts to describe human interaction arises from the fact that Lorenz chose the Rayleigh number as a critical control parameter in his model. […] Low performance teams could be characterized as being stuck in a viscous atmosphere highly resistant to flow

Sample from Brown et al.’s response:

They appear to assert that the predictive use of differential equations abstracted from a domain of the natural sciences to describe human interactions can be justified on the basis of the linguistic similarity between elements of the technical vocabulary of that scientific domain and the adjectives used metaphorically by a particular observer to describe those human interactions. If true, this would have remarkable implications for the social sciences. One could describe a team’s interactions as “sparky” and confidently predict that their emotions would be subject to the same laws that govern the dielectric breakdown of air under the influence of an electric field. Alternatively, the interactions of a team of researchers whose journal articles are characterized by “smoke and mirrors” could be modeled using the physics of airborne particulate combustion residues, combined in some way with classical optics.

The Observer article is here, and Brown et al.’s article (“The complex dynamics of wishful thinking”) is here.

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Pseudoscience"

Date: Monday, 20 Jan 2014 17:06

For the last couple years, people interested in quantum gravity have been arguing about the ‘firewall problem’. It’s a thought experiment involving black holes that claims to demonstrate an inconsistency in some widely held assumptions about how quantum mechanics and general relativity fit together. Everyone is either scratching their heads over it, struggling to find a way out… or grumbling that the problem isn’t real, and everyone else has gone crazy.

While the firewall problem has roots going back earlier, the paper that got everyone interested came out in July 2012:

They called it the ‘firewall’ problem because one way out is to assume that when you fall into a black hole you hit a ‘wall of fire’ — a region of hot radiation — as you cross the event horizon. That sounds crazy: the equivalence principle says you shouldn’t feel anything special as you fall through the horizon, at least if the black hole is big enough. But they claimed this crazy-sounding solution was the most conservative way out!

In June of the following year, two extremely reputable physicists, in their attempts to avoid this conclusion, came up with an idea that sounds even crazier: every pair of entangled quantum particles is connected by a wormhole!

They called their proposal EPR = ER, which is a joke referring to two papers Einstein wrote in 1935. EPR stands for ‘Einstein–Podolsky–Rosen’, the authors of a famous paper on quantum entanglement, a spooky way that distant objects can be correlated. ER stands for ‘Einstein–Rosen’, the authors of a famous paper on wormholes, solutions of general relativity in which distant regions of space can be connected by a kind of tunnel, or handle.

I haven’t been thinking much about quantum gravity lately, but the recent convulsions in my old favorite subject caught my attention. In particular, I’d long been fond of the idea that when we finally understand quantum gravity, the distinction between quantum mechanics and our theory of spacetime will evaporate. There’s a lot of evidence coming from category theory, which reveals analogies between the two:

Now, quantum entanglement is a sneaky way for two distant particles to be correlated. A wormhole is a sneaky way for two distant particles to be connected. Could this be hinting at yet another analogy between quantum mechanics and our theory of spacetime? Perhaps one that deserves a category-theoretic treatment?

Last summer, when Jamie Vicary and I were both at the Centre for Quantum Technologies, we figured out something interesting about this. We wrote a paper about it, which is done now:

I’d like to explain a bit of it here, because it uses 2-categories in a cute way.

### Disclaimers

First, I want to emphasize that Jamie and I are not making any claims about the firewall problem. In particular, we’re not claiming that EPR = ER is a way to solve the firewall problem.

Second, we’re not claiming that in our universe, every quantum-entangled pair of particles is connected by a wormhole.

Third, I’m not going to explain the firewall problem here. I don’t feel I have a deep understanding of it — and in fact, I wouldn’t be shocked if it gently dissolves when people think about it carefully enough. So, if you want to get a feel for this problem, but don’t feel quite up to the technical literature, I recommend starting with these blog articles:

I’ve listed them in roughly increasing order of how much physics you need to know to enjoy them, but they’re all good in different ways. The quantum information theorist John Preskill also wrote a nice intro to Maldacena and Susskind’s work:

### Wormholes and pair creation

Okay, so let me tell you what I’m actually going to tell you about. Jamie and I did some calculations that work in a simplified setup where spacetime is just 3-dimensional, and nothing about spacetime matters except its topology. This is called a 3-dimensional topological quantum field theory, or 3d TQFT for short.

3d TQFTs are good for describing quantum gravity in a fictitious world where spacetime is 3-dimensional and time is no different from space. At a more practical level, they’re also good for describing thin films of certain special substances at very low temperatures! Some people want to use these films to make topological quantum computers.

Here’s the interesting thing: in this setup, the creation of a particle-antiparticle pair can be reinterpreted as the formation of a wormhole. The particle and its antiparticle are the opposite ends of the wormhole! See, we can interpret this picture two ways:

In one way of interpreting this picture, we’re looking at a chunk of spacetime with a curved tube removed. Reading it from bottom to top, space starts out as a disk and ends up as a disk with two holes. So, a particle-antiparticle pair is being created!

In the other way of interpreting this picture, we think of that curved thing as a wormhole: a handle connecting two distant part of space. Reading from bottom to top, space starts out as a disk and ends up as a disk with a handle attached. So, a wormhole is being formed!

If this seems confusing, don’t worry: I’ll explain it more precisely later.

The idea that particle-antiparticle pairs could be wormhole ends is an old one, going back to the paper by Einstein and Rosen. It’s especially cute for electrically charged particles. When it seems the electric field is diverging, coming out of a particle, maybe it’s really just flowing through a wormhole and coming out one end… and flowing in the other end, which would thus look like a particle of the opposite charge.

The idea has been kicking around for a long time. John Wheeler was especially fond of it: he called it ‘charge without charge’. But the discovery of 3d TQFTs gave us a context where we can make this idea precise and calculate with it.

This lets us relate quantum entanglement of particle-antiparticle pairs to wormholes. Indeed, we can compute what happens when a wormhole forms in a 3d TQFT, and use that to compute the state of the resulting particle-antiparticle pair.

The particle and antiparticle act entangled! They even act like they’re ‘completely’ entangled: every piece of information about the particle at one end of the wormhole is encoded in information about its antiparticle at the other end.

This entanglement is, however, a kind of sham. The particle and its antiparticle act entangled, but they have no choice about this, since they are not really separate things. They are just two views of the same thing: the wormhole. True entanglement arises when two independent things are correlated in a certain way. It doesn’t really count as entanglement when something is correlated with itself.

The fact that the entanglement is ‘fake’ was crucial for Maldacena and Susskind, who used it to get around a key aspect of the firewall problem: the so-called ‘monogamy of entanglement’. Just as it’s against the rules to be married to two people, a quantum system can’t be entangled with two others. But ‘fake’ entanglement isn’t monogamous.

Jamie Vicary and I make the concept of fake entanglement precise in our paper, but I don’t feel like talking about that now. Instead, I want to say more about how the creation of a particle-antiparticle pair can be reinterpreted as the formation of a wormhole. This is where 2-categories come in…

### Wormholes and pair creation: the 2-categorical view

In our simplified setup where spacetime is just 3-dimensional, space is just 2-dimensional, and the formation of a wormhole looks like this:

This process goes from a disk to a disk with a handle attached. But the disk itself can be seen as a process going from the empty set to a circle… and so does the disk with a handle attached. A ‘process between processes’ is a 2-morphism in a 2-category. So, the formation of a wormhole is actually a 2-morphism, like this:

Here $\emptyset$ is the empty set, $S^1$ is the circle, and $D^2$ is the disk. $Y \circ H$ is the disk with a handle attached… but why am I calling it that?

Here’s why. The disk with a handle attached can be chopped into two pieces: a handle, and a disk with two holes:

The handle goes from the empty set to the disjoint union of two circles:

$H : \emptyset \to S^1 \sqcup S^1$

The disk with two holes goes like this:

$Y : S^1 \sqcup S^1 \to S^1$

So, the disk with a handle is the composite

$Y \circ H : \emptyset \to S^1$

Okay, so we’ve sliced and diced the process of wormhole formation and seen how it’s a 2-morphism in a 2-category. What about the process of creating a particle-antiparticle pair? What does that look like?

It looks like this:

We read this from bottom to top. Space starts out as a disk. As time passes, a particle-antiparticle pair forms. When they’ve formed, space is a disk with two holes. (You see, in 3d TQFTs a particle can be treated as a hole cut out of space, or ‘puncture’.)

Here’s a 2-categorical diagram for this process of pair creation:

At the start, space is a disk — and as before, we think of this disk as going from the empty set to the circle:

$D^2 : \emptyset \to S^1$

As time passes, space turns into a disk with two holes, namely

$Y: S^1 \sqcup S^1 \to S^1$

As this happens, the circle doesn’t change, but the empty set turns into two circles, thanks to the handle

$H : \emptyset \to S^1 \sqcup S^1$

Now here’s the punchline. Take the square diagram:

Compose the left and top arrows, and compose the bottom and right ones. The bottom and right ones compose to give a disk:

$(S^1 \times I) \circ D^2 = D^2$

so we get a diagram we’ve already seen:

This is the diagram for wormhole formation! So, pair creation is just wormhole formation in disguise.

### 2-categorical remarks

Finally, here are some remarks that may be too categorical, and even too 2-categorical, for most readers to enjoy.

The ‘bigon’ here is how we draw a morphism in a 2-category:

But the square here is how we draw a morphism in a double category:

There’s a standard trick for getting double categories from 2-categories, due to Ehresmann, and that’s what we’re using here. It’s fairly common for people to study TQFTs using 2-categories. Indeed, Jamie is writing a huge multi-part paper on this with Bruce Bartlett, Chris Douglas and Chris Schommer-Pries, which will be really exciting when it comes out, assuming hell hasn’t frozen over yet. It’s less common for people to study TQFTs using double categories, but Jeffrey Morton has:

The advantage of double categories is that they let us study cobordisms-between-cobordisms where the boundary of space changes with time… as in pair creation. Since Jeffrey was my grad student, I was primed to think about this issue.

Here’s another thing, which Jamie and I were reluctant to include in our paper. You can get a 3d TQFT $Z$ from a modular tensor category $\mathbf{C}$, and then

$Z(S^1) = \mathbf{C}$

Since a hole in space acts like a particle, and cutting out a hole in space leaves a circle as boundary, $\mathbf{C}$ plays the role of the ‘2-Hilbert space of particle types’. $\mathbf{C}$ has a basis of simple objects which are the various types of particles.

The 2-Hilbert space for a pair of particles is then

$Z (S^1 \sqcup S^1) = \mathbf{C} \boxtimes \mathbf{C}$

where $\boxtimes$ is the tensor product of 2-Hilbert spaces. So when we apply our TQFT to a handle

$H : \emptyset \to S^1 \sqcup S^1$

we get

$Z(H) : \mathrm{Hilb} \to \mathbf{C} \boxtimes \mathbf{C}$

and this functor is determined by its value on $\mathbb{C} \in \mathrm{Hilb}$, which is

$\bigoplus_\sigma \sigma \boxtimes \sigma^* \in \mathbf{C} \boxtimes \mathbf{C}$

where $\sigma$ ranges over the simple objects of $\mathbf{C}$, and $\sigma^*$ is the dual of $\sigma$.

Translating back into physics: the two ends of the wormholes act like particles. These particles can be of any type whatsoever, but if one end happens to be a particle of type $\sigma$, the other end has to be the corresponding antiparticle, $\sigma^*$. So, you could call $\bigoplus_\sigma \sigma \boxtimes \sigma^*$ the space of states of a particle-antiparticle pair.

And, this space of states is a categorified version of a completely entangled state! In ordinary quantum mechanics, we could have a Hilbert space $\mathcal{H}$ with a basis $e_i$, and a completely entangled state would be something like

$\sum_i e_i \otimes e_i \in \mathcal{H} \otimes \mathcal{H}$

(up to normalization). Now we’ve got a 2-Hilbert space $\mathbf{C}$ and

$\bigoplus_\sigma \sigma \boxtimes \sigma^* \in \mathbf{C} \boxtimes \mathbf{C}$

It’s the same idea, one level up! We could call this ‘2-entanglement’.

Even better, we have a kind of microcosm principle thing going on. In higher category theory, little things often naturally live in bigger things that are categorified versions of themselves. For example, monoids naturally live in monoidal categories. There are many other examples, too. This is called the ‘microcosm principle’.

In the situation here, we’ve already seen the space of states of the particle-antiparticle pair is 2-entangled. But the formation of a wormhole creates an element of this space of states: a particle-antiparticle pair in a particular state. And this state is itself entangled!

Jamie explain this in more detail in our paper, starting at equation (4). But we didn’t mention that the entanglement of the state of the particle-antiparticle pair is ‘riding’ the 2-entanglement of the space of states of this pair. You have to really like higher categories to care about an observation like this.

Author: "john (baez@math.ucr.edu)" Tags: "Topological Field Theory"

Date: Tuesday, 14 Jan 2014 07:44

William Lawvere’s Elementary Theory of the Category of Sets (ETCS) was one of the first attempts at using category theory as a foundation of mathematics and formulating set theory in category theoretic language. In this post I will outline the content of Lawvere’s paper [Lawvere 1965] and put it in a historical context; suggestions for further reading are at the end. Before I begin, I’d like to thank Emily Riehl, Steve Awodey and the other Kan Extension Seminar participants for their helpful feedback, suggestions, and reading responses to the paper; and, of course, William Lawvere for writing it!

### Some history

ETCS came about at a boomtime for category theory and the foundations of mathematics. It was the early 1960s; category theory had found its feet and had started being used extensively throughout mathematics, notably in algebraic geometry with the work of Alexander Grothendieck. Around the same time, Paul Cohen developed the technique of forcing, which has since become central to modern set theory. Lawvere had recently written his Ph.D. thesis [Lawvere 1963], which was the first step in his programme of using the category of categories as a foundation. It was therefore fitting that he should want to try his hand at doing set theory inside category theory.

Lawvere’s attempt did not go unopposed, even by the likes of Samuel Eilenberg and Saunders Mac Lane:

His attempt to explain this idea to Eilenberg did not succeed; […] Sammy asked me to listen to Lawvere’s idea. I did listen, and at the end I told him “Bill, you can’t do that. Elements are absolutely essential to set theory.” [Mac Lane 1988]

Unfazed, Lawvere pressed on and in 1964 he published the ETCS paper, which was expanded and reprinted in 1965. What follows is a very brief sketch of its contents; more depth and technical details can be found in the paper itself.

### Axioms of ETCS

Before listing the axioms, we should fix some set theoretic notation with category theoretic meaning: given an object $A$ in a category with a terminal object $1$, if $a \colon 1 \to A$ then we write $a \in A$ and say $a$ is an element of $A$. If $s \colon S \to A$ is monic then we say $s$ is a subset of $A$, and write $s \subseteq A$. We then write $a \in s$ if $a$ factors through $s$.

The axioms of ETCS consist of the usual axioms for a category $\mathcal{C}$ together with the following eight axioms:

1. $\mathcal{C}$ has an initial object $0$, a terminal object $1$, products $A \times B$ and coproducts $A+B$ of all pairs $A,B$, and equalisers and coequalisers of all pairs $f,g : A \rightrightarrows B$. (Thus $\mathcal{C}$ has all finite limits and colimits.)
2. Every pair of objects $A,B$ has an exponential $B^A$.
3. $\mathcal{C}$ has a natural numbers object $\mathbb{N}$, with zero morphism $\mathtt{zero} : 1 \to \mathbb{N}$ and successor morphism $\mathtt{suc} : \mathbb{N} \to \mathbb{N}$.
4. $\mathcal{C}$ is well-pointed, i.e. given any pair of maps $f,g : A \rightrightarrows B$ with $f \ne g$, there is an element $a \in A$ such that $f \circ a \ne g \circ a$.
5. Axiom of Choice: If $f : A \to B$ and there is some $a \in A$, then there exists a quasi-inverse $g : B \to A$, which satisfies $f \circ g \circ f = f$.
6. If $A$ has no elements then $A$ is an initial object.
7. If $a \in A + B$ then $a$ factors through one of the coproduct inclusions.
8. $\mathcal{C}$ has an object with more than one element.

With modern technology we can simplify this list considerably: a model of ETCS is a well-pointed topos with a natural numbers object satisfying the axiom of choice.

In Lawvere’s paper, the axioms are introduced as they are needed, rather than all as one list. As such, it becomes clear which properties held by the category of sets are also held by other categories. Moreover, Lawvere comments on how the axioms interact with each other. For example, since the category $\mathbf{Pos}$ of posets and order-preserving maps satisfies each of the axioms except for #5, the axiom of choice is independent of the other axioms of ETCS.

### Mathematics with ETCS

Lawvere demonstrated the usefulness of ETCS as a set theory by proving six mathematical theorems.

Theorem 1 (Primitive recursion). Given a pair of morphisms $f_0 : A \to B, \quad u : \mathbb{N} \times A \times B \to B$ there is a morphism $f : \mathbb{N} \times A \to B$ satisfying, for each $n \in \mathbb{N}$ and $a \in A$, $f \circ \langle \mathtt{zero}, a\rangle = f_0 \circ a, \quad f \circ \langle \mathtt{suc} \circ n, a \rangle = u \circ \langle n, a, f \circ \langle n, a \rangle \rangle$

The primitive recursion theorem requires only axioms 1-3, and uniqueness of $f$ is implied by axiom 4. Thus primitive recursion can be done in categories such as the functor category $[\mathbb{C}, \mathbf{Set}]$ for any small category $\mathbb{C}$, and in the arrow category $\mathbf{Set}^{{\rightarrow}}$.

Theorem 2 (Peano’s postulates).

• The morphism $\mathtt{succ} : \mathbb{N} \to \mathbb{N}$ is monic
• If $n = \mathtt{succ} \circ m$ for some $m \in \mathbb{N}$ then $n \ne \mathtt{zero}$.
• If $s \subseteq \mathbb{N}$ and $n \in s \Rightarrow \mathtt{succ} \circ n \in s$ then $s = \mathrm{id}_{\mathbb{N}}$.

The third of these allows us to perform mathematical induction in any model of ETCS.

Theorem 3 (Image factorization). Every morphism can be factored as an epi followed by a mono.

Theorem 4 (Unions). Let $2 = 1+1$ and let $\top : 1 \to 2$ be one of the inclusion maps. Given any $\alpha : I \to 2^A$ there is a subset $u : {\bigcup_{\alpha}} \to A$ such that, for all $a \in A$, $a \in u$ if and only if $\mathtt{ev}_{2^A} \circ \langle \alpha \circ j, a \rangle = \top$ for some $j \in I$.

Intuitively, $\bigcup_{\alpha}$ is the union of the subsets of $A$ picked out by the indexing morphism $\alpha$; that is, ${\bigcup_{\alpha}} = \bigcup_{j \in I} \alpha(j)$

Theorem 5 (Complements). Let $A$ be an object. Given any subset $s : S \to A$ there is a subset $s' : S' \to A$, such that $A \cong S+S'$ with $s$ and $s'$ as the coproduct injections.

Theorem 6 (Quotients). Let $A$ be an object and $f = \langle f_0, f_1 \rangle : R \to A \times A$ be an equivalence relation. Let $\pi_0, \pi_1: A \times A \to A$ be the product projections. Then $f$ is the equaliser of $q \circ \pi_0$ and $q \circ \pi_1$, where $q : A \to Q$ is the coequaliser of $f_0$ with $f_1$.

Intuitively, $Q$ is the quotient of $A$ by $R$, then the theorem is stating that $R$ is precisely the set of pairs $\langle a,b \rangle \in A \times A$ such that $[a]_R = [b]_R \in Q$.

### Metamathematics

The paper closes with a theorem justifying that the axioms of ETCS really do characterise categories which are sufficiently ‘like’ the category of sets. Lawvere refers to this likeness as the invariant form of the category of sets.

(Meta)theorem. If $\mathcal{C}$ is a locally small, complete model of ETCS, then $\mathcal{C}$ is equivalent to $\mathbf{Set}$.

The equivalence $\mathcal{C} \overset{T}{\underset{\check T}{\rightleftarrows}} \mathbf{Set}$ is given by $TA = \text{Hom}_{\mathbf{Set}}(1,A) = \{ \text{elements of } A \}$ $\check{T}X = \sum_{x \in X} 1$

### What happened next?

The appeal of ETCS was not felt by the mathematical community at the time. Indeed, according to Marquis and Reyes:

The category of sets was not taken as a foundational framework. It was not studied and explored. […] It seems that one of the most common reactions at the time was that ETCS was merely a translation in categorical terms of the standard axioms of set theory [Marquis and Reyes 2011]

Even Lawvere acknowledged the shortcomings of his work:

However, it is the author’s feeling that when one wishes to go substantially beyond what can be done in the theory presented here, a much more satisfactory foundation for practical purposes will be provided by a theory of the category of categories.

However, the impact of ETCS stems not from its (non-)use as a foundation, but from the research that it led to. It was one of the first attempts to give a categorical account of logic, and it led directly to the definition of an elementary topos, which generalised the notion of a Grothendieck topos and bridged the gap between geometry and logic via category theory. For a number of reasons it was toposes, rather than models of ETCS, which became the central object of study in categorical logic.

• Lawvere, F.W. Functorial Semantics of Algebraic Theories, Ph.D. thesis, Columbia University (1963).
• Lawvere, F.W. An elementary theory of the category of sets, Proceedings of the National Academy of Science of the USA 52, 1506-1511 (1965), reprinted as Lawvere, F.W., McLarty, C. An elementary theory of the category of sets (long version) with commentary, Reprints in Theory and Applications of Categories, No. 11 (2005), pp. 1-35.
• Mac Lane, S. Concepts and categories in perspective. In A Century of Mathematics in America, I, vol. I, pages 323-365. American Mathematical Society, Providence (1988).
• Marquis, J.-P. and Reyes, G. The History of Categorical Logic: 1963-1977. In Handbook of the history of logic, vol. 6, pp. 689-800. Elsevier (2011).
Author: "riehl (eriehl@math.harvard.edu)" Tags: "Categories"

Date: Monday, 13 Jan 2014 08:15

Going to the Joint Mathematics Meetings in Baltimore this week? Then drop in at Booth 330, which will be occupied by the Electronic Frontier Foundation.

The EFF have been doing fantastic work for over 20 years, keeping the internet the kind of place you most likely want it to be: ensuring your freedom of speech, protecting your privacy, and defending the core principles of the internet against the controlling ambitions of both corporations and governments. Although they’re only a small nonprofit organization, with a comparatively minuscule budget, they’ve had a string of legal victories against huge players. They deserve your support!

But what are they doing at the Joint Mathematics Meetings? It was the idea of Thomas Hales. Hales is famous for, among other things, proving the Kepler sphere-packing conjecture. (He also wrote a very nice introduction to motivic measure, mentioned here in passing a couple of years ago.)

Hales anticipated that the NSA would be recruiting mathematicians with particular fervour this year: in order to recruit, they’ll need to overcome the outrage caused by the recent revelations of mass, population-level, surveillance. They’ll want to persuade mathematicians that what they’re doing is good for society. And the EFF will be there to tell mathematicians that there may be better channels for their talents.

In an earlier post, Bas Spitters immediately put his finger on the point where mathematics most directly touches the NSA scandal: the undermining of internet encryption. What you might not have immediately picked up from casually reading the newspapers is that among the variety of techniques used by the secret services to get round encryption, they managed to insert a back door into a cryptographic protocol based on elliptic curves.

The definitive mathematical account of this is Hales’s piece in the February 2014 issue of the Notices of the AMS. There are many other accounts from different perspectives. But rather than dump a large number of links on you, let me highlight one by the EFF.

The EFF piece quotes from an internal NSA document, which lists as one of their items of budgetary spending:

Insert vulnerabilities into commercial encryption systems, IT systems, networks …

And the EFF article makes a crucial point:

By weakening encryption, the NSA allows others to more easily break it. By installing backdoors and other vulnerabilities in systems, the NSA exposes them to other malicious hackers—whether they are foreign governments or criminals. As security expert Bruce Schneier explained, “It’s sheer folly to believe that only the NSA can exploit the vulnerabilities they create.”

In other words, even if (for some reason) you trust the NSA with everyone’s data, their undermining of internet encryption makes the world a more dangerous place.

Maybe you’re saying: what’s done is done. The outlook is gloomy, but what can we do now?

One immediate priority is to stop the situation becoming normalized. Those who wish to destroy online privacy will want to make the NSA’s actions seem like an unexceptional part of protecting national security. The NSA will, I imagine, be trying to persuade mathematicians in Baltimore this week that the whole fuss is really rather overblown; that perhaps a few checks and balances need adjusting, but fundamentally what it’s doing is good. The opposite argument needs to be made, and I imagine the EFF will be making it.

But more positive actions are possible. Mathematicians involved in cryptography can speak up! They can say “I do not want to contribute to mass surveillance”, just as physicists and engineers have refused to contribute to the building of nuclear weapons, and doctors have refused to participate in torture. We can withdraw our labour. We have that choice.

And mathematicians have a role to play in building new tools that allow genuine privacy. In the wake of the Snowden revelations, there’s been a big push to develop encrypted channels of communication that are secure against government snooping. Mathematicians can help.

It’s important to realize that as far as we know, the NSA has made no decisive mathematical cryptographic advance of which the rest of the world is ignorant. Snowden said in June:

Encryption works. Properly implemented strong crypto systems are one of the few things that you can rely on. Unfortunately, endpoint security is so terrifically weak that NSA can frequently find ways around it.

The NSA have used social, legal, physical and other means to weaken encryption, but the underlying mathematics has not been challenged. Nonetheless, mathematical expertise will be needed to build new, better, tools.

I’d love to go along to the EFF booth this week and help out. Unfortunately, I’m on the wrong side of the Atlantic. But if you’re going to be there and you want to help out, I believe they’d appreciate it. Just drop an email to Yan Zhu, who will be the official EFF representative there, or talk to her in person when you arrive.

And if you can’t be there, but reading this has made you want to help in some other way, why not join the EFF?

Postscript  For more on the NSA’s weakening of internet encryption, here are some further links. I already linked to one post by the Johns Hopkins computer scientist Matthew Green; here’s a more technical companion post. Less technically, there’s a great piece by IT security legend Bruce Schneier, and of course there’s any number of articles for a general readership.

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Science and Politics"

Date: Sunday, 12 Jan 2014 07:34

The categorical notion of an end is something that several people have requested Catster videos for and Yemon Choi was recently asking if Tom had covered it in his new-born book. Given that I’ve got ends in my head at the moment for two different reasons, I thought I’d write a post on how I think about them.

I feel that seeing an integral sign like $\int _{c\in \mathcal{C}} T(c,c)$ can cause people’s eyes to glaze over, never mind them getting confused as to whether that represents an end or a coend. So I will endeavour to avoid integral signs apart from right at the end.

My experience is that coends roam more freely in the wild than ends do, but I will focus on ends in this post. One reason that people are interested in ends is that natural transformation objects in enriched category theory are expressed as ends, but I will stay away from the enriched setting here.

Having said what I won’t do, maybe I should say what I will do. I will mainly concentrate on a few examples to demonstrate ends as universal wedges.

### Functors

The input data for an end is a functor of the form $T\colon \mathcal{C}^{\mathrm{op}}\times \mathcal{C}\to \mathcal{D}$. The only functors that I know people take the ends of are those of the form $T(c,c')= [F(c),G(c')]$ where $F,G\colon \mathcal{C}\to \mathcal{E}$ are functors and $[-,-]\colon \mathcal{E}^{\mathrm{op}}\times \mathcal{E}\to \mathcal{D}$ is some kind of hom functor, i.e. either the usual hom (so $\mathcal{D}=\text {Set}$) or an internal hom (so $\mathcal{D}=\mathcal{E}$).

Here are some typical examples I’ve come across.

1. $\text {Hom}\colon \mathcal{C}^{\mathrm{op}}\times \mathcal{C}\to \text {Set}$

2. $\text {Hom}_{\mathcal{E}}(F(-),G(-)) \colon \mathcal{C}^{\mathrm{op}}\times \mathcal{C}\to \text {Set}$, where $F,G\colon \mathcal{C}\to \mathcal{E}$ are functors.

3. $\text {Lin}(U(-,)U(-)) \colon \text {Rep}(A)^{\mathrm{op}}\times \text {Rep}(A)\to \text {Vect}$, where

• $A$ is finite-dimensional algebra over $\mathbb{C}$,

• $\text {Rep}(A)$ is the category of finite-dimensional, complex representations of $A$,

• $\text {Vect}$ is the category of complex vector spaces,

• $U\colon \text {Rep}(A)\to \text {Vect}$ is the forgetful functor,

• $\text {Lin}$ is the internal hom in $\text {Vect}$, i.e.  $\text {Lin}(V,W)$ is the vector space of linear maps from $V$ to $W$.

4. $[-,-]\colon \text {Rep}(G)^{\mathrm{op}}\times \text {Rep}(G)\to \text {Rep}(G)$

• where $G$ is a finite group

• $[V,W]$ is the internal hom of $G$-representations $V$ and $W$, so it is the vector space of linear maps with the $G$-action on $f\colon V\to W$ given by $(g\cdot f)(v)=g\cdot f(g^{-1}\cdot v)$.

The main point of difference between 3 and 4 is that for an arbitrary algebra $A$, the category of representations does not have an internal hom, but for a finite group $G$ it does. Other examples like 4 would involve representation categories of Hopf algebras or quantum groups.

### Wedges

An end is a universal wedge, so I’d better say what a wedge is. A wedge for a functor $T\colon \mathcal{C}\times \mathcal{C}^{\mathrm{op}}\to \mathcal{D}$ is an object $e\in \text {Ob}(\mathcal{D})$ with a morphism $\omega _{c}\colon e\to T(c,c)$ for every object $c\in \text {Ob}(\mathcal{C})$. These morphisms have to satisfy a naturality condition which says that for every morphism $f\colon c\to c'$ in $\mathcal{C}$, the two obvious maps $e\to T(c,c')$ you can make are the same, i.e. the following diagram commutes. $\begin{matrix} e & \overset{\omega _{c}}{\longrightarrow }& T(c,c) \\ \omega _{c'} \downarrow & &\downarrow T(1,f)\\ T(c',c') &\underset{T(f,1)}{\longrightarrow } & T(c,c') \end{matrix}$ We can denote such a wedge by writing $e\stackrel{\cdot }{\longrightarrow }T$.

Let’s have a look a what wedges are in the examples given above.

1. For $T=\text {Hom}$, a wedge is a set $e$ and a function $\omega _{c}\colon e\to \text {Hom}(c,c)$ for each $c\in \mathcal{C}$. This means that for each $n\in e$ we get a family of morphisms $\omega _{c}(n)\colon c\to c$. The naturality condition ensures that these are the components of a natural transformation of the identity functor $\text {Id}_{\mathcal{C}}\to \text {Id}_{\mathcal{C}}$. So a wedge is a set $e$ with a function $e\to \text {Nat}(\text {Id}_{\mathcal{C}},\text {Id}_{\mathcal{C}})$.

2. For $T=\text {Hom}(F(-),G(-))$, by a similar argument to that above, a wedge is a set $e$ with a function $e\to \text {Nat}(F,G)$ to the set of natural transformations.

3. In this case, where we have a $\mathbb{C}$-algebra $A$, an end is a vector space $e$ with a morphism $e\to \text {Lin}(U(V),U(V))$ for every representation $V$ of $A$, in other words a linear map $e\otimes U(V)\to U(V)$. The naturality condition says that these linear ‘action maps’ must commute with all $A$-intertwining maps.

4. In the case of the internal hom of representations of a finite group, a wedge consists of a representation $e\in \text {Rep}(G)$ together with a natural ‘action’ on every $G$-representation: $e\otimes V\to V$. These action maps are intertwiners and must commute with all other intertwiners.

If $T(c,c')$ is of the form $[F(c),F(c')]$ as in 1, 3 and 4 above, then we have morphisms $e\to [F(c),F(c)]$ so in some sense $e$ is acting on $F(c)$ for every $c$.

### Ends

An end is a universal wedge. An end for $T$ consists of a set $E$ with a morphism $\Omega _{c}\colon E\to T(c,c)$ for each $c\in \mathcal{C}$, satisfying the wedge naturality conditions, such that if $e\stackrel{\cdot }{\longrightarrow }T$ is another wedge for $T$ then there is unique map $e\to E$ such that the components of $e$ factor through $E$ as $e\to E\to T(c,c)$.

An end is often written as an integral $\int _{c\in \mathcal{C}}T(c,c)$. Coends, which I’m not talking about here, are written with the limits at the top of the integral sign: in Sheffield we have the mnemonic

“The end of the walking stick is at the bottom.”

I have mixed feelings about this notation. I think that people can be intimidated by it, also people get the impression that it is supposed to be something to do with integration which confuses them (or maybe that’s just me!)

We can look back at our examples and identify the ends.

1. For $T=\text {Hom}$, from what is written above, it should be clear that the end is the set of natural transformations of the identity $\text {Nat}(\text {Id}_{\mathcal{C}},\text {Id}_{\mathcal{C}})$. This set is sometimes called the Hochschild cohomology of the category.

2. For $T=\text {Hom}(F(-),G(-))$, similar to the example above, the end is $\text {Nat}(F,G)$ the set of natural transformations from $F$ to $G$.

3. In the case of the representation category of an algebra $A$ we find that the end is actually (the underlying vector space of) the algebra $A$ itself. It is pretty obvious that the tautological action on $A$-representations $A\otimes U(V)\to U(V)$ gives rise to a wedge. It is less obvious that it is the universal wedge.

4. This is the example of the internal hom for the representations of a finite group $G$. Whilst we have the tautological action $\mathbb{C}G\otimes U(V)\to U(V)$ for every representation $V$, we can make these into intertwining maps by taking the group algebra with the adjoint action $\mathbb{C}G^{\mathrm{ad}}$. The resulting morphisms $\mathbb{C}G^{\mathrm{ad}}\otimes V\to V$ in $\text {Rep}(G)$ make the group algebra $\mathbb{C}G^{\mathrm{ad}}$ into a wedge for the internal hom $[-,-]$. In fact it is an end for the internal hom. This is related to the fact that a group algebra is semisimple and that there is an isomorphism of algebras in the representation category $\mathbb{C}G^{\mathrm{ad}}\cong \bigoplus _{W} [W,W]$ where sum runs over (a set of representatives of the equivalence classes of) the irreducible representations of $G$.

Example 3 is the prototypical example of Tannakian reconstruction. We start with the representation category $\text {Rep}(A)$ and the ‘fibre functor’ $U:\text {Rep}(A)\to \text {Vect}$, then reconstruct $A$ from there. See the nlab page on Tanaka duality for more details.

Example 4 is a kind of ‘internal reconstruction’. More generally, for a Hopf algebra $H$ the end of the internal hom is a version of $H$ inside its category of representations. I believe that this idea is due to Shahn Majid (see Chapter 9 of Foundations of Quantum Group Theory).

### Ends as Algebras (or Monoids, if you prefer)

In Examples 1, 3 and 4, the end could actually be given more structure than that of being just a set, a vector space or a representation. In all three examples the end is actually an algebra (or monoid, if you prefer) in the appropriate category. In fact, in Example 4 the end has a Hopf algebra structure, but I won’t go into that here.

Suppose the functor $T\colon \mathcal{C}^{\mathrm{op}}\otimes \mathcal{C}\to \mathcal{D}$ is of the form $[F(-),F(-)]$ for some functor $F\colon \mathcal{C}\to \mathcal{E}$ and for $[,]$ either internal or external hom then using the composition of the hom, for each $c\in \mathcal{C}$ we have the composite $E\otimes E\to [F(c),F(c)]\otimes [F(c),F(c)]\to [F(c),F(c)]$ and you can check that this gives a wedge $E\otimes E\stackrel{\cdot }{\longrightarrow }T$. Thus by the universal nature of the end we get a canonical map $\mu \colon E\otimes E\to E.$ In a similar vein we have an identity morphism for each $c\in \mathcal{C}$ $1\to [F(c),F(c)],$ these give rise to a wedge $1\stackrel{\cdot }{\longrightarrow }T$ so by the universality of $E$ we get a canonical map $\eta \colon 1\to E.$ You can verify that $\mu$ and $\eta$ make the end $E$ into an algebra object in $\mathcal{D}$, and the components of the end $E\to [F(c),F(c)]$ are algebra homomorphisms.

### Further examples

Just to finish we can have a look at how taking different ends on a similar category give different but related answers. So let’s look at Examples 1, 3 and 4 for the category $\text {Rep}(G)$ of representations of a finite group.

• If we use the ordinary hom then we get the centre of the group algebra. $\int _{V\in \text {Rep}(G)} \text {Hom}(V,V) = Z(\mathbb{C}G)\in \text {Set}$

• If we use the internal hom in $\text {Vect}$ we get the group algebra. $\int _{V\in \text {Rep}(G)}\text {Lin}(U(V,)U(V))=\mathbb{C}G\in \text {Vect}$

• If we use the internal hom then we get the group algebra with the adjoint action. $\int _{V\in \text {Rep}(G)} [V,V]=\mathbb{C}G^{\mathrm{ad}}\in \text {Rep}(G)$

There ends my introduction to ends.

Author: "willerton (S.Willerton@sheffield.ac.uk)" Tags: "Categories"

Date: Wednesday, 08 Jan 2014 02:45

I’ve posted before about the involvement of mathematicians and computer scientists in the secret services — who, we now know, routinely record your and my daily activities. And in case you’ve somehow missed this whole scandal, you could do a lot worse than starting with this piece by Bruce Schneier in The Atlantic.

Many academics oppose mass surveillance. This probably includes many mathematicians who have helped the NSA and GCHQ in the past. If you’re one of them, you can add your name to a public declaration: Academics Against Mass Surveillance. All you need to do is this:

email us at info (at) academicsagainstsurveillance.net with your name, academic function and university in the subject line.

Given their pivotal role, I hope that many more mathematicians will want to add their names. (There may be some in the pipeline, along with my own: apparently, the organizers can’t keep up with the volume of emails coming in, citing an “overwhelming response”.)

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Science and Politics"

Date: Wednesday, 25 Dec 2013 15:47

I’m using the quiet of Christmas to finish writing a book, Basic Category Theory. It’s nothing revolutionary: just a short introduction to the subject, based on courses I’ve taught. But the process of book-writing is demanding and maddening enough that I wanted to take a moment to reflect on why that is — and why you hear authors complain so much.

Put another way, I’m taking a break from the tedium of writing a book to write about the tedium of writing a book. I hope it’s not tedious.

I want to try to articulate why writing a book is so much more painful than writing a paper. This isn’t something I’ve thought through; I’m typing this off the top of my head. But I’ll see if I can gather some reasons.

First, let me confess that I’ve been surprised by just how demanding it’s been. I’ve written one book before, and that was an extremely stressful experience. But this one is a venture of a completely different kind: it’s half the length, it’s a textbook rather than research (and therefore not nearly so personal), I already had what I thought was a nearly-final draft when I approached the publishers this time, and I’d been polishing the notes up, on and off, for the previous twelve years. What’s more, I’m older and, I hope, more able to cope with stress: just as carpenters get calloused hands that make them insensitive to small abrasions, I like to imagine that academics get calloused minds that allow them not to be bothered by small stresses and strains.

So, I went into this aware of the potential stress. I think I successfully removed just about all of it. But what I hadn’t bargained for is that when you remove all the stress, what’s left underneath is… boredom!

Let me qualify that. When you’re writing just about anything, there’s an intensely satisfying period when it’s all coming together. That’s great. But after that, towards the end — and that’s the stage I’m at now — there’s an awful lot of grind.

Let’s start with the obvious. A paper is long if it’s 50 pages; a book is short if it’s 200. But the crucial thing is that the pain does not scale linearly.

For instance, you have to check that you’ve used notation consistently throughout. The time it takes to check this for each piece of notation is proportional to the length of the book — but so too is the number of pieces of notation. So, the time needed to check consistency of notation is proportional to the square of the length.

You also want to make sure you haven’t repeated yourself. (In an earlier draft, I told/reminded the reader what the discrete topology was three separate times.) This amounts to checking that line $i$ is not too similar to line $j$ for all lines $i$ and $j$, which again means that the time you need is proportional to the square of the length.

What’s more, a book feels different from a paper. Books tend to get more publicity, and people engage with them in a different way. My experience is that if I tell a mathematical friend that I’m writing a book, they’re pretty interested; but if I solemnly informed them that I was writing a paper, they’d look at me like I’d told them I had two legs. (I do.) So when you’re writing a book, you know that what you’re doing is likely to come under more scrutiny. This will bring out all the perfectionism in you.

In my case, this was a big effect. I’ve taught master’s-level category theory courses several times, and I had a polished set of Latexed notes for them. I thought it wouldn’t take much effort to turn them into a book. What I hadn’t realized is the extent to which I was speaking to my students — not particular individuals, but the generic Glasgow master’s student that the notes were addressed to. To adapt them for an unknown anyone-in-the-world reader, I’ve needed to examine and undo a lot of assumptions, and that’s taken a lot of work.

On top of all this, there are things you need to do for a book that you don’t need to do for a paper. One of them is indexing, which has to be up there among the most boring tasks in academic life. I actually have a book on indexing (excerpted from the Chicago Manual of Style), which I bought when I was indexing my last book, in Chicago. It’s 65 pages long — and yes, it has an index.

Anyway, I’m happy to say that I’ve very nearly finished. My deadline is 31 December. Although these deadlines seem to be almost infinitely elastic, I intend to meet this one. After that, there will be a whole lot of to-ing and fro-ing with Cambridge University Press, who are publishing it, and I hope it will be on the shelves some time in the middle of 2014.

Merry Christmas!

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Scientific Communication"

Date: Saturday, 21 Dec 2013 13:22

Limits commute with limits, and colimits commute with colimits, but limits and colimits don’t usually commute with each other — with some notable exceptions. The most famous of these is that in the category of sets, finite limits commute with filtered colimits.

Various other cases of limit-colimit commutation are known. There’s an nLab page listing some. But it seems that quite an easy case has been overlooked.

It came to light earlier this week, when I was visiting Cambridge. Peter Johnstone told me that he’d found a family of new limit-colimit commutations in the category of sets, I asked whether his result could be simplified in a certain way (to involve groups only), and we both realized that it could not only be simplified, but also generalized.

Here it is. Let $G$ and $H$ be finite groups whose orders are coprime. View them as one-object categories. Then $G$-colimits commute with $H$-limits in the category of sets.

Now here’s the result stated more precisely: first in category-theoretic terms, then purely group-theoretically.

Let $I$ and $J$ be small categories, and let

$D: I \times J \to \mathbf{Set}$

be a functor. There’s a canonical map of sets

$\lambda: colim_{j \in J} lim_{i \in I} D(i, j) \to lim_{i \in I} colim_{j \in J} D(i, j),$

and the question is whether $\lambda$ is a bijection. If the answer is yes for all $D$, we say that limits over $I$ commute with colimits over $J$ in the category of sets. The statement is that when $I$ and $J$ are the one-object categories corresponding to finite groups with coprime orders, they do commute.

Here it is again, purely group-theoretically. To translate, we’re going to need the facts that when a group is viewed as a one-object category, a functor from that category into $\mathbf{Set}$ is the same thing as a left action of the group, the limit of such a functor is the set of fixed points of the action, and the colimit is the set of orbits.

Let $H$ and $G$ be groups, and let $X$ be a set equipped with both a left $H$-action and a left $G$-action in such a way that the actions commute: $g h x = h g x$ for all $h$, $g$ and $x$. Equivalently, $X$ has a left action by $G \times H$.

The set $Fix_H(X)$ of $H$-fixed points has a $G$-action, and we can take the set $Fix_H(X)/G$ of orbits. On the other hand, the set $X/G$ of $G$-orbits on $X$ has an $H$-action, and we can take the set $Fix_H(X/G)$ of fixed points. There’s a canonical map of sets

$\lambda: Fix_H(X)/G \to Fix_H(X/G).$

It’s straightforward to show that $\lambda$ is always injective. It’s not always surjective. But the fact is that it’s surjective (and therefore bijective) if $G$ and $H$ are finite with coprime orders. So then,

$Fix_H(X)/G \cong Fix_H(X/G).$

The proof is so short that I might as well include it. We have to show that $\lambda$ is surjective. Let $\xi \in Fix_H(X/G)$. Then $\xi = G x$ for some $x \in X$, and we know that $G x$ is a fixed point of $H$. It’s enough to show that $x$ itself is a fixed point of $H$: for then the element of $Fix_H(X)/G$ represented by $x$ is mapped by $\lambda$ to the element of $Fix_H(X/G)$ represented by $x$, which is $\xi$.

So, let $h \in H$. We must show that $h x = x$. Since $G x$ is a fixed point of $H$, we know that $h x = g x$ for some $g \in G$. Since the $G$- and $H$-actions on $X$ commute, $h^n x = g^n x$ for all integers $n$. But $\left|G\right|$ and $\left|H\right|$ are coprime, so we can choose an $n$ such that

$n \equiv 1  (mod \left|H\right|), \qquad n \equiv 0  (mod \left|G\right|).$

Then $h^n = h$ and $g^n = 1$, so $h x = x$, as required.

Although category theorists seem to have overlooked this result, I thought it might be well-known in group theory, so I asked on MathOverflow. No one has yet provided a reference, and the view of the expert group theorist Derek Holt was that “the proof is sufficiently straightforward as not to require a reference”. But what may not have been apparent to group theorists is its categorical significance.

You might ask whether this result can be generalized. In other words, for which pairs of groups do limits over one commute with colimits over the other in $\mathbf{Set}$? Although I didn’t ask for it, Will Sawin provided a complete answer — a necessary and sufficient condition. Here it is:

$H$-limits commute with $G$-colimits in $\mathbf{Set}$ if and only if no nontrivial quotient of $H$ is isomorphic to a subquotient of $G$.

Of course, this immediately implies the coprime-order case for finite groups. But it applies to other pairs of finite groups too (e.g. when $H$ is simple and $G$ has smaller order), as well as to some pairs where one or both groups are infinite.

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Categories"

Date: Sunday, 15 Dec 2013 23:59

Here’s a new feature of the Café, thanks to our benevolent host Jacques Distler. If you ever want to see how someone has created some mathematical expression on this blog, there’s an easy way to do it.

With Firefox, you simply double-click on the expression. Try it: $A \times B^A \to B$ or $x_{m n}$ or $\Biggl( \begin{matrix} 1 & 2 \\ 3 & 4 \end{matrix} \Biggr).$ A window should pop up showing the TeX source.

With other browsers, I’m not so sure. Try double-clicking. If that doesn’t work, then, according to Jacques’s instructions, you “bring up the MathJax context-menu for the formula, and choose Show Math As $\to$ Annotation $\to$ TeX”. I don’t know how one brings up this menu. Does anyone else know? (Update: right-click in Chrome, Explorer and Opera, and control-click in Safari. Thanks to those who responded.)

Once you’ve made the TeX source appear, you can cut and paste to your heart’s content. Of course, most users here are fluent in LaTeX. But like most math-oriented websites, we use a variant of TeX that’s a little different from standard LaTeX, so this should turn out to be a helpful feature.

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Blogging"

Next page
» You can also retrieve older items : Read