• Shortcuts : 'n' next unread feed - 'p' previous unread feed • Styles : 1 2

» Publishers, Monetize your RSS feeds with FeedShow:  More infos  (Show/Hide Ads)

Date: Sunday, 20 Apr 2014 17:35

The Legendre–Fenchel transform, or Fenchel transform, or convex conjugation, is, in its naivest form, a duality between convex functions on a vector space and convex functions on the dual space. It is of central importance in convex optimization theory and in physics it is used to switch between Hamiltonian and Lagrangian perspectives.


Suppose that VV is a real vector space and that f:V[,+] f\colon V\to [-\infty ,+\infty ] is a function then the Fenchel transform is the function f *:V #[,+] f^{\ast }\colon V^{#}\to [-\infty ,+\infty ] defined on the dual vector space V #V^{#} by f *(k)sup xV{k,xf(x)}. f^{\ast }(k)\coloneqq \sup _{x\in V}\big \{ \langle k,x\rangle -f(x)\big \} .

If you’re a regular reader then you will be unsurprised when I say that I want to show how it naturally arises from enriched category theory constructions. I’ll show that in the next post. In this post I’ll give a little introduction to the Legendre–Fenchel transform.

There is probably no best way to introduce the Legendre–Fenchel transform. The only treatment that I knew for many years was in Arnold’s book Mathematical Methods of Classical Mechanics, but I have recently come across the convex optimization literature and would highly recommend Tourchette’s The Fenchel Transform in a Nutshell — my treatment here is heavily influenced by this paper. I will talk mainly about the one-dimensional case as I think that gives a lot of the intuition.

We will start, as Legendre did, with the special case of a strictly convex differentiable function f:f\colon \mathbb{R}\to \mathbb{R}; for instance, the function x 2+1/2x^{2}+1/2 pictured on the left hand side above. The derviative of ff is strictly increasing and so the function ff can be parametrized by the derivative k=df/dxk =d f/d x instead of the parameter xx. Indeed we can write the parameter xx in terms of the slope kk, x=x(k)x=x(k). The Legendre-Fenchel transform f *f^{*} can then be defined to satisfy k,x=f(x)+f *(k), \langle k,x \rangle = f(x) +f^{\ast }(k), where the angle brackets mean the pairing between a vector space and its dual. In this one-dimensional case, where xx and kk are thought of as real numbers, we just have k,x=kx\langle k,x \rangle = k x.

As xx is a function of kk we can rewrite this as f *(k)k,x(k)f(x(k)). f^{\ast }(k)\coloneqq \langle k,x(k) \rangle -f(x(k)). So the Legendre-Fenchel transform encodes the function is a different way. By differentiating this equation you can see that the df */dk=x(k)d f^{\ast }/d k=x(k), thus we have interchanged the abcissa (the horizontal co-ordinate) and the slope. So if ff has derivative k 0k_{0} at x 0x_{0} then f *f^{\ast } has derivative x 0x_{0} at k 0k_{0}. This is illustrated in the above picture.

I believe this is what Legendre did and then that what Fenchel did was to generalize this to non-differentiable functions.

For non-differentiable functions, we can’t talk about tangent lines and derivatives, but instead can talk about supporting lines. A supporting line is one which touches the graph at at least one point and never goes above the graph. (The fact that we’re singling out lines not going above the graph means that we have convex functions in mind.)

For instance, at the point (x 0,f(x 0))(x_{0},f(x_{0})) the graph pictured below has no tangent line, but has supporting lines with gradient from k 1k_{1} to k 2k_{2}. A convex function will have at least one supporting line at each point.


It transpires that the right way to generalize the transform to this non-differentiable case is to define it as follows: f *(k)sup x{k,xf(x)}. f^{\ast }(k)\coloneqq \sup _{x\in \mathbb{R}}\big \{ \langle k,x\rangle -f(x)\big \} . In this case, if ff has a supporting line of slope k 0k_{0} at x 0x_{0} then f *f^{\ast } has a supporting line of slope x 0x_{0} at k 0k_{0}. In the picture above, at x 0x_{0}, the function ff has supporting lines with slope from k 1k_{1} to k 2k_{2}: correspondingly, the function f *f^{\ast } has supporting lines with slope x 0x_{0} all the way from k 1k_{1} to k 2k_{2}.

If we allow the function ff to be not strictly convex then the transform will not alway be finite. For example, if f(x)ax+bf(x)\coloneqq ax+b then we have f *(a)=bf^{\ast }(a)=-b and f *(k)=+f^{\ast }(k)=+\infty for kak\ne a. So we will allow functions taking values in the extended real numbers: ¯[,+]\overline{\mathbb{R}}\coloneqq [-\infty ,+\infty ].

We can use the above definition to get the transform of any function f:¯f\colon \mathbb{R}\to \overline{\mathbb{R}}, whether convex or not, but the resulting transform f *f^{\ast } is always convex. (When there are infinite values involved we can also say that f *f^{\ast } is lower semi-continuous, but I’ll absorb that into my definition of convex for functions taking infinite values.)

Everything we’ve done for one-dimensional \mathbb{R} easily generalizes to any finite dimensional real vector space VV, where we should say ‘supporting hyperplane’ instead of ‘supporting line’. From that we get a transform between sets of functions (--) *:Fun(V,¯)Fun(V #,¯), (\text {--})^{\ast }\colon \mathrm{Fun}(V,\overline{\mathbb{R}})\to \mathrm{Fun}(V^{#},\overline{\mathbb{R}}), where V #V^{#} is the vector space dual of VV. Similarly, we have a reverse transform going the other way, which is traditionally also denoted with a star (--) *:Fun(V #,¯)Fun(V,¯), (\text {--})^{\ast }\colon \mathrm{Fun}(V^{#},\overline{\mathbb{R}})\to \mathrm{Fun}(V,\overline{\mathbb{R}}), for g:V #¯g\colon V^{#}\to \overline{\mathbb{R}} we define g *(x)sup kV #{k,xg(k)}. g^{\ast }(x)\coloneqq \sup _{k\in V^{#}}\big \{ \langle k,x\rangle -g(k)\big \} .

This pair of transforms have some rather nice properties, for instance, they are order reversing. We can put a partial order on any set of functions to ¯\overline{\mathbb{R}} by defining f 1f 2f_{1}\ge f_{2} if f 1(x)f 2(x)f_{1}(x)\ge f_{2}(x) for all xx. Then f 1f 2f 2 *f 1 *. f_{1}\ge f_{2} \quad \Rightarrow \quad f_{2}^{\ast }\ge f_{1}^{\ast }. Also for any function ff we have f *=f *** f^{\ast }=f^{\ast \ast \ast } which implies that the operator ff **f\mapsto f^{\ast \ast } is idempotent: f **=f ****. f^{\ast \ast }=f^{\ast \ast \ast \ast }. This means that ff **f\mapsto f^{\ast \ast } is a closure operation. What it actually does is take the convex envelope of ff, which is the largest convex function less than or equal to ff. Here’s an example.


This gives that if ff is already a convex function then f **=ff^{\ast \ast }=f. And as a consequence the Legendre–Fenchel transform and the reverse transform restrict to an order reversing bijection between convex functions on VV and convex functions on its dual V #V^{#}. Cvx(V,¯)Cvx(V #,¯). \mathrm{Cvx}(V,\overline{\mathbb{R}})\cong \mathrm{Cvx}(V^{#},\overline{\mathbb{R}}).

There are many other things that can be said about the transform, such as Fenchel duality and the role it plays in optimization, but I don’t understand such things to my own satisfaction yet.

Next time I’ll explain how most of the above structure drops out of the nucleus construction in enriched category theory.

Author: "willerton (S.Willerton@sheffield.ac.uk)" Tags: "Analysis"
Send by mail Print  Save  Delicious 
Date: Sunday, 20 Apr 2014 17:14

Here’s a classic theorem about finite products theories, also known as algebraic theories or Lawvere theories. You can find it in Toposes, Triples and Theories, but it must go way back to Lawvere’s thesis. In my work with Nina Otter on phylogenetic trees, we need a slight generalization of it… and if true, this generalization must already be known. So I really just want a suitable reference!

Theorem. Suppose CC is a category with finite products that is single-sorted: every object is a finite product of copies of a single object xCx \in C. Let Mod(C)Mod(C) be the category of models of CC: that is, product-preserving functors

ϕ:CSet \phi : C \to Set

and natural transformations between these. Let

U:Mod(C)Set U : Mod(C) \to Set

be the functor sending any model to its underlying set:

U(ϕ)=ϕ(x) U(\phi) = \phi(x)

Then UU has a left adjoint

F:SetMod(C) F : Set \to Mod(C)

and Mod(C)Mod(C) is equivalent to the category of algebras of the monad

UF:SetSet U F : Set \to Set

As a result, any model ϕ\phi can be written as a coequalizer

FUFU(ϕ)FU(ϕ)ϕ F U F U(\phi) \stackrel{\longrightarrow}{\longrightarrow} F U(\phi) \longrightarrow \phi

where the arrows are built from the counit of the adjunction

ϵ:FU1 Mod(C) \epsilon : F U \to 1_{Mod(C)}

in the obvious ways: FU(ϵ FU(ϕ)) F U (\epsilon_{F U (\phi)}), ϵ FUFU(ϕ)\epsilon_{F U F U(\phi)} and ϵ ϕ\epsilon_\phi.

The generalization we need is quite mild, I hope. First, we need to consider multi-typed theories. Second, we need to consider models in TopTop rather than SetSet. So, this is what we want:

Conjecture. Suppose CC is a category with finite products that is nn-sorted: every object is a finite product of copies of certain objects x λx_\lambda, where λ\lambda ranges over some index set Λ\Lambda. Let Mod(C)Mod(C) be the category of topological models of CC: that is, product-preserving functors

ϕ:CTop \phi : C \to Top

and natural transformations between these. Let

U:Mod(C)Top Λ U : Mod(C) \to Top^\Lambda

be the functor sending any model to its underlying spaces:

U(ϕ)=(ϕ(x λ)) λΛ U(\phi) = (\phi(x_\lambda))_{\lambda \in \Lambda}

Then UU has a left adjoint

F:Top ΛMod(C) F : Top^\Lambda \to Mod(C)

and Mod(C)Mod(C) is equivalent to the category of algebras of the monad

UF:Top ΛTop Λ U F : Top^\Lambda \to Top^\Lambda

As a result, any model ϕ\phi can be written as a coequalizer

FUFU(ϕ)FU(ϕ)ϕ F U F U(\phi) \stackrel{\longrightarrow}{\longrightarrow} F U(\phi) \longrightarrow \phi

where the arrows are built from the counit of the adjunction in the obvious ways.


1) There shouldn’t be anything terribly special about TopTop here: any sufficiently nice category should work… but all I need is TopTop. What counts as ‘sufficiently nice’? If we need to replace TopTop ‘convenient category of topological spaces’, to make it cartesian closed, that’s fine with me—just let me know!

2) I’m sure this conjecture, if true, follows from some super-duper-generalization. I don’t mind hearing about such generalizations, and I suspect some of you will be unable to resist mentioning them… so okay, go ahead, impress me…

… but remember: all I need is this puny little result!

In fact, all I really need is a puny special case of this puny result. If it works, it will be a small, cute application of algebraic theories to biology.

Author: "john (baez@math.ucr.edu)" Tags: "Categories"
Send by mail Print  Save  Delicious 
Date: Friday, 18 Apr 2014 19:56

Guest post by Christina Vasilakopoulou

In the eighth installment of the Kan Extension Seminar, we discuss the paper “Elementary Observations on 2-Categorical Limits” by G.M. Kelly, published in 1989. Even though Kelly’s classic book Basic Concepts of Enriched Category Theory, which contains the abstract theory related to indexed (or weighted) limits for arbitrary 𝒱\mathcal{V}-categories, was available since 1982, the existence of the present article is well-justifiable.

On the one hand, it constitutes an independent account of the fundamental case 𝒱\mathcal{V}=Cat\mathbf{Cat}, thus it motivates and exemplifies the more general framework through a more gentle, yet meaningful exposition of 2-categorical limits. The explicit construction of specific notable finite limits such as inserters, equifiers etc. promotes the comprehension of the definitions, via a hands-on description. Moreover, these finite limits and particular results concerning 2-categories rather than general enriched categories, such as the construction of the cotensor as a PIE limit, are central for the theory of 2-categories. Lastly, by introducing indexed lax and pseudo limits along with Street’s bilimits, and providing appropriate lax/ pseudo/ bicategorical completeness results, the paper serves also as an indespensable reference for the later “2-Dimensional Monad Theory” by Blackwell, Kelly and Power.

I would like to take this opportunity to thank Emily as well as all the other participants of the Kan Extension Seminar. This has been a unique experience of constant motivation and inspiration for me!

Basic Machinery

Presently, our base of enrichment is the cartesian monoidal closed category Cat\mathbf{Cat} of (small) categories, with the usual adjunction ×𝒜[𝒜,]-\times\mathcal{A}\dashv[\mathcal{A},-]. The very definition of an indexed limit requires a good command of the basic Cat\mathbf{Cat}-categorical notions, as seen for example in “Review of the Elements of 2-categories” by Kelly and Street. In particular, a 2-natural transformation α:GH\alpha:G\Rightarrow H between 2-functors consists of components which not only satisfy the usual naturality condition, but also the 2-naturality one expressing compatibility with 2-cells. Moreover, a modification between 2-natural transformations m:αβm:\alpha\Rrightarrow\beta has components families of 2-cells m A:α Aβ A:GAHAm_A:\alpha_A\Rightarrow\beta_A:GA\to HA compatible with the mapped 1-cells of the domain 2-category, i.e. m BGf=Hfm Am_B\cdot Gf=Hf\cdot m_A (where \cdot is whiskering).

A 2-functor F:𝒦CatF:\mathcal{K}\to\mathbf{Cat} is called representable, when there exists a 2-natural isomorphism α:𝒦(K,)F. \alpha:\mathcal{K}(K,-)\xrightarrow{\quad\sim\quad}F. The components of this isomorphism are α A:𝒦(K,A)FA\alpha_A:\mathcal{K}(K,A)\cong FA in Cat\mathbf{Cat}, and the unit of the representation is the corresponding `element’ 1FK\mathbf{1}\to FK via Yoneda.

For a general complete symmetric monoidal closed category 𝒱\mathcal{V}, the usual functor category [𝒜,][\mathcal{A},\mathcal{B}] for two 𝒱\mathcal{V}-categories is endowed with the structure of a 𝒱\mathcal{V}-category itself, with hom-objects ends [𝒜,](T,S)= A𝒜(TA,SA) [\mathcal{A},\mathcal{B}](T,S)=\int_{A\in\mathcal{A}} \mathcal{B}(TA,SA) (which exist at least when 𝒜\mathcal{A} is small). In our context of 𝒱\mathcal{V}=Cat\mathbf{Cat} it is not necessary to employ ends and coends at all, and the hom-category [𝒦,](G,H)[\mathcal{K},\mathcal{L}](G,H) of the functor 2-category is evidently the category of 2-natural transformations and modifications. However, we note that computations via (co)ends simplify and are essential for constructions and (co)completeness results for enrichment in general monoidal categories.

The definition of weighted limits for 2-categories

To briefly motivate the definition of a weighted limit, recall that an ordinary limit of a (Set\mathbf{Set}-) functor G:𝒫𝒞G:\mathcal{P}\to\mathcal{C} is characterized by an isomorphism 𝒞(C,limG)[𝒫,𝒞](ΔC,G) \mathcal{C}(C,limG)\cong[\mathcal{P},\mathcal{C}](\Delta C, G) natural in CC, where ΔC:𝒫𝒞\Delta C:\mathcal{P}\to\mathcal{C} is the constant functor on the object CC. In other words, the limit is the representing object of the presheaf [𝒫,𝒞](Δ,G):𝒞 opSet. [\mathcal{P},\mathcal{C}](\Delta -,G):\mathcal{C}^\op\to\mathbf{Set}. Since components of a natural transformation ΔCG\Delta C\Rightarrow G (i.e. cones) can be viewed as components of a natural Δ1𝒞(C,G):𝒞Set\Delta\mathbf{1}\Rightarrow\mathcal{C}(C,G-):\mathcal{C}\to\mathbf{Set}, the above defining isomorphism can be written as 𝒞(C,limG)[𝒫,Set](Δ1,𝒞(C,G)). \mathcal{C}(C,\mathrm{lim}G)\cong[\mathcal{P},\mathbf{Set}](\Delta\mathbf{1},\mathcal{C}(C,G-)). In this form, ordinary limits can be easily seen as particular examples of conical indexed limits for 𝒱\mathcal{V}=Set\mathbf{Set}, and we are able to generalize the concept of a limit by replacing the functor Δ1\Delta\mathbf{1} by an arbitrary functor (weight) 𝒞Set\mathcal{C}\to\mathbf{Set}.

We may thus think of a 2-functor F:𝒫CatF:\mathcal{P}\to\mathbf{Cat} as a (small) indexing type or weight, and a 2-functor G:𝒫𝒦G:\mathcal{P}\to\mathcal{K} as a diagram in 𝒦\mathcal{K} of shape 𝒫\mathcal{P}: Cat weight F 𝒫 diagramG 𝒦. \begin{matrix} & \mathbf{Cat}\quad \\ {}^{weight} \nearrow_{F} & \\ \mathcal{P} & \overset{G}\underset{diagram}{\rightarrow} & \mathcal{K}. \\ \end{matrix} The 2-functor GG gives rise to a 2-functor p[Fp,𝒦(,Gp)]=[𝒫,Cat](F,𝒦(,G)):𝒦 opCat \int_p [Fp,\mathcal{K}(-,Gp)]=[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(-,G)):\; \mathcal{K}^\op\longrightarrow\mathbf{Cat} which maps a 0-cell AA to the category [𝒫,Cat](F,𝒦(A,G))[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(A,G)). A representation of this contravariant 2-functor is an object {F,K}𝒦\{F,K\}\in\mathcal{K} along with 2-natural isomorphism 𝒦(,{F,G})[𝒫,Cat](F,𝒦(,G))\mathcal{K}(-,\{F,G\})\xrightarrow{\;\sim\;}[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(-,G)) with components isomorphisms between categories 𝒦(A,{F,G})[𝒫,Cat](F,𝒦(A,G)). \mathcal{K}(A,\{F,G\})\cong[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(A,G-)). The unit of this representation is 1[𝒫,Cat](F,𝒦({F,G},G))\mathbf{1}\to[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(\{F,G\},G)) which corresponds uniquely to a 2-natural transformation ξ:F𝒦({F,G},G)\xi:F\Rightarrow\mathcal{K}(\{F,G\},G).

Via this 2-natural isomorphism, the object {F,G}\{F,G\} in 𝒦\mathcal{K} satisfies a universal property which can be expressed in two levels:

  • The 1-dimensional aspect of the universal property states that every natural transformation ρ\rho factorizes as Fρ 𝒦(A,G) ξ 𝒦(h,1) 𝒦({F,G},G) \begin{matrix} F \xrightarrow{\rho} & \mathcal{K}(A,G) \\ {}_\xi \searrow & \uparr_{\mathcal{K}(h,1)} \\ & \mathcal{K}(\{F,G\},G) \\ \end{matrix} for a unique 1-cell h:A{F,G}h:A\to\{F,G\} in 𝒦\mathcal{K}, where the vertical arrow is just pre-composition with hh.

  • The 2-dimensional aspect of the universal property states that every modification θ:ρρ\theta:\rho\Rrightarrow\rho' factorizes as 𝒦(α,1)ξ\mathcal{K}(\alpha,1)\cdot \xi for a unique 2-cell α:hh\alpha:h\Rightarrow h' in 𝒦\mathcal{K}.

The fact that the 2-dimensional aspect (which asserts an isomorphism of categories) does not in general follow from the 1-dimensional aspect (which asserts a bijection between the hom-sets of the underlying categories) is a recurrent issue of the paper. In fact, things would be different if the underlying category functor 𝒱(I,)=() 0:𝒱-CatCat \mathcal{V}(I,-)=(\;)_0:\mathcal{V}\text{-}\mathbf{Cat}\to\mathbf{Cat} were conservative, in which case the 2-dimensional universal property would always imply the 1-dimensional one. Certainly though, this is not the case for 𝒱\mathcal{V}=Cat\mathbf{Cat}: the respective functor discards all the 2-cells and is not even faithful. However, if we know that a weighted limit exists, then the first level of the universal property suffices to detect it up to isomorphism.

Completeness of 2-categories

A 2-category 𝒦\mathcal{K} is complete when all limits {F,G}\{F,G\} exist. The defining 2-natural isomorphism extends the mapping (F,G){F,G}(F,G)\mapsto\{F,G\} into a functor of two variables (the weighted limit functor) {,}:[𝒫,Cat] op×[𝒫,𝒦]𝒦 \{-,-\}:[\mathcal{P},\mathbf{Cat}]^{op}\times[\mathcal{P},\mathcal{K}]\longrightarrow \mathcal{K} as the left parametrized adjoint (actually its opposite) of the functor 𝒦(,?):𝒦 op×[𝒫,𝒦][𝒫,Cat] \mathcal{K}(-,?):\mathcal{K}^{op}\times[\mathcal{P},\mathcal{K}]\to[\mathcal{P},\mathbf{Cat}] mapping an object AA and a functor GG to 𝒦(A,G)\mathcal{K}(A,G-). A colimit in 𝒦\mathcal{K} is a limit in 𝒦 op\mathcal{K}^op, and the weighted colimit functor is *:[𝒫 op,Cat]×[𝒫,𝒦]𝒦. -\ast-:[\mathcal{P}^op,\mathbf{Cat}]\times[\mathcal{P},\mathcal{K}]\longrightarrow\mathcal{K}. Apart from the evident duality, we observe that often colimits are harder to compute than limits. This may partially be due to the fact that {F,G}\{F,G\} is determined by the representable 𝒦(,{F,G})\mathcal{K}(-,\{F,G\}) which gives generalized elements of {F,G}\{F,G\}, whereas the description of 𝒦(F*G,)\mathcal{K}(F\ast G,-) gives us arrows out of F*GF\ast G. For example, limits in Cat\mathbf{Cat} are easy to compute via [𝒜,{F,G}][𝒫,Cat](F,[𝒜,G])[𝒜,[𝒫,Cat](F,G)] [\mathcal{A},\{F,G\}]\cong[\mathcal{P},\mathbf{Cat}](F,[\mathcal{A},G-])\cong[\mathcal{A},[\mathcal{P},\mathbf{Cat}](F,G)] and in particular, taking 𝒜=1\mathcal{A}=\mathbf{1} gives us the objects of the category {F,G}\{F,G\} and 𝒜=2\mathcal{A}=\mathbf{2} gives us the morphisms. On the contrary, colimits in Cat\mathbf{Cat} are not straightforward (except than their property F*GG*FF\ast G\cong G\ast F).

Notice that like ordinary limits are defined, via representability, in terms of limits in Set\mathbf{Set}, we can define weighted limits in terms of limits of representables in Cat\mathbf{Cat}: 𝒦(A,{F,G}){F,𝒦(A,G)},𝒦(F*,G,A){F,𝒦(G,A)}. \mathcal{K}(A,\{F,G\})\cong\{F,\mathcal{K}(A,G-)\},\quad\mathcal{K}(F\ast ,G,A)\cong\{F,\mathcal{K}(G-,A)\}. On the other hand, if the weights are representables, via Yoneda lemma we get {𝒫(P,),G}GP,𝒫(,P)*GGP. \{\mathcal{P}(P,-),G\}\cong GP, \qquad \mathcal{P}(-,P)\ast G\cong GP.

The main result for general 𝒱\mathcal{V}-completeness in Kelly’s book says that a 𝒱\mathcal{V}-enriched category is complete if and only if it admits all conical limits (equivalently, products and equalizers) and cotensor products. Explicitly, conical limits are those with weight the constant 𝒱\mathcal{V}-functor ΔI\Delta I, whereas cotensors are those where the domain enriched category 𝒫\mathcal{P} is the unit category 1 \mathbf{1}, hence the weight and the diagram are determined by objects in 𝒱\mathcal{V} and 𝒦\mathcal{K} respectively. Once again, for 𝒱\mathcal{V}=Cat\mathbf{Cat} an elementary description of both limits is possible.

Notice that when a 2-category admits tensor products of the form 2*A\mathbf{2}\ast A, then the 2-dimensional universal property follows from the 1-dimensional for every limit, because of conservativity of the functor Cat 0(2,)\mathbf{Cat}_0(\mathbf{2},-) and the definition of tensors. Moreover, the former also implies that the category 2\mathbf{2} is a strong generator in Cat\mathbf{Cat}, hence the existence of only the cotensor {2,B}\{\mathbf{2},B\} along with conical limits in a 2-category 𝒦\mathcal{K} is enough to deduce 2-completeness.

Cat\mathbf{Cat} itself has cotensor and tensor products, given by {𝒜,}=[𝒜,]\{\mathcal{A},\mathcal{B}\}=[\mathcal{A},\mathcal{B}] and 𝒜*=𝒜×\mathcal{A}\ast\mathcal{B}=\mathcal{A}\times\mathcal{B}. It is ultimately also cocomplete, all colimits being constructed from tensors and ordinary colimits in Cat 0\mathbf{Cat}_0 (which give the conical limits in Cat\mathbf{Cat} by the existence of the cotensor [2,B][\mathbf{2},B]).

If we were to make use of ends and coends, the explicit construction of an arbitrary 2-(co)limit in 𝒦\mathcal{K} as the (co)equalizer of a pair of arrows between (co)products of (co)tensors coincides with {F,G}= K{FK,GK},F*G= KFK*GK. \{F,G\}=\int_K \{FK,GK\}, \qquad F\ast G=\int^K FK\ast GK. Such an approach simplifies the proofs of many useful properties of limits and colimits, such as {F,{G,H}}{F*G,H},(G*F)*HF*(G*H) \{F,\{G,H\}\}\cong\{F\ast G,H\},\;\;(G\ast F)\ast H\cong F\ast(G\ast H) for appropriate 2-functors.

Famous finite 2-limits

The paper provides the description of some important classes of limits in 2-categories, essentially by exhibiting the unit of the defining representation for each particular case. A table which summarizes the main examples included is the following:


Let’s briefly go through the explicit construction of an inserter in a 2-category 𝒦\mathcal{K}. The weight and diagram shape are as in the first line of the above table, and denote by BgfCB\overset{f}\underset{g}{\rightrightarrows}C the image of the diagram in 𝒦\mathcal{K}. The standard technique is to identify the form of objects and morphisms of the functor 2-category [𝒫,Cat](F,𝒦(A,G))[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(A,G-)), and then state both aspects of the universal property.

An object is a 2-natural transformation α:F𝒦(A,G)\alpha:F\Rightarrow\mathcal{K}(A,G-) with components α :1𝒦(A,B)\alpha_\bullet:1\to\mathcal{K}(A,B) and α :2𝒦(A,C)\alpha_\star:\mathbf{2}\to\mathcal{K}(A,C) satisfying the usual naturality condition (2-naturality follows trivially, since 𝒫\mathcal{P} only has the identity 2-cell). This amounts to the following data:

  • an 1-cell Aα BA\xrightarrow{\alpha_\bullet}B, i.e. the object in 𝒦(A,B)\mathcal{K}(A,B) determined by the functor α \alpha_\bullet;

  • a 2-cell α 0α α 1{\alpha_\star0}\overset{\alpha_\star}{\Rightarrow}{\alpha_\star1}, i.e. the morphism in 𝒦(A,C)\mathcal{K}(A,C) determined by the functor α \alpha_\star;

  • properties, which make the 1-cells α 0,α 1\alpha_\star0,\alpha_\star1 factorize as α 0=Aα BfC\alpha_\star0=A\xrightarrow{\alpha_\bullet}B\xrightarrow{f}C and α 1=Aα BgC\alpha_\star1=A\xrightarrow{\alpha_\bullet}B\xrightarrow{g}C.

We can encode the above data by a diagram B α f A α C. α g B \begin{matrix} & B & \\ {}^{\alpha_\bullet} \nearrow && {\searrow}^f \\ A\; & \Downarrow{\alpha_\star}& \quad C. \\ {}_{\alpha_\bullet} \searrow && \nearrow_g \\ & B & \\ \end{matrix} Now a morphism is a modification m:αβm:\alpha\Rrightarrow\beta between two objects as above. This has components

  • m :α β m_\bullet:\alpha_\bullet\Rightarrow\beta_\bullet in 𝒦(A,B)\mathcal{K}(A,B);

  • m :α β m_\star:\alpha_\star\Rightarrow\beta_\star given by 2-cells m 0:α 0β 0m_\star^0:\alpha_\star0\Rightarrow{\beta_\star0} and m 1:α 1β 1m_\star^1:\alpha_\star1\Rightarrow\beta_\star1 in 𝒦(A,C)\mathcal{K}(A,C) satisfying naturality m 1α =β m 0m^1_\star\circ\alpha_\star=\beta_\star\circ m^0_\star.

The modification condition m 0=fm m^0_\star=f\cdot m_\bullet and m 1=gm m^1_\star=g\cdot m_\bullet gives the components of m m_\star as whiskered composites of m m_\bullet. We can thus express such a modification as a 2-cell m m_\bullet satisfying gm α =fm β gm_\bullet\circ\alpha_\star=fm_\bullet\circ\beta_\star (graphically expressed by pasting m m_\bullet accordingly to the sides of α ,β \alpha_\star,\beta_\star).

This encoding simplifies the statement of the universal property for {F,G}\{F,G\}, as the object of in 𝒦\mathcal{K} through which any natural transformation and modification uniquely factorize in an appropriate way (in fact, through the unit ξ\xi). A very similar process can be followed for the identification of the other classes of limits. As an illustration, let’s consider some of these limits in the 2-category Cat\mathbf{Cat}.

  • The inserter of two functors F,G:𝒞F,G:\mathcal{B}\to\mathcal{C} is a category 𝒜\mathcal{A} with objects pairs (B,h)(B,h) where BB\in\mathcal{B} and h:FBGBh:FB\to GB in 𝒞\mathcal{C}. A morphism (B,h)(B,h)(B,h)\to(B',h') is an arrow f:BBf:B\to B' in \mathcal{B} such that the following diagram commutes: FB Ff FB h h FB Gh GB. \begin{matrix} FB & \overset{Ff}{\longrightarrow} & FB' \\ {}_h\downarrow && \downarrow_{h'} \\ FB & \underset{Gh}{\longrightarrow} & GB'. \\ \end{matrix} The functor α =P:𝒜\alpha_\bullet=P:\mathcal{A}\to\mathcal{B} is just the forgetful functor, and the natural transformation is given by (α ) (B,h)=h(\alpha_\star)_{(B,h)}=h.

  • The comma-object of two functors F,GF,G is precisely the comma category. If the functors have also the same domain, their inserter is a subcategory of the comma category.

  • The equifier of two natural transformations ϕ 1,ϕ 2:FG:𝒞\phi^1,\phi^2:F\Rightarrow G:\mathcal{B}\to\mathcal{C} is the full subcategory 𝒜\mathcal{A} of \mathcal{B} over all objects BB such that ϕ B 1=ϕ B 2\phi^1_B=\phi^2_B in 𝒞\mathcal{C}.

There is a variety of constructions of new classes of limits from given ones, coming down to the existence of endo-identifiers, inverters, iso-inserters, comma-objects, iso-comma-objects, lax/ oplax/pseudo limits of arrows and the cotensors {2,K}\{\mathbf{2},K\}, {I,K}\{\mathbf{I},K\} out of inserters, equifiers and binary products in the 2-category 𝒦\mathcal{K}. Along with the substantial construction of arbitrary cotensors out of these three classes, P(roducts)I(nserters)E(quifiers) limits are established as essential tools, also relatively to categories of algebras for 2-monads. Notice that equalizers are `too tight’ to fit in certain 2-categories of importance such as Lex\mathbf{Lex}.

Weaker notions of limits in 2-categories

The concept of a weighted 2-limit strongly depends on the specific structure of the 2-category [𝒫,Cat][\mathcal{P},\mathbf{Cat}] of 2-functors, 2-natural transformations and modifications, for the 2-categories 𝒫\mathcal{P} and Cat\mathbf{Cat}. If we alter this structure by considering lax natural transformations or pseudonatural transformations, we obtain the definition of the lax limit {F,G} l\{F,G\}_l and pseudo limit {F,G} p\{F,G\}_p as the representing objects for the 2-functors Lax[𝒫,Cat](F,𝒦(,G)):𝒦 opCat Psd[𝒫,Cat](F,𝒦(,G)):𝒦 opCat. \begin{matrix} Lax[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(-,G)):\mathcal{K}^\op\to\mathbf{Cat} \\ Psd[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(-,G)):\mathcal{K}^\op\to\mathbf{Cat}. \end{matrix} Notice that the functor categories Lax[𝒫,]Lax[\mathcal{P},\mathcal{L}] and Psd[𝒫,]Psd[\mathcal{P},\mathcal{L}] are 2-categories whenever \mathcal{L} is a 2-category, hence the defining isomorphisms are again between categories as before.

An important remark is that any lax or pseudo limit in 𝒦\mathcal{K} can be in fact expressed as a `strict’ weighted 2-limit. This is done by replacing the original weight with its image under the left adjoint of the incusion functors [𝒫,Cat]Lax[𝒫,Cat][\mathcal{P},\mathbf{Cat}]\hookrightarrow Lax[\mathcal{P},\mathbf{Cat}], [𝒫,Cat]Psd[𝒫,Cat][\mathcal{P},\mathbf{Cat}]\hookrightarrow Psd[\mathcal{P},\mathbf{Cat}]. The opposite does not hold: for example, inserters and equifiers are neither lax not pseudo limits.

We can relax the notion of limits in 2-categories even further, and define the bilimit {F,G} b\{F,G\}_b of 2-functors FF and GG as the representing object up to equivalence: 𝒦(A,{F,G} b)Psd[𝒫,Cat](F,𝒦(A,G)). \mathcal{K}(A,\{F,G\}_b)\simeq Psd[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(A,G)). This is of course a particular case of general bilimits in bicategories, for which 𝒫\mathcal{P} and 𝒦\mathcal{K} are requested to be bicategories and FF and GG homomorphism of bicategories. The above equivalence of categories expresses a birepresentation of the homomorphism Hom[𝒫,Cat](F,𝒦(,G)):𝒦 opCatHom[\mathcal{P},\mathbf{Cat}](F,\mathcal{K}(-,G)):\mathcal{K}^op\to\mathbf{Cat}.

Evidently, bilimits (firstly introduced by Ross Street) may exist even when pseudo limits do not, since they require an equivalence rather than isomorphism of hom-categories. The following two results sum up the conditions ensuring whether a 2-category has all lax, pseudo and bilimits.

  • A 2-category with products, inserters and equifiers has all lax and pseudo limits (whereas it may not have all strict 2-limits).

  • A 2-category with biproducts, biequalizers and bicotensors is bicategorically complete. Equivalently, it admits all bilimits if and only if for all 2-functors F:𝒫CatF:\mathcal{P}\to\mathbf{Cat}, G:𝒫𝒦G:\mathcal{P}\to\mathcal{K} from a small ordinary category 𝒫\mathcal{P}, the above mentioned birepresentation exists.

Street’s construction of an arbitrary bilimit requires a descent object of a 3-truncated bicosimplicial object in 𝒦\mathcal{K}. An appropriate modification of the arguments exhibits lax and pseudo limits as PIE limits.

These weaker forms of limits in 2-categories are fundamental for the theory of 2-categories and bicategories. Many important constructions such as the Eilenberg-Moore object as well as the Grothendieck construction on a fibration, arise as lax/oplax limits. They are also crucial in 2-monad theory, for example when studying categories of (strict) algebras with non-strict (pseudo or even lax/oplax) morphisms, which are more common in nature.

Author: "riehl (eriehl@math.harvard.edu)" Tags: "Categories"
Send by mail Print  Save  Delicious 
Date: Monday, 14 Apr 2014 17:16

A new Spanish language mathematical magazine has been launched: universo.math. Hispanophones should check out the first issue! There are some very interesting looking articles which cover areas from art through politics to research-level mathematics.

The editor-in-chief is my mathematical brother Jacob Mostovoy and he wants it to be a mix of Mathematical Intellingencer, Notices of the AMS and the New Yorker, together with less orthodox ingredients; the aim is to keep the quality high.

Besides Jacob, the contributors to the first issue that I recognise include Alberto Verjovsky, Ernesto Lupercio and Edward Witten, so universo.math seems to be off to a high quality start.

Author: "willerton (S.Willerton@sheffield.ac.uk)" Tags: "Scientific Communication"
Send by mail Print  Save  Delicious 
Date: Thursday, 10 Apr 2014 08:54

Guest post by Bruce Bartlett

The following is the greatest math talk I’ve ever watched!

  • Etienne Ghys (with pictures and videos by Jos Leys), Knots and Dynamics, ICM Madrid 2006. [See below the fold for some links.]

Etienne Ghys A modular knot

I wasn’t actually at the ICM; I watched the online version a few years ago, and the story has haunted me ever since. Simon and I have been playing around with some of this stuff, so let me share some of my enthusiasm for it!

The story I want to tell here is how, via modular flow of lattices in the plane, certain matrices in SL(2,)\SL(2,\mathbb{Z}) give rise to knots in the 3-sphere less a trefoil knot. Despite possibly sounding quite scary, this can be easily explained in an elementary yet elegant fashion.

As promised above, here are some links related to Ghys’ ICM talk.

I’m going to focus on the last third of the talk — the modular flow on the space of lattices. That’s what produced the beautiful picture above (credit for this and other similar pics below goes to Jos Leys; the animation is Simon’s.)

Lattices in the plane

For us, a lattice is a discrete subgroup of \mathbb{C}. There are three types: the zero lattice, the degenerate lattices, and the nondegenerate lattices:


Given a lattice LL and an integer n4n \geq 4 we can calculate a number — the Eisenstein series of the lattice: G n(L)= ωL,ω01ω n. G_{n}(L) = \sum _{\omega \in L, \omega \neq 0} \frac{1}{\omega ^{n}}. We need n3n \geq 3 for this sum to converge. For, roughly speaking, we can rearrange it as a sum over rr of the lattice points on the boundary of a square of radius rr. The number of lattice points on this boundary scales with rr, so we end up computing something like r0rr n\sum _{r \geq 0} \frac{r}{r^{n}} and so we need n3n \geq 3 to make the sum converge.

Note that G n(L)G_{n}(L) = 0 for nn odd since every term ω\omega is cancelled by the opposite term ω-\omega . So, the first two nontrivial Eisenstein series are G 4G_{4} and G 6G_{6}. We can use them to put `Eisenstein coordinates’ on the space of lattices.

Theorem: The map {lattices} 2 L (G 4(L),G 6(L)) \begin{aligned} \{ \text{lattices} \} &\rightarrow \mathbb{C}^{2} \\ L & \mapsto (G_{4} (L), \, G_{6}(L)) \end{aligned} is a bijection.

The nicest proof is in Serre’s A Course in Arithmetic, p. 89. It is a beautiful application of the Cauchy residue theorem, using the fact that G 4G_{4} and G 6G_{6} define modular forms on the upper half plane HH. (Usually, number theorists set up their lattices so that they have basis vectors 11 and τ\tau where τH\tau \in H. But I want to avoid this ‘upper half plane’ picture as far as possible, since it breaks symmetry and mystifies the geometry. The whole point of the Ghys picture is that not breaking the symmetry reveals a beautiful hidden geometry! Of course, sometimes you need the ‘upper half plane’ picture, like in the proof of the above result.)

Lemma: The degenerate lattices are the ones satisfying 20G 4 349G 6 2=020 G_{4}^{3} - 49G_{6}^{2} = 0.

Let’s prove one direction of this lemma — that the degenerate lattices do indeed satisfy this equation. To see this, we need to perform a computation. Let’s calculate G 4G_{4} and G 6G_{6} of the lattice \mathbb{Z} \subset \mathbb{C}. Well, G 4()= n01n 4=2ζ(4)=2π 490 G_{4}(\mathbb{Z}) = \sum _{n \neq 0} \frac{1}{n^{4}} = 2 \zeta (4) = 2 \frac{\pi ^{4}}{90} where we have cheated and looked up the answer on Wikipedia! Similarly, G 6()=2π 6945G_{6}(\mathbb{Z}) = 2 \frac{\pi ^{6}}{945}.

So we see that 20G 4() 349G 6() 2=020 G_{4}(\mathbb{Z})^{3} - 49 G_{6}(\mathbb{Z})^{2} = 0. Now, every degenerate lattice is of the form tt \mathbb{Z} where tt \in \mathbb{C}. Also, if we transform the lattice via LtLL \mapsto t L, then G 4t 4G 4G_{4} \mapsto t^{-4} G_{4} and G 6t 6G 6G_{6} \mapsto t^{-6} G_{6}. So the equation remains true for all the degenerate lattices, and we are done.

Corollary: The space of nondegenerate lattices in the plane of unit area is homeomorphic to the complement of the trefoil in S 3S^{3}.

The point is that given a lattice LL of unit area, we can scale it LλLL \mapsto \lambda L, λ +\lambda \in \mathbb{R}^{+} until (G 4(L),G 6(L))(G_{4}(L), G_{6}(L)) lies on the 3-sphere S 3={(z,w):|z| 2+|w| 2=1} 2S^{3} = \{ (z,w) : |z|^{2} + |w|^{2} = 1\} \subset \mathbb{C}^{2}. And the equation 20z 349w 2=020 z^{3} - 49 w^{2} = 0 intersected with S 3S^{3} cuts out a trefoil knot… because it is “something cubed plus something squared equals zero”. And the lemma above says that the nondegenerate lattices are precisely the ones which do not satisfy this equation, i.e. they represent the complement of this trefoil.

Since we have not divided out by rotations, but only by scaling, we have arrived at a 3-dimensional picture which is very different to the 2-dimensional moduli space (upper half-plane divided by SL(2,)\SL(2,\mathbb{Z})) picture familiar to a number theorist.

The modular flow

There is an intriguing flow on the space of lattices of unit area, called the modular flow. Think of LL as sitting in 2\mathbb{R}^{2}, and then act on 2\mathbb{R}^{2} via the transformation (e t 0 0 e t), \left ( \begin{array}{cc} e^{t} & 0 \\ 0 & e^{-t} \end{array} \right ), dragging the lattice LL along for the ride. (This isn’t just some formula we pulled out the blue — geometrically this is the ‘geodesic flow on the unit tangent bundle of the modular orbifold’.)

We are looking for periodic orbits of this flow.

“Impossible!” you say. “The points of the lattice go off to infinity!” Indeed they do… but disregard the individual points. The lattice itself can ‘click’ back into its original position:


How are we to find such periodic orbits? Start with an integer matrix A=(a b c d)SL(2,) A = \left ( \begin{array}{cc} a & b \\ c & d \end{array}\right ) \in \SL(2, \mathbb{Z}) and assume AA is hyperbolic, which simply means |a+d|2|a + d| \geq 2. Under these conditions, we can diagonalize AA over the reals, so we can find a real matrix PP such that PAP 1=±(e t 0 0 e t) P A P^{-1} = \pm \left ( \begin{array}{cc} e^{t} & 0 \\ 0 & e^{-t} \end{array} \right ) for some tt \in \mathbb{R}. Now set LP( 2)L \coloneqq P(\mathbb{Z}^{2}). We claim that LL is a periodic orbit of period tt. Indeed: L t =(e t 0 0 e t)P( 2) =±PA( 2) =±P( 2) =L. \begin{aligned} L_{t} &= \left ( \begin{array}{cc} e^{t} & 0 \\ 0 & e^{-t} \end{array} \right ) P (\mathbb{Z}^{2}) \\ &= \pm PA (\mathbb{Z}^{2}) \\ &= \pm P (\mathbb{Z}^{2}) \\ &= L. \end{aligned} We have just proved one direction of the following.

Theorem: The periodic orbits of the modular flow are in bijection with the conjugacy classes of hyperbolic elements in SL(2,)\SL(2, \mathbb{Z}).

These periodic orbits produce fascinating knots in the complement of the trefoil! In fact, they link with the trefoil (the locus of degenerate lattices) in fascinating ways. Here are two examples, starting with different matrices ASL(2,)A \in \SL(2, \mathbb{Z}).


The trefoil is the fixed orange curve, while the periodic orbits are the red and green curves respectively.

Ghys proved the following two remarkable facts about these modular knots.

  • The linking number of a modular knot with the trefoil of degenerate lattices equals the Rademacher function of the corresponding matrix in SL(2,)\SL(2, \mathbb{Z}) (the change in phase of the Dedekind eta function).
  • The knots occuring in the modular flow are the same as those occuring in the Lorenz equations!

Who would have thought that lattices in the plane could tell the weather!!

I must say I have thought about many aspects of these closed geodesics, but it had never crossed my mind to ask which knots are produced. – Peter Sarnak

Author: "willerton (S.Willerton@sheffield.ac.uk)" Tags: "Dynamical systems"
Send by mail Print  Save  Delicious 
Date: Monday, 07 Apr 2014 04:50

Guest post by Sean Moss

In this post I shall discuss the paper “On a Topological Topos” by Peter Johnstone. The basic problem is that algebraic topology needs a “convenient category of spaces” in which to work: the category 𝒯\mathcal{T} of topological spaces has few good categorical properties beyond having all small limits and colimits. Ideally we would like a subcategory, containing most spaces of interest, which is at least cartesian closed, so that there is a useful notion of function space for any pair of objects. A popular choice for a “convenient category” is the full subcategory of 𝒯\mathcal{T} consisting of compactly-generated spaces. Another approach is to weaken the notion of topological space, i.e. to embed 𝒯\mathcal{T} into a larger category, hopefully with better categorical properties.

A topos is a category with enough good properties (including cartesian closedness) that it acts like the category of sets. Thus a topos acts like a mathematical universe with ‘sets’, ‘functions’ and its own internal logic for manipulating them. It is exciting to think that if a “convenient topos of spaces” could be found, then its logical aspects could be applied to the study of its objects. The set-like nature of toposes might make it seem unlikely that this can happen. For instance, every topos is balanced, but the category of topological spaces is famously not. However, sheaves (the objects in Grothendieck toposes) originate from geometry and already behave somewhat like generalized spaces.

I shall begin by elaborating on this observation about Grothendieck toposes, and briefly examine some previous attempts at a “topological topos”. I shall explore the idea of replacing open sets with convergent sequences and see how this leads us to Johnstone’s topos. Finally I shall describe how Johnstone’s topos is a good setting for homotopy theory.

I would to thank Emily Riehl for organizing the Kan Extension Seminar and for much useful feedback. Thanks go also to the other seminar participants for being a constant source of interesting perspectives, and to Peter Johnstone, for his advice and for writing this incredible paper.

Are there toposes of spaces?

We shall need to be flexible about what we mean by “space”. For the rest of this post I shall try to use the term “topological space” in a strict technical sense (a set of points plus specified open sets), whereas “space” will be a nebulous concept. The idea is that spaces have existence regardless of having been implemented as a topological space or not, and may naturally have more (or perhaps less) structure. Topology merely forms one setting for their rigorous study. In topology we can only detect the topological properties of spaces. For example, \mathbb{R} and (0,1)(0,1) are isomorphic as topological spaces, but they are far from being the same space: consider how different their implementations as, say, metric spaces are. Some spaces are naturally considered as having algebraic or smooth structure. The type of question one wishes to ask about a space will bear upon the type of object as which it should be implemented.

An extremely important class of toposes consists of the Grothendieck toposes, which are categories of sheaves on a site. A site is a small category together with a Grothendieck coverage (also known as a Grothendieck topology). Informally, the Grothendieck coverage tells us how some objects can be “covered” by maps coming out of other objects. In the special case where the site is a topological space, the objects are open sets and the coverage tells us that an open set is covered by the inclusions of any family of open sets whose union is all of that open set. A sheaf on a site is then a contravariant Set\mathrm{Set}-valued functor on the underlying category (a presheaf) which satisfies a “unique patching” condition with respect to each covering sieve.

In the following two senses, a Grothendieck topos always behaves like a category of spaces:

(A) One way to describe the properties of a space is to consider the maps into that space. This is the idea behind the homotopy groups, where we consider (homotopy classes of) maps from the nn-sphere into a space. Given a small category 𝒞\mathcal{C}, each object is determined by knowing all the arrows into it and how these arrows “restrict” along other arrows, i.e. precisely the data of the representable presheaf. A non-representable presheaf can be viewed as a generalized object of 𝒞\mathcal{C}, which is testable by the ‘classical’ objects of 𝒞\mathcal{C}: it is described entirely by what the maps into it from objects of 𝒞\mathcal{C} ought to be. If we have in mind that 𝒞\mathcal{C} is some category of spaces, with some sense in which some spaces are covered by families of maps out of other spaces, (i.e. we have a Grothendieck coverage), then we should be able to patch maps into these generalized spaces together. So the topos of sheaves on this site is a setting in which we may be able to implement certain spaces, if we wish to study their properties testable by objects of 𝒞\mathcal{C}.

(B) The category of presheaves on a small category 𝒞\mathcal{C} is its free cocompletion. Intuitively, it is the category of objects obtained by formally gluing together objects of 𝒞\mathcal{C}. The use of the word “gluing” is itself a spatial metaphor. CW-complexes are built out of gluing together cells - simplicial sets are instructions for carrying out this gluing. Manifolds are built from gluing together open subsets of Euclidean space. Purely formal ‘gluing’ is not quite sufficient: the Yoneda embedding of 𝒞\mathcal{C} into its presheaves typically does not preserve any colimits already in 𝒞\mathcal{C}. But if 𝒞\mathcal{C} is a category of spaces, its objects are not neutral with respect to each other: there may be a suitable Grothendieck coverage on 𝒞\mathcal{C} which tells us how some objects can cover others. The topos of sheaves is then the category of objects obtained by formally gluing objects of 𝒞\mathcal{C} in a way that respects these coverings. This is strongly connected with the preservation of colimits by the embedding of 𝒞\mathcal{C} into the sheaves. Colimits in the presheaf topos are constructed pointwise; to get the sheaf colimit one applies the reflection into the category of sheaves (“sheafification”) to the presheaf colimit. The more covers imposed on 𝒞\mathcal{C}, the more work is done by the sheafification, so the closer we end up to the original colimit.

Are there toposes in topology?

It is far from clear that we can choose a site for which the space-like behaviour of sheaves accords with the usual topological intuition. If we want to use a topological topos for homotopy theory, then ideally it should contain objects that we can recognize as the CW-complexes, and we should be able to construct them via more or less the usual colimits.

Attempt 1: The “gros topos” of Giraud

The idea is to take sheaves on the ‘site’ of topological spaces, where covers are given by families of open inclusions of subspaces whose union is the whole space. We do not automatically get a topos unless the site is small, so instead take some small, full subcategory 𝒞\mathcal{C} of 𝒯\mathcal{T}, which is closed under open subspaces. The gros topos is the topos of sheaves for this site.

The Yoneda embedding exhibits 𝒞\mathcal{C} as a subcategory, and in fact we can ‘embed’ 𝒯\mathcal{T} via the functor Xhom 𝒯(,X)X \mapsto \hom_\mathcal{T}(-,X), this will be full and faithful on a fairly large subcategory. By (B) one may like to consider the gros topos as the category of spaces glued together from objects of 𝒞\mathcal{C}. This turns out not to be useful, since the site does not have enough covers for colimits to agree with those in 𝒯\mathcal{T}. Moreover the site is so large that calculations are difficult.

Attempt 2: Lawvere’s topos

We use observation (A). Motivated by the use of paths in homotopy theory, we take MM to be the full subcategory of 𝒯\mathcal{T} whose only object is the closed unit interval II. So MM is the monoid of continuous endomorphisms of II. Lawvere’s topos \mathcal{L} is the topos of sheaves on MM with respect to the canonical Grothendieck coverage (the largest Grothendieck coverage on MM for which hom M(,I)\hom_M(-,I) is a sheaf).

Then an object XX of \mathcal{L} is a set X(I)X(I) of paths, together with, for any continuous γ:II\gamma\colon I \to I, a reparametrization map X(γ):X(I)X(I)X(\gamma) \colon X(I) \to X(I), where this assignment is functorial. The points of such a space are given by natural transformations 1X1 \to X, i.e. ‘constant paths’ or paths which are fixed by every reparametrization. We can see which point a path visits at time tt by reparametrizing that path by the constant map III \to I with value tt. A word of caution: a given object in \mathcal{L} may have distinct paths which agree on points for all time.

This site is much easier to calculate with than the gros site (once we have a handle on the canonical coverage). Again there is a functor P:𝒯P \colon \mathcal{T} \to \mathcal{L} given by Xhom 𝒯(I,X)X \mapsto \hom_\mathcal{T}(I,X), which is full and faithful on a fairly large subcategory (including CW-complexes). However, it is still the case that the site could do with more covers: the functor PP does not preserve all the colimits used to build up CW-complexes. By observation (B), an object of \mathcal{L} is obtained by gluing together copies of the unit interval II, so it is possible to construct the circle S 1S^1 out of copies of II, but we cannot do this in the usual way. The coequalizer of II by its endpoints in \mathcal{L} is not S 1S^1, but a “signet-ring”: it is a circle with a ‘lump’, through which a path can cross only if waits there for non-zero time. We cannot solve this problem by adding in more covers, because the coverage is already canonical (adding in more covers will evict the representable hom 𝒯(,I)\hom_\mathcal{T}(-,I) from the topos).

The key idea in Johnstone’s topos is to replace paths with convergent sequences. Given a topological space XX, a convergent sequence in XX is a function from a:{}Xa \colon \mathbb{N}\cup\{\infty\} \to X such that whenever UXU \subseteq X is an open set containing a a_\infty, then there exists an NN such that a nUa_n \in U for all n>Nn \gt N. The convergent sequences are precisely the continuous maps out of {}\mathbb{N}\cup\{\infty\} when we give it the topology that makes it the one-point compactification of the discrete space \mathbb{N} - we denote this topological space by +\mathbb{N}^+.

Convergent sequences as primitive

It is a basic theorem in general topology that, given a function f:XYf\colon X \to Y between topological spaces, if it is continuous then it preserves convergent sequences. The converse is not true for general topological spaces, but it is true whenever YY is a sequential space. Given a topological space XX, A set UXU \subseteq X is sequentially open if for any convergent sequence (a n)(a_n), with a Ua_\infty \in U, (a n)(a_n) is eventually in UU. (Clearly any open subset is sequentially open.) A topological space is then said to be sequential if all of its sequentially open sets are open. The sequential spaces include all first-countable spaces and in fact they can be characterized as the topological quotients of metrizable spaces, so they certainly include all CW-complexes.

The notion of convergent sequence is arguably more intuitive than that of open set. For example, each convergent sequence gives you concrete data about the nearness of some family of points to another point, whereas open sets only give you such data when the topology (or at least a neighbourhood basis) is considered as a whole. It would be compelling to define a continuous function as one that preserves convergent sequences. This motivates the study of subsequential spaces.

A subsequential space consists of a set XX (of points) and family of “convergent sequences”: a specified subset of the set of functions {}X\mathbb{N}\cup\{\infty\} \to X, such that:

  1. for every point xXx \in X, the constant sequence (x)(x) converges to xx;
  2. if (x n)(x_n) converges to xx, then so does every subsequence of (x n)(x_n);
  3. if (x n)(x_n) is a sequence and xx is a point such that every subsequence of (x n)(x_n) contains a (sub)subsequence converging to xx, then (x n)(x_n) converges to xx.

The third axiom is the general form of intertwining two or more sequences with the same limit or changing a finite initial segment of a sequence. Note that there is no ‘Hausdorff‘-style condition on the convergent sequences: a sequence may converge to more than one limit. A continuous map between subsequential spaces XYX \to Y is a function from the points of XX to the points of YY that preserves convergence of sequences.

The axioms above are all true of the set of convergent sequences which arise from a topology on a set. In fact, this process gives a full and faithful embedding of sequential spaces into subsequential spaces. Thus sequential spaces live inside both topological and subsequential spaces. They are coreflective in the former and reflective in the latter: given a topological space XX, its sequentially open sets constitute a new (finer) topology; given a subsequential space YY, we can consider the “sequentially open” sets with respect to its convergent sequences, and then take all convergent sequences in the resulting (sequential) topological space. Observe that the sense of the adjunction in each case comes from the fact that we either throw in more open sets - so there is a natural map (X) seqX(X)_\text{seq} \to X, or throw in more convergent sequences - so there is a natural map Y(Y) seqY \to (Y)_\text{seq}.

In the following I shall denote the category of sequential spaces by \mathcal{F} and that of subsequential spaces by \mathcal{F}'.

Johnstone’s topos

Let Σ\Sigma be the full subcategory of 𝒯\mathcal{T} on the objects 11 (the singleton space) and +\mathbb{N}^+ (the one-point compactification of the discrete space of natural numbers). The arrows in this category can be described without topology as well: as functions, the maps + +\mathbb{N}^+ \to \mathbb{N}^+ are the eventually constant ones and the ones that “tend to infinity”.

Given an infinite subset TT \subseteq \mathbb{N}, let f Tf_T denote the unique order-preserving injection + +\mathbb{N}^+ \to \mathbb{N}^+ whose image is T{}T \cup \{\infty\}. One can check that there is a Grothendieck coverage JJ on Σ\Sigma where 11 is covered by only the maximal sieve, and where +\mathbb{N}^+ is covered by any sieve RR such that:

  1. RR contains all of the points n:1 +n \colon 1 \to \mathbb{N}^+, n +n \in \mathbb{N}^+.
  2. For any infinite subset TT \subseteq \mathbb{N} there exists an infinite subset TTT' \subseteq T such that f TRf_{T'} \in R.

The topos \mathcal{E} is then defined to be Sh(Σ,J)\mathrm{Sh}(\Sigma,J).

The objects in our topos are a slight generalization of subsequential space. If XX \in \mathcal{E}, then X(1)X(1) is its set of points, and X( +)X(\mathbb{N}^+) is its set of convergent sequences. Each point n:1 +n \colon 1 \to \mathbb{N}^+ induces a ‘projection map’ X(n):X( +)X(1)X(n)\colon X(\mathbb{N}^+) \to X(1), giving you the point of the sequence at time nn. The unique map +1\mathbb{N}^+ \to 1 induces a map X(1)X( +)X(1) \to X(\mathbb{N}^+), which sends each point to a canonical choice of constant sequence. Note that there may be more than one convergent sequence with the same points, thus it may be helpful to think of X( +)X(\mathbb{N}^+) as the set of proofs of convergence for sequences.

Clearly we can embed \mathcal{F}', the subsequential spaces, into \mathcal{E}: the points are the same, and the convergence proofs are just the convergent sequences. The first two axioms are satisfied because of the equations that hold in Σ\Sigma. The third axiom is encoded into the coverage. Conversely, any object XX of \mathcal{E} for which the projection maps X(n):X( +)X(1)X(n)\colon X(\mathbb{N}^+) \to X(1), n +n \in \mathbb{N}^+ are jointly injective is isomorphic to one coming from a subsequential space. There is a functor H:𝒯H \colon \mathcal{T} \to \mathcal{E} sending Xhom 𝒯(,X)X \mapsto \hom_\mathcal{T}(-,X), and it is indeed sheaf-valued since it is equal to the composite of the coreflection 𝒯\mathcal{T} \to \mathcal{F} with the inclusions \mathcal{F} \to \mathcal{F}' \to \mathcal{E}. In fact, the Grothendieck coverage defining \mathcal{E} is canonical, so it is the largest for which this functor is well-defined.

We can use observation (B) to think of \mathcal{E} as all spaces constructed from gluing sequences together. It is just about possible that we could have motivated the construction of \mathcal{E} this way: classically, any sequential space XX is the quotient in 𝒯\mathcal{T} of a metrizable space, which may be taken to be a disjoint union of copies of +\mathbb{N}^+ - one for every convergent sequence in XX. Compare this with the canonical representation of a presheaf as a colimit of representables (one for each of its elements).


It turns out that \mathcal{F}' is the subcategory of ¬¬\neg\neg-separated objects in \mathcal{E}, hence it is a reflective subcategory. \mathcal{F} is reflective in \mathcal{F}', hence it is also reflective in \mathcal{E}. In particular, all limits in \mathcal{F} are preserved by the inclusion into \mathcal{E}. Take some caution, however, since products do not agree with those in 𝒯\mathcal{T}: one has to take the sequential coreflection of the topological product. This is only a minor issue; having to modify the product arises in other “convenient categories” such as compactly-generated spaces.

The colimits in \mathcal{F} do agree with those in 𝒯\mathcal{T} because it is a coreflective subcategory. Surprisingly, the inclusion \mathcal{F} \to \mathcal{E} preserves many of these colimits.

Theorem Let XX be a sequential space, and {U ααA}\{U_\alpha \mid \alpha \in A\} an open cover of XX. Then the obvious colimit diagram in \mathcal{F}: U αU β U α U βU γ U β X U γ \begin{matrix} U_\alpha\cap U_\beta & \rightarrow & U_\alpha & & \\ & \searrow & & \searrow & \\ U_\beta \cap U_\gamma & \rightarrow & U_\beta & \rightarrow & X \\ \vdots & \searrow & & \nearrow & \\ & & U_\gamma & & \\ & & \vdots & & \end{matrix} is preserved by the embedding \mathcal{F} \to \mathcal{E}.

Proof The recipe for this sort of theorem is: take the colimit in presheaves, show that the comparison map is monic, then show that it is JJ-dense, for then it will exhibit XX as the colimit upon reflecting into the topos \mathcal{E}. The colimit LL in presheaves is calculated “objectwise”, so LL has the same points of XX, but only those convergent sequences which are entirely within some U αU_\alpha (hence the comparison map LXL \to X is monic). To sheafify, we need to add in all those sequences xX( +)x \in X(\mathbb{N}^+) which are locally in LL, i.e. for which the sieve {f:? +X(f)(x)L(?)} \{f \colon ? \to \mathbb{N}^+ \mid X(f)(x) \in L(?) \} in Σ\Sigma is JJ-covering. For any xX( +)x \in X(\mathbb{N}^+), this sieve clearly contains all the points 1 +1 \to \mathbb{N}^+. But xx must also be eventually within one of the U αU_\alpha, so the second condition for the covering sieves is also satisfied. \square

There are several other colimit preservation results one can talk about (with similar proofs to the above). The amazing consequence of these is that the colimits used to construct CW-complexes are all preserved by the embedding \mathcal{F} \to \mathcal{E}. Thus classical homotopy theory embeds into \mathcal{E} and we have successfully found a topos of spaces which agrees with the classical theory.

Geometric realization

Let Δ\Delta be the category of non-zero finite ordinals and order-preserving maps. Then objects of the presheaf category [Δ op,Set][\Delta^\mathrm{op},\mathrm{Set}] are known as simplicial sets.

Theorem [Δ op,Set][\Delta^\mathrm{op},\mathrm{Set}] is the classifying topos for intervals in Set\mathrm{Set}-toposes.

The closed unit interval [0,1][0,1] is sequential and is in fact an interval (a totally ordered object with distinct top and bottom elements). Thus it corresponds to a geometric morphism [Δ op,Set]\mathcal{E} \to [\Delta^\mathrm{op},\mathrm{Set}] (an adjunction (f f )(f^\star \dashv f_\star) with f f^\star left-exact).

Theorem If S[Δ op,Set]S \in [\Delta^\mathrm{op},\mathrm{Set}] is a simplicial set, then f (S)f^\star(S) is its geometric realization, considered as a sequential space and hence as an object of \mathcal{E}. If XX \in \mathcal{E} is a sequential space, then f (E)f_\star(E) is its singular complex.

The usual geometric realization is not left-exact if considered to take values in 𝒯\mathcal{T}, one must choose a “convenient subcategory” first, and then there is some work to do in proving it. Here the left-exactness just arises out of the general theory of geometric morphisms. Should we wish to do so, the above method allows us to replace [0,1][0,1] with any other object that the internal logic of \mathcal{E} sees as an interval to get a different realization of simplicial sets.

The above is far from a complete survey of “On a Topological Topos”, which contains several more results of interest relating to \mathcal{E} and captures the elegance of using the site Σ\Sigma for calculation - I thoroughly recommend taking a look if you know some topos theory. We have seen enough though to understand that for many spaces the sequential properties align with the topological properties. Unfortunately, \mathcal{E} is yet to receive the attention it deserves.

Author: "riehl (eriehl@math.harvard.edu)" Tags: "Categories"
Send by mail Print  Save  Delicious 
Date: Tuesday, 01 Apr 2014 12:58

Guest post by Nils Carqueville and Daniel Murfet

(My university probably isn’t alone in encouraging mathematicians and computer scientists to embrace the idea of “big data”, or in more sober terminology, “data science”. Here, Nils Carqueville and Daniel Murfet introduce their really excellent article on big data in whole-population surveillance. —TL)

In recent years Big Data has become an increasingly relevant topic in the economic sector, for intelligence agencies, and for the sciences.

A particularly far-reaching development made possible by Big Data is that of unprecedented mass surveillance. As Alexander Beilinson, Stefan Forcey, Tom Leinster and others have pointed out, the role that mathematicians and computer scientists play here is central. With this in mind last January we wrote an essay on

Big Data Power

stressing some of those aspects of the matter that we think deserve more attention or additional elaboration. We hope this to be a useful contribution to the necessary discussion on modern mass surveillance, and we thank Tom for his efforts in this direction, and for allowing us to post here.

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Science and Politics"
Send by mail Print  Save  Delicious 
Date: Monday, 31 Mar 2014 11:16

EDIT: New Submission Deadline: 15 April 2014, Notification of Acceptance: 30 April 2014

A friend, John Wigglesworth, has asked me to announce the following conference. Perhaps he’ll let us know if pluralist homotopy type theorists are welcome ;).

I wonder if any progress has been made on my MO ‘multiverse’ question.

CFP: Symposium on the Foundations of Mathematics, Kurt Gödel Research Center, University of Vienna.

Set theory is taken to serve as a foundation for mathematics. But it is well-known that there are set-theoretic statements that cannot be settled by the standard axioms of set theory. The Zermelo-Fraenkel axioms, with the Axiom of Choice (ZFC), are incomplete. The primary goal of this symposium is to explore the different approaches that one can take to the phenomenon of incompleteness.

One option is to maintain the traditional “universe” view and hold that there is a single, objective, determinate domain of sets. Accordingly, there is a single correct conception of set, and mathematical statements have a determinate meaning and truth-value according to this conception. We should therefore seek new axioms of set theory to extend the ZFC axioms and minimize incompleteness. It is then crucial to determine what justifies some new axioms over others.

Alternatively, one can argue that there are multiple conceptions of set, depending on how one settles particular undecided statements. These different conceptions give rise to parallel set-theoretic universes, collectively known as the “multiverse”. What mathematical statements are true can then shift from one universe to the next. From within the multiverse view, however, one could argue that some universes are more preferable than others.

These different approaches to incompleteness have wider consequences for the concepts of meaning and truth in mathematics and beyond. The conference will address these foundational issues at the intersection of philosophy and mathematics. The primary goal of the conference is to showcase contemporary philosophical research on different approaches to the incompleteness phenomenon.

To accomplish this, the conference has the following general aims and objectives:

(1) To bring to a wider philosophical audience the different approaches that one can take to the set-theoretic foundations of mathematics. (2) To elucidate the pressing issues of meaning and truth that turn on these different approaches. (3) To address philosophical questions concerning the need for a foundation of mathematics, and whether or not set theory can provide the necessary foundation

Date and Venue: 7-8 July 2014 - Kurt Gödel Research Center, Vienna

Confirmed Speakers: Sy-David Friedman (Kurt Gödel Research Center for Mathematical Logic), Hannes Leitgeb (Munich Center for Mathematical Philosophy)

Call for Papers: We welcome submissions from scholars (in particular, young scholars, i.e. early career researchers or post-graduate students) on any area of the foundations of mathematics (broadly construed). Particularly desired are submissions that address the role of set theory in the foundations of mathematics, or the foundations of set theory (universe/multiverse dichotomy, new axioms, etc.) and related ontological and epistemological issues. Applicants should prepare an extended abstract (maximum 1500 words) for blind review, and send it to sotfom [at] gmail [dot] com. The successful applicants will be invited to give a talk at the conference and will be refunded the cost of accommodation in Vienna for two days (7-8 July).

Submission Deadline: 31 March 2014

Notification of Acceptance: 30 April 2014

Scientific Committee: Philip Welch (University of Bristol), Sy-David Friedman (Kurt Gödel Research Center), Ian Rumfitt (University of Birmigham), John Wigglesworth (London School of Economics), Claudio Ternullo (Kurt Gödel Research Center), Neil Barton (Birkbeck College), Chris Scambler (Birkbeck College), Jonathan Payne (Institute of Philosophy), Andrea Sereni (Università Vita-Salute S. Raffaele), Giorgio Venturi (Université de Paris VII, “Denis Diderot” - Scuola Normale Superiore)

Organisers: Sy-David Friedman (Kurt Gödel Research Center), John Wigglesworth (London School of Economics), Claudio Ternullo (Kurt Gödel Research Center), Neil Barton (Birkbeck College), Carolin Antos (Kurt Gödel Research Center)

Conference Website: sotfom [dot] wordpress [dot] com

Further Inquiries: please contact

Claudio Ternullo (ternulc7 [at] univie [dot] ac [dot] at)

Neil Barton (bartonna [at] gmail [dot] com)

John Wigglesworth (jmwigglesworth [at] gmail [dot] com)

Author: "david (d.corfield@kent.ac.uk)" Tags: "Conference"
Send by mail Print  Save  Delicious 
Date: Monday, 31 Mar 2014 10:59

Nina Otter is a master’s student in mathematics at ETH Zürich who has just gotten into the PhD program at Oxford. She and I are writing a paper on operads and the tree of life.

Anyone who knows about operads knows that they’re related to trees. But I’m hoping someone has proved some precise theorems about this relationship, so that we don’t have to.

By operad I’ll always mean a symmetric topological operad. Such a thing has an underlying ‘symmetric collection’, which in turn has an underlying ‘collection’. A collection is just a sequence of topological spaces O nO_n for n0n \ge 0. In a symmetric collection, each space O nO_n has an action of the symmetric group S nS_n.

I’m hoping that someone has already proved something like this:

Conjecture 1. The free operad on the terminal symmetric collection has, as its space of nn-ary operations, the set of rooted trees having some of their leaves labelled {1,,n}\{1, \dots, n\}.

Conjecture 2. The free operad on the terminal collection has, as its space of nn-ary operations, the set of planar rooted trees having some of their leaves labelled {1,,n}\{1, \dots, n\}.

Calling them ‘conjectures’ makes it sound like they might be hard — but they’re not supposed to be hard. If they’re right, they should be easy and in some sense well-known! But there are various slightly different concepts of ‘rooted tree’ and ‘rooted planar tree’, and we have to get the details right to make these conjectures true. For example, a graph theorist might draw a rooted planar tree like this:

while an operad theorist might draw it like this:

If the conjectures are right, we can use them to define the concepts of ‘rooted tree’ and ‘rooted planar tree’, thus side-stepping these details. And having purely operad-theoretic definitions of ‘tree’ and ‘rooted tree’ would make it a lot easier to use these concepts in operad theory. That’s what I want to do, ultimately. But proving these conjectures, and of course providing the precise definitions of ‘rooted tree’ and ‘rooted planar tree’ that make them true, would still be very nice.

And it would be even nicer if someone had already done this. So please provide references… and/or correct mistakes in the following stuff!

Rooted Trees

Definition. For any natural number n=0,1,2,n = 0, 1, 2, \dots, an nn-tree is a quadruple T=(V,E,s,t)T=(V,E,s,t) where:

  • VV is a finite set whose elements are called internal vertices;
  • EE is a finite non-empty set whose elements are called edges;
  • s:EV{1,,n}s: E \to V\sqcup \{1,\dots, n\} and t:EV{0}t: E \to V \sqcup\{0\} are maps sending any edge to its source and target, respectively.

Given u,vV{0}{1,,n}u,v\in V \sqcup\{0\} \sqcup \{1,\dots, n\}, we write uevu \stackrel{e}{\longrightarrow} v if eEe\in E is an edge such that s(e)=us(e)=u and t(e)=vt(e)=v.

This data is required to satisfy the following conditions:

  • s:EV{1,,n}s : E \to V\sqcup \{1,\dots, n\} is a bijection;
  • there exists exactly one eEe\in E such that t(e)=0t(e)=0;
  • for any vV{1,,n}v \in V\sqcup\{1,\dots , n\} there exists a directed edge path from vv to 00: that is, a sequence of edges e 0,,e ne_0, \dots, e_n and vertices v 1,,v nv_1 , \dots , v_n such that ve 0v 1,v 1e 1v 2,,v ne n0. v \stackrel{e_0}{\longrightarrow} v_1 , \; v_1 \stackrel{e_1}{\longrightarrow} v_2, \; \dots, \; v_n \stackrel{e_n}{\longrightarrow} 0 .

So the idea is that our tree has V{0}{1,,n} V \sqcup\{0\} \sqcup \{1,\dots, n\} as its set of vertices. There could be lots of leaves, but we’ve labelled some of them by numbers 1,,n1, \dots, n. In our pictures, the source of each edge is at its top, while the target is at bottom.

There is a root, called 00, but also a ‘pre-root’: the unique vertex with an edge going from it to 00. I’m not sure I like this last bit, and we might be able to eliminate this redundancy, but it’s built into the operad theorist’s picture here:

It might be a purely esthetic issue. Like everything else, it gets a bit more scary when we consider degenerate special cases.

I’m hoping there’s an operad TreeTree whose set of nn-ary operations, Tree nTree_n, consists of isomorphism classes of nn-trees as defined above. I’m hoping someone has already proved this. And I hope someone has characterized this operad TreeTree in a more algebraic way, as follows.

Definition. A symmetric collection CC consists of topological spaces {C n} n0\{C_n\}_{n \ge 0} together with a continuous action of the symmetric group S nS_n on each space C nC_n. A morphism of symmetric collections f:CCf : C \to C' consists of an S nS_n-equivariant continuous map f n:C nC nf_n : C_n \to C'_n for each n0n \ge 0. Symmetric collections and morphisms between them form a category STop{STop}.

(More concisely, if we denote the groupoid of sets of the {1,,n}\{1, \dots, n\} and bijections between these as SS, then STopSTop is the category of functors from SS to TopTop.)

There is a forgetful functor from operads to symmetric collections

U:OpSTop U : Op \to STop

with a left adjoint

F:STopOp F: STop \to Op

assigning to each symmetric collection the operad freely generated by it.

Definition. Let CommComm be the terminal operad: that is, the operad, unique up to isomorphism, such that Comm nComm_n is a 1-element set for each n0n \ge 0.

The algebras of Comm\Comm are commutative topological monoids.

Conjecture 1. There is a unique isomorphism of operads

ϕ:F(U(Comm))Tree \phi : F (U (Comm)) \stackrel{\sim}{\longrightarrow} Tree

that for each n0n \ge 0 sends the unique nn-ary operation in CommComm to the corolla with nn leaves: that is, the isomorphism class of nn-trees with no internal vertices.

(When I say “the unique nn-ary operation in CommComm”, but treating it as an operation in F(U(Comm))F(U(Comm)), I’m using the fact that the unique operation in Comm n\Comm_n gives an element in U(Comm) nU(\Comm)_n, and thus an operation in F(U(Comm)) nF(U(\Comm))_n.)

Planar Rooted Trees

And there should be a similar result relating planar rooted trees to collections without symmetric group actions!

Definition. A planar nn-tree is an nn-tree in which each internal vertex vv is equipped with a linear order on the set of its children, i.e. the set t 1(v)t^{-1}(v).

I’m hoping someone has constructed an operad PTreePTree whose set of nn-ary operations, PTree nPTree_n, consists of isomorphism classes of planar nn-trees. And I hope someone has characterized this operad PTreePTree as follows:

Definition. A collection CC consists of topological spaces {C n} n0\{C_n\}_{n \ge 0}. A morphism of collections f:CCf :C \to C' consists of a continuous map f n:C nC nf_n : C_n \to C'_n for each n0n \ge 0. Collections and morphisms between them form a category Top\mathbb{N}Top.

(If we denote the category with natural numbers as objects and only identity morphisms between as \mathbb{N}, then Top\mathbb{N}Top is the category of functors from \mathbb{N} to TopTop.)

There is a forgetful functor

ϒ:OpTop \Upsilon : Op \to \mathbb{N}Top

with a left adjoint

Φ:TopOp \Phi : \mathbb{N}Top \to Op

assigning to each collection the operad freely generated by it. This left adjoint is the composite

TopGSTopFOp \mathbb{N}Top \stackrel{G}{\longrightarrow} STop \stackrel{F}{\longrightarrow} Op

where the first functor freely creates a symmetric collection G(C)G(C) from a collection CC by setting G(C) n=S n×C nG(C)_n = S_n \times C_n, and the second freely generates an operad from a symmetric collection, as described above.

Conjecture 2. There is a unique isomorphism of operads

ψ:Φ(ϒ(Comm))PTree \psi : \Phi(\Upsilon (Comm)) \stackrel{\sim}{\longrightarrow} PTree

that for each n0n \ge 0 sends the unique nn-ary operation in CommComm to the corolla with nn leaves ordered so that 1<<n1 &lt; \cdots &lt; n.

Have you seen a proof of this stuff???

Author: "john (baez@math.ucr.edu)"
Send by mail Print  Save  Delicious 
Date: Monday, 31 Mar 2014 07:54

Standard logic involving the truth values ‘true’ and ‘false’ can make it difficult to model some of the fuzziness we use in everyday speech. If you’d bought a bike yesterday then today it would be truthful to say “This bike is new”, but it wouldn’t be truthful so say it in 20 years’ time. However, between now and then there won’t be a specific day on which the statement “This bike is new” suddenly switches from being true to being false. How can you model this situation?

An old bike

One approach to modelling this situation is with fuzzy logic where you allow your truth values to be things other than just true and false. For instance, you can take the interval [0,1][0,1] as the set of truth values with 00 representing false and 11 representing true. So the truth degree of the statement “This bike is new” would vary, being 11 today and decreasing to something very close to 00 in 20 years’ time.

This post is an attempt by me to understand this fuzzy logic in the context of enriched category theory, in particular, using [0,1][0,1] as a monoidal category to enrich over. We will see that categories enriched over [0,1][0,1] can be interpreted as fuzzy posets or fuzzy preorders.

This was going to be a comment on Tom Avery’s Kan Extension Seminar post on Metric Spaces, Generalized Logic, and Closed Categories but grew too big!

I have been trying to understand a bit of fuzzy logic because it feeds into fuzzy concept analysis which I’m also trying to understand. As ever, I understand the category theory better than the logic. In terms of references, I’ve picked up bits and pieces from Belohlavek’s Concept lattices and order in fuzzy logic and Hájek’s Metamathematics of fuzzy logic.

Truth degrees in [0,1][0,1]: fuzzy logic

As mentioned above you can replace your set of truth values {false,true}\{ \mathrm{false},\mathrm{true}\} by the interval [0,1][0,1], with 00 representing completely false and 11 representing completely true. Thus an element of [0,1][0,1] represents a ‘degree’ of truth. The idea is that this represents the degree of truthfulness of statements like “George is old” or “The film is good”: it is not supposed to represent than the probabitlity of the truth of a statement like “I will receive an email from a student tomorrow”. We are modelling vagueness and not probabilty. Also, we are supposed to be modeling ojectivity so the truth degree of the statement “George is old” will depend on the age of George and not on anyone’s opinion of how old he seems to be.

Being category theorists, we might try to fit this into Lawvere’s framework of generalized logic, where we think of our truth values as forming a closed monoidal category. We can make [0,1][0,1] it into a poset — and hence a category — by using the order \le , this will be our notion of entailment. So if PP and QQ are statements then PQP\vdash Q precisely when the truth degree of PP is less than or equal to the truth degree of QQ. However, I’m not entirely sure how to interpret that.

The observant amongst you will have noticed that this poset [0,1][0,1] is isomorphic to the poset we use for metric spaces ([0,],)([0,\infty ],\ge ), via the maps xexp(x)x\mapsto \exp (-x) and aln(a)a\mapsto -\ln (a). So an alternative interpretation of these truth values will be as ‘proximities’ — proximity approximately 00 meaning not at all close, and proximity approximately 11 meaning really, really close. [Switching from distances in [0,][0,\infty ] to proximities in [0,1][0,1] is an important step in calculating magnitudes of metric spaces, but I won’t say any more about that.]

Next we want some notion of ‘conjunction’ \otimes and ‘implication’ \Rightarrow . In category theory terms we want a closed monoidal structure on the category [0,1][0,1]. There are at least three such structures that are well studied.

The product structure: Here ababa\otimes b \coloneqq a\cdot b and ab1a\Rightarrow b \coloneqq 1 if aba\le b and b/ab/a otherwise. Via the exponential and logarithm maps this corresponds to the usual closed monoidal structure on [0,][0,\infty ] of addition and truncated subtraction.

The Gödel structure: Here abmin(a,b)a\otimes b \coloneqq \min (a,b) and ab1a\Rightarrow b \coloneqq 1 if aba\le b and bb otherwise. Via the exponential and logarithm maps this corresponds to the closed monoidal structure on [0,][0,\infty ] which gives rise to ultrametric spaces.

The Łukasiewicz structure: Here abmax(a+b1,0)a\otimes b \coloneqq \max (a+b-1,0) and abmin(1a+b,1)a\Rightarrow b\coloneqq \min (1-a+b,1). Via the exponential and logarithm maps this corresponds to the monoidal structure xyln(max(e x+e y1,0))x\otimes y\coloneqq -\ln (\max (e^{-x}+e^{-y}-1,0)) and that doesn’t look at all familiar!

If the truth degrees of two statements are PP and QQ then the truth degree of both statements together should be given by the conjunction PQP\otimes Q, but I don’t know examples of real world models with these three different logics. Anybody? I would be very interested in hearing thoughts and ideas on this.

In all of the above three cases the closed monoidal category {false,true}\{ \mathrm{false},\mathrm{true}\} of classical truth values embeds closed monoidally.

Enriching over [0,1][0,1]: fuzzy preorders

We can now think about what categories enriched in ([0,1],)([0,1],\le ) are, as generalizations of preorders. Such an enriched category consists of a set CC and to each pair of elements c,cCc,c'\in C we associate a truth degree C(c,c)[0,1]C(c,c')\in [0,1], we want to think of this as the degree to which cc is ordered before cc'. This is what we will call a fuzzy preorder. It will satisfy fuzzy transitivity: C(c,c)C(c,c)C(c,c)C(c',c'')\otimes C(c,c')\le C(c,c''). How this is actually interpreted will depend on which product \otimes we took.

Here’s an example of a fuzzy preorder. Suppose that Bart is aged 9, Marge is aged 39 and Homer is aged 40. Is it true that Homer is (at least) as old as Marge? Yes. Is it true that Marge is (at least) as old as Homer? Well, nearly. Is it true that Bart is (at least) as old as Marge? Well, not really at all. We can formalise this into a fuzzy preorder \succeq “is at least as old as” where (HomerMarge) =1 (MargeHomer) =a tiny bit less than1 (BartMarge) =much less than1. \begin{aligned} (\text {Homer}\succeq \text {Marge})&=1 \\ (\text {Marge}\succeq \text {Homer})&=\text {a tiny bit less than}\, \, 1\\ (\text {Bart}\succeq \text {Marge})&=\text {much less than}\, \, 1. \end{aligned} Algebraically we could define AB{1 ifage(A)age(B) age(A)/age(B) ifage(A)<age(B). A\succeq B\coloneqq \begin{cases} 1 &\text {if}\, \, \mathrm{age}(A)\ge \mathrm{age}(B)\\ \mathrm{age}(A)/\mathrm{age}(B)&\text {if}\, \, \mathrm{age}(A)\lt \mathrm{age}(B) . \end{cases} This gives a fuzzy preorder on the set (Bart,Marge,Homer)(\text {Bart}, \text {Marge}, \text {Homer}) with respect the product monoidal structure \cdot on [0,1][0,1], or in other words, it gives a category enriched over the monoidal category ([0,1],,1)([0,1],\cdot ,1).

There should be some more convincing examples, but I haven’t found any yet.

An exercise at this point is to look at the dictionary in my post on Galois correspondences and enriched categories and extend it by another column for categories enriched over [0,1][0,1].

Author: "willerton (S.Willerton@sheffield.ac.uk)" Tags: "Logic"
Send by mail Print  Save  Delicious 
Date: Sunday, 30 Mar 2014 13:13

Term is nearly over, which for me means the end of the 4th year Fourier Analysis course I’ve been teaching for the last couple of years.

I was fortunate enough to take over the course from Jim Wright, a genuine expert on the subject, and I inherited a great set of notes from him. But I felt the need to make the course my own, so I’ve been writing my own notes, which I’ve just finished: notes here, plus accompanying problem sheets. They’re mostly about convergence of Fourier series, with a delicious dessert of Fourier analysis on finite abelian groups.

But what I wanted to write about here — and get your opinions on — was not Fourier analysis, but some questions of teaching. This year, I’ve been (in the jargon) “flipping the classroom”, or at least partially flipping it (which reminds me of that mysterious substance, partially inverted sugar syrup, that you sometimes see on ingredients lists). I’d like to hear about other people’s similar experiences.

The allotted class time is two 50-minute “lectures” a week, plus one “workshop” (Edinburgh lingo for tutorial or exercise class) a fortnight. Here’s the routine for lectures:

  • At least a week before a given lecture, I make Latexed notes for that lecture available online.

  • The students are committed to spending at least an hour before each lecture reading over the notes.

  • I spend the first half of each lecture giving an overview of that portion of the notes, on the firm assumption that the students have done the prescribed amount of reading beforehand. Thus, I can spend time talking about the big picture rather than the mechanics, and I can concentrate on parts that seem likely to cause most difficulty rather than having to go through everything.

  • The second half of the lecture is interactive. We do different activities each time, e.g.:

    • solving exercises (individually, in small groups, or all together)
    • question and answer sessions
    • definitions quizzes
    • working on mathematical writing
    • identifying hard parts of the course and having students explain them to each other.

In a completely flipped classroom, all the time would be taken up with interactive work. The first half of each class isn’t too far from a traditional lecture, except for the data-projected notes and the prior reading by the students. That’s why I said it’s only partially flipped.

It’s the first time I’ve done things quite like this, so this year has been pretty experimental. In particular, before every lecture, I’ve had to come up with an idea for the second half of the class, and evidently some have been more successful than others. I’m also running out of ideas! It’s a good thing it’s the end of term.

Have you ever taught in a similar way? If so, what worked well, and what didn’t? Can you share some ideas for classroom activities?

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Education"
Send by mail Print  Save  Delicious 
Date: Wednesday, 26 Mar 2014 17:39

I’ve just submitted a piece for the new Opinions section of the monthly LMS Newsletter: Should mathematicians cooperate with GCHQ? (Update: now available (p.34).) The LMS is the London Mathematical Society, which is the UK’s national mathematical society. My piece should appear in the April edition of the newsletter, and you can read it below.

Here’s the story. Since November, I’ve been corresponding with people at the LMS, trying to find out what connections there are between it and GCHQ. Getting the answer took nearly three months and a fair bit of pushing. In the process, I made some criticisms of the LMS’s total silence over the GCHQ/NSA scandal:

GCHQ is a major employer of mathematicians in the UK. The NSA is said to be the largest employer of mathematicians in the world. If there had been a major scandal at the heart of the largest publishing houses in the world, unfolding constantly over the last eight months, wouldn’t you expect it to feature prominently in every issue of the Society of Publishers’ newsletter?

To its credit, the LMS responded by inviting me to write an inaugural piece for a new Opinions section of the newsletter. Here it is.

Should mathematicians cooperate with GCHQ?

Tom Leinster

One of the UK’s largest employers of mathematicians has been embroiled in a major international scandal for the last nine months, stands accused of law-breaking on an industrial scale, and is now the object of widespread outrage. How has the mathematical community responded? Largely by ignoring it.

GCHQ and its partners have been systematically monitoring as much of our lives as they possibly can, including our emails, phone calls, text messages, bank transactions, web browsing, Skype calls, and physical location. The goal: “collect it all”. They tap internet trunk cables, bug charities and political leaders, disrupt lawful activist groups, and conduct economic espionage, all under the banner of national security.

Perhaps most pertinently to mathematicians, the NSA (GCHQ’s major partner and partial funder) has deliberately undermined internet encryption, inserting a secret back door into a standard elliptic curve algorithm. This can be exploited by anyone sufficiently skilled and malicious — not only the NSA/GCHQ. (See Thomas Hales’s piece in February’s Notices of the AMS.) We may never know what else mathematicians have been complicit in; GCHQ’s policy is not to comment on intelligence matters, which is to say, anything it does.

Indifference to mass surveillance rests partly on misconceptions such as “it’s only metadata”. This is certainly false; for instance, GCHQ has used webcams to collect images, many sexually intimate, of millions of ordinary citizens. It is also misguided, even according to the NSA’s former legal counsel: “metadata absolutely tells you everything about somebody’s life”.

Some claim to be unbothered by the recording of their daily activities, confident that no one will examine their records. They may be right. But even if you feel that way, do you want the secret services to possess such a powerful tool for chilling dissent, activism, and even journalism? Do you trust an organization operating in secret, and subject to only “light oversight” (a GCHQ lawyer’s words), never to abuse that power?

Mathematicians seldom have to face ethical questions. But now we must decide: cooperate with GCHQ or not? It has been suggested that mathematicians today are in the same position as nuclear physicists in the 1940s. However, the physicists knew they were building a bomb, whereas mathematicians working for GCHQ may have little idea how their work will be used. Colleagues who have helped GCHQ in the past, trusting that they were contributing to legitimate national security, may justifiably feel betrayed.

At a bare minimum, we as a community should talk about it. Sasha Beilinson has proposed that working for the NSA/GCHQ should be made “socially unacceptable”. Not everyone will agree, but it reminds us that we have both individual choice and collective power. Individuals can withdraw their labour from GCHQ. Heads of department can refuse staff leave to work for GCHQ. The LMS can refuse GCHQ’s money.

At a bare minimum, let us acknowledge that the choices are ours to make. We are human beings before we are mathematicians, and if we do not like what GCHQ is doing, we do not have to cooperate.

I had a 500-word limit, so I omitted a lot. Here are the facts on the LMS’s links with GCHQ, as stated to me by the LMS President Terry Lyons:

The Society has an indirect relationship with GCHQ via a funding agreement with the Heilbronn Institute, in which the Institute will give up to £20,000 per year to the Society. This is approximately 0.7% of our total income. This is a recently made agreement and the funding will contribute directly to the LMS-CMI Research Schools, providing valuable intensive training for early career mathematicians. GCHQ is not involved in the choice of topics covered by the Research Schools.

So, GCHQ’s financial support for the LMS is small enough that declining it would not make a major financial impact.

I hope the LMS will make a public statement clarifying its relationship with GCHQ. I see no argument against transparency.

Another significant factor (which Lyons alludes to above and is already a matter of public record) is that GCHQ is a funder of the Heilbronn Institute, which is a collaboration between GCHQ and the University of Bristol. I don’t know that the LMS is involved with Heilbronn beyond what’s mentioned above, but Heilbronn does seem to provide an important channel through which (some!) British mathematicians support the secret services.

Finally, I want to make clear that although I think there are some problems with the LMS as an institution, I don’t blame the people running it, many of whom are taking time out of extremely busy schedules for the most altruistic reasons. As I wrote to one of them:

I’m genuinely in awe of the amount that you […] give to the mathematical community, both in terms of your selflessness and your energy. I don’t know how you do it. Anything critical I have to say is said with that admiration as the backdrop, and I hope I’d never say anything of the form “do more!”, because to ask that would be ridiculous.

Rules for commenting here  I’ve now written several posts on this and related subjects (1, 2, 3, 4). Every time, I’ve deleted some off-topic comments — including some I’ve enjoyed and agreed with heartily. Please keep comments on-topic. In case there’s any doubt, the topic is the relationship between mathematicians and the secret services. Comments that stray too far from this will be deleted.

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Science and Politics"
Send by mail Print  Save  Delicious 
Date: Tuesday, 25 Mar 2014 14:34

Guest post by Alexander Campbell

We want to develop category theory in a general 2-category, in order to both generalise and clarify our understanding of category theory. The key to this endeavour is to express the basic notions of the theory of categories in a natural 2-categorical language. In this way we are continuing a theme present in previous posts from the Kan Extension Seminar, wherein monads and adjunctions were given a 2-categorical setting, and by analogy, in our very first paper, whose purpose was to express basic notions of the theory of sets in a natural categorical language. In this post we consider a concept very central and special to category theory: the Yoneda lemma.

So what’s the Yoneda Lemma again?

The Yoneda lemma says that for any object aa of a category AA, the diagram 1 a A * ι A(a,) Set \begin{matrix} 1 & \overset{a}{\rightarrow} & A \\ {}_{\ast} \searrow & \overset{\iota}{\Rightarrow} & \swarrow_{A(a,-)} \\ & Set \\ \end{matrix} is a left extension.

In this post I will give a motivation for the notion of Yoneda structure, as defined in the paper Yoneda Structures on 2-Categories of Ross Street and Bob Walters. But before we begin I would like to take this opportunity to thank Emily for inviting me to join the Kan Extension Seminar and for her support and encouragement throughout the course. This has been and continues to be a singularly valuable experience in my first year as a category theorist.

Liftings and extensions

After the various compositions available in a 2-category (nicely unified as the operation of pasting), the most basic notions of two-dimensional algebra are liftings and extensions. A diagram A h α g C f B \begin{matrix} & A \\ {}^h \swarrow & \overset{\alpha}{\Leftarrow} & \searrow^{g} \\ C & \overset{f}{\rightarrow} & B \\ \end{matrix} in a 2-category 𝒦\mathcal{K} is said to be a left lifting if pasting along hh effects a bijection of 2-cells hkgfk. \frac{h \Rightarrow k}{g \Rightarrow f k}.

There is a similar elementary definition of left extension, but we can simply say that a left extension is a left lifting in 𝒦 op\mathcal{K}^{op} (reverse 1-cells). Hence these are just dual versions of a single notion. Left liftings in 𝒦 co\mathcal{K}^{co} and 𝒦 coop\mathcal{K}^{coop} are right liftings and right extensions respectively, but these will not concern us here.

An elementary result of 2-categorical algebra is the pasting lemma. This tells us that given a diagram A h β g α f D l C k B \begin{matrix} & & A \\ {}^h \swarrow & \overset{\beta}{\Leftarrow} & \downarrow^{g} & \overset{\alpha}{\Leftarrow} & \searrow^{f} \\ D & \underset{l}{\rightarrow} & C & \underset{k}{\rightarrow} & B \\ \end{matrix} in which α\alpha is a left lifting, β\beta is a left lifting if and only if the composite is a left lifting. By duality, we immediately get a corresponding result for left extensions.

The Yoneda lemma and Yoneda structures

A Yoneda structure on a 2-category consists of two pieces of data satisfying three axioms. We will see that the data is what is necessary to naturally express the Yoneda lemma in the 2-category, and that the axioms are expressions of familiar results of category theory.

In a general 2-category, following the philosophy of generalised elements, it is natural to consider arrows XAX \longrightarrow A with arbitrary domain (not just restricted to the terminal object 1) as objects of AA. In order to state the Yoneda lemma for such generalised objects, we must replace *:1Set\ast \colon 1 \longrightarrow \text{Set} with some arrow y X:XPXy_X \colon X \longrightarrow P X. Here PXP X could be loosely thought of as “sets freely varying over XX”; in CAT, it is the category of presheaves over XX and y Xy_X is the Yoneda embedding.

But now we must address the issue of size. Firstly, in CAT, the Yoneda embedding y Xy_X exists only for locally small XX. Secondly, in the wish for a snappy statement of the Yoneda lemma, I left an important condition unmentioned: it is not true for a general category AA that the hom-functor A(a,)A(a,-) exists for every object aa. (It would be true if AA were locally small, but we do not want CAT to consist only of locally small categories; for if XX is not small, then PXP X is not locally small.) Hence we must restrict those elements for which we state the Yoneda lemma.

We can now specify the data of a Yoneda structure on a 2-category. The first piece of data is a class of admissible arrows in the 2-category; the only property we require of this class is that it be a right ideal, meaning that if gg is admissible, then so is gfg f for every such composable ff. We say that an object AA is admissible if the identity arrow 1 A:AA1_A \colon A \longrightarrow A is admissible; in the Yoneda structure on CAT, the admissible objects are the locally small categories. Hence we see that there is nothing mysterious about size; it is just part of the structure. The other piece of a Yoneda structure is the Yoneda arrows: an admissible arrow y A:APAy_A \colon A \longrightarrow P A for each admissible object AA.

With this structure we can naturally state the Yoneda lemma in a 2-category. We take this as our first axiom for a Yoneda structure.

Axiom 1. For XX, aa both admissible, the left extension X a A y X χ a A(a,1) PX \begin{matrix} X & \overset{a}{\rightarrow} & A \\ {}_{y_X} \searrow & \overset{\chi^a}{\Rightarrow} & \swarrow_{A(a,1)} \\ & P X \\ \end{matrix} exists.

The universal property of this extension is indicated by A(a,1)fy Xfa. \frac{A(a,1) \Rightarrow f}{y_X \Rightarrow f a}.

So we have used the Yoneda lemma as a definition of the hom-functors A(a,1)A(a,1); the axiom asserts their existence. We can further define the hom-functors on 1-cells and 2-cells in both variables: in the first variable by the universal property of extension, and in the second variable by composition. In particular, for (suitably admissible) f:ABf \colon A \longrightarrow B, we can define Pf:PBPAP f \colon P B \longrightarrow P A as Pf=(PB)(B(1,f),1)P f = (P B)(B(1,f),1), with a similar definition on 2-cells. This all just follows from Axiom 1, hence is part of any Yoneda structure.

All further use of the admissibility notion is just to ensure the existence of the data of Axiom 1. So I will make minimal explicit mention of it from here; it can be easily filled in as necessary.

Universal arrows and absolute liftings

The second axiom of a Yoneda structure is expressed in terms of liftings. I will now exhibit the presence of liftings in category theory.

The first formalisation of the notion of universal property that Mac Lane gives in Categories for the Working Mathematician is that of universal arrow (I will deal with the ‘left’ sense here). Given a functor f:ABf \colon A \longrightarrow B and an object bb of BB, he defines what it means for a “pair” consisting of an object aa of AA and an arrow θ:bfa\theta \colon b \longrightarrow fa to be a universal arrow from bb to ff. This definition says precisely that 1 a θ b A f B \begin{matrix} & 1 \\ {}^a \swarrow & \overset{\theta}{\Leftarrow} & \searrow^{b} \\ A & \underset{f}{\rightarrow} & B \\ \end{matrix} is a left lifting diagram.

In a general 2-category, the restriction of our consideration to such ‘global objects’ is untenable. So how do we express the universal property of a ‘generalised arrow’ from a ‘generalised object’ b:XBb \colon X \longrightarrow B to ff? It is not enough to say that the corresponding diagram X a θ b A f B \begin{matrix} & X \\ {}^a \swarrow & \overset{\theta}{\Leftarrow} & \searrow^{b} \\ A & \underset{f}{\rightarrow} & B \\ \end{matrix} is a left lifting diagram; we must consider the object bb at all its stages of development. Hence we define a 2-cell θ\theta to be a universal arrow from bb to ff when for every earlier stage YXY \longrightarrow X, the diagram Y X a θ b A f B \begin{matrix} & Y \\ & \downarrow \\ & X \\ {}^a \swarrow & \overset{\theta}{\Leftarrow} & \searrow^{b} \\ A & \underset{f}{\rightarrow} & B \\ \end{matrix} is a left lifting. That is, we require our corresponding diagram to be an absolute left lifting (meaning that the property of being a left lifting is preserved when composed with any arrow into XX).

Hence absolute liftings give a natural 2-categorical expression of the notion of universal arrow. This slogan can help us to interpret some basic facts of the algebra of 2-categories as they relate to category theory. For instance, the result that a 2-cell A f η 1 B g A \begin{matrix} & A \\ {}^f \swarrow & \overset{\eta}{\Leftarrow} & \searrow^{1} \\ B & \underset{g}{\rightarrow} & A \\ \end{matrix} is the unit of an adjunction fgf \dashv g if and only if it is an absolute left lifting corresponds to the familiar result of basic category theory that states that a functor g:BAg \colon B \longrightarrow A is a right adjoint if and only if there is a universal arrow to gg from every object of AA.

Universal elements and representability

Now, observe that the 2-cell in the ordinary Yoneda lemma, in addition to being a left extension, is trivially a left lifting; for this just means that there is a bijection between arrows aba \longrightarrow b and elements of the set A(a,b)A(a,b). In familiar terms, this says that ι\iota is a universal element of the functor A(a,)A(a,-), and by the above discussion we can immediately incorporate this into our setting as the second axiom for a Yoneda structure.

Axiom 2. The 2-cell χ a\chi^a of Axiom 1 is an absolute left lifting.

The universal property of this absolute left lifting is indicated by axbX(1,x)A(a,b). \frac{a x \Rightarrow b}{X(1,x) \Rightarrow A(a,b)}.

The combination of these two universal properties of the 2-cells χ a\chi^a is the ignition that allows us to begin to develop category theory in our 2-category. Together they give a bijection of 2-cells π\pi and η\eta as indicated by the bijections B(s,1)C(j,t)y AC(j,ts)jts \frac{\frac{B(s,1) \Rightarrow C(j,t)}{y_A \Rightarrow C(j,t s)}}{j \Rightarrow t s} and as displayed in the diagram A s χ s y A B B(s,1) PA t π C(j,1) C = A s η j χ j y A B t C C(j,1) PA \begin{matrix} & A \\ {}^s \swarrow & \overset{\chi^s}{\Leftarrow} & \searrow^{y_A} \\ B & \overset{B(s,1)}{\rightarrow} & P A \\ {}_t \searrow & \Downarrow^{\pi} & \nearrow_{C(j,1)} \\ & C \\ \end{matrix} \qquad = \qquad \begin{matrix} & & A \\ {}^s \swarrow & \overset{\eta}{\Leftarrow} & \downarrow^{j} & \overset{\chi^j}{\Leftarrow} & \searrow^{y_A} \\ B & \underset{t}{\rightarrow} & C & \underset{C(j,1)}{\rightarrow} & P A \\ \end{matrix} From Axiom 2 and the pasting lemma, we get that the LHS is an absolute left lifting (with base C(j,t)=C(j,1)tC(j,t) = C(j,1)t) if and only if η\eta is an absolute left lifting. This is exactly the realisation in our setting of the familiar correspondence between universal elements of C(j,t)C(j,t) and universal arrows from jj to tt. In particular, if π\pi is an isomorphism, then this condition holds, and we have a universal element of C(j,t)C(j,t). Note that we can understand π\pi being an isomorphism as meaning that C(j,t)C(j,t) is representable.

Now, recall the representability condition of ordinary category theory that says that a functor f:BSetf \colon B \longrightarrow \text{Set} is representable if and only if it has a universal element. We have seen that one half of this implication follows from the first two axioms of a Yoneda structure. We could take the other half as an axiom, being a basic result of category theory that we wish to incorporate into our setting.

Axiom 3*. If a 2-cell σ:A(a,1)f:APX\sigma \colon A(a,1) \Rightarrow f \colon A \longrightarrow P X yields an absolute left lifting diagram when pasted onto χ a\chi^a, then σ\sigma is an isomorphism.

Understood in our interpretation, this is indeed the other implication of the representability condition.

However this axiom is too strong to capture all the examples we want! While it does hold in ordinary category theory and in 2-categories of internal categories and variable (=indexed=parametrized) categories, it is does not hold in general in enriched category theory. For in enriched category theory, representability of a functor is expressed as an isomorphism in the base of enrichment, and the bijection of sets of the universal element condition is not enough to capture this.

Hence we do not take Axiom 3* as an axiom for Yoneda structures. Instead we take a couple of special cases. By Axiom 1 for y Ay_A and gfgf in place of aa, we have the bijections of 2-cells P1 A1 PAy Ay AC(gf,1)Pf.C(g,1)y APf.C(g,1).gf. \frac{P1_A \Rightarrow 1_{P A}}{y_A \Rightarrow y_A} \qquad \frac{C(g f,1) \Rightarrow P f.C(g,1)}{y_A \Rightarrow P f.C(g,1).g f}. Let ι A\iota_A correspond to 1 y A1_{y_A} in the first bijection, and θ f,g\theta_{f,g} correspond to the 2-cell A f B g C y A χ B(1,f) y B χ g C(g,1) PA Pf PB \begin{matrix} A & \overset{f}{\rightarrow} & B & \overset{g}{\rightarrow} & C \\ {}^{y_A} \downarrow & \overset{\chi^{B(1,f)}}{\Rightarrow} & {}^{y_B} \downarrow & \overset{\chi^g}{\Rightarrow} & \swarrow_{C(g,1)} \\ P A & \underset{P f}{\leftarrow} & P B \\ \end{matrix} in the second. These corresponding 2-cells are an identity and a pasting of two absolute left liftings respectively. Hence (by the pasting lemma in the second case) they are both absolute left liftings, and Axiom 3* would imply that they are isomorphisms. We take these special cases as our third and final axiom.

Axiom 3. The 2-cells ι A\iota_A and θ f,g\theta_{f,g} are isomorphisms.

In summary, a Yoneda structure on a 2-category consists of admissible arrows and Yoneda arrows satisfying Axioms 1, 2, and 3.

Formal category theory

Now that we have established the notion of a Yoneda structure on a 2-category, we can begin to develop category theory within such a 2-category. What is remarkable is that everything follows formally and naturally from the Yoneda structure and from the elementary two-dimesional algebra of pasting, liftings, extensions and adjunctions. In their paper, Street and Walters develop results concerning adjunctions, weighted colimits, EM objects and Kleisli objects for monads, and they introduce the notion of totality. I will now briefly survey some of these results; I refer the reader to the paper for details.

They show that the 2-categorical definition of an adjunction fgf\dashv g in terms of its unit and counit is equivalent to the condition B(f,1)A(1,g)B(f,1) \cong A(1,g), which expresses the classical hom-set definition of an adjunction. This is an isomorphism of modules, and indeed, one benefit of a Yoneda structure is that it equips our 2-category with a notion of modules (a.k.a. profunctors): modules from AA to BB are arrows APBA \longrightarrow P B.

It is well known that a satisfactory theory of colimits in enriched category theory requires the consideration of colimits weighted by presheaves. However the natural generality of the theory calls for weighting by modules (presheaves on AA are modules from II to AA). Given a module j:MPAj\colon M \longrightarrow P A and an arrow s:ACs\colon A \longrightarrow C (suitably admissible), we define the colimit colim(j,s):MC\text{colim}(j,s) \colon M \longrightarrow C of ss weighted by jj by the formula C(colim(j,s),1)(PA)(j,C(s,1))C(\text{colim}(j,s),1) \cong (P A)(j,C(s,1)). From this formula we can derive many familiar results such as the fact that left adjoints preserve colimits, associativity of colimits in the weight, and the Yoneda isomorphisms. Also we can introduce the notion of pointwise extension in terms of colimits and show that they are indeed extensions. Thus all the results of the calculus of colimits, which in enriched category theory depend on a “complicated machinery” involving enriched extranatural transformations, coends, enriched functor categories etc., in fact follow easily and formally in the context of the elementary notion of a 2-category with a Yoneda structure.

Street and Walters show that a certain result from Street’s earlier paper, the characterisation of the Eilenberg-Moore category of a monad in CAT as certain sheaves on the Kleisli category of the monad, holds for any Yoneda structure. They also show that in the presence of more properties of the 2-category (particularly a bo-ff factorisation of its arrows), we can make sense of the idea of the Kleisli object as a “full subcategory” of the Eilenberg-Moore object.

This paper introduced the notion of totality. An arrow s:ACs \colon A \longrightarrow C is said to be total when AA and ss are admissible and the module C(s,1):CPAC(s,1) \colon C \longrightarrow P A has an admissible left adjoint. It follows immediately from the developed theory that this left adjoint zz must be given by zj=colim(j,s)z j = \text{colim}(j,s) and be the pointwise left extension of ss along the Yoneda arrow y Ay_A (note that these assertions require all the admissibility assumptions in the definition). Such adjunctions are absolutely fundamental to the use of category theory; they sometimes go by the name of nerve and realization.

An object AA is said to be total when the identity arrow 1 A1_A is total; this is the same as saying the Yoneda arrow y A:APAy_A \colon A \longrightarrow P A has a left adjoint. This can be thought of as a sort of strong cocompleteness property. In particular we get a “very satisfactory” adjoint functor theorem: if AA is total, an arrow f:ABf \colon A \longrightarrow B has a right adjoint if and only if B(f,1)B(f,1) is admissible and ff preserves the colimit colim(B(f,1),1)\text{colim}(B(f,1),1).


Important motivating examples of 2-categories with Yoneda structures are 2-categories of internal categories (see the papers of Street and Weber) and enriched categories, in particular CAT is the 2-category of categories internal to SET. Given categories \mathcal{E} and 𝒱\mathcal{V} suitably nice and approriate for internalisation and enrichment respectively, we get Yoneda structures on the 2-categories Cat((\mathcal{E}) and 𝒱\mathcal{V}-Cat roughly as follows. For such a category 𝒞\mathcal{C}, the size structure arises from an object SS of 𝒞\mathcal{C} which is in some way understood as a “full subcategory” of 𝒞\mathcal{C}; for internal categories, this can be characterised as an internal full subcategory, or equivalently, a classifying discrete opfibration. Then the objects PAP A arise as exponentials S A opS^{A^{op}}, and the Yoneda arrows are exponential adjoints of hom-functors.

The 2-category LEX of finitely complete categories and finitely continuous functors inherits a Yoneda structure from CAT. This example is of interest because the total objects are (almost) Grothendieck toposes.

Another example, not treated in this paper but in this later paper of Street, are 2-categories of variable categories. These are of the form 𝒦\mathcal{K} = Hom( op\mathcal{E}^{op},CAT), consisting of pseudofunctors, pseudonatural transformations, and modifications. I mention this example here for the interesting way in which the size structure arises. An arrow f:ABf \colon A \longrightarrow B in 𝒦\mathcal{K} is admissible if for every a:UAa\colon U \longrightarrow A and b:VBb \colon V \longrightarrow B where UU and VV are representables on objects of \mathcal{E}, the comma object fa/bf a/b is representable.

Author: "riehl (eriehl@math.harvard.edu)" Tags: "Categories"
Send by mail Print  Save  Delicious 
Date: Tuesday, 18 Mar 2014 17:52

Guest post by Dimitri Zaganidis

First of all, I would like to thank Emily for organizing the Kan extension seminar. It is a pleasure to be part of it. I want also to thank my advisor Kathryn Hess and my office mate Martina Rovelli for their revisions.

In the fifth installment of the Kan Extension Seminar we read the paper “Review of the Elements of 2-categories” by G.M Kelly and Ross Street. This article was published in the Proceedings of the Sydney Category Theory Seminar, and its purpose is to “serve as a common introduction to the authors’ paper in this volume”.

The article has three main parts, the first of them being definitions in elementary terms of double categories and 2-categories, together with the notion of pasting. In a second chapter, they review adjunctions in 2-categories with a nice expression of the naturality of the bijection given by mates using double categories. The last part of the article introduces monads in 2-categories, and specializing to 2-monads towards the end.

Double categories and 2-categories

The article starts with the definition of a double category as a category object in the (not locally small) category of categories CAT\mathbf{CAT}. (I think that there might be some set theoretic issues with such a category, but you can add small everywhere if you want to stay safe.)

The authors then switch to a description of such an object in terms of objects, horizontal arrows, vertical arrows, and squares, with various compositions and units. I will explain a bit how to go from one description to the other.

A category object is constituted of a category of objects, a category of morphisms, target and source functors, identity functor and a composition.

The category of objects is the category whose morphisms are “the objects” and whose morphisms are the vertical arrows. The category of morphisms is the category whose objects are the horizontal morphisms and whose morphisms are the squares, with vertical composition.

Since the functors Obj,Mor:CATSET\mathrm{Obj}, \mathrm{Mor}: \mathbf{CAT} \longrightarrow \mathbf{SET} preserve pullbacks, by applying them to a double category seen as a category object, we get actual categories. Applying Obj\mathrm{Obj} to the double category, we get the category whose objects are “the objects” and whose morphisms are the horizontal arrows. Applying Mor\mathrm{Mor}, we get the category whose objects are the vertical morphisms and whose morphisms are the squares, but this time with horizontal composition.

An interesting thing to notice is that the symmetry of the explicit description of a double category is much more apparent than the symmetry of its description as a category object.

One can define a 22-category as a double category with a discrete category of objects, or as a CAT\mathbf{CAT}-enriched category, exactly as one can define a simplicially enriched small category as either a category enriched over sSet\mathbf{sSet} or as a category object in sSet\mathbf{sSet} with a discrete simplicial set of objects.

The second viewpoint on 2-categories leads to definitions of 2-functors and 2-natural transformations and also to modifications, once one makes clear what enrichment a category of 2-functors inherits.

It is also worthwhile mentioning that the pasting operation makes computations easier to make, because they are more visual. The proof of proposition 2.1 of this paper is a good illustration of this.

The basic example of a 2-category is CAT\mathbf{CAT} itself, with natural transformations as 2-cells (squares).

As category theory describes set-like constructions, 2-category theory describes category-like constructions. You can usually build up categories with as objects sets with extra structure. In the same way, small V-categories, V-functors, and V-natural transformations form a 2-category.

My first motivation to learn about 2-categories was the 2-category of quasi-categories defined by Joyal and which has been studied by Emily Riehl and Dominic Verity in the article The 2-category theory of quasi-categories in particular the category-like constructions one can make with quasi-categories, such as adjunctions and limits.

Adjunctions and mates in 2-categories

It is not a surprise that 2-categories are the right framework in which to define adjunctions. To build the general definition from the usual one, you just need to replace categories by objects in a 2-category, functors by 1-cells of the 2-category, and natural transformations by its 2-cells.

Adjunctions in a 2-category 𝒞\mathcal{C} compose (as in CAT\mathbf{CAT}), and one can form two, a priori distinct double categories of adjunctions. Both of them will have the objects of 𝒞\mathcal{C} as objects and the horizontal morphisms being the morphisms of 𝒞\mathcal{C}, while their vertical morphisms are the adjunctions (going in the same direction as the right adjoint, by convention). The two double categories differ on the squares. Given adjunctions fu f \dashv u and fuf' \dashv u' together with 1-cells a:AAa:A \longrightarrow A' (between the domains of uu and uu') and b:BBb:B \longrightarrow B' (between the codomains of uu and uu'), the squares of the first double category are 2-cells buuab u \Rightarrow u'a while the squares of the second are 2-cells fbaff'b \Rightarrow a f.

Now, the bijective correspondence between these kind of 2-cells given by mates induces an isomorphism of double categories. This means in particular that the horizontal (or vertical) composite of mates is equal to the mate of the corresponding composite.

This is a very beautiful way to express the naturality of the mate correspondence, and it provides a one-line proof of the fact that two 1-cells that are left adjoints to a same 1-cell are naturally isomorphic.

Monads in 2-categories

2-categories are also the right framework to define monads. A monad in a 2-category 𝒞\mathcal{C} and on an object BB is a 1-cell t:BBt:B \longrightarrow B together with 2-cells μ:t 2t\mu: t^2 \Rightarrow t and η:1 Bt\eta: 1_B \Rightarrow t, verifying the usual equations μ(tμ)=μ(μt)\mu \circ (t\mu)= \mu \circ (\mu t) and μ(tη)=1 B=μ(ηt)\mu \circ(t\eta) = 1_B = \mu \circ(\eta t). Since 2-functors preserve both horizontal and vertical compositions, for all objects XX of 𝒞\mathcal{C}, tt induces a monad on 𝒞(X,B)\mathcal{C}(X,B), given by post-composition (t *,μ *,η *)(t_{\ast},\mu_{\ast},\eta_{\ast}). The authors call * an action of tt on s:XBs:X \longrightarrow B* a t *t_\ast algebra structure on ss.

The definition of monad morphism in this article is quite surprising for someone who has read the previous article by Ross Street, The formal theory of monads, which we read for the second meeting of the Kan extension seminar.

In Ross Street’s original paper, a monad morphism (B,t,μ,η)(B,t,μ,η)(B,t,\mu, \eta) \longrightarrow (B',t',\mu', \eta') is a 1-cell f:BBf: B \longrightarrow B' together with a 22-cell ϕ:tfft\phi: t'f \Rightarrow f t verifying certain conditions.

In this paper, morphisms of monads are defined only for monads on the same object, letting the 11-cell part of a monad transformation of the previous article be the identity. This leads the authors to reverse the direction of the morphism, since the 22-cell seems to go in the reverse direction of the 11-cell!

One might think that fixing f=1f=1 is needed by the result which explains that there is a bijection between monad morphisms ttt \Rightarrow t' and actions of tt on tt' making tt' a “(t,t)(t,t')-bimodule”. In fact, in the case where ff is not necessarily the identity, there is a bijection between 2-cells ϕ:tfft\phi:t f \Rightarrow f t' such that (f,ϕ)(f,\phi) is a monad functor and actions of tt on ftft' making ftft' a “(t,t)(t,t')-bimodule”. A statement of the same kind can be also made for monad functor transformations (in the sense of the formal theory of monads). A 2-cell σ:ff\sigma : f \Rightarrow f' is a monad functor transformation (f,ϕ)(f,ϕ)(f,\phi) \longrightarrow (f', \phi') if and only if σt:ftft\sigma t': f t' \Rightarrow f' t' is a morphism of “(t,t)(t,t')-bimodules”.

A 2-category admits the construction of algebras if for every monad (B,t,μ,η)(B,t,\mu, \eta), the 2-functor X𝒞(X,B) (t *,μ *,η *)X \mapsto \mathcal {C}(X,B)^{(t_\ast, \mu_\ast, \eta_\ast)} is representable. The representing object is called the object of tt-algebras. By Yoneda, the free-forgetful adjunction can be made internal in this case.

The terminology is justified, because in the 22-category CAT\mathbf{CAT}, it specializes to the usual notions of the category of tt-algebras and the corresponding free-forgetful adjunction.

A monad in 𝒞\mathcal{C} is the same as a 2-functor Mnd𝒞\mathbf{Mnd} \longrightarrow \mathcal{C}, where Mnd\mathbf{Mnd} is the 2-category with one object and Δ +\Delta_+, the algebraist’s simplicial category as monoidal hom-category (with ordinal sum). Since moreover, 𝒞(X,B) (t *,μ *,η *)[Mnd,CAT](Δ +,𝒞(X,)),\mathcal {C}(X,B)^{(t_\ast, \mu_\ast, \eta_\ast)} \cong [\mathbf{Mnd}, \CAT]( \Delta_{+\infty}, \mathcal{C}(X,-)), (where Δ +\Delta_{+\infty} is the subcategory of maps of Δ\Delta preserving maxima, which is acted on by Δ +\Delta_+ via ordinal sum) one can see that the object of t-algebras can be expressed as a weighted limit.

As a consequence, it is not surprising that a 2-category admits the construction of algebras under some completeness assumptions.


In the last part of the article, the authors review the notion of a doctrine, which is a 2-monad in 2-CAT\mathbf{CAT}, i.e., a 2-functor D:𝒞𝒞D: \mathcal {C} \longrightarrow \mathcal{C}, where 𝒞\mathcal{C} is a 2-category, and 2-natural transformations mm and jj, which are respectively the multiplication and the unit, verifying the usual identities. The fact that it is both a monad on a 2-category and in another one can be a bit disturbing at first.

If (D,m,j)(D,m,j) is a doctrine over a 2-category 𝒞\mathcal{C}, then its algebras will be objects XX of 𝒞\mathcal{C} together with an action DXXDX \longrightarrow X, exactly as in the case of algebras over a usual monad.

Already with morphisms, we can take advantage of the fact that a 2-category 𝒞\mathcal{C} has 2-cells, and define DD-morphisms to be lax in the sense that the diagram DX DY X Y \begin{matrix} DX & \longrightarrow & DY \\ \downarrow & & \downarrow \\ X & \longrightarrow & Y \end{matrix} is not supposed to be commutative, but is rather filled by a 2-cell with some coherence properties.

As one might expect, we can actually form a 2-category of such DD-algebras by adding 2-cells, using again the 22-cells existing in 𝒞\mathcal{C}.

If we keep only the DD-morphisms that are strict, we obtain the object of algebras (which should be a 22-category) that we discussed before.

One example of a doctrine is Δ +×:CATCAT\Delta_+ \times - : \mathbf{CAT} \longrightarrow \mathbf{CAT} together with the multiplication induced by the ordinal sum, and unit given on 𝒟\mathcal {D} by the functor 𝒟Δ +×𝒟\mathcal{D} \longrightarrow \Delta_+ \times \mathcal{D} that sends dd to (,d)(\emptyset,d).

The algebras for this doctrine will be categories equipped with a monad acting on them, while the DD-morphisms are transformations of monads, and the DD-2-cells are exactly the monad functor transformations of Street’s article.

Here, since we have two different 2-categories of algebras (with strict DD-morphisms or with all of them), one can wonder if monad morphisms DDD \longrightarrow D' will induce 22-functors DD'-AlgD\mathbf{Alg} \longrightarrow D-Alg\mathbf{Alg} on the level of these 22-categories.

This is indeed the case, and one can actually go even one step further and define monad modifications, using the fact that 2-CAT\mathbf{CAT} is in fact a 3-category! These modifications between two given monad morphisms are in fact in bijective correspondence with the 2-natural transformations between the 22-functors induced by these monad morphisms on the level of algebras (with lax D-morphisms). Note that they are not the same as monad morphisms transformations of Street’s article.

This bijection is nice because it implies that you can compare 2-categories of algebras by only looking at the doctrines: if they are equivalent, so are the 2-categories of algebras.

The fact that this bijection does not hold when we restrict only to strict morphism was really surprising to me, but I guess this is the price to pay to use the 3-category structure.

During the last days of April, the Kan extension seminar will be reading the article “Two dimensional monad theory”, by Blackwell, Kelly and Powell. We will then have more to say about these 2-monads!

Author: "riehl (eriehl@math.harvard.edu)" Tags: "Categories"
Send by mail Print  Save  Delicious 
Date: Tuesday, 18 Mar 2014 15:18

Leila Schneps is trying to raise $6,000 for what sounds like a good cause: translating a biography of Grothendieck into English:

As of this moment she’s raised $350… including $100 of her own money.

Leila Schneps writes:

Two volumes of a projected 3-volume biography of A. Grothendieck have been completed by Winfried SCHARLAU, and are available in German on Amazon.com through “Books on Demand” as

“Wer ist Alexander Grothendieck? Teil 1: Anarchy”


“Wer ist Alexander Grothendieck? Teil 3: Spiritualität”.

Grothendieck is aware of Scharlau’s work and even met him during the preparation of the part concerning the lives of his parents, although he subsequently ruptured relations with Scharlau as he has with everyone else.

In 2013, an English translation of volume 1 by my sister, Melissa Schneps became available on Amazon as

“Who is Alexander Grothendieck? Part 1, Anarchy”

Melissa’s translation of volume 1 was funded by Jim Carlson of the Clay Institute for Mathematics.

The new director of the Clay Institute has not accepted to fund translation of volume 3. The book covers Grothendieck’s life from his rupture with the mathematical establishment in 1970 to his disappearance in 1991 and contains a wealth of information from documents and interviews.

Due to multiple expressions of interest in an English translation, we have decided to use this method to raise funding for the translation of volume 3. Once the project is underway, we propose to create a web page where chapters of the volume can be made accessible to donors as they appear. Once completed, the new volume will be available on Amazon through “Books on Demand” like the others.

If you are interested in Alexander Grothendieck and his astonishing life, please help to make this happen.

In response Kevin Lin asked the obvious question: “What about volume 2?” I don’t know the answer! Do you?

Author: "john (baez@math.ucr.edu)" Tags: "Science Funding"
Send by mail Print  Save  Delicious 
Date: Sunday, 16 Mar 2014 21:14

One of my dreams these days is to get people to apply modern math to ecology and biology, to help us design technologies that work with nature instead of against it. I call this dream ‘green mathematics’. But this will take some time to reach, since living systems are subtle, and most mathematicians are more familiar with physics.

So, I’ve been warming up by studying the mathematics of chemistry, evolutionary game theory, electrical engineering, control theory and information theory. There are a lot of ideas in common to all these fields, but making them clear requires some category theory. I call this project ‘network theory’. I’m giving some talks about it at Oxford.

(This diagram is written in Systems Biology Graphical Notation.)

Here’s the plan:

Network Theory

Nature and the world of human technology are full of networks. People like to draw diagrams of networks: flow charts, electrical circuit diagrams, signal-flow graphs, Bayesian networks, Feynman diagrams and the like. Mathematically minded people know that in principle these diagrams fit into a common framework: category theory. But we are still far from a unified theory of networks. After an overview, we will look at three portions of the jigsaw puzzle in three separate talks:

I. Electrical circuits and signal-flow graphs.

II. Stochastic Petri nets, chemical reaction networks and Feynman diagrams.

III. Bayesian networks, information and entropy.

All these talks will be in Lecture Theatre B of the Computer Science Department—you can see a map here, but the entrance is on Keble Road. Here are the times:

• Friday 21 February 2014, 2 pm: Network Theory: overview. Also available on YouTube.

• Tuesday 25 February, 3:30 pm: Network Theory I: electrical circuits and signal-flow graphs. Also available on YouTube.

• Tuesday 4 March, 3:30 pm: Network Theory II: stochastic Petri nets, chemical reaction networks and Feynman diagrams. Also available on YouTube.

• Tuesday 11 March, 3:30 pm: Network Theory III: Bayesian networks, information and entropy. Also available on YouTube.

I thank Samson Abramsky, Bob Coecke and Jamie Vicary of the Computer Science Department for inviting me, and Ulrike Tillmann and Minhyong Kim of the Mathematical Institute for helping me get set up. I also thank all the people who helped do the work I’ll be talking about, most notably Jacob Biamonte, Jason Erbele, Brendan Fong, Tobias Fritz, Tom Leinster, Tu Pham, and Franciscus Rebro.

Ulrike Tillmann has also kindly invited me to give a topology seminar:

Operads and the Tree of Life

Trees are not just combinatorial structures: they are also biological structures, both in the obvious way but also in the study of evolution. Starting from DNA samples from living species, biologists use increasingly sophisticated mathematical techniques to reconstruct the most likely “phylogenetic tree” describing how these species evolved from earlier ones. In their work on this subject, they have encountered an interesting example of an operad, which is obtained by applying a variant of the Boardmann–Vogt “W construction” to the operad for commutative monoids. The operations in this operad are labelled trees of a certain sort, and it plays a universal role in the study of stochastic processes that involve branching. It also shows up in tropical algebra. This talk is based on work in progress with Nina Otter.

I’m not sure exactly where this will take place, but probably somewhere in the Mathematical Institute, shown on this map. Here’s the time:

• Monday 24 February, 3:30 pm, Operads and the Tree of Life.

If you’re nearby, I hope you can come to some of these talks — and say hi!


(This diagram was drawn by Darwin.)

Author: "john (baez@math.ucr.edu)" Tags: "Conference"
Send by mail Print  Save  Delicious 
Date: Wednesday, 05 Mar 2014 17:06

Guest post by Nick Gurski

I have been thinking about various sorts of operads with my PhD student Alex Corner, and have become interested in the following very concrete question: what are examples of operads in the category of finite groups under the cartesian product? I don’t know any really interesting examples, but maybe you do! After the break I will explain why I got interested in this question, and tell you about some examples that I do know.

Alex and I started off thinking about various sorts of things you might do with operads in Cat\mathbf{Cat}, and were eventually forced into what we currently call an action operad. This is an operad GG whose job it is to act on the objects of other operads. The key examples to keep in mind are the terminal operad (each set is just a singleton), the symmetric operad (the nnth set is the nnth symmetric group), and the braid operad (the nnth set is the nnth braid group). The technical definition involves an operad GG, a group structure on each set G(n)G(n), a map of operads π:GΣ\pi:G \rightarrow \Sigma to the symmetric operad which is levelwise a group homomorphism, and a final condition (when it makes sense) relating operadic composition, μ\mu, with group multiplication:

μ(g;f 1,,f n)μ(g;f 1,,f n)=μ(gg;f π(g)(1)f 1,,f π(g)(n)f n). \mu(g; f_1, \ldots, f_n) \cdot \mu(g'; f_1', \ldots, f_n') = \mu (g g'; f_{\pi(g')(1)}f_{1}', \ldots, f_{\pi(g')(n)}f_{n}').

These ideas have cropped up before, for example in Nathalie Wahl’s thesis or this preprint of Wenbin Zhang. Once you have this definition, you can define operads which are equivariant with respect to GG: you have an operad PP, a group action of G(n)G(n) on P(n)P(n) for each nn, and some equivariance conditions that generalize the equivariance conditions for a symmetric operad.

This isn’t the only thing you can do with an action operad, you can also think about the 2-monad on Cat\mathbf{Cat} whose algebras are strict monoidal categories where G(n)G(n) acts naturally on nn-fold tensor products. If you do this with the symmetric operad, you get symmetric strict monoidal categories (or permutative categories, if you are a topologist); if you do this with the (ribbon) braid operad, you get (ribbon) braided strict monoidal categories; and if you do this with the action operad of all terminal groups, you get back plain old strict monoidal categories. The only other “naturally-occurring” example of an action operad that I know of is the operad of nn-fruit cactus groups, J nJ_{n}. These groups come up in the representation theory of quantum groups, particularly the theory of crystals, and the monoidal structure you get out here is something Drinfeld called a coboundary category. I can give you a generators-and-relations definition of these groups, at which point I would have completely exhausted my understanding of this operad. The best reference that I know of is the paper Crystals and coboundary categories by Henriques and Kamnitzer.

What does this have to do with my question about operads of finite groups? Well, as it turns out, the structure map for an action operad π:GΣ\pi:G \rightarrow \Sigma only has two options: it can be surjective, or it can be the zero map (i.e., everything maps to the identity permutation). Furthermore, that condition I wrote down relating group multiplication and operadic composition says that giving an action operad with π\pi the zero map is equivalent to giving an operad in which the operadic composition maps all preserve group multiplication. Alex and I already showed that the operadic composition of all identity elements is the identity element in the target, in other words an action operad with π\pi the zero map is just an operad in the category of groups using the cartesian product.

You can take kernels of maps between action operads, so in particular given any action operad GG you can take the kernel of π:GΣ\pi:G \rightarrow \Sigma; this gives you an operad in groups. For the examples above, you get finite groups when GG is the terminal operad or G=ΣG = \Sigma (as the terminal operad is obviously the kernel in the case of G=ΣG = \Sigma), but for the rest of the examples you get an operad in the category of groups, but most of those groups are infinite. The groups involved are the so-called pure versions: pure braids, pure ribbon braids, and pure nn-fruit cacti. One can then think of an action operad with surjective map π\pi as being an extension of the symmetric operad by an operad in the category of groups (the “pure” version), and that action operad is finite if and only if the operad in the category of groups is one containing only finite groups. We can translate this back into thinking about monoidal categories by then noting if we have some notion of strict monoidal category in which nnth tensor powers come equipped with a natural action of a finite group G(n)G(n) for all nn, then we must be able to dig up an operad in the category of finite groups.

Now let’s talk concrete examples: what operads do I know in the category of finite groups? Well, there is obviously the terminal operad, but I can go very slightly further in that I can tell you how to construct some new ones. Here are two methods you can use to construct operads of finite groups.

  • Let AA be a finite abelian group. Then there is an operad A̲\underline{A} where A̲(n)=A n\underline{A}(n) = A^{n} (the power here is a cartesian power). The operadic composition map A n×A k 1××A k nA Σk i A^{n} \times A^{k_{1}} \times \cdots \times A^{k_{n}} \rightarrow A^{\Sigma k_{i}} takes the vector (a 1,,a n)(a_{1}, \ldots, a_{n}) in the first coordinate and duplicates the iith coordinate k ik_{i} times, then adds the result to the vector you get by just concatenating the nn vectors in the A k iA^{k_{i}}. Here AA must be abelian as it appears as A̲(1)\underline{A}(1), and G(1)G(1) must be abelian for any action operad by an Eckmann-Hilton argument.
  • Now let GG be any finite group, but in fact you will see that we don’t use anything interesting about the group structure here. There is an operad G c2G^{c2} with G c2(n)=G (n2) G^{c2}(n) = G^{\binom{n}{2}} given the pointwise group structure. You should think of this as the set of functions from {(i,j):1i<jn}\{ (i,j): 1 \leq i \lt j \leq n \} to GG. Operad composition is a map G (n2)×G (k 12)××G (k n2)G (Σk i2). G^{\binom{n}{2}} \times G^{\binom{k_{1}}{2}} \times \cdots \times G^{\binom{k_{n}}{2}} \rightarrow G^{\binom{\Sigma k_{i}}{2}}. If we are given 1a<bk i1 \leq a \lt b \leq \sum k_{i}, we must give back an element of GG. If there is some rr such that i=1 r1k ia<b< i=1 rk i, \sum_{i=1}^{r-1} k_{i} \leq a \lt b \lt \sum_{i=1}^{r} k_{i}, then we use the function coming from G c2(r)G^{c2}(r) evaluated on 1a+1 i=1 r1k i<b+1 i=1 r1k ik r. 1 \leq a +1 - \sum_{i=1}^{r-1} k_{i} \lt b+1- \sum_{i=1}^{r-1} k_{i} \leq k_{r}. If not, then there exist r<sr \lt s such that aa lies in the rrth “interval” as we had before and bb lies in the ssth interval, so then you use the function coming from G c2(n)G^{c2}(n) evaluated on r<sr \lt s.

I am reasonably content with the first of these constructions, I understand how to do the second but don’t really know where it comes from, and I have some ideas about how one could try to insert the finite group of their choice as G(0)G(0) but haven’t checked the details, and know basically nothing else. Furthermore, I don’t know how these interact with each other, or how you can form extensions of Σ\Sigma with them outside of some obvious constructions.

Those are just the two straightforward approaches that I know of to construct operads in the category of finite groups. You can also try to construct these operads from operads of topological spaces or simplicial sets, but once again I don’t know of an example that produces finite things (apart from ones giving the groups above). Do you know any others?

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Categories"
Send by mail Print  Save  Delicious 
Date: Friday, 21 Feb 2014 00:52

Guest post by Tom Avery

Before getting started, I’d like to thank Emily for organizing the seminar, as well as all the other participants. It’s been a lot of fun so far! I’d also like to thank my supervisor Tom Leinster for some very helpful suggestions when writing this post.

In the fourth instalment of the Kan Extension Seminar we’re looking at Lawvere’s paper “Metric spaces, generalized logic, and closed categories”. This is the paper that introduced the surprising description of metric spaces as categories enriched over a certain monoidal category \mathbb{R}. A lot of people find this very striking when they first see it, and it helps to drive home the point that enriched categories are not just ordinary categories with some extra structure on the hom-sets; in fact the hom-sets don’t have to be sets at all!

Lawvere also intended the paper to serve as an accessible introduction to enriched category theory, so it begins fairly gently with some basic definitions. For the purposes of this post however, I’ll assume the reader has at least seen the definitions of symmetric monoidal closed categories, 𝒱\mathcal{V}-categories, and 𝒱\mathcal{V}-functors. If not, everything you need can be found on the nlab.

Lawvere begins by describing some of the philosophy behind the paper. He argues that rather than being abstract nonsense that only appears at the highest level of mathematics, category theory can be found even in the most elementary concepts. He mentions the well known examples of posets and groups as categories, and the aim of the paper is to describe another example of an elementary mathematical structure arising from categorical constructions, namely metric spaces.

We’ll also look at posets as enriched categories that are closely related to metric spaces, and this is where the “generalized logic” of the title comes in. The idea is that objects in a monoidal category 𝒱\mathcal{V} will be truth values for our generalized logic, morphisms in 𝒱\mathcal{V} will be entailments between them, certain functors correspond to logical operations, and adjointness relations between these functors correspond to rules of inference.

We will only consider categories enriched in complete and cocomplete symmetric monoidal closed categories (which I’ll just call “monoidal categories”), and we are particularly interested in two examples. The first is 𝟚\mathbb{2}, which has two objects, false\mathrm{false} and true\mathrm{true}, and three morphisms, which we write as entailments: falsefalse\mathrm{false} \vdash \mathrm{false}, falsetrue\mathrm{false} \vdash \mathrm{true} and truetrue\mathrm{true} \vdash \mathrm{true}. The tensor product is given by conjunction, the unit is true\mathrm{true} and the internal hom is implication.

The hom-tensor adjunction gives a correspondence between judgements auva \wedge u \vdash v and u(av)u \vdash (a \Rightarrow v), which is just the deduction theorem from logic. Categories enriched in 𝟚\mathbb{2} are simply posets (well, really preorders); we interpret the hom-object X(x,y)X(x,y) as the truth value of the statement xyx \leq y, and the composition and identities give us

(xy)(yz)xzandtruexx. (x \leq y) \wedge (y \leq z) \vdash x \leq z \quad \text{and} \quad \mathrm{true} \vdash x \leq x.

The category \mathbb{R} has as objects non-negative reals together with \infty, and has a single morphism xyx \to y precisely when xyx \geq y. The tensor product is addition with 00 as the unit, and this forces the internal hom to be truncated subtraction:

[x,y]=max(yx,0), [x,y] = \mathrm{max} (y-x, 0),

but we’ll just write this as yxy-x. Thus the fact that x+yzx+y \geq z if and only if yzxy \geq z-x can be thought of as somehow generalizing the deduction theorem. The composition and identities in an \mathbb{R}-category XX give us

X(x,y)+X(y,z)X(x,z)and0X(x,x),X(x,y) + X(y,z) \geq X(x,z) \quad \text{and} \quad 0 \geq X(x,x),

which are familiar as axioms of metric spaces, and this is how we will refer to \mathbb{R}-categories. This is a weakening of the usual metric space axioms in three ways:

  • We don’t require that X(x,y)=0X(x,y) = 0 only when x=yx=y. This condition amounts to requiring that isomorphic objects are equal, which is undesirable from a categorical point of view; this is also the reason for dropping antisymmetry of posets. There are naturally occurring examples that don’t satisfy the non-degeneracy condition; perhaps most notably, it makes sense to regard the distance between two measurable functions that are equal almost everywhere as being zero.

  • We allow X(x,y)=X(x,y) = \infty. Allowing infinite distances is also quite natural, as it’s easy to imagine spaces where it’s impossible to get from one point to another. For example, this is the most sensible way to think of distances in a discrete space.

  • We don’t require X(x,y)=X(y,x)X(x,y) = X(y,x). Allowing non-symmetry is a more significant shift in what we mean by a metric space, but even so there are natural examples, for example the Hausdorff metric on subsets of a metric space. It’s best to think of the hom-values in a generalized metric space not as representing distances, but representing the least time or effort it takes to get from one point or state to another. When you look at it this way, there’s no reason to assume symmetry (e.g. going up and down a hill are not equally difficult).

There is a monoidal functor 𝟚\mathbb{2} \to \mathbb{R} sending false\mathrm{false} \mapsto \infty and true0\mathrm{true} \mapsto 0, and this induces an inclusion of the category of posets into the category of metric spaces. There is also a monoidal functor 𝟚Set\mathbb{2} \to \mathrm{Set}, so metric spaces and categories can be seen as generalizing posets in different directions.

Functor categories and Yoneda

A 𝟚\mathbb{2}-functor between posets is precisely an order preserving map, and a \mathbb{R}-functor between metric spaces is a Lipschitz map with Lipschitz constant 1, i.e. a function f:XYf \colon X \to Y such that Y(fx,fx)X(x,x)Y(f x,f x') \leq X(x,x') for all xx and xx' in XX. For an arbitrary monoidal category 𝒱\mathcal{V} and 𝒱\mathcal{V}-categories XX and YY the functor 𝒱\mathcal{V}-category [X,Y][X,Y] has as objects the 𝒱\mathcal{V}-functors XYX \to Y, and the hom objects are defined by the end

[X,Y](f,g)= xY(fx,gx).[X,Y](f,g) = \int _x Y(f x,g x).

When 𝒱=Set\mathcal{V} = \mathrm{Set}, this just gives the ordinary end representation of the set of natural transformations. Since 𝟚\mathbb{2} and \mathbb{R} are both posets, ends reduce to products, which are conjunctions in 𝟚\mathbb{2} and suprema in \mathbb{R}. So for posets, the object of natural transformations between order preserving maps f,g:XYf,g \colon X \to Y is the truth value of the statement x(fxgx)\forall x (f x \leq g x). In other words, order preserving maps are ordered by domination. For metric spaces the hom object [X,Y](f,g)[X,Y](f,g) is given by sup xY(fx,gx)\sup_x Y(f x,g x). So the space of Lipschitz-1 maps between metric spaces is endowed with the sup metric.

The closed structure of the monoidal category 𝒱\mathcal{V} makes 𝒱\mathcal{V} itself into a 𝒱\mathcal{V}-category. For 𝟚\mathbb{2} this gives the obvious interpretation of 𝟚\mathbb{2} as a poset. However, the metric on \mathbb{R} is given by truncated subtraction, rather than the usual |yx||y-x|. The self-enrichment of 𝒱\mathcal{V} means we can define the presheaf 𝒱\mathcal{V}-category [X op,𝒱][X^{\mathrm{op}}, \mathcal{V}], and that allows us to define the Yoneda embedding X[X op,𝒱]X \to [X^{\mathrm{op}},\mathcal{V}], which sends an object xx to the representable presheaf X(,x)X(-,x). Just as in ordinary category theory, this embedding is full and faithful, which in the case of posets gives:

Proposition. Any poset XX can be embedded into the poset of downwards closed subsets of XX (ordered by inclusion), by sending an element xx to the set of all things x\leq x.

Here we have used the correspondence between downwards closed subsets and order preserving maps X op𝟚X^{\mathrm{op}} \to \mathbb{2}. For metric spaces we have:

Proposition. Any metric space can be isometrically embedded into the space of Lipschitz-1 maps X opX^{\mathrm{op}} \to \mathbb{R}.

Adequacy and density

We say that a 𝒱\mathcal{V}-functor i:XYi \colon X \to Y is adequate if the composite

Y[Y op,𝒱][X op,𝒱] Y \to [Y^{\mathrm{op}},\mathcal{V}] \to [X^{\mathrm{op}}, \mathcal{V}]

of the Yoneda embedding with restriction along ii is full and faithful. We say that ii is dense if we have an isomorphism of bimodules i i 1 Xi_{\star} \circ i^{\star} \cong 1_X. I haven’t yet said what bimodules are, but the important thing is that in the case of metric spaces this becomes the condition that

Y(y 1,y 2)=inf x[Y(y 1,ix)+Y(ix,y 2)]. Y(y_1,y_2) = inf_x [Y(y_1,i x) + Y(i x, y_2)].

This terminology is slightly unfortunate, because what Lawvere calls an adequate functor is nowadays called a dense functor, and Lawvere’s notion of density doesn’t have a modern name as far as I’m aware. A functor is dense iff it’s image is dense in the usual metric space sense. Lawvere shows that a dense functor is adequate, at least in the case of metric spaces. A metric space is separable if it admits a dense map from the discrete space on the natural numbers, and this result give us

Proposition. Any separable metric space can be isometrically embedded in the space of sequences of reals endowed with the sup metric.

This space of sequences is not quite l l_{\infty} as usually defined because of the non-standard metric on \mathbb{R}.

The comprehension scheme

A(n ordinary) functor p:EBp \colon E \to B is called a discrete opfibration if every morphism p(e)bp(e) \to b is the image under pp of a unique morphism with domain ee. There is a correspondence between discrete opfibrations on BB and functors BSetB \to \mathrm{Set}, and this correspondence is obtained by restricting a certain adjunction Cat/B[B,Set]\mathrm{Cat}/B \rightleftarrows [B, \mathrm{Set}] to an equivalence. This adjunction can be defined for arbitrary 𝒱\mathcal{V}, provided that the unit of 𝒱\mathcal{V} is in fact terminal (this gives a way of defining projections from a tensor product to each of the factors, which is not generally possible). I won’t go into the details of how it’s defined, but I’ll describe what it gives for 𝟚\mathbb{2} and \mathbb{R}.

Given an order-preserving map p:EBp \colon E \to B between posets, the left adjoint of this adjunction sends pp to ϕ p:B𝟚\phi_p \colon B \to \mathbb{2}, defined by setting ϕ p(b)\phi_p(b) to be the truth value of the statement e(bp(e))\exists e(b \leq p(e)). The right adjoint sends ϕ:B𝒱\phi \colon B \to \mathcal{V} to π:{B|ϕ}B\pi \colon \{B|\phi\} \to B, where {B|ϕ}\{B|\phi\} is the (upwards closed) subset of BB on which ϕ\phi takes the value true\mathrm{true}, and π\pi is the inclusion. If we take BB to be a discrete poset (i.e. a set) then ϕ\phi is just a unary predicate on BB, and

{B|ϕ}={bB|ϕ(b)=true}.\{B|\phi \} = \{b \in B | \phi (b) = \mathrm{true} \}.

Lawvere calls the right adjoint of the adjunction the comprehension scheme, and this example shows why.

For metric spaces, the left adjoint sends a Lipschitz-1 map p:EBp \colon E \to B to the function ϕ p:B\phi_p \colon B \to \mathbb{R} defined by ϕ p(b)=inf eB(p(e),b)\phi_p (b) = \inf_e B(p (e), b), i.e. the distance from the image of pp. The right adjoint sends ϕ:B\phi \colon B \to \mathbb{R} to the inclusion of its vanishing set.

It’s interesting to note that for Set\mathrm{Set} and 𝟚\mathbb{2}, this adjunction restricts to an equivalence between a subcategory of 𝒱-Cat/B\mathcal{V} \text{-} \mathrm{Cat} / B (the discrete opfibrations, and inclusions of upwards closed subsets respectively) and the whole functor category [B,𝒱][B, \mathcal{V}]. For \mathbb{R} however, only those functions BB \to \mathbb{R} that give the distance from a closed subset are invariant under the adjunction, and these correspond to the inclusion of this closed set. Thus purely categorical considerations naturally give rise to the closed sets in BB as a distinguished class of subsets.

Bimodules and Kan extensions

For 𝒱\mathcal{V}-categories XX and YY, a bimodule ϕ:XY\phi \colon X \to Y consists of a 𝒱\mathcal{V}-functor ϕ:Y opX𝒱\phi \colon Y^{\mathrm{op}} \otimes X \to \mathcal{V} (the placement of the op\mathrm{op} is not completely agreed upon, but I’ll stick with Lawvere’s convention). We can think of a bimodule as a 𝒱\mathcal{V}-category structure on the disjoint union of XX and YY, with ϕ(y,x)\phi(y,x) as the hom object between an object of YY and an object of XX, and with all the hom objects in the other direction being the initial object of 𝒱\mathcal{V}. The 𝒱\mathcal{V}-functoriality of ϕ\phi means that we have a “composition” operation Y(y,y)ϕ(y,x)ϕ(y,x)Y(y',y)\otimes \phi(y,x) \to \phi(y',x), and dually, and that this composition is associative.

There are two special types of bimodule that are particularly important, and they are both induced by 𝒱\mathcal{V}-functors. Given a 𝒱\mathcal{V}-functor f:XYf \colon X \to Y, we define bimodules f :XYf_{\star} \colon X \to Y and f :YXf^{\star} \colon Y \to X by

f (y,x)=Y(y,fx)andf (x,y)=Y(fx,y).f_{\star} (y,x) = Y(y, f x) \quad \text{and} \quad f^{\star} (x,y) = Y(f x,y).

A morphism between two bimodules XYX \rightrightarrows Y is just a 𝒱\mathcal{V}-natural transformation between the corresponding 𝒱\mathcal{V}-functors Y opX𝒱Y^{\mathrm{op}} \otimes X \to \mathcal{V}. Given bimodules ϕ:XY\phi \colon X \to Y and ψ:YZ\psi \colon Y \to Z, we can define the composite bimodule ψϕ:XZ\psi \circ \phi \colon X \to Z by the coend

ψϕ(z,x)= yψ(z,y)ϕ(y,x), \psi \circ \phi (z,x) = \int^y \psi (z,y) \otimes \phi (y, x),

and this composition is associative (up to isomorphism). The bimodule 1 X:XX1_X \colon X \to X given by X(,):X opX𝒱X(-,-) \colon X^{\mathrm{op}} \otimes X \to \mathcal{V} serves as an identity for this composition, and hence there is bicategory 𝒱-Mod\mathcal{V} \text{-} \mathrm{Mod} of 𝒱\mathcal{V}-categories, bimodules and natural transformations.

Let kk be the 𝒱\mathcal{V}-category with a single object and whose only hom object is the unit object of 𝒱\mathcal{V}. Then a bimodule kkk \to k is just an object of 𝒱\mathcal{V}, and composition is just the tensor product in 𝒱\mathcal{V}. So bimodule composition generalizes the tensor product, and it’s natural to ask whether the internal hom of 𝒱\mathcal{V} also generalizes. Since general bimodule composition is not commutative, there are two composition operations (on the left and the right) that could potentially have right adjoints, and in fact they both do, and these are defined by end formulae. If we restrict our attention to modules of the form f f_{\star} for a 𝒱\mathcal{V}-functor ff, the operation ββf \beta \mapsto \beta \circ f_{\star} also has a left adjoint, which is given by ϕϕf \phi \mapsto \phi \circ f^{\star}, and can also be written explicitly as a coend.

Let ff be a 𝒱\mathcal{V}-functor XYX \to Y. Bimodules YkY \to k are just 𝒱\mathcal{V}-functors ψ:Y𝒱\psi \colon Y \to \mathcal{V}, and the bimodule composite ψf \psi \circ f_{\star} is the bimodule XkX \to k corresponding to the 𝒱\mathcal{V}-functor ψf:X𝒱\psi f \colon X \to \mathcal{V}. Thus the left and right adjoint to precomposition with f f_{\star} in this case give left and right Kan extensions along ff, and the formulae for these adjoints reduce to the familiar coend and end formulae for left and right Kan extensions:

Lan fϕ(y)= xY(fx,y)ϕ(x)andRan fϕ(y)= x[Y(y,fx),ϕ(x)] \mathrm{Lan}_f \phi (y) = \int^x Y(f x, y) \otimes \phi (x) \quad \text{and} \quad \mathrm{Ran}_f \phi (y) = \int _x [ Y(y, f x), \phi (x) ]

When ff is full and faithful, the Kan extensions really are extensions, in other words if you extend and then restrict, you get back what you started with. So specializing all this to the case 𝒱=\mathcal{V} = \mathbb{R} gives:

Proposition. Let XX be a subspace of a metric space YY, and let ϕ\phi be a Lipschitz-1 map from XX to \mathbb{R}. Then ϕ\phi has both maximal and minimal Lipschitz-1 extensions to the whole of YY, given by

Lan fϕ(y)=inf x[ϕ(x)+Y(fx,y)]andRan fϕ(y)=sup x[ϕ(x)Y(y,fx)].\mathrm{Lan}_f \phi (y) = \inf_x [ \phi(x) + Y(f x,y) ] \quad \text{and} \quad \mathrm{Ran}_f \phi (y) = \sup_x [ \phi(x) - Y(y,f x)].

A particularly interesting example is given when XX is the space of simple, non-negative functions on a probability space (i.e. positive linear combinations of indicator functions of measurable sets), and ff is the inclusion into the space of all measurable functions (with the sup metric). If you take ϕ(x)\phi (x) to be the integral of the simple function xx, then the two Kan extensions of ϕ\phi give the upper and lower integrals of a measurable function yy. In particular, yy is integrable if and only if Lan fϕ(y)=Ran fϕ(y)\mathrm{Lan}_f \phi (y) = \mathrm{Ran}_f \phi (y).

If we think of a map from a poset to 𝟚\mathbb{2} as a unary relation or predicate, then the left and right Kan extensions become

Lan fϕ(y)=x(fxyϕ(x))andRan fϕ(y)=x(yfxϕ(x)). \mathrm{Lan}_f \phi (y) = \exists x (f x \leq y \wedge \phi (x) ) \quad \text{and} \quad \mathrm{Ran}_f \phi (y) = \forall x (y \leq f x \Rightarrow \phi (x)).

If we restrict to the case when XX and YY are just sets, so that ff is an arbitrary function, this gives

Lan fϕ(y)=(xf 1(y))ϕ(x)andRan fϕ(y)=(xf 1(y))ϕ(x), \mathrm{Lan}_f \phi (y) = (\exists x \in f^{-1} (y)) \phi (x) \quad \text{and} \quad \mathrm{Ran}_f \phi (y) = (\forall x \in f^{-1}(y)) \phi (x),

and specializing even further to the case when ff is a product projection W×ZZW \times Z \to Z we have

Lan fϕ(z)=wϕ(w,z)andRan fϕ(z)=wϕ(w,z). \mathrm{Lan}_f \phi (z) = \exists w \phi (w,z) \quad \text{and} \quad \mathrm{Ran}_f \phi (z) = \forall w \phi (w,z).

So existential and universal quantification are (very) special cases of Kan extensions. This leads Lawvere to use the phrase “Kan quantification” for Kan extensions.

Every 𝒱\mathcal{V}-functor XYX \to Y gives rise to an adjunction f :XY:f f_{\star} \colon X \rightleftarrows Y \colon f^{\star} in the bicategory of bimodules, and it’s natural to ask under what conditions does every adjunction arise in this way. Recall that a metric space is (Cauchy) complete if every Cauchy sequence has a limit. A bit of care is needed when making this precise, because the possible non-symmetry of the metric means there are several things it could mean for a sequence to be Cauchy or convergent.

Proposition. A metric space YY is Cauchy complete if and only if every adjunction of bimodules XYX \rightleftarrows Y is induced by a Lipschitz-1 map XYX \to Y. In particular, the points of the Cauchy completion of YY correspond to bimodule adjunctions 1Y1 \rightleftarrows Y (where 11 is the one point metric space).

This gives a description of Cauchy complete metric spaces that is purely categorical, so it makes sense to talk about Cauchy complete 𝒱\mathcal{V}-categories for arbitrary 𝒱\mathcal{V}. It turns out that an ordinary (Set\mathrm{Set}-enriched) category is Cauchy complete iff all idempotents split, and the Cauchy completion of a ring regarded as a one object category enriched in Ab\mathrm{Ab} is the category of finitely generated projective modules.

These days it’s more common to see Cauchy complete categories defined in terms of absolute colimits. I’m not too sure about the history of the concept, but I think the definition Lawvere gives came first, and Street later gave the characterisation in terms of absolute colimits in “Absolute colimits in enriched categories”.

Free 𝒱\mathcal{V}-categories

Finally, Lawvere defines a 𝒱\mathcal{V}-graph (X,γ)(X, \gamma) to consist of a set XX of “vertices” and for every pair (x,x)(x,x') of vertices an object γ(x,x)\gamma (x,x') of 𝒱\mathcal{V} (i.e. like a 𝒱\mathcal{V}-category but without composition or identities). A morphism of 𝒱\mathcal{V}-graphs is a function f:XYf \colon X \to Y and a family of morphisms f x,x:γ(x,x)δ(fx,fx)f_{x,x'} \colon \gamma (x,x') \to \delta (f x, f x'). There is an evident forgetful functor from 𝒱-Cat\mathcal{V} \text{-} \mathrm{Cat} to the category of 𝒱\mathcal{V}-graphs. This has a left adjoint which sends a 𝒱\mathcal{V}-graph (X,γ)(X, \gamma) to the 𝒱\mathcal{V}-category FXF X which has the vertices of XX as objects and hom objects defined by

FX(x,x)= x 1,,x nγ(x,x 1)γ(x 1,x 2)γ(x n,x),FX(x,x') = \sum_{x_1,\ldots , x_n} \gamma (x,x_1) \otimes \gamma (x_1,x_2) \otimes \ldots \otimes \gamma (x_n, x'),

where the sum runs over all finite sequences of vertices. In the case of metric spaces, this gives the “least-cost” distance:

inf x 1,,x n[γ(x,x 1)+γ(x 1,x 2)++γ(x n,x)]. inf_{x_1, \ldots , x_n} [ \gamma(x, x_1) + \gamma (x_1, x_2) + \ldots + \gamma (x_n, x') ].

I’ll finish with a few questions for those more knowledgeable than myself:

  • I think the only categorical notion in the paper that I hadn’t at least heard of before is Lawvere’s notion of density (i.e. a functor ii such that i i =1i_{\star} \circ i^{\star} = 1; as noted above this is different from what people usually mean by a dense functor). Does this appear elsewhere in category theory, or is it only interesting for metric spaces?

  • A lot of the results about metric spaces are very similar to classical results, but are not quite the same because of the non-standard metric on \mathbb{R}. Is there any way of recovering the classical results without too much work?

  • The idea of generalized logic is intriguing, but I’m not sure how far you can go with it. Is it possible, for example, to talk about generalized models of a generalized theory?

Author: "riehl (eriehl@math.harvard.edu)" Tags: "Categories"
Send by mail Print  Save  Delicious 
Date: Sunday, 16 Feb 2014 11:00

You may recall how Tom Leinster, Tobias Fritz and I cooked up a neat category-theoretic characterization of entropy in a long conversation here on this blog. Now Tobias and I have a sequel giving a category-theoretic characterization of relative entropy. But since some people might be put off by the phrase ‘category-theoretic characterization’, it’s called:

I’ve written about this paper before, on my other blog:

  • Relative Entropy (Part 1): how various structures important in probability theory arise naturally when you do linear algebra using only the nonnegative real numbers.
  • Relative Entropy (Part 2): a category related to statistical inference, FinStat,\mathrm{FinStat}, and how relative entropy defines a functor on this category.
  • Relative Entropy (Part 3): statement of our main theorem, which characterizes relative entropy up to a constant multiple as the only functor F:FinStat[0,)F : \mathrm{FinStat} \to [0,\infty) with a few nice properties.

But now the paper is actually done! Let me give a compressed version of the whole story here… with sophisticated digressions buried in some parenthetical remarks that you’re free to skip if you want.

Our gives a new characterization of the concept of relative entropy, also known as ‘relative information’, ‘information gain’ or—by people who like to use jargon to make their work seem obscure—‘Kullback-Leibler divergence’.

Here’s the basic idea. Whenever you have two probability distributions pp and qq on the same finite set X,X, you can define the entropy of qq relative to pp:

S(q,p)= xXq xln(q xp x) S(q,p) = \sum_{x\in X} q_x \ln\left( \frac{q_x}{p_x} \right)

Here we set

q xln(q xp x)q_x \ln\left( \frac{q_x}{p_x} \right)

equal to \infty when p x=0,p_x = 0, unless q xq_x is also zero, in which case we set it equal to 0. Relative entropy thus takes values in [0,].[0,\infty].

Intuitively speaking, S(q,p)S(q,p) measures how surprised you’d be if you thought a situation was described by a probability distribution pp… but then someone came along and said no, it’s really qq.

Or if ‘surprise’ sounds too subjective, it’s the expected amount of information gained when you discover the probability distribution is really q,q, when you’d thought it was p.p.

Tobias and I wanted to use category theory to say what’s so great about relative entropy. We did it using a category FinStat\mathrm{FinStat} where:

  • an object (X,q)(X,q) consists of a finite set XX and a probability distribution xq xx \mapsto q_x on that set;
  • a morphism (f,s):(X,q)(Y,r)(f,s) : (X,q) \to (Y,r) consists of a measure-preserving function ff from XX to Y,Y, together with a probability distribution xs xyx \mapsto s_{x y} on XX for each element yYy \in Y, with the property that s xy=0s_{x y} = 0 unless f(x)=yf(x) = y.

If the raw math seems hard to swallow, perhaps some honey-coated words will help it go down. I think of an object of FinStat\mathrm{FinStat} as a system with some finite set of states together with a probability distribution on its states. This lets me think of a morphism

(f,s):(X,q)(Y,r) (f,s) : (X,q) \to (Y,r)

in a nice way. First, there’s a measurement process f:XYf : X \to Y, a function from the set XX of states of some system being measured to the set YY of states of some measurement apparatus. The condition that ff be measure-preserving says the probability that the apparatus winds up in any state yYy \in Y is the sum of the probabilities of all states of XX leading to that outcome:

r y= xf 1(y)q x \displaystyle{ r_y = \sum_{x \in f^{-1}(y)} q_x }

Second, there’s a hypothesis ss. This is a guess about the probability that the system being measured is in the state xXx \in X given any measurement outcome yY.y \in Y. The guess is the number s xys_{x y}.

Now, suppose we have any morphism

(f,s):(X,q)(Y,r) (f,s) : (X,q) \to (Y,r)

in FinStat.\mathrm{FinStat}. From this we get two probability distributions on XX. First, we have the probability distribution pp given by

p x= yYs xyr y \displaystyle{ p_x = \sum_{y \in Y} s_{x y} r_y } \qquad \qquad \heartsuit\heartsuit\heartsuit

This is our best guess about the the probability that the system is in any given state, given our hypothesis and the probability distribution of measurement results. Second, we have the ‘true’ probability distribution qq.

In fact, this way of assigning relative entropies to morphisms defines a functor

RE:FinStat[0,] RE : \mathrm{FinStat} \to [0,\infty]

where we use [0,][0,\infty] to denote the category with one object, the numbers 0x0 \le x \le \infty as morphisms, and addition as composition. More precisely, if

(f,s):(X,q)(Y,r) (f,s) : (X,q) \to (Y,r)

is any morphism in FinStat,\mathrm{FinStat}, we define

RE(f,s)=S(q,p) RE(f,s) = S(q,p)

where pp is defined as in equation \heartsuit\heartsuit\heartsuit. This tells us how surprised we are when we learn the true probability distribution qq, if our measurement results were distributed according to rr and our hypothesis was ss.

The fact that RERE is a functor is nontrivial and rather interesting! It says that given any composable pair of measurement processes:

(X,q)(f,s)(Y,r)(g,t)(Z,u) (X,q) \stackrel{(f,s)}{\longrightarrow} (Y,r) \stackrel{(g,t)}{\longrightarrow} (Z,u)

the relative entropy of their composite is the sum of the relative entropies of the two parts:

RE((g,t)(f,s))=RE(g,t)+RE(f,s). RE((g,t) \circ (f,s)) = RE(g,t) + RE(f,s) .

We prove that RERE is a functor. However, we go further: we characterize relative entropy by saying that up to a constant multiple, RERE is the unique functor from FinStat\mathrm{FinStat} to [0,][0,\infty] obeying three reasonable conditions.

Lower semicontinuity

The first condition is that RERE is lower semicontinuous. The set P(X)P(X) of probability distibutions on a finite set XX naturally has the topology of an (n1)(n-1)-simplex when XX has nn elements. The set [0,][0,\infty] has an obvious topology where it’s homeomorphic to a closed interval. However, with these topologies, the relative entropy does not define a continuous function

S:P(X)×P(X) [0,] (q,p) S(q,p). \begin{array}{rcl} S : P(X) \times P(X) &amp;\to&amp; [0,\infty] \\ (q,p) &amp;\mapsto &amp; S(q,p) . \end{array}

The problem is that

S(q,p)= xXq xln(q xp x)\displaystyle{ S(q,p) = \sum_{x\in X} q_x \ln\left( \frac{q_x}{p_x} \right) }

and q xln(q x/p x)q_x \ln(q_x/p_x) is \infty when p x=0p_x = 0 and q x>0q_x &gt; 0 — but it’s 00 when p x=q x=0.p_x = q_x = 0.

So, it turns out that SS is only lower semicontinuous, meaning that if p i,q ip^i , q^i are sequences of probability distributions on XX with p ipp^i \to p and q iqq^i \to q then

S(q,p)liminf iS(q i,p i) S(q,p) \le \liminf_{i \to \infty} S(q^i, p^i)

We give the set of morphisms in FinStat\mathrm{FinStat} its most obvious topology, and show that with this topology, RERE maps morphisms to morphisms in a lower semicontinuous way.

(Lower semicontinuity may seem like an annoying property. But there’s a way to redeem it. There’s a sneaky topology on [0,][0,\infty] such that a function taking values in [0,][0,\infty] is lower semicontinuous (in the lim inf sense above) if and only if it’s continuous with respect to this sneaky topology!

Using this idea, we can make FinStatFinStat and [0,][0,\infty] into topological categories — that is, categories internal to Top — in such a way that lower semicontinuity simply says

RE:FinStat[0,] RE : FinStat \to [0,\infty]

is a continuous functor.

A bit confusingly, this sneaky topology on [0,][0,\infty] is called the upper topology. I’ve fallen in love with the upper topology on [0,][0,\infty]. Why?

Well, [0,][0,\infty] is a very nice rig, or ‘ring without negatives’. Addition is defined in the obvious way, and multiplication is defined in the almost-obvious way, except that

0=0=0 0 \cdot \infty = \infty \cdot 0 = 0

Even this is actually obvious if you remember that it’s required by the definition of a rig. But if you try to put the ordinary closed interval topology on [0,][0,\infty], you’ll see multiplication is not continuous, because aa \cdot \infty is infinite when a>0a \gt 0 but then it suddenly jumps down to zero when aa hits zero. However, multiplication is continuous if we give [0,][0,\infty] the upper topology! Then [0,][0,\infty] becomes a topological rig.)

Convex linearity

The second condition is that RERE is convex linear. We describe how to take convex linear combinations of morphisms in FinStat,\mathrm{FinStat}, and then the functor RERE maps any convex linear combination of morphisms in FinStat\mathrm{FinStat} to the corresponding convex linear combination of numbers in [0,].[0,\infty].

Intuitively, this means that if we take a coin with probability PP of landing heads up, and flip it to decide whether to perform one measurement process or another, the expected information gained is PP times the expected information gain of the first process plus 1P1-P times the expected information gain of the second process.

(Operadically, the point is that both FinStatFinStat and [0,][0,\infty] are algebras of an operad P whose operations are convex linear combinations. The nn-ary operations in P are just probability distributions on an nn-element set. In other words, they’re points in the (n1)(n-1)-simplex.

So, saying that RERE is convex linear means that

RE:FinStat[0,] RE: FinStat \to [0,\infty]

is a map of P-algebras. But we avoid discussing this in our paper because FinStatFinStat, being a category, is just a ‘weak’ P-algebra, and we decided this would be too much for our poor little readers.

For those who like fine nuances: P is a topological operad, and FinStatFinStat and [0,][0,\infty] are algebras of this in the topological category TopCat. As I mentioned, FinStatFinStat is a ‘weak’ P-algebra, meaning the laws for convex linear combinations hold only up to coherent natural isomorphism. [0,][0,\infty] is strict… but to get convex linear combinations like λ0+(1λ)\lambda \cdot 0 + (1 - \lambda) \infty to behave continuously, we have to give [0,][0,\infty] the upper topology!)

Vanishing on a subcategory

The third condition is that RERE vanishes on morphisms (f,s):(X,q)(Y,r)(f,s) : (X,q) \to (Y,r) where the hypothesis ss is optimal. By this, we mean that equation \heartsuit\heartsuit\heartsuit gives a probability distribution pp equal to the ‘true’ one, qq.

That makes a lot of sense conceptually: we don’t gain any information upon learning the truth about a situation if we already knew the truth!

(But the subcategory of FinStatFinStat where we keep all the objects but only these ‘optimal’ morphisms also has a nice category-theoretic significance. Tom Leinster called it FP in this post:

That’s because it’s the ‘free P-algebra on an internal P-algebra’, where P is the operad I mentioned. I won’t explain what this means here, because Tom did it! Suffice it to say that it’s a shockingly abstract piece of operad theory that nonetheless manages to capture the concept of entropy very neatly. But that’s plain old entropy, not relative entropy.)

The result

Here, then, is our main result:

Theorem. Any lower semicontinuous, convex-linear functor

F:FinStat[0,] F : \mathrm{FinStat} \to [0,\infty]

that vanishes on every morphism with an optimal hypothesis must equal some constant times the relative entropy. In other words, there exists some constant c[0,]c \in [0,\infty] such that

F(f,s)=cRE(f,s) F(f,s) = c RE(f,s)

for any any morphism (f,s):(X,p)(Y,q)(f,s) : (X,p) \to (Y,q) in FinStat.\mathrm{FinStat}.

The proof

The proof is surprisingly hard. Or maybe we’re just surprisingly bad at proving things. But the interesting thing is this: the proof is swift and effective in the ‘generic’ case — the case where the support of the probability measures involved is the whole set they’re living on, and the constant cc is finite.

It takes some more work to handle the case where the probability measures have smaller support.

But the really hard work starts when we handle the case that, in the end, has c=c = \infty. Then the proof becomes more like analysis than what you normally expect in category theory. We slowly corner the result, blocking off all avenues of escape. Then we close in, grab its neck, and strangle it, crushing its larynx ever tighter, as it loses the will to fight back and finally expires… still twitching.

You’ve got to read the proof to understand what I mean.

Author: "john (baez@math.ucr.edu)" Tags: "Probability theory"
Send by mail Print  Save  Delicious 
Date: Friday, 14 Feb 2014 23:20

Stefan Forcey just alerted me to a long and reflective piece (edit: now paywalled) in the Chronicle of Higher Education about the relationship between the NSA and American academics — mathematicians in particular.

Now many academics are trying to be heard from the outside, arguing that the NSA’s spying tactics are proving counterproductive and that university researchers have a duty to stop assisting them. […]

In the months since Edward J. Snowden fled the United States with tens of thousands of electronic documents describing NSA practices, mathematicians are realizing that they are in the same position as nuclear physicists in the middle of the last century, and business students in more recent times — suddenly needing to figure out the ethics behind what they do, said Edward Frenkel, a professor of mathematics at the University of California at Berkeley.

The Chronicle article cites not only various sources that have been mentioned here at the Café before (e.g. Beilinson’s letter and Stefan’s Notices article), but also something that hasn’t:

Their appeals were followed on January 24 by an open letter from a group of 50 researchers warning of long-term damage to society and to the nation’s technological enterprise from the NSA’s reported tactic of intentionally weakening computer-security standards so it can carry out spy operations.

“Every country, including our own, must give intelligence and law-enforcement authorities the means to pursue terrorists and criminals,” the researchers wrote, “but we can do so without fundamentally undermining the security that enables commerce, entertainment, personal communication, and other aspects of 21st-century life.”

Shunned as NSA advisers, academics question their ties to the agency, The Chronicle of Higher Education, 10 February 2014. [Article was readable for free when I first put this post up, but got put behind a paywall a few days later.]

Author: "leinster (Tom.Leinster@ed.ac.uk)" Tags: "Science and Politics"
Send by mail Print  Save  Delicious 
Next page
» You can also retrieve older items : Read
» © All content and copyrights belong to their respective authors.«
» © FeedShow - Online RSS Feeds Reader