The Legendre–Fenchel transform, or Fenchel transform, or convex conjugation, is, in its naivest form, a duality between convex functions on a vector space and convex functions on the dual space. It is of central importance in convex optimization theory and in physics it is used to switch between Hamiltonian and Lagrangian perspectives.
Suppose that is a real vector space and that is a function then the Fenchel transform is the function defined on the dual vector space by
If you’re a regular reader then you will be unsurprised when I say that I want to show how it naturally arises from enriched category theory constructions. I’ll show that in the next post. In this post I’ll give a little introduction to the Legendre–Fenchel transform.
There is probably no best way to introduce the Legendre–Fenchel transform. The only treatment that I knew for many years was in Arnold’s book Mathematical Methods of Classical Mechanics, but I have recently come across the convex optimization literature and would highly recommend Tourchette’s The Fenchel Transform in a Nutshell — my treatment here is heavily influenced by this paper. I will talk mainly about the one-dimensional case as I think that gives a lot of the intuition.
We will start, as Legendre did, with the special case of a strictly convex differentiable function ; for instance, the function pictured on the left hand side above. The derviative of is strictly increasing and so the function can be parametrized by the derivative instead of the parameter . Indeed we can write the parameter in terms of the slope , . The Legendre-Fenchel transform can then be defined to satisfy where the angle brackets mean the pairing between a vector space and its dual. In this one-dimensional case, where and are thought of as real numbers, we just have .
As is a function of we can rewrite this as So the Legendre-Fenchel transform encodes the function is a different way. By differentiating this equation you can see that the , thus we have interchanged the abcissa (the horizontal co-ordinate) and the slope. So if has derivative at then has derivative at . This is illustrated in the above picture.
I believe this is what Legendre did and then that what Fenchel did was to generalize this to non-differentiable functions.
For non-differentiable functions, we can’t talk about tangent lines and derivatives, but instead can talk about supporting lines. A supporting line is one which touches the graph at at least one point and never goes above the graph. (The fact that we’re singling out lines not going above the graph means that we have convex functions in mind.)
For instance, at the point the graph pictured below has no tangent line, but has supporting lines with gradient from to . A convex function will have at least one supporting line at each point.
It transpires that the right way to generalize the transform to this non-differentiable case is to define it as follows: In this case, if has a supporting line of slope at then has a supporting line of slope at . In the picture above, at , the function has supporting lines with slope from to : correspondingly, the function has supporting lines with slope all the way from to .
If we allow the function to be not strictly convex then the transform will not alway be finite. For example, if then we have and for . So we will allow functions taking values in the extended real numbers: .
We can use the above definition to get the transform of any function , whether convex or not, but the resulting transform is always convex. (When there are infinite values involved we can also say that is lower semi-continuous, but I’ll absorb that into my definition of convex for functions taking infinite values.)
Everything we’ve done for one-dimensional easily generalizes to any finite dimensional real vector space , where we should say ‘supporting hyperplane’ instead of ‘supporting line’. From that we get a transform between sets of functions where is the vector space dual of . Similarly, we have a reverse transform going the other way, which is traditionally also denoted with a star for we define
This pair of transforms have some rather nice properties, for instance, they are order reversing. We can put a partial order on any set of functions to by defining if for all . Then Also for any function we have which implies that the operator is idempotent: This means that is a closure operation. What it actually does is take the convex envelope of , which is the largest convex function less than or equal to . Here’s an example.
This gives that if is already a convex function then . And as a consequence the Legendre–Fenchel transform and the reverse transform restrict to an order reversing bijection between convex functions on and convex functions on its dual .
There are many other things that can be said about the transform, such as Fenchel duality and the role it plays in optimization, but I don’t understand such things to my own satisfaction yet.
Next time I’ll explain how most of the above structure drops out of the nucleus construction in enriched category theory.
Here’s a classic theorem about finite products theories, also known as algebraic theories or Lawvere theories. You can find it in Toposes, Triples and Theories, but it must go way back to Lawvere’s thesis. In my work with Nina Otter on phylogenetic trees, we need a slight generalization of it… and if true, this generalization must already be known. So I really just want a suitable reference!
Theorem. Suppose is a category with finite products that is single-sorted: every object is a finite product of copies of a single object . Let be the category of models of : that is, product-preserving functors
and natural transformations between these. Let
be the functor sending any model to its underlying set:
Then has a left adjoint
and is equivalent to the category of algebras of the monad
As a result, any model can be written as a coequalizer
where the arrows are built from the counit of the adjunction
in the obvious ways: , and .
The generalization we need is quite mild, I hope. First, we need to consider multi-typed theories. Second, we need to consider models in rather than . So, this is what we want:
Conjecture. Suppose is a category with finite products that is -sorted: every object is a finite product of copies of certain objects , where ranges over some index set . Let be the category of topological models of : that is, product-preserving functors
and natural transformations between these. Let
be the functor sending any model to its underlying spaces:
Then has a left adjoint
and is equivalent to the category of algebras of the monad
As a result, any model can be written as a coequalizer
where the arrows are built from the counit of the adjunction in the obvious ways.
1) There shouldn’t be anything terribly special about here: any sufficiently nice category should work… but all I need is . What counts as ‘sufficiently nice’? If we need to replace ‘convenient category of topological spaces’, to make it cartesian closed, that’s fine with me—just let me know!
2) I’m sure this conjecture, if true, follows from some super-duper-generalization. I don’t mind hearing about such generalizations, and I suspect some of you will be unable to resist mentioning them… so okay, go ahead, impress me…
… but remember: all I need is this puny little result!
In fact, all I really need is a puny special case of this puny result. If it works, it will be a small, cute application of algebraic theories to biology.
Guest post by Christina Vasilakopoulou
In the eighth installment of the Kan Extension Seminar, we discuss the paper “Elementary Observations on 2-Categorical Limits” by G.M. Kelly, published in 1989. Even though Kelly’s classic book Basic Concepts of Enriched Category Theory, which contains the abstract theory related to indexed (or weighted) limits for arbitrary -categories, was available since 1982, the existence of the present article is well-justifiable.
On the one hand, it constitutes an independent account of the fundamental case =, thus it motivates and exemplifies the more general framework through a more gentle, yet meaningful exposition of 2-categorical limits. The explicit construction of specific notable finite limits such as inserters, equifiers etc. promotes the comprehension of the definitions, via a hands-on description. Moreover, these finite limits and particular results concerning 2-categories rather than general enriched categories, such as the construction of the cotensor as a PIE limit, are central for the theory of 2-categories. Lastly, by introducing indexed lax and pseudo limits along with Street’s bilimits, and providing appropriate lax/ pseudo/ bicategorical completeness results, the paper serves also as an indespensable reference for the later “2-Dimensional Monad Theory” by Blackwell, Kelly and Power.
I would like to take this opportunity to thank Emily as well as all the other participants of the Kan Extension Seminar. This has been a unique experience of constant motivation and inspiration for me!
Presently, our base of enrichment is the cartesian monoidal closed category of (small) categories, with the usual adjunction . The very definition of an indexed limit requires a good command of the basic -categorical notions, as seen for example in “Review of the Elements of 2-categories” by Kelly and Street. In particular, a 2-natural transformation between 2-functors consists of components which not only satisfy the usual naturality condition, but also the 2-naturality one expressing compatibility with 2-cells. Moreover, a modification between 2-natural transformations has components families of 2-cells compatible with the mapped 1-cells of the domain 2-category, i.e. (where is whiskering).
A 2-functor is called representable, when there exists a 2-natural isomorphism The components of this isomorphism are in , and the unit of the representation is the corresponding `element’ via Yoneda.
For a general complete symmetric monoidal closed category , the usual functor category for two -categories is endowed with the structure of a -category itself, with hom-objects ends (which exist at least when is small). In our context of = it is not necessary to employ ends and coends at all, and the hom-category of the functor 2-category is evidently the category of 2-natural transformations and modifications. However, we note that computations via (co)ends simplify and are essential for constructions and (co)completeness results for enrichment in general monoidal categories.
The definition of weighted limits for 2-categories
To briefly motivate the definition of a weighted limit, recall that an ordinary limit of a (-) functor is characterized by an isomorphism natural in , where is the constant functor on the object . In other words, the limit is the representing object of the presheaf Since components of a natural transformation (i.e. cones) can be viewed as components of a natural , the above defining isomorphism can be written as In this form, ordinary limits can be easily seen as particular examples of conical indexed limits for =, and we are able to generalize the concept of a limit by replacing the functor by an arbitrary functor (weight) .
We may thus think of a 2-functor as a (small) indexing type or weight, and a 2-functor as a diagram in of shape : The 2-functor gives rise to a 2-functor which maps a 0-cell to the category . A representation of this contravariant 2-functor is an object along with 2-natural isomorphism with components isomorphisms between categories The unit of this representation is which corresponds uniquely to a 2-natural transformation .
Via this 2-natural isomorphism, the object in satisfies a universal property which can be expressed in two levels:
The 1-dimensional aspect of the universal property states that every natural transformation factorizes as for a unique 1-cell in , where the vertical arrow is just pre-composition with .
The 2-dimensional aspect of the universal property states that every modification factorizes as for a unique 2-cell in .
The fact that the 2-dimensional aspect (which asserts an isomorphism of categories) does not in general follow from the 1-dimensional aspect (which asserts a bijection between the hom-sets of the underlying categories) is a recurrent issue of the paper. In fact, things would be different if the underlying category functor were conservative, in which case the 2-dimensional universal property would always imply the 1-dimensional one. Certainly though, this is not the case for =: the respective functor discards all the 2-cells and is not even faithful. However, if we know that a weighted limit exists, then the first level of the universal property suffices to detect it up to isomorphism.
Completeness of 2-categories
A 2-category is complete when all limits exist. The defining 2-natural isomorphism extends the mapping into a functor of two variables (the weighted limit functor) as the left parametrized adjoint (actually its opposite) of the functor mapping an object and a functor to . A colimit in is a limit in , and the weighted colimit functor is Apart from the evident duality, we observe that often colimits are harder to compute than limits. This may partially be due to the fact that is determined by the representable which gives generalized elements of , whereas the description of gives us arrows out of . For example, limits in are easy to compute via and in particular, taking gives us the objects of the category and gives us the morphisms. On the contrary, colimits in are not straightforward (except than their property ).
Notice that like ordinary limits are defined, via representability, in terms of limits in , we can define weighted limits in terms of limits of representables in : On the other hand, if the weights are representables, via Yoneda lemma we get
The main result for general -completeness in Kelly’s book says that a -enriched category is complete if and only if it admits all conical limits (equivalently, products and equalizers) and cotensor products. Explicitly, conical limits are those with weight the constant -functor , whereas cotensors are those where the domain enriched category is the unit category , hence the weight and the diagram are determined by objects in and respectively. Once again, for = an elementary description of both limits is possible.
Notice that when a 2-category admits tensor products of the form , then the 2-dimensional universal property follows from the 1-dimensional for every limit, because of conservativity of the functor and the definition of tensors. Moreover, the former also implies that the category is a strong generator in , hence the existence of only the cotensor along with conical limits in a 2-category is enough to deduce 2-completeness.
itself has cotensor and tensor products, given by and . It is ultimately also cocomplete, all colimits being constructed from tensors and ordinary colimits in (which give the conical limits in by the existence of the cotensor ).
If we were to make use of ends and coends, the explicit construction of an arbitrary 2-(co)limit in as the (co)equalizer of a pair of arrows between (co)products of (co)tensors coincides with Such an approach simplifies the proofs of many useful properties of limits and colimits, such as for appropriate 2-functors.
Famous finite 2-limits
The paper provides the description of some important classes of limits in 2-categories, essentially by exhibiting the unit of the defining representation for each particular case. A table which summarizes the main examples included is the following:
Let’s briefly go through the explicit construction of an inserter in a 2-category . The weight and diagram shape are as in the first line of the above table, and denote by the image of the diagram in . The standard technique is to identify the form of objects and morphisms of the functor 2-category , and then state both aspects of the universal property.
An object is a 2-natural transformation with components and satisfying the usual naturality condition (2-naturality follows trivially, since only has the identity 2-cell). This amounts to the following data:
an 1-cell , i.e. the object in determined by the functor ;
a 2-cell , i.e. the morphism in determined by the functor ;
properties, which make the 1-cells factorize as and .
We can encode the above data by a diagram Now a morphism is a modification between two objects as above. This has components
given by 2-cells and in satisfying naturality .
The modification condition and gives the components of as whiskered composites of . We can thus express such a modification as a 2-cell satisfying (graphically expressed by pasting accordingly to the sides of ).
This encoding simplifies the statement of the universal property for , as the object of in through which any natural transformation and modification uniquely factorize in an appropriate way (in fact, through the unit ). A very similar process can be followed for the identification of the other classes of limits. As an illustration, let’s consider some of these limits in the 2-category .
The inserter of two functors is a category with objects pairs where and in . A morphism is an arrow in such that the following diagram commutes: The functor is just the forgetful functor, and the natural transformation is given by .
The comma-object of two functors is precisely the comma category. If the functors have also the same domain, their inserter is a subcategory of the comma category.
The equifier of two natural transformations is the full subcategory of over all objects such that in .
There is a variety of constructions of new classes of limits from given ones, coming down to the existence of endo-identifiers, inverters, iso-inserters, comma-objects, iso-comma-objects, lax/ oplax/pseudo limits of arrows and the cotensors , out of inserters, equifiers and binary products in the 2-category . Along with the substantial construction of arbitrary cotensors out of these three classes, P(roducts)I(nserters)E(quifiers) limits are established as essential tools, also relatively to categories of algebras for 2-monads. Notice that equalizers are `too tight’ to fit in certain 2-categories of importance such as .
Weaker notions of limits in 2-categories
The concept of a weighted 2-limit strongly depends on the specific structure of the 2-category of 2-functors, 2-natural transformations and modifications, for the 2-categories and . If we alter this structure by considering lax natural transformations or pseudonatural transformations, we obtain the definition of the lax limit and pseudo limit as the representing objects for the 2-functors Notice that the functor categories and are 2-categories whenever is a 2-category, hence the defining isomorphisms are again between categories as before.
An important remark is that any lax or pseudo limit in can be in fact expressed as a `strict’ weighted 2-limit. This is done by replacing the original weight with its image under the left adjoint of the incusion functors , . The opposite does not hold: for example, inserters and equifiers are neither lax not pseudo limits.
We can relax the notion of limits in 2-categories even further, and define the bilimit of 2-functors and as the representing object up to equivalence: This is of course a particular case of general bilimits in bicategories, for which and are requested to be bicategories and and homomorphism of bicategories. The above equivalence of categories expresses a birepresentation of the homomorphism .
Evidently, bilimits (firstly introduced by Ross Street) may exist even when pseudo limits do not, since they require an equivalence rather than isomorphism of hom-categories. The following two results sum up the conditions ensuring whether a 2-category has all lax, pseudo and bilimits.
A 2-category with products, inserters and equifiers has all lax and pseudo limits (whereas it may not have all strict 2-limits).
A 2-category with biproducts, biequalizers and bicotensors is bicategorically complete. Equivalently, it admits all bilimits if and only if for all 2-functors , from a small ordinary category , the above mentioned birepresentation exists.
Street’s construction of an arbitrary bilimit requires a descent object of a 3-truncated bicosimplicial object in . An appropriate modification of the arguments exhibits lax and pseudo limits as PIE limits.
These weaker forms of limits in 2-categories are fundamental for the theory of 2-categories and bicategories. Many important constructions such as the Eilenberg-Moore object as well as the Grothendieck construction on a fibration, arise as lax/oplax limits. They are also crucial in 2-monad theory, for example when studying categories of (strict) algebras with non-strict (pseudo or even lax/oplax) morphisms, which are more common in nature.
A new Spanish language mathematical magazine has been launched: universo.math. Hispanophones should check out the first issue! There are some very interesting looking articles which cover areas from art through politics to research-level mathematics.
The editor-in-chief is my mathematical brother Jacob Mostovoy and he wants it to be a mix of Mathematical Intellingencer, Notices of the AMS and the New Yorker, together with less orthodox ingredients; the aim is to keep the quality high.
Besides Jacob, the contributors to the first issue that I recognise include Alberto Verjovsky, Ernesto Lupercio and Edward Witten, so universo.math seems to be off to a high quality start.
Guest post by Bruce Bartlett
The following is the greatest math talk I’ve ever watched!
- Etienne Ghys (with pictures and videos by Jos Leys), Knots and Dynamics, ICM Madrid 2006. [See below the fold for some links.]
I wasn’t actually at the ICM; I watched the online version a few years ago, and the story has haunted me ever since. Simon and I have been playing around with some of this stuff, so let me share some of my enthusiasm for it!
The story I want to tell here is how, via modular flow of lattices in the plane, certain matrices in give rise to knots in the 3-sphere less a trefoil knot. Despite possibly sounding quite scary, this can be easily explained in an elementary yet elegant fashion.
As promised above, here are some links related to Ghys’ ICM talk.
- Video — watch this now!
- Slides — the (ungarbled) slides for the talk
- Web version with animations — a great summary
- Article — the accompanying article with details
- More pictures — from Jos Leys’s webpage
- More animations — from Jos Leys’s webpage
- Article by Dana Lyons — a great summary with historical context and interviews
I’m going to focus on the last third of the talk — the modular flow on the space of lattices. That’s what produced the beautiful picture above (credit for this and other similar pics below goes to Jos Leys; the animation is Simon’s.)
Lattices in the plane
For us, a lattice is a discrete subgroup of . There are three types: the zero lattice, the degenerate lattices, and the nondegenerate lattices:
Given a lattice and an integer we can calculate a number — the Eisenstein series of the lattice: We need for this sum to converge. For, roughly speaking, we can rearrange it as a sum over of the lattice points on the boundary of a square of radius . The number of lattice points on this boundary scales with , so we end up computing something like and so we need to make the sum converge.
Note that = 0 for odd since every term is cancelled by the opposite term . So, the first two nontrivial Eisenstein series are and . We can use them to put `Eisenstein coordinates’ on the space of lattices.
Theorem: The map is a bijection.
The nicest proof is in Serre’s A Course in Arithmetic, p. 89. It is a beautiful application of the Cauchy residue theorem, using the fact that and define modular forms on the upper half plane . (Usually, number theorists set up their lattices so that they have basis vectors and where . But I want to avoid this ‘upper half plane’ picture as far as possible, since it breaks symmetry and mystifies the geometry. The whole point of the Ghys picture is that not breaking the symmetry reveals a beautiful hidden geometry! Of course, sometimes you need the ‘upper half plane’ picture, like in the proof of the above result.)
Lemma: The degenerate lattices are the ones satisfying .
Let’s prove one direction of this lemma — that the degenerate lattices do indeed satisfy this equation. To see this, we need to perform a computation. Let’s calculate and of the lattice . Well, where we have cheated and looked up the answer on Wikipedia! Similarly, .
So we see that . Now, every degenerate lattice is of the form where . Also, if we transform the lattice via , then and . So the equation remains true for all the degenerate lattices, and we are done.
Corollary: The space of nondegenerate lattices in the plane of unit area is homeomorphic to the complement of the trefoil in .
The point is that given a lattice of unit area, we can scale it , until lies on the 3-sphere . And the equation intersected with cuts out a trefoil knot… because it is “something cubed plus something squared equals zero”. And the lemma above says that the nondegenerate lattices are precisely the ones which do not satisfy this equation, i.e. they represent the complement of this trefoil.
Since we have not divided out by rotations, but only by scaling, we have arrived at a 3-dimensional picture which is very different to the 2-dimensional moduli space (upper half-plane divided by ) picture familiar to a number theorist.
The modular flow
There is an intriguing flow on the space of lattices of unit area, called the modular flow. Think of as sitting in , and then act on via the transformation dragging the lattice along for the ride. (This isn’t just some formula we pulled out the blue — geometrically this is the ‘geodesic flow on the unit tangent bundle of the modular orbifold’.)
We are looking for periodic orbits of this flow.
“Impossible!” you say. “The points of the lattice go off to infinity!” Indeed they do… but disregard the individual points. The lattice itself can ‘click’ back into its original position:
How are we to find such periodic orbits? Start with an integer matrix and assume is hyperbolic, which simply means . Under these conditions, we can diagonalize over the reals, so we can find a real matrix such that for some . Now set . We claim that is a periodic orbit of period . Indeed: We have just proved one direction of the following.
Theorem: The periodic orbits of the modular flow are in bijection with the conjugacy classes of hyperbolic elements in .
These periodic orbits produce fascinating knots in the complement of the trefoil! In fact, they link with the trefoil (the locus of degenerate lattices) in fascinating ways. Here are two examples, starting with different matrices .
The trefoil is the fixed orange curve, while the periodic orbits are the red and green curves respectively.
Ghys proved the following two remarkable facts about these modular knots.
- The linking number of a modular knot with the trefoil of degenerate lattices equals the Rademacher function of the corresponding matrix in (the change in phase of the Dedekind eta function).
- The knots occuring in the modular flow are the same as those occuring in the Lorenz equations!
Who would have thought that lattices in the plane could tell the weather!!
I must say I have thought about many aspects of these closed geodesics, but it had never crossed my mind to ask which knots are produced. – Peter Sarnak
Guest post by Sean Moss
In this post I shall discuss the paper “On a Topological Topos” by Peter Johnstone. The basic problem is that algebraic topology needs a “convenient category of spaces” in which to work: the category of topological spaces has few good categorical properties beyond having all small limits and colimits. Ideally we would like a subcategory, containing most spaces of interest, which is at least cartesian closed, so that there is a useful notion of function space for any pair of objects. A popular choice for a “convenient category” is the full subcategory of consisting of compactly-generated spaces. Another approach is to weaken the notion of topological space, i.e. to embed into a larger category, hopefully with better categorical properties.
A topos is a category with enough good properties (including cartesian closedness) that it acts like the category of sets. Thus a topos acts like a mathematical universe with ‘sets’, ‘functions’ and its own internal logic for manipulating them. It is exciting to think that if a “convenient topos of spaces” could be found, then its logical aspects could be applied to the study of its objects. The set-like nature of toposes might make it seem unlikely that this can happen. For instance, every topos is balanced, but the category of topological spaces is famously not. However, sheaves (the objects in Grothendieck toposes) originate from geometry and already behave somewhat like generalized spaces.
I shall begin by elaborating on this observation about Grothendieck toposes, and briefly examine some previous attempts at a “topological topos”. I shall explore the idea of replacing open sets with convergent sequences and see how this leads us to Johnstone’s topos. Finally I shall describe how Johnstone’s topos is a good setting for homotopy theory.
I would to thank Emily Riehl for organizing the Kan Extension Seminar and for much useful feedback. Thanks go also to the other seminar participants for being a constant source of interesting perspectives, and to Peter Johnstone, for his advice and for writing this incredible paper.
Are there toposes of spaces?
We shall need to be flexible about what we mean by “space”. For the rest of this post I shall try to use the term “topological space” in a strict technical sense (a set of points plus specified open sets), whereas “space” will be a nebulous concept. The idea is that spaces have existence regardless of having been implemented as a topological space or not, and may naturally have more (or perhaps less) structure. Topology merely forms one setting for their rigorous study. In topology we can only detect the topological properties of spaces. For example, and are isomorphic as topological spaces, but they are far from being the same space: consider how different their implementations as, say, metric spaces are. Some spaces are naturally considered as having algebraic or smooth structure. The type of question one wishes to ask about a space will bear upon the type of object as which it should be implemented.
An extremely important class of toposes consists of the Grothendieck toposes, which are categories of sheaves on a site. A site is a small category together with a Grothendieck coverage (also known as a Grothendieck topology). Informally, the Grothendieck coverage tells us how some objects can be “covered” by maps coming out of other objects. In the special case where the site is a topological space, the objects are open sets and the coverage tells us that an open set is covered by the inclusions of any family of open sets whose union is all of that open set. A sheaf on a site is then a contravariant -valued functor on the underlying category (a presheaf) which satisfies a “unique patching” condition with respect to each covering sieve.
In the following two senses, a Grothendieck topos always behaves like a category of spaces:
(A) One way to describe the properties of a space is to consider the maps into that space. This is the idea behind the homotopy groups, where we consider (homotopy classes of) maps from the -sphere into a space. Given a small category , each object is determined by knowing all the arrows into it and how these arrows “restrict” along other arrows, i.e. precisely the data of the representable presheaf. A non-representable presheaf can be viewed as a generalized object of , which is testable by the ‘classical’ objects of : it is described entirely by what the maps into it from objects of ought to be. If we have in mind that is some category of spaces, with some sense in which some spaces are covered by families of maps out of other spaces, (i.e. we have a Grothendieck coverage), then we should be able to patch maps into these generalized spaces together. So the topos of sheaves on this site is a setting in which we may be able to implement certain spaces, if we wish to study their properties testable by objects of .
(B) The category of presheaves on a small category is its free cocompletion. Intuitively, it is the category of objects obtained by formally gluing together objects of . The use of the word “gluing” is itself a spatial metaphor. CW-complexes are built out of gluing together cells - simplicial sets are instructions for carrying out this gluing. Manifolds are built from gluing together open subsets of Euclidean space. Purely formal ‘gluing’ is not quite sufficient: the Yoneda embedding of into its presheaves typically does not preserve any colimits already in . But if is a category of spaces, its objects are not neutral with respect to each other: there may be a suitable Grothendieck coverage on which tells us how some objects can cover others. The topos of sheaves is then the category of objects obtained by formally gluing objects of in a way that respects these coverings. This is strongly connected with the preservation of colimits by the embedding of into the sheaves. Colimits in the presheaf topos are constructed pointwise; to get the sheaf colimit one applies the reflection into the category of sheaves (“sheafification”) to the presheaf colimit. The more covers imposed on , the more work is done by the sheafification, so the closer we end up to the original colimit.
Are there toposes in topology?
It is far from clear that we can choose a site for which the space-like behaviour of sheaves accords with the usual topological intuition. If we want to use a topological topos for homotopy theory, then ideally it should contain objects that we can recognize as the CW-complexes, and we should be able to construct them via more or less the usual colimits.
Attempt 1: The “gros topos” of Giraud
The idea is to take sheaves on the ‘site’ of topological spaces, where covers are given by families of open inclusions of subspaces whose union is the whole space. We do not automatically get a topos unless the site is small, so instead take some small, full subcategory of , which is closed under open subspaces. The gros topos is the topos of sheaves for this site.
The Yoneda embedding exhibits as a subcategory, and in fact we can ‘embed’ via the functor , this will be full and faithful on a fairly large subcategory. By (B) one may like to consider the gros topos as the category of spaces glued together from objects of . This turns out not to be useful, since the site does not have enough covers for colimits to agree with those in . Moreover the site is so large that calculations are difficult.
Attempt 2: Lawvere’s topos
We use observation (A). Motivated by the use of paths in homotopy theory, we take to be the full subcategory of whose only object is the closed unit interval . So is the monoid of continuous endomorphisms of . Lawvere’s topos is the topos of sheaves on with respect to the canonical Grothendieck coverage (the largest Grothendieck coverage on for which is a sheaf).
Then an object of is a set of paths, together with, for any continuous , a reparametrization map , where this assignment is functorial. The points of such a space are given by natural transformations , i.e. ‘constant paths’ or paths which are fixed by every reparametrization. We can see which point a path visits at time by reparametrizing that path by the constant map with value . A word of caution: a given object in may have distinct paths which agree on points for all time.
This site is much easier to calculate with than the gros site (once we have a handle on the canonical coverage). Again there is a functor given by , which is full and faithful on a fairly large subcategory (including CW-complexes). However, it is still the case that the site could do with more covers: the functor does not preserve all the colimits used to build up CW-complexes. By observation (B), an object of is obtained by gluing together copies of the unit interval , so it is possible to construct the circle out of copies of , but we cannot do this in the usual way. The coequalizer of by its endpoints in is not , but a “signet-ring”: it is a circle with a ‘lump’, through which a path can cross only if waits there for non-zero time. We cannot solve this problem by adding in more covers, because the coverage is already canonical (adding in more covers will evict the representable from the topos).
The key idea in Johnstone’s topos is to replace paths with convergent sequences. Given a topological space , a convergent sequence in is a function from such that whenever is an open set containing , then there exists an such that for all . The convergent sequences are precisely the continuous maps out of when we give it the topology that makes it the one-point compactification of the discrete space - we denote this topological space by .
Convergent sequences as primitive
It is a basic theorem in general topology that, given a function between topological spaces, if it is continuous then it preserves convergent sequences. The converse is not true for general topological spaces, but it is true whenever is a sequential space. Given a topological space , A set is sequentially open if for any convergent sequence , with , is eventually in . (Clearly any open subset is sequentially open.) A topological space is then said to be sequential if all of its sequentially open sets are open. The sequential spaces include all first-countable spaces and in fact they can be characterized as the topological quotients of metrizable spaces, so they certainly include all CW-complexes.
The notion of convergent sequence is arguably more intuitive than that of open set. For example, each convergent sequence gives you concrete data about the nearness of some family of points to another point, whereas open sets only give you such data when the topology (or at least a neighbourhood basis) is considered as a whole. It would be compelling to define a continuous function as one that preserves convergent sequences. This motivates the study of subsequential spaces.
A subsequential space consists of a set (of points) and family of “convergent sequences”: a specified subset of the set of functions , such that:
- for every point , the constant sequence converges to ;
- if converges to , then so does every subsequence of ;
- if is a sequence and is a point such that every subsequence of contains a (sub)subsequence converging to , then converges to .
The third axiom is the general form of intertwining two or more sequences with the same limit or changing a finite initial segment of a sequence. Note that there is no ‘Hausdorff‘-style condition on the convergent sequences: a sequence may converge to more than one limit. A continuous map between subsequential spaces is a function from the points of to the points of that preserves convergence of sequences.
The axioms above are all true of the set of convergent sequences which arise from a topology on a set. In fact, this process gives a full and faithful embedding of sequential spaces into subsequential spaces. Thus sequential spaces live inside both topological and subsequential spaces. They are coreflective in the former and reflective in the latter: given a topological space , its sequentially open sets constitute a new (finer) topology; given a subsequential space , we can consider the “sequentially open” sets with respect to its convergent sequences, and then take all convergent sequences in the resulting (sequential) topological space. Observe that the sense of the adjunction in each case comes from the fact that we either throw in more open sets - so there is a natural map , or throw in more convergent sequences - so there is a natural map .
In the following I shall denote the category of sequential spaces by and that of subsequential spaces by .
Let be the full subcategory of on the objects (the singleton space) and (the one-point compactification of the discrete space of natural numbers). The arrows in this category can be described without topology as well: as functions, the maps are the eventually constant ones and the ones that “tend to infinity”.
Given an infinite subset , let denote the unique order-preserving injection whose image is . One can check that there is a Grothendieck coverage on where is covered by only the maximal sieve, and where is covered by any sieve such that:
- contains all of the points , .
- For any infinite subset there exists an infinite subset such that .
The topos is then defined to be .
The objects in our topos are a slight generalization of subsequential space. If , then is its set of points, and is its set of convergent sequences. Each point induces a ‘projection map’ , giving you the point of the sequence at time . The unique map induces a map , which sends each point to a canonical choice of constant sequence. Note that there may be more than one convergent sequence with the same points, thus it may be helpful to think of as the set of proofs of convergence for sequences.
Clearly we can embed , the subsequential spaces, into : the points are the same, and the convergence proofs are just the convergent sequences. The first two axioms are satisfied because of the equations that hold in . The third axiom is encoded into the coverage. Conversely, any object of for which the projection maps , are jointly injective is isomorphic to one coming from a subsequential space. There is a functor sending , and it is indeed sheaf-valued since it is equal to the composite of the coreflection with the inclusions . In fact, the Grothendieck coverage defining is canonical, so it is the largest for which this functor is well-defined.
We can use observation (B) to think of as all spaces constructed from gluing sequences together. It is just about possible that we could have motivated the construction of this way: classically, any sequential space is the quotient in of a metrizable space, which may be taken to be a disjoint union of copies of - one for every convergent sequence in . Compare this with the canonical representation of a presheaf as a colimit of representables (one for each of its elements).
It turns out that is the subcategory of -separated objects in , hence it is a reflective subcategory. is reflective in , hence it is also reflective in . In particular, all limits in are preserved by the inclusion into . Take some caution, however, since products do not agree with those in : one has to take the sequential coreflection of the topological product. This is only a minor issue; having to modify the product arises in other “convenient categories” such as compactly-generated spaces.
The colimits in do agree with those in because it is a coreflective subcategory. Surprisingly, the inclusion preserves many of these colimits.
Theorem Let be a sequential space, and an open cover of . Then the obvious colimit diagram in : is preserved by the embedding .
Proof The recipe for this sort of theorem is: take the colimit in presheaves, show that the comparison map is monic, then show that it is -dense, for then it will exhibit as the colimit upon reflecting into the topos . The colimit in presheaves is calculated “objectwise”, so has the same points of , but only those convergent sequences which are entirely within some (hence the comparison map is monic). To sheafify, we need to add in all those sequences which are locally in , i.e. for which the sieve in is -covering. For any , this sieve clearly contains all the points . But must also be eventually within one of the , so the second condition for the covering sieves is also satisfied.
There are several other colimit preservation results one can talk about (with similar proofs to the above). The amazing consequence of these is that the colimits used to construct CW-complexes are all preserved by the embedding . Thus classical homotopy theory embeds into and we have successfully found a topos of spaces which agrees with the classical theory.
Let be the category of non-zero finite ordinals and order-preserving maps. Then objects of the presheaf category are known as simplicial sets.
Theorem is the classifying topos for intervals in -toposes.
The closed unit interval is sequential and is in fact an interval (a totally ordered object with distinct top and bottom elements). Thus it corresponds to a geometric morphism (an adjunction with left-exact).
The usual geometric realization is not left-exact if considered to take values in , one must choose a “convenient subcategory” first, and then there is some work to do in proving it. Here the left-exactness just arises out of the general theory of geometric morphisms. Should we wish to do so, the above method allows us to replace with any other object that the internal logic of sees as an interval to get a different realization of simplicial sets.
The above is far from a complete survey of “On a Topological Topos”, which contains several more results of interest relating to and captures the elegance of using the site for calculation - I thoroughly recommend taking a look if you know some topos theory. We have seen enough though to understand that for many spaces the sequential properties align with the topological properties. Unfortunately, is yet to receive the attention it deserves.
(My university probably isn’t alone in encouraging mathematicians and computer scientists to embrace the idea of “big data”, or in more sober terminology, “data science”. Here, Nils Carqueville and Daniel Murfet introduce their really excellent article on big data in whole-population surveillance. —TL)
A particularly far-reaching development made possible by Big Data is that of unprecedented mass surveillance. As Alexander Beilinson, Stefan Forcey, Tom Leinster and others have pointed out, the role that mathematicians and computer scientists play here is central. With this in mind last January we wrote an essay on
stressing some of those aspects of the matter that we think deserve more attention or additional elaboration. We hope this to be a useful contribution to the necessary discussion on modern mass surveillance, and we thank Tom for his efforts in this direction, and for allowing us to post here.
EDIT: New Submission Deadline: 15 April 2014, Notification of Acceptance: 30 April 2014
A friend, John Wigglesworth, has asked me to announce the following conference. Perhaps he’ll let us know if pluralist homotopy type theorists are welcome ;).
I wonder if any progress has been made on my MO ‘multiverse’ question.
CFP: Symposium on the Foundations of Mathematics, Kurt Gödel Research Center, University of Vienna.
Set theory is taken to serve as a foundation for mathematics. But it is well-known that there are set-theoretic statements that cannot be settled by the standard axioms of set theory. The Zermelo-Fraenkel axioms, with the Axiom of Choice (ZFC), are incomplete. The primary goal of this symposium is to explore the different approaches that one can take to the phenomenon of incompleteness.
One option is to maintain the traditional “universe” view and hold that there is a single, objective, determinate domain of sets. Accordingly, there is a single correct conception of set, and mathematical statements have a determinate meaning and truth-value according to this conception. We should therefore seek new axioms of set theory to extend the ZFC axioms and minimize incompleteness. It is then crucial to determine what justifies some new axioms over others.
Alternatively, one can argue that there are multiple conceptions of set, depending on how one settles particular undecided statements. These different conceptions give rise to parallel set-theoretic universes, collectively known as the “multiverse”. What mathematical statements are true can then shift from one universe to the next. From within the multiverse view, however, one could argue that some universes are more preferable than others.
These different approaches to incompleteness have wider consequences for the concepts of meaning and truth in mathematics and beyond. The conference will address these foundational issues at the intersection of philosophy and mathematics. The primary goal of the conference is to showcase contemporary philosophical research on different approaches to the incompleteness phenomenon.
To accomplish this, the conference has the following general aims and objectives:
(1) To bring to a wider philosophical audience the different approaches that one can take to the set-theoretic foundations of mathematics. (2) To elucidate the pressing issues of meaning and truth that turn on these different approaches. (3) To address philosophical questions concerning the need for a foundation of mathematics, and whether or not set theory can provide the necessary foundation
Date and Venue: 7-8 July 2014 - Kurt Gödel Research Center, Vienna
Confirmed Speakers: Sy-David Friedman (Kurt Gödel Research Center for Mathematical Logic), Hannes Leitgeb (Munich Center for Mathematical Philosophy)
Call for Papers: We welcome submissions from scholars (in particular, young scholars, i.e. early career researchers or post-graduate students) on any area of the foundations of mathematics (broadly construed). Particularly desired are submissions that address the role of set theory in the foundations of mathematics, or the foundations of set theory (universe/multiverse dichotomy, new axioms, etc.) and related ontological and epistemological issues. Applicants should prepare an extended abstract (maximum 1500 words) for blind review, and send it to sotfom [at] gmail [dot] com. The successful applicants will be invited to give a talk at the conference and will be refunded the cost of accommodation in Vienna for two days (7-8 July).
Submission Deadline: 31 March 2014
Notification of Acceptance: 30 April 2014
Scientific Committee: Philip Welch (University of Bristol), Sy-David Friedman (Kurt Gödel Research Center), Ian Rumfitt (University of Birmigham), John Wigglesworth (London School of Economics), Claudio Ternullo (Kurt Gödel Research Center), Neil Barton (Birkbeck College), Chris Scambler (Birkbeck College), Jonathan Payne (Institute of Philosophy), Andrea Sereni (Università Vita-Salute S. Raffaele), Giorgio Venturi (Université de Paris VII, “Denis Diderot” - Scuola Normale Superiore)
Organisers: Sy-David Friedman (Kurt Gödel Research Center), John Wigglesworth (London School of Economics), Claudio Ternullo (Kurt Gödel Research Center), Neil Barton (Birkbeck College), Carolin Antos (Kurt Gödel Research Center)
Conference Website: sotfom [dot] wordpress [dot] com
Further Inquiries: please contact
Claudio Ternullo (ternulc7 [at] univie [dot] ac [dot] at)
Neil Barton (bartonna [at] gmail [dot] com)
John Wigglesworth (jmwigglesworth [at] gmail [dot] com)
Nina Otter is a master’s student in mathematics at ETH Zürich who has just gotten into the PhD program at Oxford. She and I are writing a paper on operads and the tree of life.
Anyone who knows about operads knows that they’re related to trees. But I’m hoping someone has proved some precise theorems about this relationship, so that we don’t have to.
By operad I’ll always mean a symmetric topological operad. Such a thing has an underlying ‘symmetric collection’, which in turn has an underlying ‘collection’. A collection is just a sequence of topological spaces for . In a symmetric collection, each space has an action of the symmetric group .
I’m hoping that someone has already proved something like this:
Conjecture 1. The free operad on the terminal symmetric collection has, as its space of -ary operations, the set of rooted trees having some of their leaves labelled .
Conjecture 2. The free operad on the terminal collection has, as its space of -ary operations, the set of planar rooted trees having some of their leaves labelled .
Calling them ‘conjectures’ makes it sound like they might be hard — but they’re not supposed to be hard. If they’re right, they should be easy and in some sense well-known! But there are various slightly different concepts of ‘rooted tree’ and ‘rooted planar tree’, and we have to get the details right to make these conjectures true. For example, a graph theorist might draw a rooted planar tree like this:
while an operad theorist might draw it like this:
If the conjectures are right, we can use them to define the concepts of ‘rooted tree’ and ‘rooted planar tree’, thus side-stepping these details. And having purely operad-theoretic definitions of ‘tree’ and ‘rooted tree’ would make it a lot easier to use these concepts in operad theory. That’s what I want to do, ultimately. But proving these conjectures, and of course providing the precise definitions of ‘rooted tree’ and ‘rooted planar tree’ that make them true, would still be very nice.
And it would be even nicer if someone had already done this. So please provide references… and/or correct mistakes in the following stuff!
Definition. For any natural number , an -tree is a quadruple where:
- is a finite set whose elements are called internal vertices;
- is a finite non-empty set whose elements are called edges;
- and are maps sending any edge to its source and target, respectively.
Given , we write if is an edge such that and .
This data is required to satisfy the following conditions:
- is a bijection;
- there exists exactly one such that ;
- for any there exists a directed edge path from to : that is, a sequence of edges and vertices such that
So the idea is that our tree has as its set of vertices. There could be lots of leaves, but we’ve labelled some of them by numbers . In our pictures, the source of each edge is at its top, while the target is at bottom.
There is a root, called , but also a ‘pre-root’: the unique vertex with an edge going from it to . I’m not sure I like this last bit, and we might be able to eliminate this redundancy, but it’s built into the operad theorist’s picture here:
It might be a purely esthetic issue. Like everything else, it gets a bit more scary when we consider degenerate special cases.
I’m hoping there’s an operad whose set of -ary operations, , consists of isomorphism classes of -trees as defined above. I’m hoping someone has already proved this. And I hope someone has characterized this operad in a more algebraic way, as follows.
Definition. A symmetric collection consists of topological spaces together with a continuous action of the symmetric group on each space . A morphism of symmetric collections consists of an -equivariant continuous map for each . Symmetric collections and morphisms between them form a category .
(More concisely, if we denote the groupoid of sets of the and bijections between these as , then is the category of functors from to .)
There is a forgetful functor from operads to symmetric collections
with a left adjoint
assigning to each symmetric collection the operad freely generated by it.
Definition. Let be the terminal operad: that is, the operad, unique up to isomorphism, such that is a 1-element set for each .
The algebras of are commutative topological monoids.
Conjecture 1. There is a unique isomorphism of operads
that for each sends the unique -ary operation in to the corolla with leaves: that is, the isomorphism class of -trees with no internal vertices.
(When I say “the unique -ary operation in ”, but treating it as an operation in , I’m using the fact that the unique operation in gives an element in , and thus an operation in .)
Planar Rooted Trees
And there should be a similar result relating planar rooted trees to collections without symmetric group actions!
Definition. A planar -tree is an -tree in which each internal vertex is equipped with a linear order on the set of its children, i.e. the set .
I’m hoping someone has constructed an operad whose set of -ary operations, , consists of isomorphism classes of planar -trees. And I hope someone has characterized this operad as follows:
Definition. A collection consists of topological spaces . A morphism of collections consists of a continuous map for each . Collections and morphisms between them form a category .
(If we denote the category with natural numbers as objects and only identity morphisms between as , then is the category of functors from to .)
There is a forgetful functor
with a left adjoint
assigning to each collection the operad freely generated by it. This left adjoint is the composite
where the first functor freely creates a symmetric collection from a collection by setting , and the second freely generates an operad from a symmetric collection, as described above.
Conjecture 2. There is a unique isomorphism of operads
that for each sends the unique -ary operation in to the corolla with leaves ordered so that .
Have you seen a proof of this stuff???
Standard logic involving the truth values ‘true’ and ‘false’ can make it difficult to model some of the fuzziness we use in everyday speech. If you’d bought a bike yesterday then today it would be truthful to say “This bike is new”, but it wouldn’t be truthful so say it in 20 years’ time. However, between now and then there won’t be a specific day on which the statement “This bike is new” suddenly switches from being true to being false. How can you model this situation?
One approach to modelling this situation is with fuzzy logic where you allow your truth values to be things other than just true and false. For instance, you can take the interval as the set of truth values with representing false and representing true. So the truth degree of the statement “This bike is new” would vary, being today and decreasing to something very close to in 20 years’ time.
This post is an attempt by me to understand this fuzzy logic in the context of enriched category theory, in particular, using as a monoidal category to enrich over. We will see that categories enriched over can be interpreted as fuzzy posets or fuzzy preorders.
This was going to be a comment on Tom Avery’s Kan Extension Seminar post on Metric Spaces, Generalized Logic, and Closed Categories but grew too big!
I have been trying to understand a bit of fuzzy logic because it feeds into fuzzy concept analysis which I’m also trying to understand. As ever, I understand the category theory better than the logic. In terms of references, I’ve picked up bits and pieces from Belohlavek’s Concept lattices and order in fuzzy logic and Hájek’s Metamathematics of fuzzy logic.
Truth degrees in : fuzzy logic
As mentioned above you can replace your set of truth values by the interval , with representing completely false and representing completely true. Thus an element of represents a ‘degree’ of truth. The idea is that this represents the degree of truthfulness of statements like “George is old” or “The film is good”: it is not supposed to represent than the probabitlity of the truth of a statement like “I will receive an email from a student tomorrow”. We are modelling vagueness and not probabilty. Also, we are supposed to be modeling ojectivity so the truth degree of the statement “George is old” will depend on the age of George and not on anyone’s opinion of how old he seems to be.
Being category theorists, we might try to fit this into Lawvere’s framework of generalized logic, where we think of our truth values as forming a closed monoidal category. We can make it into a poset — and hence a category — by using the order , this will be our notion of entailment. So if and are statements then precisely when the truth degree of is less than or equal to the truth degree of . However, I’m not entirely sure how to interpret that.
The observant amongst you will have noticed that this poset is isomorphic to the poset we use for metric spaces , via the maps and . So an alternative interpretation of these truth values will be as ‘proximities’ — proximity approximately meaning not at all close, and proximity approximately meaning really, really close. [Switching from distances in to proximities in is an important step in calculating magnitudes of metric spaces, but I won’t say any more about that.]
Next we want some notion of ‘conjunction’ and ‘implication’ . In category theory terms we want a closed monoidal structure on the category . There are at least three such structures that are well studied.
The product structure: Here and if and otherwise. Via the exponential and logarithm maps this corresponds to the usual closed monoidal structure on of addition and truncated subtraction.
The Gödel structure: Here and if and otherwise. Via the exponential and logarithm maps this corresponds to the closed monoidal structure on which gives rise to ultrametric spaces.
The Łukasiewicz structure: Here and . Via the exponential and logarithm maps this corresponds to the monoidal structure and that doesn’t look at all familiar!
If the truth degrees of two statements are and then the truth degree of both statements together should be given by the conjunction , but I don’t know examples of real world models with these three different logics. Anybody? I would be very interested in hearing thoughts and ideas on this.
In all of the above three cases the closed monoidal category of classical truth values embeds closed monoidally.
Enriching over : fuzzy preorders
We can now think about what categories enriched in are, as generalizations of preorders. Such an enriched category consists of a set and to each pair of elements we associate a truth degree , we want to think of this as the degree to which is ordered before . This is what we will call a fuzzy preorder. It will satisfy fuzzy transitivity: . How this is actually interpreted will depend on which product we took.
Here’s an example of a fuzzy preorder. Suppose that Bart is aged 9, Marge is aged 39 and Homer is aged 40. Is it true that Homer is (at least) as old as Marge? Yes. Is it true that Marge is (at least) as old as Homer? Well, nearly. Is it true that Bart is (at least) as old as Marge? Well, not really at all. We can formalise this into a fuzzy preorder “is at least as old as” where Algebraically we could define This gives a fuzzy preorder on the set with respect the product monoidal structure on , or in other words, it gives a category enriched over the monoidal category .
There should be some more convincing examples, but I haven’t found any yet.
An exercise at this point is to look at the dictionary in my post on Galois correspondences and enriched categories and extend it by another column for categories enriched over .
Term is nearly over, which for me means the end of the 4th year Fourier Analysis course I’ve been teaching for the last couple of years.
I was fortunate enough to take over the course from Jim Wright, a genuine expert on the subject, and I inherited a great set of notes from him. But I felt the need to make the course my own, so I’ve been writing my own notes, which I’ve just finished: notes here, plus accompanying problem sheets. They’re mostly about convergence of Fourier series, with a delicious dessert of Fourier analysis on finite abelian groups.
But what I wanted to write about here — and get your opinions on — was not Fourier analysis, but some questions of teaching. This year, I’ve been (in the jargon) “flipping the classroom”, or at least partially flipping it (which reminds me of that mysterious substance, partially inverted sugar syrup, that you sometimes see on ingredients lists). I’d like to hear about other people’s similar experiences.
The allotted class time is two 50-minute “lectures” a week, plus one “workshop” (Edinburgh lingo for tutorial or exercise class) a fortnight. Here’s the routine for lectures:
At least a week before a given lecture, I make Latexed notes for that lecture available online.
The students are committed to spending at least an hour before each lecture reading over the notes.
I spend the first half of each lecture giving an overview of that portion of the notes, on the firm assumption that the students have done the prescribed amount of reading beforehand. Thus, I can spend time talking about the big picture rather than the mechanics, and I can concentrate on parts that seem likely to cause most difficulty rather than having to go through everything.
The second half of the lecture is interactive. We do different activities each time, e.g.:
- solving exercises (individually, in small groups, or all together)
- question and answer sessions
- definitions quizzes
- working on mathematical writing
- identifying hard parts of the course and having students explain them to each other.
In a completely flipped classroom, all the time would be taken up with interactive work. The first half of each class isn’t too far from a traditional lecture, except for the data-projected notes and the prior reading by the students. That’s why I said it’s only partially flipped.
It’s the first time I’ve done things quite like this, so this year has been pretty experimental. In particular, before every lecture, I’ve had to come up with an idea for the second half of the class, and evidently some have been more successful than others. I’m also running out of ideas! It’s a good thing it’s the end of term.
Have you ever taught in a similar way? If so, what worked well, and what didn’t? Can you share some ideas for classroom activities?
I’ve just submitted a piece for the new Opinions section of the monthly LMS Newsletter: Should mathematicians cooperate with GCHQ? (Update: now available (p.34).) The LMS is the London Mathematical Society, which is the UK’s national mathematical society. My piece should appear in the April edition of the newsletter, and you can read it below.
Here’s the story. Since November, I’ve been corresponding with people at the LMS, trying to find out what connections there are between it and GCHQ. Getting the answer took nearly three months and a fair bit of pushing. In the process, I made some criticisms of the LMS’s total silence over the GCHQ/NSA scandal:
GCHQ is a major employer of mathematicians in the UK. The NSA is said to be the largest employer of mathematicians in the world. If there had been a major scandal at the heart of the largest publishing houses in the world, unfolding constantly over the last eight months, wouldn’t you expect it to feature prominently in every issue of the Society of Publishers’ newsletter?
To its credit, the LMS responded by inviting me to write an inaugural piece for a new Opinions section of the newsletter. Here it is.
I had a 500-word limit, so I omitted a lot. Here are the facts on the LMS’s links with GCHQ, as stated to me by the LMS President Terry Lyons:
Should mathematicians cooperate with GCHQ?
One of the UK’s largest employers of mathematicians has been embroiled in a major international scandal for the last nine months, stands accused of law-breaking on an industrial scale, and is now the object of widespread outrage. How has the mathematical community responded? Largely by ignoring it.
GCHQ and its partners have been systematically monitoring as much of our lives as they possibly can, including our emails, phone calls, text messages, bank transactions, web browsing, Skype calls, and physical location. The goal: “collect it all”. They tap internet trunk cables, bug charities and political leaders, disrupt lawful activist groups, and conduct economic espionage, all under the banner of national security.
Perhaps most pertinently to mathematicians, the NSA (GCHQ’s major partner and partial funder) has deliberately undermined internet encryption, inserting a secret back door into a standard elliptic curve algorithm. This can be exploited by anyone sufficiently skilled and malicious — not only the NSA/GCHQ. (See Thomas Hales’s piece in February’s Notices of the AMS.) We may never know what else mathematicians have been complicit in; GCHQ’s policy is not to comment on intelligence matters, which is to say, anything it does.
Indifference to mass surveillance rests partly on misconceptions such as “it’s only metadata”. This is certainly false; for instance, GCHQ has used webcams to collect images, many sexually intimate, of millions of ordinary citizens. It is also misguided, even according to the NSA’s former legal counsel: “metadata absolutely tells you everything about somebody’s life”.
Some claim to be unbothered by the recording of their daily activities, confident that no one will examine their records. They may be right. But even if you feel that way, do you want the secret services to possess such a powerful tool for chilling dissent, activism, and even journalism? Do you trust an organization operating in secret, and subject to only “light oversight” (a GCHQ lawyer’s words), never to abuse that power?
Mathematicians seldom have to face ethical questions. But now we must decide: cooperate with GCHQ or not? It has been suggested that mathematicians today are in the same position as nuclear physicists in the 1940s. However, the physicists knew they were building a bomb, whereas mathematicians working for GCHQ may have little idea how their work will be used. Colleagues who have helped GCHQ in the past, trusting that they were contributing to legitimate national security, may justifiably feel betrayed.
At a bare minimum, we as a community should talk about it. Sasha Beilinson has proposed that working for the NSA/GCHQ should be made “socially unacceptable”. Not everyone will agree, but it reminds us that we have both individual choice and collective power. Individuals can withdraw their labour from GCHQ. Heads of department can refuse staff leave to work for GCHQ. The LMS can refuse GCHQ’s money.
At a bare minimum, let us acknowledge that the choices are ours to make. We are human beings before we are mathematicians, and if we do not like what GCHQ is doing, we do not have to cooperate.
The Society has an indirect relationship with GCHQ via a funding agreement with the Heilbronn Institute, in which the Institute will give up to £20,000 per year to the Society. This is approximately 0.7% of our total income. This is a recently made agreement and the funding will contribute directly to the LMS-CMI Research Schools, providing valuable intensive training for early career mathematicians. GCHQ is not involved in the choice of topics covered by the Research Schools.
So, GCHQ’s financial support for the LMS is small enough that declining it would not make a major financial impact.
I hope the LMS will make a public statement clarifying its relationship with GCHQ. I see no argument against transparency.
Another significant factor (which Lyons alludes to above and is already a matter of public record) is that GCHQ is a funder of the Heilbronn Institute, which is a collaboration between GCHQ and the University of Bristol. I don’t know that the LMS is involved with Heilbronn beyond what’s mentioned above, but Heilbronn does seem to provide an important channel through which (some!) British mathematicians support the secret services.
Finally, I want to make clear that although I think there are some problems with the LMS as an institution, I don’t blame the people running it, many of whom are taking time out of extremely busy schedules for the most altruistic reasons. As I wrote to one of them:
I’m genuinely in awe of the amount that you […] give to the mathematical community, both in terms of your selflessness and your energy. I don’t know how you do it. Anything critical I have to say is said with that admiration as the backdrop, and I hope I’d never say anything of the form “do more!”, because to ask that would be ridiculous.
Rules for commenting here I’ve now written several posts on this and related subjects (1, 2, 3, 4). Every time, I’ve deleted some off-topic comments — including some I’ve enjoyed and agreed with heartily. Please keep comments on-topic. In case there’s any doubt, the topic is the relationship between mathematicians and the secret services. Comments that stray too far from this will be deleted.
Guest post by Alexander Campbell
We want to develop category theory in a general 2-category, in order to both generalise and clarify our understanding of category theory. The key to this endeavour is to express the basic notions of the theory of categories in a natural 2-categorical language. In this way we are continuing a theme present in previous posts from the Kan Extension Seminar, wherein monads and adjunctions were given a 2-categorical setting, and by analogy, in our very first paper, whose purpose was to express basic notions of the theory of sets in a natural categorical language. In this post we consider a concept very central and special to category theory: the Yoneda lemma.
So what’s the Yoneda Lemma again?
The Yoneda lemma says that for any object of a category , the diagram is a left extension.
In this post I will give a motivation for the notion of Yoneda structure, as defined in the paper Yoneda Structures on 2-Categories of Ross Street and Bob Walters. But before we begin I would like to take this opportunity to thank Emily for inviting me to join the Kan Extension Seminar and for her support and encouragement throughout the course. This has been and continues to be a singularly valuable experience in my first year as a category theorist.
Liftings and extensions
After the various compositions available in a 2-category (nicely unified as the operation of pasting), the most basic notions of two-dimensional algebra are liftings and extensions. A diagram in a 2-category is said to be a left lifting if pasting along effects a bijection of 2-cells
There is a similar elementary definition of left extension, but we can simply say that a left extension is a left lifting in (reverse 1-cells). Hence these are just dual versions of a single notion. Left liftings in and are right liftings and right extensions respectively, but these will not concern us here.
An elementary result of 2-categorical algebra is the pasting lemma. This tells us that given a diagram in which is a left lifting, is a left lifting if and only if the composite is a left lifting. By duality, we immediately get a corresponding result for left extensions.
The Yoneda lemma and Yoneda structures
A Yoneda structure on a 2-category consists of two pieces of data satisfying three axioms. We will see that the data is what is necessary to naturally express the Yoneda lemma in the 2-category, and that the axioms are expressions of familiar results of category theory.
In a general 2-category, following the philosophy of generalised elements, it is natural to consider arrows with arbitrary domain (not just restricted to the terminal object 1) as objects of . In order to state the Yoneda lemma for such generalised objects, we must replace with some arrow . Here could be loosely thought of as “sets freely varying over ”; in CAT, it is the category of presheaves over and is the Yoneda embedding.
But now we must address the issue of size. Firstly, in CAT, the Yoneda embedding exists only for locally small . Secondly, in the wish for a snappy statement of the Yoneda lemma, I left an important condition unmentioned: it is not true for a general category that the hom-functor exists for every object . (It would be true if were locally small, but we do not want CAT to consist only of locally small categories; for if is not small, then is not locally small.) Hence we must restrict those elements for which we state the Yoneda lemma.
We can now specify the data of a Yoneda structure on a 2-category. The first piece of data is a class of admissible arrows in the 2-category; the only property we require of this class is that it be a right ideal, meaning that if is admissible, then so is for every such composable . We say that an object is admissible if the identity arrow is admissible; in the Yoneda structure on CAT, the admissible objects are the locally small categories. Hence we see that there is nothing mysterious about size; it is just part of the structure. The other piece of a Yoneda structure is the Yoneda arrows: an admissible arrow for each admissible object .
With this structure we can naturally state the Yoneda lemma in a 2-category. We take this as our first axiom for a Yoneda structure.
Axiom 1. For , both admissible, the left extension exists.
The universal property of this extension is indicated by
So we have used the Yoneda lemma as a definition of the hom-functors ; the axiom asserts their existence. We can further define the hom-functors on 1-cells and 2-cells in both variables: in the first variable by the universal property of extension, and in the second variable by composition. In particular, for (suitably admissible) , we can define as , with a similar definition on 2-cells. This all just follows from Axiom 1, hence is part of any Yoneda structure.
All further use of the admissibility notion is just to ensure the existence of the data of Axiom 1. So I will make minimal explicit mention of it from here; it can be easily filled in as necessary.
Universal arrows and absolute liftings
The second axiom of a Yoneda structure is expressed in terms of liftings. I will now exhibit the presence of liftings in category theory.
The first formalisation of the notion of universal property that Mac Lane gives in Categories for the Working Mathematician is that of universal arrow (I will deal with the ‘left’ sense here). Given a functor and an object of , he defines what it means for a “pair” consisting of an object of and an arrow to be a universal arrow from to . This definition says precisely that is a left lifting diagram.
In a general 2-category, the restriction of our consideration to such ‘global objects’ is untenable. So how do we express the universal property of a ‘generalised arrow’ from a ‘generalised object’ to ? It is not enough to say that the corresponding diagram is a left lifting diagram; we must consider the object at all its stages of development. Hence we define a 2-cell to be a universal arrow from to when for every earlier stage , the diagram is a left lifting. That is, we require our corresponding diagram to be an absolute left lifting (meaning that the property of being a left lifting is preserved when composed with any arrow into ).
Hence absolute liftings give a natural 2-categorical expression of the notion of universal arrow. This slogan can help us to interpret some basic facts of the algebra of 2-categories as they relate to category theory. For instance, the result that a 2-cell is the unit of an adjunction if and only if it is an absolute left lifting corresponds to the familiar result of basic category theory that states that a functor is a right adjoint if and only if there is a universal arrow to from every object of .
Universal elements and representability
Now, observe that the 2-cell in the ordinary Yoneda lemma, in addition to being a left extension, is trivially a left lifting; for this just means that there is a bijection between arrows and elements of the set . In familiar terms, this says that is a universal element of the functor , and by the above discussion we can immediately incorporate this into our setting as the second axiom for a Yoneda structure.
Axiom 2. The 2-cell of Axiom 1 is an absolute left lifting.
The universal property of this absolute left lifting is indicated by
The combination of these two universal properties of the 2-cells is the ignition that allows us to begin to develop category theory in our 2-category. Together they give a bijection of 2-cells and as indicated by the bijections and as displayed in the diagram From Axiom 2 and the pasting lemma, we get that the LHS is an absolute left lifting (with base ) if and only if is an absolute left lifting. This is exactly the realisation in our setting of the familiar correspondence between universal elements of and universal arrows from to . In particular, if is an isomorphism, then this condition holds, and we have a universal element of . Note that we can understand being an isomorphism as meaning that is representable.
Now, recall the representability condition of ordinary category theory that says that a functor is representable if and only if it has a universal element. We have seen that one half of this implication follows from the first two axioms of a Yoneda structure. We could take the other half as an axiom, being a basic result of category theory that we wish to incorporate into our setting.
Axiom 3*. If a 2-cell yields an absolute left lifting diagram when pasted onto , then is an isomorphism.
Understood in our interpretation, this is indeed the other implication of the representability condition.
However this axiom is too strong to capture all the examples we want! While it does hold in ordinary category theory and in 2-categories of internal categories and variable (=indexed=parametrized) categories, it is does not hold in general in enriched category theory. For in enriched category theory, representability of a functor is expressed as an isomorphism in the base of enrichment, and the bijection of sets of the universal element condition is not enough to capture this.
Hence we do not take Axiom 3* as an axiom for Yoneda structures. Instead we take a couple of special cases. By Axiom 1 for and in place of , we have the bijections of 2-cells Let correspond to in the first bijection, and correspond to the 2-cell in the second. These corresponding 2-cells are an identity and a pasting of two absolute left liftings respectively. Hence (by the pasting lemma in the second case) they are both absolute left liftings, and Axiom 3* would imply that they are isomorphisms. We take these special cases as our third and final axiom.
Axiom 3. The 2-cells and are isomorphisms.
In summary, a Yoneda structure on a 2-category consists of admissible arrows and Yoneda arrows satisfying Axioms 1, 2, and 3.
Formal category theory
Now that we have established the notion of a Yoneda structure on a 2-category, we can begin to develop category theory within such a 2-category. What is remarkable is that everything follows formally and naturally from the Yoneda structure and from the elementary two-dimesional algebra of pasting, liftings, extensions and adjunctions. In their paper, Street and Walters develop results concerning adjunctions, weighted colimits, EM objects and Kleisli objects for monads, and they introduce the notion of totality. I will now briefly survey some of these results; I refer the reader to the paper for details.
They show that the 2-categorical definition of an adjunction in terms of its unit and counit is equivalent to the condition , which expresses the classical hom-set definition of an adjunction. This is an isomorphism of modules, and indeed, one benefit of a Yoneda structure is that it equips our 2-category with a notion of modules (a.k.a. profunctors): modules from to are arrows .
It is well known that a satisfactory theory of colimits in enriched category theory requires the consideration of colimits weighted by presheaves. However the natural generality of the theory calls for weighting by modules (presheaves on are modules from to ). Given a module and an arrow (suitably admissible), we define the colimit of weighted by by the formula . From this formula we can derive many familiar results such as the fact that left adjoints preserve colimits, associativity of colimits in the weight, and the Yoneda isomorphisms. Also we can introduce the notion of pointwise extension in terms of colimits and show that they are indeed extensions. Thus all the results of the calculus of colimits, which in enriched category theory depend on a “complicated machinery” involving enriched extranatural transformations, coends, enriched functor categories etc., in fact follow easily and formally in the context of the elementary notion of a 2-category with a Yoneda structure.
Street and Walters show that a certain result from Street’s earlier paper, the characterisation of the Eilenberg-Moore category of a monad in CAT as certain sheaves on the Kleisli category of the monad, holds for any Yoneda structure. They also show that in the presence of more properties of the 2-category (particularly a bo-ff factorisation of its arrows), we can make sense of the idea of the Kleisli object as a “full subcategory” of the Eilenberg-Moore object.
This paper introduced the notion of totality. An arrow is said to be total when and are admissible and the module has an admissible left adjoint. It follows immediately from the developed theory that this left adjoint must be given by and be the pointwise left extension of along the Yoneda arrow (note that these assertions require all the admissibility assumptions in the definition). Such adjunctions are absolutely fundamental to the use of category theory; they sometimes go by the name of nerve and realization.
An object is said to be total when the identity arrow is total; this is the same as saying the Yoneda arrow has a left adjoint. This can be thought of as a sort of strong cocompleteness property. In particular we get a “very satisfactory” adjoint functor theorem: if is total, an arrow has a right adjoint if and only if is admissible and preserves the colimit .
Important motivating examples of 2-categories with Yoneda structures are 2-categories of internal categories (see the papers of Street and Weber) and enriched categories, in particular CAT is the 2-category of categories internal to SET. Given categories and suitably nice and approriate for internalisation and enrichment respectively, we get Yoneda structures on the 2-categories Cat) and -Cat roughly as follows. For such a category , the size structure arises from an object of which is in some way understood as a “full subcategory” of ; for internal categories, this can be characterised as an internal full subcategory, or equivalently, a classifying discrete opfibration. Then the objects arise as exponentials , and the Yoneda arrows are exponential adjoints of hom-functors.
The 2-category LEX of finitely complete categories and finitely continuous functors inherits a Yoneda structure from CAT. This example is of interest because the total objects are (almost) Grothendieck toposes.
Another example, not treated in this paper but in this later paper of Street, are 2-categories of variable categories. These are of the form = Hom(,CAT), consisting of pseudofunctors, pseudonatural transformations, and modifications. I mention this example here for the interesting way in which the size structure arises. An arrow in is admissible if for every and where and are representables on objects of , the comma object is representable.
Guest post by Dimitri Zaganidis
First of all, I would like to thank Emily for organizing the Kan extension seminar. It is a pleasure to be part of it. I want also to thank my advisor Kathryn Hess and my office mate Martina Rovelli for their revisions.
In the fifth installment of the Kan Extension Seminar we read the paper “Review of the Elements of 2-categories” by G.M Kelly and Ross Street. This article was published in the Proceedings of the Sydney Category Theory Seminar, and its purpose is to “serve as a common introduction to the authors’ paper in this volume”.
The article has three main parts, the first of them being definitions in elementary terms of double categories and 2-categories, together with the notion of pasting. In a second chapter, they review adjunctions in 2-categories with a nice expression of the naturality of the bijection given by mates using double categories. The last part of the article introduces monads in 2-categories, and specializing to 2-monads towards the end.
Double categories and 2-categories
The article starts with the definition of a double category as a category object in the (not locally small) category of categories . (I think that there might be some set theoretic issues with such a category, but you can add small everywhere if you want to stay safe.)
The authors then switch to a description of such an object in terms of objects, horizontal arrows, vertical arrows, and squares, with various compositions and units. I will explain a bit how to go from one description to the other.
A category object is constituted of a category of objects, a category of morphisms, target and source functors, identity functor and a composition.
The category of objects is the category whose morphisms are “the objects” and whose morphisms are the vertical arrows. The category of morphisms is the category whose objects are the horizontal morphisms and whose morphisms are the squares, with vertical composition.
Since the functors preserve pullbacks, by applying them to a double category seen as a category object, we get actual categories. Applying to the double category, we get the category whose objects are “the objects” and whose morphisms are the horizontal arrows. Applying , we get the category whose objects are the vertical morphisms and whose morphisms are the squares, but this time with horizontal composition.
An interesting thing to notice is that the symmetry of the explicit description of a double category is much more apparent than the symmetry of its description as a category object.
One can define a -category as a double category with a discrete category of objects, or as a -enriched category, exactly as one can define a simplicially enriched small category as either a category enriched over or as a category object in with a discrete simplicial set of objects.
The second viewpoint on 2-categories leads to definitions of 2-functors and 2-natural transformations and also to modifications, once one makes clear what enrichment a category of 2-functors inherits.
It is also worthwhile mentioning that the pasting operation makes computations easier to make, because they are more visual. The proof of proposition 2.1 of this paper is a good illustration of this.
The basic example of a 2-category is itself, with natural transformations as 2-cells (squares).
As category theory describes set-like constructions, 2-category theory describes category-like constructions. You can usually build up categories with as objects sets with extra structure. In the same way, small V-categories, V-functors, and V-natural transformations form a 2-category.
My first motivation to learn about 2-categories was the 2-category of quasi-categories defined by Joyal and which has been studied by Emily Riehl and Dominic Verity in the article The 2-category theory of quasi-categories in particular the category-like constructions one can make with quasi-categories, such as adjunctions and limits.
Adjunctions and mates in 2-categories
It is not a surprise that 2-categories are the right framework in which to define adjunctions. To build the general definition from the usual one, you just need to replace categories by objects in a 2-category, functors by 1-cells of the 2-category, and natural transformations by its 2-cells.
Adjunctions in a 2-category compose (as in ), and one can form two, a priori distinct double categories of adjunctions. Both of them will have the objects of as objects and the horizontal morphisms being the morphisms of , while their vertical morphisms are the adjunctions (going in the same direction as the right adjoint, by convention). The two double categories differ on the squares. Given adjunctions and together with 1-cells (between the domains of and ) and (between the codomains of and ), the squares of the first double category are 2-cells while the squares of the second are 2-cells .
Now, the bijective correspondence between these kind of 2-cells given by mates induces an isomorphism of double categories. This means in particular that the horizontal (or vertical) composite of mates is equal to the mate of the corresponding composite.
This is a very beautiful way to express the naturality of the mate correspondence, and it provides a one-line proof of the fact that two 1-cells that are left adjoints to a same 1-cell are naturally isomorphic.
Monads in 2-categories
2-categories are also the right framework to define monads. A monad in a 2-category and on an object is a 1-cell together with 2-cells and , verifying the usual equations and . Since 2-functors preserve both horizontal and vertical compositions, for all objects of , induces a monad on , given by post-composition . The authors call * an action of on * a algebra structure on .
The definition of monad morphism in this article is quite surprising for someone who has read the previous article by Ross Street, The formal theory of monads, which we read for the second meeting of the Kan extension seminar.
In Ross Street’s original paper, a monad morphism is a 1-cell together with a -cell verifying certain conditions.
In this paper, morphisms of monads are defined only for monads on the same object, letting the -cell part of a monad transformation of the previous article be the identity. This leads the authors to reverse the direction of the morphism, since the -cell seems to go in the reverse direction of the -cell!
One might think that fixing is needed by the result which explains that there is a bijection between monad morphisms and actions of on making a “-bimodule”. In fact, in the case where is not necessarily the identity, there is a bijection between 2-cells such that is a monad functor and actions of on making a “-bimodule”. A statement of the same kind can be also made for monad functor transformations (in the sense of the formal theory of monads). A 2-cell is a monad functor transformation if and only if is a morphism of “-bimodules”.
A 2-category admits the construction of algebras if for every monad , the 2-functor is representable. The representing object is called the object of -algebras. By Yoneda, the free-forgetful adjunction can be made internal in this case.
The terminology is justified, because in the -category , it specializes to the usual notions of the category of -algebras and the corresponding free-forgetful adjunction.
A monad in is the same as a 2-functor , where is the 2-category with one object and , the algebraist’s simplicial category as monoidal hom-category (with ordinal sum). Since moreover, (where is the subcategory of maps of preserving maxima, which is acted on by via ordinal sum) one can see that the object of t-algebras can be expressed as a weighted limit.
As a consequence, it is not surprising that a 2-category admits the construction of algebras under some completeness assumptions.
In the last part of the article, the authors review the notion of a doctrine, which is a 2-monad in 2-, i.e., a 2-functor , where is a 2-category, and 2-natural transformations and , which are respectively the multiplication and the unit, verifying the usual identities. The fact that it is both a monad on a 2-category and in another one can be a bit disturbing at first.
If is a doctrine over a 2-category , then its algebras will be objects of together with an action , exactly as in the case of algebras over a usual monad.
Already with morphisms, we can take advantage of the fact that a 2-category has 2-cells, and define -morphisms to be lax in the sense that the diagram is not supposed to be commutative, but is rather filled by a 2-cell with some coherence properties.
As one might expect, we can actually form a 2-category of such -algebras by adding 2-cells, using again the -cells existing in .
If we keep only the -morphisms that are strict, we obtain the object of algebras (which should be a -category) that we discussed before.
One example of a doctrine is together with the multiplication induced by the ordinal sum, and unit given on by the functor that sends to .
The algebras for this doctrine will be categories equipped with a monad acting on them, while the -morphisms are transformations of monads, and the -2-cells are exactly the monad functor transformations of Street’s article.
Here, since we have two different 2-categories of algebras (with strict -morphisms or with all of them), one can wonder if monad morphisms will induce -functors -- on the level of these -categories.
This is indeed the case, and one can actually go even one step further and define monad modifications, using the fact that 2- is in fact a 3-category! These modifications between two given monad morphisms are in fact in bijective correspondence with the 2-natural transformations between the -functors induced by these monad morphisms on the level of algebras (with lax D-morphisms). Note that they are not the same as monad morphisms transformations of Street’s article.
This bijection is nice because it implies that you can compare 2-categories of algebras by only looking at the doctrines: if they are equivalent, so are the 2-categories of algebras.
The fact that this bijection does not hold when we restrict only to strict morphism was really surprising to me, but I guess this is the price to pay to use the 3-category structure.
During the last days of April, the Kan extension seminar will be reading the article “Two dimensional monad theory”, by Blackwell, Kelly and Powell. We will then have more to say about these 2-monads!
Leila Schneps is trying to raise $6,000 for what sounds like a good cause: translating a biography of Grothendieck into English:
As of this moment she’s raised $350… including $100 of her own money.
Leila Schneps writes:
Two volumes of a projected 3-volume biography of A. Grothendieck have been completed by Winfried SCHARLAU, and are available in German on Amazon.com through “Books on Demand” as
“Wer ist Alexander Grothendieck? Teil 1: Anarchy”
“Wer ist Alexander Grothendieck? Teil 3: Spiritualität”.
Grothendieck is aware of Scharlau’s work and even met him during the preparation of the part concerning the lives of his parents, although he subsequently ruptured relations with Scharlau as he has with everyone else.
In 2013, an English translation of volume 1 by my sister, Melissa Schneps became available on Amazon as
“Who is Alexander Grothendieck? Part 1, Anarchy”
Melissa’s translation of volume 1 was funded by Jim Carlson of the Clay Institute for Mathematics.
The new director of the Clay Institute has not accepted to fund translation of volume 3. The book covers Grothendieck’s life from his rupture with the mathematical establishment in 1970 to his disappearance in 1991 and contains a wealth of information from documents and interviews.
Due to multiple expressions of interest in an English translation, we have decided to use this method to raise funding for the translation of volume 3. Once the project is underway, we propose to create a web page where chapters of the volume can be made accessible to donors as they appear. Once completed, the new volume will be available on Amazon through “Books on Demand” like the others.
If you are interested in Alexander Grothendieck and his astonishing life, please help to make this happen.
In response Kevin Lin asked the obvious question: “What about volume 2?” I don’t know the answer! Do you?
One of my dreams these days is to get people to apply modern math to ecology and biology, to help us design technologies that work with nature instead of against it. I call this dream ‘green mathematics’. But this will take some time to reach, since living systems are subtle, and most mathematicians are more familiar with physics.
So, I’ve been warming up by studying the mathematics of chemistry, evolutionary game theory, electrical engineering, control theory and information theory. There are a lot of ideas in common to all these fields, but making them clear requires some category theory. I call this project ‘network theory’. I’m giving some talks about it at Oxford.
(This diagram is written in Systems Biology Graphical Notation.)
Here’s the plan:
Nature and the world of human technology are full of networks. People like to draw diagrams of networks: flow charts, electrical circuit diagrams, signal-flow graphs, Bayesian networks, Feynman diagrams and the like. Mathematically minded people know that in principle these diagrams fit into a common framework: category theory. But we are still far from a unified theory of networks. After an overview, we will look at three portions of the jigsaw puzzle in three separate talks:
I. Electrical circuits and signal-flow graphs.
II. Stochastic Petri nets, chemical reaction networks and Feynman diagrams.
III. Bayesian networks, information and entropy.
All these talks will be in Lecture Theatre B of the Computer Science Department—you can see a map here, but the entrance is on Keble Road. Here are the times:
• Tuesday 25 February, 3:30 pm: Network Theory I: electrical circuits and signal-flow graphs. Also available on YouTube.
• Tuesday 4 March, 3:30 pm: Network Theory II: stochastic Petri nets, chemical reaction networks and Feynman diagrams. Also available on YouTube.
• Tuesday 11 March, 3:30 pm: Network Theory III: Bayesian networks, information and entropy. Also available on YouTube.
I thank Samson Abramsky, Bob Coecke and Jamie Vicary of the Computer Science Department for inviting me, and Ulrike Tillmann and Minhyong Kim of the Mathematical Institute for helping me get set up. I also thank all the people who helped do the work I’ll be talking about, most notably Jacob Biamonte, Jason Erbele, Brendan Fong, Tobias Fritz, Tom Leinster, Tu Pham, and Franciscus Rebro.
Ulrike Tillmann has also kindly invited me to give a topology seminar:
Trees are not just combinatorial structures: they are also biological structures, both in the obvious way but also in the study of evolution. Starting from DNA samples from living species, biologists use increasingly sophisticated mathematical techniques to reconstruct the most likely “phylogenetic tree” describing how these species evolved from earlier ones. In their work on this subject, they have encountered an interesting example of an operad, which is obtained by applying a variant of the Boardmann–Vogt “W construction” to the operad for commutative monoids. The operations in this operad are labelled trees of a certain sort, and it plays a universal role in the study of stochastic processes that involve branching. It also shows up in tropical algebra. This talk is based on work in progress with Nina Otter.
I’m not sure exactly where this will take place, but probably somewhere in the Mathematical Institute, shown on this map. Here’s the time:
• Monday 24 February, 3:30 pm, Operads and the Tree of Life.
If you’re nearby, I hope you can come to some of these talks — and say hi!
(This diagram was drawn by Darwin.)
Guest post by Nick Gurski
I have been thinking about various sorts of operads with my PhD student Alex Corner, and have become interested in the following very concrete question: what are examples of operads in the category of finite groups under the cartesian product? I don’t know any really interesting examples, but maybe you do! After the break I will explain why I got interested in this question, and tell you about some examples that I do know.
Alex and I started off thinking about various sorts of things you might do with operads in , and were eventually forced into what we currently call an action operad. This is an operad whose job it is to act on the objects of other operads. The key examples to keep in mind are the terminal operad (each set is just a singleton), the symmetric operad (the th set is the th symmetric group), and the braid operad (the th set is the th braid group). The technical definition involves an operad , a group structure on each set , a map of operads to the symmetric operad which is levelwise a group homomorphism, and a final condition (when it makes sense) relating operadic composition, , with group multiplication:
These ideas have cropped up before, for example in Nathalie Wahl’s thesis or this preprint of Wenbin Zhang. Once you have this definition, you can define operads which are equivariant with respect to : you have an operad , a group action of on for each , and some equivariance conditions that generalize the equivariance conditions for a symmetric operad.
This isn’t the only thing you can do with an action operad, you can also think about the 2-monad on whose algebras are strict monoidal categories where acts naturally on -fold tensor products. If you do this with the symmetric operad, you get symmetric strict monoidal categories (or permutative categories, if you are a topologist); if you do this with the (ribbon) braid operad, you get (ribbon) braided strict monoidal categories; and if you do this with the action operad of all terminal groups, you get back plain old strict monoidal categories. The only other “naturally-occurring” example of an action operad that I know of is the operad of -fruit cactus groups, . These groups come up in the representation theory of quantum groups, particularly the theory of crystals, and the monoidal structure you get out here is something Drinfeld called a coboundary category. I can give you a generators-and-relations definition of these groups, at which point I would have completely exhausted my understanding of this operad. The best reference that I know of is the paper Crystals and coboundary categories by Henriques and Kamnitzer.
What does this have to do with my question about operads of finite groups? Well, as it turns out, the structure map for an action operad only has two options: it can be surjective, or it can be the zero map (i.e., everything maps to the identity permutation). Furthermore, that condition I wrote down relating group multiplication and operadic composition says that giving an action operad with the zero map is equivalent to giving an operad in which the operadic composition maps all preserve group multiplication. Alex and I already showed that the operadic composition of all identity elements is the identity element in the target, in other words an action operad with the zero map is just an operad in the category of groups using the cartesian product.
You can take kernels of maps between action operads, so in particular given any action operad you can take the kernel of ; this gives you an operad in groups. For the examples above, you get finite groups when is the terminal operad or (as the terminal operad is obviously the kernel in the case of ), but for the rest of the examples you get an operad in the category of groups, but most of those groups are infinite. The groups involved are the so-called pure versions: pure braids, pure ribbon braids, and pure -fruit cacti. One can then think of an action operad with surjective map as being an extension of the symmetric operad by an operad in the category of groups (the “pure” version), and that action operad is finite if and only if the operad in the category of groups is one containing only finite groups. We can translate this back into thinking about monoidal categories by then noting if we have some notion of strict monoidal category in which th tensor powers come equipped with a natural action of a finite group for all , then we must be able to dig up an operad in the category of finite groups.
Now let’s talk concrete examples: what operads do I know in the category of finite groups? Well, there is obviously the terminal operad, but I can go very slightly further in that I can tell you how to construct some new ones. Here are two methods you can use to construct operads of finite groups.
- Let be a finite abelian group. Then there is an operad where (the power here is a cartesian power). The operadic composition map takes the vector in the first coordinate and duplicates the th coordinate times, then adds the result to the vector you get by just concatenating the vectors in the . Here must be abelian as it appears as , and must be abelian for any action operad by an Eckmann-Hilton argument.
- Now let be any finite group, but in fact you will see that we don’t use anything interesting about the group structure here. There is an operad with given the pointwise group structure. You should think of this as the set of functions from to . Operad composition is a map If we are given , we must give back an element of . If there is some such that then we use the function coming from evaluated on If not, then there exist such that lies in the th “interval” as we had before and lies in the th interval, so then you use the function coming from evaluated on .
I am reasonably content with the first of these constructions, I understand how to do the second but don’t really know where it comes from, and I have some ideas about how one could try to insert the finite group of their choice as but haven’t checked the details, and know basically nothing else. Furthermore, I don’t know how these interact with each other, or how you can form extensions of with them outside of some obvious constructions.
Those are just the two straightforward approaches that I know of to construct operads in the category of finite groups. You can also try to construct these operads from operads of topological spaces or simplicial sets, but once again I don’t know of an example that produces finite things (apart from ones giving the groups above). Do you know any others?
Guest post by Tom Avery
Before getting started, I’d like to thank Emily for organizing the seminar, as well as all the other participants. It’s been a lot of fun so far! I’d also like to thank my supervisor Tom Leinster for some very helpful suggestions when writing this post.
In the fourth instalment of the Kan Extension Seminar we’re looking at Lawvere’s paper “Metric spaces, generalized logic, and closed categories”. This is the paper that introduced the surprising description of metric spaces as categories enriched over a certain monoidal category . A lot of people find this very striking when they first see it, and it helps to drive home the point that enriched categories are not just ordinary categories with some extra structure on the hom-sets; in fact the hom-sets don’t have to be sets at all!
Lawvere also intended the paper to serve as an accessible introduction to enriched category theory, so it begins fairly gently with some basic definitions. For the purposes of this post however, I’ll assume the reader has at least seen the definitions of symmetric monoidal closed categories, -categories, and -functors. If not, everything you need can be found on the nlab.
Lawvere begins by describing some of the philosophy behind the paper. He argues that rather than being abstract nonsense that only appears at the highest level of mathematics, category theory can be found even in the most elementary concepts. He mentions the well known examples of posets and groups as categories, and the aim of the paper is to describe another example of an elementary mathematical structure arising from categorical constructions, namely metric spaces.
We’ll also look at posets as enriched categories that are closely related to metric spaces, and this is where the “generalized logic” of the title comes in. The idea is that objects in a monoidal category will be truth values for our generalized logic, morphisms in will be entailments between them, certain functors correspond to logical operations, and adjointness relations between these functors correspond to rules of inference.
We will only consider categories enriched in complete and cocomplete symmetric monoidal closed categories (which I’ll just call “monoidal categories”), and we are particularly interested in two examples. The first is , which has two objects, and , and three morphisms, which we write as entailments: , and . The tensor product is given by conjunction, the unit is and the internal hom is implication.
The hom-tensor adjunction gives a correspondence between judgements and , which is just the deduction theorem from logic. Categories enriched in are simply posets (well, really preorders); we interpret the hom-object as the truth value of the statement , and the composition and identities give us
The category has as objects non-negative reals together with , and has a single morphism precisely when . The tensor product is addition with as the unit, and this forces the internal hom to be truncated subtraction:
but we’ll just write this as . Thus the fact that if and only if can be thought of as somehow generalizing the deduction theorem. The composition and identities in an -category give us
which are familiar as axioms of metric spaces, and this is how we will refer to -categories. This is a weakening of the usual metric space axioms in three ways:
We don’t require that only when . This condition amounts to requiring that isomorphic objects are equal, which is undesirable from a categorical point of view; this is also the reason for dropping antisymmetry of posets. There are naturally occurring examples that don’t satisfy the non-degeneracy condition; perhaps most notably, it makes sense to regard the distance between two measurable functions that are equal almost everywhere as being zero.
We allow . Allowing infinite distances is also quite natural, as it’s easy to imagine spaces where it’s impossible to get from one point to another. For example, this is the most sensible way to think of distances in a discrete space.
We don’t require . Allowing non-symmetry is a more significant shift in what we mean by a metric space, but even so there are natural examples, for example the Hausdorff metric on subsets of a metric space. It’s best to think of the hom-values in a generalized metric space not as representing distances, but representing the least time or effort it takes to get from one point or state to another. When you look at it this way, there’s no reason to assume symmetry (e.g. going up and down a hill are not equally difficult).
There is a monoidal functor sending and , and this induces an inclusion of the category of posets into the category of metric spaces. There is also a monoidal functor , so metric spaces and categories can be seen as generalizing posets in different directions.
Functor categories and Yoneda
A -functor between posets is precisely an order preserving map, and a -functor between metric spaces is a Lipschitz map with Lipschitz constant 1, i.e. a function such that for all and in . For an arbitrary monoidal category and -categories and the functor -category has as objects the -functors , and the hom objects are defined by the end
When , this just gives the ordinary end representation of the set of natural transformations. Since and are both posets, ends reduce to products, which are conjunctions in and suprema in . So for posets, the object of natural transformations between order preserving maps is the truth value of the statement . In other words, order preserving maps are ordered by domination. For metric spaces the hom object is given by . So the space of Lipschitz-1 maps between metric spaces is endowed with the sup metric.
The closed structure of the monoidal category makes itself into a -category. For this gives the obvious interpretation of as a poset. However, the metric on is given by truncated subtraction, rather than the usual . The self-enrichment of means we can define the presheaf -category , and that allows us to define the Yoneda embedding , which sends an object to the representable presheaf . Just as in ordinary category theory, this embedding is full and faithful, which in the case of posets gives:
Proposition. Any poset can be embedded into the poset of downwards closed subsets of (ordered by inclusion), by sending an element to the set of all things .
Here we have used the correspondence between downwards closed subsets and order preserving maps . For metric spaces we have:
Proposition. Any metric space can be isometrically embedded into the space of Lipschitz-1 maps .
Adequacy and density
We say that a -functor is adequate if the composite
of the Yoneda embedding with restriction along is full and faithful. We say that is dense if we have an isomorphism of bimodules . I haven’t yet said what bimodules are, but the important thing is that in the case of metric spaces this becomes the condition that
This terminology is slightly unfortunate, because what Lawvere calls an adequate functor is nowadays called a dense functor, and Lawvere’s notion of density doesn’t have a modern name as far as I’m aware. A functor is dense iff it’s image is dense in the usual metric space sense. Lawvere shows that a dense functor is adequate, at least in the case of metric spaces. A metric space is separable if it admits a dense map from the discrete space on the natural numbers, and this result give us
Proposition. Any separable metric space can be isometrically embedded in the space of sequences of reals endowed with the sup metric.
This space of sequences is not quite as usually defined because of the non-standard metric on .
The comprehension scheme
A(n ordinary) functor is called a discrete opfibration if every morphism is the image under of a unique morphism with domain . There is a correspondence between discrete opfibrations on and functors , and this correspondence is obtained by restricting a certain adjunction to an equivalence. This adjunction can be defined for arbitrary , provided that the unit of is in fact terminal (this gives a way of defining projections from a tensor product to each of the factors, which is not generally possible). I won’t go into the details of how it’s defined, but I’ll describe what it gives for and .
Given an order-preserving map between posets, the left adjoint of this adjunction sends to , defined by setting to be the truth value of the statement . The right adjoint sends to , where is the (upwards closed) subset of on which takes the value , and is the inclusion. If we take to be a discrete poset (i.e. a set) then is just a unary predicate on , and
Lawvere calls the right adjoint of the adjunction the comprehension scheme, and this example shows why.
For metric spaces, the left adjoint sends a Lipschitz-1 map to the function defined by , i.e. the distance from the image of . The right adjoint sends to the inclusion of its vanishing set.
It’s interesting to note that for and , this adjunction restricts to an equivalence between a subcategory of (the discrete opfibrations, and inclusions of upwards closed subsets respectively) and the whole functor category . For however, only those functions that give the distance from a closed subset are invariant under the adjunction, and these correspond to the inclusion of this closed set. Thus purely categorical considerations naturally give rise to the closed sets in as a distinguished class of subsets.
Bimodules and Kan extensions
For -categories and , a bimodule consists of a -functor (the placement of the is not completely agreed upon, but I’ll stick with Lawvere’s convention). We can think of a bimodule as a -category structure on the disjoint union of and , with as the hom object between an object of and an object of , and with all the hom objects in the other direction being the initial object of . The -functoriality of means that we have a “composition” operation , and dually, and that this composition is associative.
There are two special types of bimodule that are particularly important, and they are both induced by -functors. Given a -functor , we define bimodules and by
A morphism between two bimodules is just a -natural transformation between the corresponding -functors . Given bimodules and , we can define the composite bimodule by the coend
and this composition is associative (up to isomorphism). The bimodule given by serves as an identity for this composition, and hence there is bicategory of -categories, bimodules and natural transformations.
Let be the -category with a single object and whose only hom object is the unit object of . Then a bimodule is just an object of , and composition is just the tensor product in . So bimodule composition generalizes the tensor product, and it’s natural to ask whether the internal hom of also generalizes. Since general bimodule composition is not commutative, there are two composition operations (on the left and the right) that could potentially have right adjoints, and in fact they both do, and these are defined by end formulae. If we restrict our attention to modules of the form for a -functor , the operation also has a left adjoint, which is given by , and can also be written explicitly as a coend.
Let be a -functor . Bimodules are just -functors , and the bimodule composite is the bimodule corresponding to the -functor . Thus the left and right adjoint to precomposition with in this case give left and right Kan extensions along , and the formulae for these adjoints reduce to the familiar coend and end formulae for left and right Kan extensions:
When is full and faithful, the Kan extensions really are extensions, in other words if you extend and then restrict, you get back what you started with. So specializing all this to the case gives:
Proposition. Let be a subspace of a metric space , and let be a Lipschitz-1 map from to . Then has both maximal and minimal Lipschitz-1 extensions to the whole of , given by
A particularly interesting example is given when is the space of simple, non-negative functions on a probability space (i.e. positive linear combinations of indicator functions of measurable sets), and is the inclusion into the space of all measurable functions (with the sup metric). If you take to be the integral of the simple function , then the two Kan extensions of give the upper and lower integrals of a measurable function . In particular, is integrable if and only if .
If we think of a map from a poset to as a unary relation or predicate, then the left and right Kan extensions become
If we restrict to the case when and are just sets, so that is an arbitrary function, this gives
and specializing even further to the case when is a product projection we have
So existential and universal quantification are (very) special cases of Kan extensions. This leads Lawvere to use the phrase “Kan quantification” for Kan extensions.
Every -functor gives rise to an adjunction in the bicategory of bimodules, and it’s natural to ask under what conditions does every adjunction arise in this way. Recall that a metric space is (Cauchy) complete if every Cauchy sequence has a limit. A bit of care is needed when making this precise, because the possible non-symmetry of the metric means there are several things it could mean for a sequence to be Cauchy or convergent.
Proposition. A metric space is Cauchy complete if and only if every adjunction of bimodules is induced by a Lipschitz-1 map . In particular, the points of the Cauchy completion of correspond to bimodule adjunctions (where is the one point metric space).
This gives a description of Cauchy complete metric spaces that is purely categorical, so it makes sense to talk about Cauchy complete -categories for arbitrary . It turns out that an ordinary (-enriched) category is Cauchy complete iff all idempotents split, and the Cauchy completion of a ring regarded as a one object category enriched in is the category of finitely generated projective modules.
These days it’s more common to see Cauchy complete categories defined in terms of absolute colimits. I’m not too sure about the history of the concept, but I think the definition Lawvere gives came first, and Street later gave the characterisation in terms of absolute colimits in “Absolute colimits in enriched categories”.
Finally, Lawvere defines a -graph to consist of a set of “vertices” and for every pair of vertices an object of (i.e. like a -category but without composition or identities). A morphism of -graphs is a function and a family of morphisms . There is an evident forgetful functor from to the category of -graphs. This has a left adjoint which sends a -graph to the -category which has the vertices of as objects and hom objects defined by
where the sum runs over all finite sequences of vertices. In the case of metric spaces, this gives the “least-cost” distance:
I’ll finish with a few questions for those more knowledgeable than myself:
I think the only categorical notion in the paper that I hadn’t at least heard of before is Lawvere’s notion of density (i.e. a functor such that ; as noted above this is different from what people usually mean by a dense functor). Does this appear elsewhere in category theory, or is it only interesting for metric spaces?
A lot of the results about metric spaces are very similar to classical results, but are not quite the same because of the non-standard metric on . Is there any way of recovering the classical results without too much work?
The idea of generalized logic is intriguing, but I’m not sure how far you can go with it. Is it possible, for example, to talk about generalized models of a generalized theory?
You may recall how Tom Leinster, Tobias Fritz and I cooked up a neat category-theoretic characterization of entropy in a long conversation here on this blog. Now Tobias and I have a sequel giving a category-theoretic characterization of relative entropy. But since some people might be put off by the phrase ‘category-theoretic characterization’, it’s called:
I’ve written about this paper before, on my other blog:
- Relative Entropy (Part 1): how various structures important in probability theory arise naturally when you do linear algebra using only the nonnegative real numbers.
- Relative Entropy (Part 2): a category related to statistical inference, and how relative entropy defines a functor on this category.
- Relative Entropy (Part 3): statement of our main theorem, which characterizes relative entropy up to a constant multiple as the only functor with a few nice properties.
But now the paper is actually done! Let me give a compressed version of the whole story here… with sophisticated digressions buried in some parenthetical remarks that you’re free to skip if you want.
Our gives a new characterization of the concept of relative entropy, also known as ‘relative information’, ‘information gain’ or—by people who like to use jargon to make their work seem obscure—‘Kullback-Leibler divergence’.
Here’s the basic idea. Whenever you have two probability distributions and on the same finite set you can define the entropy of relative to :
Here we set
equal to when unless is also zero, in which case we set it equal to 0. Relative entropy thus takes values in
Intuitively speaking, measures how surprised you’d be if you thought a situation was described by a probability distribution … but then someone came along and said no, it’s really .
Or if ‘surprise’ sounds too subjective, it’s the expected amount of information gained when you discover the probability distribution is really when you’d thought it was
Tobias and I wanted to use category theory to say what’s so great about relative entropy. We did it using a category where:
- an object consists of a finite set and a probability distribution on that set;
- a morphism consists of a measure-preserving function from to together with a probability distribution on for each element , with the property that unless .
If the raw math seems hard to swallow, perhaps some honey-coated words will help it go down. I think of an object of as a system with some finite set of states together with a probability distribution on its states. This lets me think of a morphism
in a nice way. First, there’s a measurement process , a function from the set of states of some system being measured to the set of states of some measurement apparatus. The condition that be measure-preserving says the probability that the apparatus winds up in any state is the sum of the probabilities of all states of leading to that outcome:
Second, there’s a hypothesis . This is a guess about the probability that the system being measured is in the state given any measurement outcome The guess is the number .
Now, suppose we have any morphism
in From this we get two probability distributions on . First, we have the probability distribution given by
This is our best guess about the the probability that the system is in any given state, given our hypothesis and the probability distribution of measurement results. Second, we have the ‘true’ probability distribution .
In fact, this way of assigning relative entropies to morphisms defines a functor
where we use to denote the category with one object, the numbers as morphisms, and addition as composition. More precisely, if
is any morphism in we define
where is defined as in equation . This tells us how surprised we are when we learn the true probability distribution , if our measurement results were distributed according to and our hypothesis was .
The fact that is a functor is nontrivial and rather interesting! It says that given any composable pair of measurement processes:
the relative entropy of their composite is the sum of the relative entropies of the two parts:
We prove that is a functor. However, we go further: we characterize relative entropy by saying that up to a constant multiple, is the unique functor from to obeying three reasonable conditions.
The first condition is that is lower semicontinuous. The set of probability distibutions on a finite set naturally has the topology of an -simplex when has elements. The set has an obvious topology where it’s homeomorphic to a closed interval. However, with these topologies, the relative entropy does not define a continuous function
The problem is that
and is when and — but it’s when
So, it turns out that is only lower semicontinuous, meaning that if are sequences of probability distributions on with and then
We give the set of morphisms in its most obvious topology, and show that with this topology, maps morphisms to morphisms in a lower semicontinuous way.
(Lower semicontinuity may seem like an annoying property. But there’s a way to redeem it. There’s a sneaky topology on such that a function taking values in is lower semicontinuous (in the lim inf sense above) if and only if it’s continuous with respect to this sneaky topology!
Using this idea, we can make and into topological categories — that is, categories internal to Top — in such a way that lower semicontinuity simply says
is a continuous functor.
A bit confusingly, this sneaky topology on is called the upper topology. I’ve fallen in love with the upper topology on . Why?
Well, is a very nice rig, or ‘ring without negatives’. Addition is defined in the obvious way, and multiplication is defined in the almost-obvious way, except that
Even this is actually obvious if you remember that it’s required by the definition of a rig. But if you try to put the ordinary closed interval topology on , you’ll see multiplication is not continuous, because is infinite when but then it suddenly jumps down to zero when hits zero. However, multiplication is continuous if we give the upper topology! Then becomes a topological rig.)
The second condition is that is convex linear. We describe how to take convex linear combinations of morphisms in and then the functor maps any convex linear combination of morphisms in to the corresponding convex linear combination of numbers in
Intuitively, this means that if we take a coin with probability of landing heads up, and flip it to decide whether to perform one measurement process or another, the expected information gained is times the expected information gain of the first process plus times the expected information gain of the second process.
(Operadically, the point is that both and are algebras of an operad P whose operations are convex linear combinations. The -ary operations in P are just probability distributions on an -element set. In other words, they’re points in the -simplex.
So, saying that is convex linear means that
is a map of P-algebras. But we avoid discussing this in our paper because , being a category, is just a ‘weak’ P-algebra, and we decided this would be too much for our poor little readers.
For those who like fine nuances: P is a topological operad, and and are algebras of this in the topological category TopCat. As I mentioned, is a ‘weak’ P-algebra, meaning the laws for convex linear combinations hold only up to coherent natural isomorphism. is strict… but to get convex linear combinations like to behave continuously, we have to give the upper topology!)
Vanishing on a subcategory
The third condition is that vanishes on morphisms where the hypothesis is optimal. By this, we mean that equation gives a probability distribution equal to the ‘true’ one, .
That makes a lot of sense conceptually: we don’t gain any information upon learning the truth about a situation if we already knew the truth!
(But the subcategory of where we keep all the objects but only these ‘optimal’ morphisms also has a nice category-theoretic significance. Tom Leinster called it FP in this post:
That’s because it’s the ‘free P-algebra on an internal P-algebra’, where P is the operad I mentioned. I won’t explain what this means here, because Tom did it! Suffice it to say that it’s a shockingly abstract piece of operad theory that nonetheless manages to capture the concept of entropy very neatly. But that’s plain old entropy, not relative entropy.)
Here, then, is our main result:
Theorem. Any lower semicontinuous, convex-linear functor
that vanishes on every morphism with an optimal hypothesis must equal some constant times the relative entropy. In other words, there exists some constant such that
for any any morphism in
The proof is surprisingly hard. Or maybe we’re just surprisingly bad at proving things. But the interesting thing is this: the proof is swift and effective in the ‘generic’ case — the case where the support of the probability measures involved is the whole set they’re living on, and the constant is finite.
It takes some more work to handle the case where the probability measures have smaller support.
But the really hard work starts when we handle the case that, in the end, has . Then the proof becomes more like analysis than what you normally expect in category theory. We slowly corner the result, blocking off all avenues of escape. Then we close in, grab its neck, and strangle it, crushing its larynx ever tighter, as it loses the will to fight back and finally expires… still twitching.
You’ve got to read the proof to understand what I mean.
Stefan Forcey just alerted me to a long and reflective piece (edit: now paywalled) in the Chronicle of Higher Education about the relationship between the NSA and American academics — mathematicians in particular.
Now many academics are trying to be heard from the outside, arguing that the NSA’s spying tactics are proving counterproductive and that university researchers have a duty to stop assisting them. […]
In the months since Edward J. Snowden fled the United States with tens of thousands of electronic documents describing NSA practices, mathematicians are realizing that they are in the same position as nuclear physicists in the middle of the last century, and business students in more recent times — suddenly needing to figure out the ethics behind what they do, said Edward Frenkel, a professor of mathematics at the University of California at Berkeley.
Their appeals were followed on January 24 by an open letter from a group of 50 researchers warning of long-term damage to society and to the nation’s technological enterprise from the NSA’s reported tactic of intentionally weakening computer-security standards so it can carry out spy operations.
“Every country, including our own, must give intelligence and law-enforcement authorities the means to pursue terrorists and criminals,” the researchers wrote, “but we can do so without fundamentally undermining the security that enables commerce, entertainment, personal communication, and other aspects of 21st-century life.”
Shunned as NSA advisers, academics question their ties to the agency, The Chronicle of Higher Education, 10 February 2014. [Article was readable for free when I first put this post up, but got put behind a paywall a few days later.]