Freeman Dyson is a famous physicist who has also dabbled in number theory quite productively. If some random dude said the Riemann Hypothesis was connected to quasicrystals, I’d probably dismiss him as a crank. But when Dyson says this, it’s a lot more interesting. So I’ve been trying to understand his remarks on this. And it’s been productive, in that I’ve learned some interesting things, and I now feel closer to seeing why the Riemann Hypothesis is a natural and important conjecture.
But still, I could use a lot of help: I don’t have much time for number theory, and a few pointers from experts could keep me from going down dead ends.
Those of you who were using the internet around 1990 may remember newsgroups like sci.math, sci.physics, sci.math.research and sci.physics.research. That’s how internet-savvy mathematicians and physicists communicated back then. And if you were around then, you may remember Matt McIrvin, who wrote consistently intelligent and good-natured posts about physics.
I lost track of him for a while, but met him again on Google+, where has been using the open-source math software called Sage to play around with ideas from number theory.
Sage comes equipped with a table of nontrivial zeros of the Riemann zeta function computed by Andrew Odlyzko. Remember, this function is given by
when , but it can be analytically continued over the whole complex plane except for a pole at 1, and then it turns out that for a lot of points on a line:
where is a real number. Here are the first few:
The Riemann Hypothesis claims that all the zeros of the zeta function lie on this line, except for the so-called trivial zeros at and so on. Riemann only checked this for the first three cases… but by now people have checked it for the first 10,000,000,000,000 cases. So it seems to be true, but nobody can prove it.
The nontrivial zeros of the Riemann zeta function are really interesting. There’s no simple formula for them, but they encode information about prime numbers. Riemann was the first to notice this… but Matt ran into it on his own.
He took the first ten thousand positive numbers that make
and he added up a bunch of functions like this:
one for each .
The result is a function of , and he graphed its absolute value. The graph has lots of sharp spikes!
And where are these spikes? He zoomed in on the first few:
A closeup of those first spikes. I wonder what those numbers are, exactly? Probably they’re in the literature somewhere.
I looked around and soon found that those spikes should be here:
See the pattern?
In math jargon, what Matt did is take the Fourier transform of a sum of Dirac deltas supported at the imaginary parts of the nontrivial Riemann zeta zeros. The answer seemed to be another sum of Dirac deltas, times different numbers: the different spikes in the pictures above seem to have different heights. It’s unusual to take the Fourier transform of such a spiky function and get another spiky function. And according to Freeman Dyson, this is the defining feature of a quasicrystal!
When I was looking around for clues, one of the first things I ran into was a lecture by Dyson. He never actually delivered this lecture—it was cancelled at the last minute for some reason—but it was printed here:
- Freeman Dyson, Frogs and birds, Notices of the American Mathematical Society 56 (2009), 212–223.
It was about two styles of doing mathematics, hence the curious title. He said:
The proof of the Riemann Hypothesis is a worthy goal, and it is not for us to ask whether we can reach it. I will give you some hints describing how it might be achieved. Here I will be giving voice to the mathematician that I was fifty years ago before I became a physicist. I will talk first about the Riemann Hypothesis and then about quasicrystals.
There were until recently two supreme unsolved problems in the world of pure mathematics, the proof of Fermat’s Last Theorem and the proof of the Riemann Hypothesis. Twelve years ago, my Princeton colleague Andrew Wiles polished off Fermat’s Last Theorem, and only the Riemann Hypothesis remains. Wiles’ proof of the Fermat Theorem was not just a technical stunt. It required the discovery and exploration of a new field of mathematical ideas, far wider and more consequential than the Fermat Theorem itself. It is likely that any proof of the Riemann Hypothesis will likewise lead to a deeper understanding of many diverse areas of mathematics and perhaps of physics too. Riemann’s zeta-function, and other zeta-functions similar to it, appear ubiquitously in number theory, in the theory of dynamical systems, in geometry, in function theory, and in physics. The zeta-function stands at a junction where paths lead in many directions. A proof of the hypothesis will illuminate all the connections. Like every serious student of pure mathematics, when I was young I had dreams of proving the Riemann Hypothesis. I had some vague ideas that I thought might lead to a proof. In recent years, after the discovery of quasicrystals, my ideas became a little less vague. I offer them here for the consideration of any young mathematician who has ambitions to win a Fields Medal.
Quasicrystals can exist in spaces of one, two, or three dimensions. From the point of view of physics, the three-dimensional quasicrystals are the most interesting, since they inhabit our three-dimensional world and can be studied experimentally. From the point of view of a mathematician, one-dimensional quasicrystals are much more interesting than two-dimensional or three-dimensional quasicrystals because they exist in far greater variety. The mathematical definition of a quasicrystal is as follows. A quasicrystal is a distribution of discrete point masses whose Fourier transform is a distribution of discrete point frequencies. Or to say it more briefly, a quasicrystal is a pure point distribution that has a pure point spectrum. This definition includes as a special case the ordinary crystals, which are periodic distributions with periodic spectra.
Excluding the ordinary crystals, quasicrystals in three dimensions come in very limited variety, all of them associated with the icosahedral group. The two-dimensional quasicrystals are more numerous, roughly one distinct type associated with each regular polygon in a plane. The two-dimensional quasicrystal with pentagonal symmetry is the famous Penrose tiling of the plane.
Finally, the one-dimensional quasicrystals have a far richer structure since they are not tied to any rotational symmetries. So far as I know, no complete enumeration of one-dimensional quasicrystals exists. It is known that a unique quasicrystal exists corresponding to every Pisot–Vijayaraghavan number or PV number. A PV number is a real algebraic integer, a root of a polynomial equation with integer coefficients, such that all the other roots have absolute value less than one . The set of all PV numbers is infinite and has a remarkable topological structure. The set of all one-dimensional quasicrystals has a structure at least as rich as the set of all PV numbers and probably much richer. We do not know for sure, but it is likely that a huge universe of one-dimensional quasicrystals not associated with PV numbers is waiting to be discovered.
Here comes the connection of the one-dimensional quasicrystals with the Riemann Hypothesis. If the Riemann Hypothesis is true, then the zeros of the zeta-function form a one-dimensional quasicrystal according to the definition. They constitute a distribution of point masses on a straight line, and their Fourier transform is likewise a distribution of point masses, one at each of the logarithms of ordinary prime numbers and prime-power numbers. My friend Andrew Odlyzko has published a beautiful computer calculation of the Fourier transform of the zeta-function zeros . The calculation shows precisely the expected structure of the Fourier transform, with a sharp discontinuity at every logarithm of a prime or prime-power number and nowhere else.
My suggestion is the following. Let us pretend that we do not know that the Riemann Hypothesis is true. Let us tackle the problem from the other end. Let us try to obtain a complete enumeration and classification of one-dimensional quasicrystals. That is to say, we enumerate and classify all point distributions that have a discrete point spectrum […] We shall then find the well-known quasicrystals associated with PV numbers, and also a whole universe of other quasicrystals, known and unknown. Among the multitude of other quasicrystals we search for one corresponding to the Riemann zeta-function and one corresponding to each of the other zeta-functions that resemble the Riemann zeta-function. Suppose that we find one of the quasicrystals in our enumeration with properties that identify it with the zeros of the Riemann zeta-function. Then we have proved the Riemann Hypothesis and we can wait for the telephone call announcing the award of the Fields Medal.
These are of course idle dreams. The problem of classifying one-dimensional quasicrystals is horrendously difficult, probably at least as difficult as the problems that Andrew Wiles took seven years to explore. But if we take a Baconian point of view, the history of mathematics is a history of horrendously difficult problems being solved by young people too ignorant to know that they were impossible. The classification of quasicrystals is a worthy goal, and might even turn out to be achievable.
 M. J. Bertin et al., Pisot and Salem Numbers, Birkhäuser, Boston, 1992.
 A. M. Odlyzko, Primes, quantum chaos and computers, in Number Theory: Proceedings of a Symposium, 4 May 1989, Washington, DC, USA (National Research Council, 1990), pp. 35–46.
Well his definition of quasicrystals is not the one used by the quasicrystal community (we actually don’t have a formal mathematical definition yet)…. His statement about icosahedral group is false, actually most 3-dimensional models don’t have any symmetry group. Same issue for the 2 dimensional quasicrystals, most of them are not related to polygons in the plane. […] I really have no idea what he mean by “it is well known that a unique quasicrystal exists corresponding to every PV number”. The existence is true, the uniqueness is far for true… Unless I’m making a terrible mistake, there are constructions which produce pure point diffractive sets from PV numbers, and they produce uncountably many… In many situations, but not always, one can probably get that most of them are “equivalent” in some sense, but not all of them… And the big issue is that any equivalence in this sense, unless one adds very strong extra conditions, allows for .. small translations of the points… And there are of course uncountably many models which are not associated to PV numbers. Another issue is that the zeroes of the RZF are not a Delone set, so anything done so far by the quasicrystal community is not relevant to the problem… And last, I really don’t see how one can go around the following issue: Let be the set of zeroes. Let be the set obtained by moving all the zeroes, such that the th zero is moved by at most . Then diffraction cannot differentiate between and .
I don’t know if these remarks are true, so if any of you know, please tell me… preferably with references!
I don’t care too much if Dyson is using ‘quasicrystal’ in a nonstandard sense. He at least seems to be hinting at a fairly precise definition, perhaps “a countable sum of Dirac deltas on that defines a tempered distribution whose Fourier transform is a countable linear combination of Dirac deltas”. The phrase ‘defines a tempered distribution’ just means the Dirac deltas don’t bunch up too fast, so the Fourier transform is well-defined. Allowing the original sum to also be a more general linear combination of Dirac deltas might be be nice, too: then the Fourier transform of a quasicrystal would be another quasicrystal!
Anyway, what I’d like to learn is what’s known about such entities. In what sense, if any, does any Pisot–Vijayaraghavan number give a unique quasicrystal in 1 dimension? Do de Bruijn’s quasiperiodic tilings, like this one drawn by Greg Egan’s software, give quasicrystals in Dyson’s sense?
Is it really true that all quasicrystals in 3 dimensions are related to the icosahedral group? What’s the theorem there? And what about the 4-dimensional pattern built from the E8 lattice—is that a quasicrystal in Dyson’s sense? Is it hard to find higher-dimensional quasicrystals, or easy?
The Guinand–Weil explicit formula
But now I’d like to come to my actual point, which concerns this remark:
If the Riemann Hypothesis is true, then the zeros of the zeta-function form a one-dimensional quasicrystal according to the definition. They constitute a distribution of point masses on a straight line, and their Fourier transform is likewise a distribution of point masses, one at each of the logarithms of ordinary prime numbers and prime-power numbers.
Is this actually true? In other words, does the Riemann Hypothesis actually imply this? And if so, why?
At first I thought it was true. That would make a very nice bumper sticker explaining the ‘meaning’ of the Riemann Hypothesis… something like
RIEMANN ZETA ZEROS ARE FOURIER DUAL TO LOGS OF PRIME POWERS!
This is the first version of the Riemann Hypothesis I’ve seen that makes me really want it to be true. You can see it discussed in this excellent book draft here, with lots of pretty pictures:
- Barry Mazur and Richard Stein, Primes.
But it seems the truth is a bit more complicated. The truth is called the explicit formula of Guinand and Weil and it involves terms, not only for the nontrivial zeros of the zeta function, but also for the trivial zeros, and the pole. And in fact it’s best to think of this formula not in terms of the original Riemann zeta function, but the ‘corrected’ version that takes the ‘prime at infinity’ into account using the gamma function, namely:
This ‘corrected’ version has the all-important symmetry:
that lets you see the zeros of this function, and thus all the nontrivial zeros of the original Riemann zeta function, lie in the strip .
So, I still have hope for getting a conceptually clear statement of the Riemann Hypothesis that’s exactly correct! However, so far, I can’t seem to say something correct without it looking rather messy. For example, on Mathoverflow Brad Rogers stated a version of the Guinand–Weil explicit formula that looks about like this:
For a compactly supported smooth function with Fourier transform ,
Here the sum is over such that is a non-trivial zero of the Riemann zeta function. is the Chebyshev prime counting function:
Alas, this formula doesn’t look very ‘conceptual’, though I think I’m beginning to understand it, and Marc Palm gave a nice sketch of a proof.
In this formula, can be complex if the Riemann Hypothesis is false. But if it’s true, is always real, and the left-hand side of the big equation
is really just the test function integrated against a sum of complex exponentials, one for each nontrivial zero of the zeta function. (I should warn you that these zeros come in complex conjugate pairs, so for each positive real we get a corresponding negative ).
The right-hand side of the big equation contains a ‘nice’ term that’s a sum over prime powers, but also some ‘corrections’ that seem to make Dyson’s claim fail to be literally true. For example, besides a linear combination of Dirac deltas at logarithms of prime powers, there’s the correction term proportional to the function . Owen Maresh has plotted this function:
So, I’m thinking that the slow rise at right here:
might not be an artifact of numerical approximations, but an actual real thing: this function .
So: what’s the really neat way to write an ‘explicit formula’ relating prime powers and Riemann zeta zeros… which simplifies in some way iff the Riemann Hypothesis holds? And is there a way to modify Freeman Dyson’s claim here, that makes it correct while still maintaining a connection to quasicrystals?
If the Riemann Hypothesis is true, then the zeros of the zeta-function form a one-dimensional quasicrystal according to the definition. They constitute a distribution of point masses on a straight line, and their Fourier transform is likewise a distribution of point masses, one at each of the logarithms of ordinary prime numbers and prime-power numbers.
Here it is at last: the Selected Papers Network. Given that social networks already exist, all we need for truly open scientific communication is a convention on a consistent set of tags and IDs for discussing papers. Christopher Lee, in bioinformatics at UCLA, has developed software that makes this work. Try it out!
What’s cool about this system is that it’s federated. Instead of locking up your comments within its own website—the “walled garden” strategy followed by many other service—it explicitly shares these data in a way that people not on the Selected Papers Network can easily see. Any other service can see and use them too.
To learn about the problems that make us want this system, read this:
To learn how the system works, and to try it out, go here:
I just returned from a month at Hong Kong University, visiting James Fullwood, an algebraic geometer who likes to think about the mathematics of string theory. There, I gave a colloquium on G2 and the rolling ball, a paper John Baez and I wrote that is due to appear in Transactions of the AMS. This project began over a decade ago in conversations between John and Jim Dolan, later continued between Jim and me. Though Jim opted not to be a coauthor, his insights were crucial.
I would like to tell you about this paper, but I’ll warm up with a puzzle—one you’ve seen in several guises if you’ve read John Baez’s posts about this paper, but well-worth revisiting.
Puzzle: Roll a ball of unit radius on a fixed ball of radius , being careful not to let your ball slip or twist as you roll it. Suppose you roll along a great circle from the North Pole to the South Pole and back to the start at the North. How many turns did the ball make as it rolled?
Here’s a variant of this puzzle you can work out very concretely: for , roll one coin around another of the same kind, without slipping, and count the number of times the rolling coin turns.
Below the fold, I’ll give you the answer, and I’ll also tell you about something amazing that happens when , bringing in the exceptional Lie group and a funky 8-dimensional number system called the split octonions.
When Cartan and Killing classified the simple Lie groups, they found something unexpected. Besides the perfectly respectable infinite families, , and , there were five exceptions. In order of increasing dimension, these exceptional groups are prosaically called , , , and . While the first three kinds of groups are all symmetry groups of vector spaces equipped with some kind of bilinear form, the exceptions do not, at first glance, have a reason to exist. I like to imagine Cartan and Killing reacting like physicist I.I. Rabi to the discovery of the muon: “Who ordered that?” Since their discovery, finding simple models for the exceptional Lie groups has been an important program in mathematics. Here, I’ll tell you about one such model for , essentially due to Cartan: is almost the symmetry group of one ball rolling without slipping or twisting on another ball, provided the ratio of radii is 1:3 or 3:1.
When Jim Dolan and I started talking about this problem, we set out specifically to explain that funny ratio. We took our cue from Bor and Montgomery, who in their excellent paper G2 and the rolling distribution, write:
Open problem. Find a geometric or dynamical interpretation for the “3” of the 3:1 ratio.
Why only 1:3 or 3:1? You can see a hint of “threeness” in the Dynkin diagram for :
And that’s probably responsible for the 1:3 ratio, but by the end of the post, I’ll have shown you another explanation, far more geometric in nature. The key is going to be to relate the rolling ball picture to another description of , also due to Cartan: is the automorphism group of the ‘split octonions’, .
I’ll explain all of these ideas, so don’t worry if you’re unfamiliar with them. But first, an aside to those in the know: throughout this post, I’ll focus on the adjoint, split real form of , which is the only form for which all of these ideas make sense, though a lot of them continue to work for the complex form as well, as long as you replace the split octonions with their complexification. This is in keeping with a well-known theme in Lie theory: split real forms and complex forms behave in almost identical fashion.
The rolling ball
Now let’s get down to business, by working out a mathematical framework to describe a ball rolling on another ball, without slipping or twisting. To start off, I’ll let the fixed ball have radius , and the rolling ball have radius 1. Only later will we see what happens when we set .
First, a configuration of this system corresponds to a point in , as follows: for , is the point of contact where the rolling ball touches the fixed ball, and tells us the present orientation of the rolling ball, rotating it from your favorite starting orientation. To show you what I mean, I can’t do better than a picture from the paper of Bor and Montgomery (though note that what I call they call , and they let their rolling ball have arbitrary radius ):
That’s how we describe configurations of the rolling ball system. What does it mean for the ball to roll without slipping or twisting? To describe that, we will turn to ‘incidence geometry’. This kind of geometry sounds very bare bones: in its simplest incarnation, an incidence geometry has points, lines, and an incidence relation (a point lies on a line, or is incident on the line). Yet this minimal structure will suffice to capture what we mean by rolling without slipping or twisting.
The rolling ball system defines an incidence geometry where:
- Points are configurations of the rolling ball—elements of .
- Lines are given by rolling without slipping or twisting along great circles.
A line, then, is some one-dimensional submanifold of . Which one-dimensional submanifolds are lines? Well, that’s the point of the puzzle at the start of this post. If you didn’t think about it before, go back and give it a shot now!
Here’s the answer: each time we roll a ball of unit radius around a ball of radius , the rolling ball turns times! And that’s how we roll. For instance, if begin rolling from the North Pole and sweep out a central angle as we do so, we rotate our ball by an angle about the axis going into the picture:
Now that we have a incidence geometry, we can talk about its symmetries. These symmetries are the diffeomorphisms that preserve the lines:
Since the lines depend on , so does this group. Alas, this group is not , even when .
Remember, I said that was almost the symmetry group of the rolling ball, and we have run head first into that “almost”: as shown by Bor and Montgomery, does not act on . Instead, it acts on its double cover:
And, if we stretch our imaginations a bit, we can think of this as a variant of the rolling ball system, which we call the ‘rolling spinor’.
The rolling spinor…
What’s a rolling spinor? It’s like a ball, but thanks to the “double” in the double cover , a rotation does not act like the identity. Instead, we need to rotate by degrees to get back where we started.
Spinors show up physically and mathematically. Famously, electrons are spinors, but a far more concrete example is given by the belt trick: put a twist in a belt, and you cannot undo the twist just by translating the ends around, but for twist, you can!
I can make more sense of this using quaternions. If I identify with the group of unit quaternions:
and identify with the subspace of imaginary quaternions:
than it’s easy for me to describe the covering map, . Each unit quaternion goes to the rotation of given by conjugation:
The kernel of is . If I think of as giving a rotation by 0°, and as giving a rotation by 360°, then the idea that a 720° rotation is the identity becomes , which is indeed true. If this idea sounds funny, or you want an overview of quaternions, read this talk I gave at Fullerton College.
The rolling spinor system comes with an incidence geometry where:
- Points are elements of .
- Lines are preimages of lines in .
When , acts on as symmetries preserving lines. To understand this action, it helps to introduce yet another variant of the rolling ball system: the rolling spinor on a projective plane.
…on a projective plane
This system is double-covered by the rolling spinor:
where the map identifies antipodal points of . We write for the equivalence class containing and . There is an incidence geometry where:
- Points are elements of .
- Lines are images of lines in .
Unlike the original rolling ball system, when , acts as symmetries of this incidence geometry. To see why, it’s time to introduce the split octonions.
The split octonions are an eight dimensional real composition algebra. There are two such algebras over the reals, the other being their more famous cousin, the octonions. We can define the split octonions as pairs of quaternions:
with the following multiplication:
Here, the barred quaternions and denote the conjugate, in , of and , obtained by flipping the sign of the imaginary part, just as we would do for a complex number. With this multiplication, one can check that the quadratic form:
makes into a composition algebra, with the property:
Now we can define . And, as promised, this description of can be related to the rolling ball descriptions I have been sketching. But first, it helps to note that we don’t really need all of : since fixes , we really only need the subspace of imaginary split octonions. This is the subspace orthogonal to 1 with respect to . In terms of pairs of quaternions, this is the subspace where the first quaternion is purely imaginary while the second can be arbitrary:acts on the imaginary split octonions as well, and in fact it is the smallest irreducible representation of . We’re close now. Because preserves , acts on the space of 1d null subspaces of :
This space of null lines is almost the configuration space of the spinor rolling on the projective plane, . Choosing a representative of each line in , we can normalize its first component to have length 1, forcing the second component to also have unit length. Doing this, we see that:
This ought to remind you of the spinor rolling on a projective plane, which has configuration space:
In the first case, we mod out both components by their sign, and in the second, we just mod out the first. And in fact, these spaces are diffeomorphic:
Note that this is not true for the ordinary rolling ball. We needed to think about a spinor rolling on a projective plane.
Under the diffeomorphism above, lines in incidence geometry become submanifolds of . If and only if , these submanifolds ‘straighten out’: they are given by projectivizing 2-dimensional null subspaces of the imaginary split octonions .
However, not all 2d null subspaces of yield lines in . When , the incidence geometry of coincides with the incidence geometry on with:
- Points are 1d null subspaces of .
- Lines are 2d null subspaces of on which the product also vanishes.
In fact, if we call a subspace of a null subalgebra if the product of any two elements vanishes, we can give a very snappy description of the incidence geometry on , thanks to the helpful fact that a 1d subspace of is null if and only if :
- Points are 1d null subalgebras of .
- Lines are 2d null subalgebras of .
Now we have what we wanted: the geometry of a spinor rolling on a projective plane, when , is the geometry of null subalgebras of the imaginary split octonions, . Hence, acts as symmetries of the spinor rolling on the projective plane!
We’ve seen how a spinor rolling on a projective plane three times as big acquires symmetry, but we still haven’t seen a completely satisfying explanation of that 1:3 ratio. It’s hiding somewhere in the proofs of the theorems I described above.
Yet there is an intuitive way to see the 1:3 ratio. Let’s think of our spinor as a ball that comes back to itself after turning twice, and let’s think of our projective plane as a ball with antipodal points identified. Rolling from a point on our fixed ball to its antipode, we’ve come back to where we’ve started in our projective plane. For what ratio of radii will the rolling spinor also be back where it started? That is, for what ratio of radii does the rolling ball turn an even number of times as it goes from a point to its antipode? At minimum, we need to turn twice. So:
- For what ratio of radii does the rolling ball turn twice as it goes from a point to its antipode? Or, put another way, for what ratio of radii does the rolling ball turn four times as it rolls once around the fixed ball?
By the solution to our puzzle, a ball of unit radius turns times as it rolls once around a fixed ball of radius . To turn four times, we need .
In another instalment of my occasional series on ‘Things you didn’t realise were enriched categories (unless you’re an expert!)’ I want to talk about torsors and how they can be considered enriched categories where the enriching category is a group, considered as a discrete monoidal category.
I’ll start off by telling you what a torsor is and then explain how it can be thought of as having something like a ‘group-valued distance’ and how this relates to enriched category theory.
What is a torsor?
“Torsor” is a word that used to strike a dread fear into my heart. It is a word that was seemingly only used by high-brow mathematicians who wished to intimidate more lowly souls. That was until I read Dan Freed’s paper Higher Algebraic Structures and Quantization which made me realise that torsor is just a fancy name for a simple concept. For a group a torsor is something that looks like but isn’t actually a group. That might sound like an overly cryptic description but hopefully it will make things clear in a minute.
On my desk I have a coffee cup — we are in a café, after all.
The rim of my coffee cup is a circle: it looks like the group of unit complex numbers , but I don’t know how to multiply two points on the coffee cup rim, , so the coffee-cup rim is not canonically a group. However, the group of unit complex number acts on the rim : if you give me a unit complex number , I can rotate my coffee cup through the angle .
This action is transitive in that I can move any point on the rim to any other point by a rotation and the action is free in that precisely one rotation will move a specified point to another specified point. A set with a action which is free and transitive is known as a principal homogeneous -set, or simply a -torsor. So the rim of my coffee cup is a -torsor.
If I pick any point on the rim then I get an bijection between the points of and the points in the group by identifying the point with the identity in . So there are as many ways to identify the rim with the circle group as there are points in the rim but none of them is canonical.
Just as an affine space can be thought of as a vector space without an origin, so a torsor can be thought of as a group without an identity: a torsor is to a group as an affine space is to a vector space.
John has written a nice piece explaining how torsors, particularly in physics, are more prevalent than you might first think, because our gut reaction is to think in terms of groups. We see a line, so we want to put an origin on it and identify it with rather than accept that it doesn’t naturally have an origin, and should be thought of as an -torsor.
Another view of torsors
Torsors can be thought of in a different fashion and thinking in this fashion this will lead us enriched categories. Between each pair of points in a torsor there is something like a ‘group-valued distance’. More precisely, associated to each ordered pair of elements of a -torsor there is a unique element of the group which sends to , we can write this as , or as if we want to emphasise the torsor . Symbolically, this is the unique element of the group that satisfies . This is the ‘distance’ from to .
[I have chosen to write rather than the more usual category theoretic convention , because the actions are on the left and the chosen conventions make the formulas look a bit nicer.]
Of course, in the coffee cup example, the ‘distance’ from one point to another is the rotation required to move the first point to the second.
Because we have a -action on we know that and this implies for all .; for the same reason we also have . Moreover, we can recover the -action from this data because given is the unique element of such that .
In summary, we can think of a -torsor as a set such that
for each pair there is a group element ;
for each triple there is an equality ;
for each element there is the equality ;
for each , there is a unique such that .
You might have noticed that the first three of the above conditions form precisely the definition of a certain kind of enriched category, namely a category enriched over the monoidal category which is the group considered as a discrete monoidal category. This means that has the elements of as its objects, has only identity morphisms and has the group multiplication as the monoidal multiplication: .
We have shown that a -torsor is a -category satisfying an extra condition. A reasonable question to ask at this point is “What’s the good in that?” One answer is that it gives us another layer of intuition; it allows us to make analogies with categories, with metric spaces, with posets, amongst other things, and it allows us to use the tools from enriched category theory.
We should now ask what a -functor between -torsors is. This is a function such that
This is quite a rigid condition, saying that the right notion of map is some kind of isometry. Unsurprisingly, if and correspond to -torsors then this is precisely the condition that is an equivariant map:
New torsors from old
When we look at torsors over abelian groups there are ways of combining torsors. We will see below how these relate to standard enriched category theory constructions.
Hom torsor: If is an abelian group and both and are -torsors then we can form the set of equivariant maps from to . I learnt many years ago from Freed’s paper that this set of maps is itself an -torsor. We can define the action of on an equivariant map by . You can check that is an equivariant map, but you will see that it is necessary that is abelian for this to work.
Tensor torsor: If is an abelian group and both and are -torsors then we can also define the tensor product as where is the relation defined by
We find that is also an -torsor if we define the action by . Again, if you check, you will see that you need to be abelian for this to be well defined.
Things get more interesting with enriched categories when we enrich over categories which are closed, braided and bicomplete. Let’s consider each of these conditions for .
Closedness: A monoidal category is left-closed if for each object the functor has a right adjoint . Unwrapping this definition for we find that for each we need a function of sets such that for every we have
Clearly, we can take . So for any group the monoidal category is left-closed.
Similarly, a monoidal category is right-closed if for each object the functor has a right adjoint . For , we can take . So for any group the monoidal category is both left- and right-closed.
Braidedness: For a monoidal category to be braided we need isomorphisms for all objects and . For the category that would mean we need for all , in other words, we need that is abelian. In that case we have that is in fact symmetric.
Bicompleteness: A category is bicomplete if it has all limits and colimits. We have no hope of any non-trivial group having a bicomplete category . This is because this category is discrete: to form a product we would need projections to and but, because the only morphisms are identities, that would mean and . You can check that the self products are actually defined though: . This looks like it might spoil our fun, but it will transpire that the only limits that we need will be the self-products.
Summary: In summary then, if is abelian then is a closed symmetric monoidal category; if is non-abelian then is a closed monoidal category which is not braided.
As the definition of the tensor product of -categories requires that is braided, we will restrict ourselves to the case that is an abelian group, and we will emphasize this fact by renaming it .
Functor category and tensor product
We can now look at how the hom torsor and tensor torsor from above arise in enriched category theory. We will work with an abelian group so that is closed symmetric monoidal.
Functor categories and hom torsors: Suppose is a closed symmetric monoidal category and that and are -categories then (provided that has sufficiently many limits) there is -category of -morphisms where the objects are the -functors and the hom object is given by the equalizer of the following diagram.
We don’t need to worry here about what the two maps are as, in the case of interest when , they are both going to be equalities.
If and are -torsors, thought of as -categories, then we try to construct the functor category . This has -functors, i.e. equivariant maps as morphisms. The hom object , between equivariant maps is given by the equalizer of the above diagram, but as mentioned, the maps are equalities, so the equalizer is just the left-hand-side term, so
As mentioned in the previous section, does not have many limits, but fortunately it does have this product as all the terms in the product are the same: a calculation shows that
Hence we find that the “distance” between two functors is the distance between the two images of any point:
This means that the functor -category exists and is actually a torsor. You can easily check that the -action is exactly that of the hom torsor, so
Tensor product categories and tensor torsors: If is a braided monoidal category then you can define the tensor product of two -categories and . An object of is a pair
and the hom objects are defined by
The braiding of is needed to define the composition morphisms.
If and are -torsors, thought of as -categories, then we can construct the tensor product -category . You might expect that this is actually a torsor equal to the tensor torsor but that’s not quite true!
In the torsor we have made the identification however in the -category these two points are distinct despite having `no distance’ between them: you can check that .
This means that the -category is not actually a torsor but it is equivalent to the torsor : the quotient map on objects gives an equivalence of -categories . You can think of as a fattened-up version of ; it is like using a stack instead of a quotient space. This is what you might expect from a categorical approach, in that you don’t violently quotient out but rather encode the equivalence relation via isomorphisms.
I’ll just finish by mentioning the Yoneda map. For any -category , the presheaf -category is a torsor.
If is actually a torsor then the Yoneda map is an isomorphism, which is not something you would expect from the Yoneda map!
If is not a torsor then the Yoneda map is a torsorization of . For instance, in the tensor example above we have .
I’m happy to announce that Emily Riehl has joined us as a host at the Café. Emily is based at Harvard, and you can deduce some of her interests from the many guest posts she has written for us before:
Guest post by Bruce Bartlett
On Monday, David Corfield and Kobi Kremnitzer gave philosophy talks in a snazzy new building at Oxford:
- Kobi Kremnitzer, What is geometry?, 2-4pm.
- David Corfield, What might philosophy make of homotopy type theory?, 4.30-6.30pm.
The talks shared homotopy type theory as a common theme. The name “Per Martin-Löf” was mentioned a lot, which was good for me since I had always thought Martin and Löf were two separate people:
Notes are available above, but I will try to give some brief impressions.
Kobi Kremnitzer, What is geometry?
He started by answering the question “Why am I giving this talk?”, and explained that he followed the pragmatic approach to philosophy of mathematics. I think then he said his approach was somehow similar to that of Wittgenstein and Carnap (but he could have been saying the exact opposite :-)), and that for him, there is no Metaphysics in the joint carving. I’m afraid this totally went over my head, but I did imagine some Oxford dons pleasantly carving a roast chicken, which started to make me hungry!
He stressed that for him mathematics is not in the business of theorem proving only, but mathematicians create new systems, new languages, and that in the correct language a problem becomes trivial… the approach of Grothendieck.
2. Categorical language Since it was a philosophy talk, he motivated categories by starting with Kripke semantics, going via posets, and then he went from posets to categories by literally “raising the bar”!
3. Crashcourse in algebraic geometry Algebraic geometry is the study of solutions to polynomial equations, like a parabola:
I haven’t specified what “y” and “x” actually are, and that’s the point. We can interpret them in any ring. Hence the Grothendieck view is to think of an affine variety defined by a bunch of polynomial equations as being a presheaf on the (opposite) category of rings,
This leads us to define the category of algebraic spaces as being nothing but the category of presheaves on .
This category of algebraic spaces has lots of nice properties. Inside it live subcategories of objects having nice properties, such as schemes and sheaves. But Kobi stressed that it is very handy to understand them as living in this general universe of algebraic spaces.
4. What is algebra? There was a crucial idea lurking above – that a ring is a gismo which allows you to take any polynomial and evaluate it on elements of .
So – to go from algebraic to differential geometry, we could replace the concept of a “ring” with the concept of a “-ring” – this is a gismo which allows you to take any smooth function and evaluate it on elements of !
For instance, the space of smooth functions on a manifold is a -ring… but so is ! So by this slight change of view, we have accomplished Leibniz’s dream – calculus and infinitesimals in the same universe.
5. Final comments He spoke a bit about: derived geometry, set-theoretic foundations, noncommutative geometry, synthetic differential geometry, elementary theory of the category of sets – have a look at the last few pages of his notes.
He stressed that ordinary set-theoretic foundations pulverizes spaces into “atomic dust” where the elements have no “cohesion” with each other… we have to put this back in by hand using topology. As a foundation, homotopy type theory will have this cohesiveness natively built in, and that is attractive to a geometer.
David Corfield, What might philosophy make of homotopy type theory?
David began by defining three “camps” in the philosophy of mathematics:
- Antiformalism (eg. Heidegger, Wittgenstein)
- Proformalism (eg. Russell, Carnap, Quine)
- Historical / dialectical philosophy (eg. Cassirer, Collingwood, Lautman, Polanyi, Lakatos, Shapere, MacIntyre, Friedman)
This last camp has the longest list of names, so you guessed it, David placed himself there! He quoted from Domski and Dixon, which I couldn’t understand, but I could understand this quote from his book:
Straight away, from simple inductive considerations, it should strike us as implausible that mathematicians dealing with number, function and space have produced nothing of philosophical significance in the past seventy years in view of their record over the previous three centuries. Implausible, that is, unless by some extraordinary event in the history of philosophy a way had been found to filter, so to speak, the findings of mathematicians working in core areas, so that even the transformations brought about by the development of category theory, which surfaced explicitly in 1940s algebraic topology, or the rise of non-commutative geometry over the past seventy years, are not deemed to merit philosophical attention.
He had rather sobering news for philosophers keen on getting involved with homotopy type theory:
If one is not on-board already, that’s leaving it rather late to assist with Homotopy Type Theory.
He mentioned the n-category exposition on homotopy type theory by Mike Shulman (starting here) quite a lot, and also Urs Schreiber’s explanation of how homotopy type theory natively produces the “right answer” in a differential topology problem in string theory (flux quantization condition).
He also spoke about polarity, inference, modal logic and lots of other stuff - but my understanding was severely cramped.
Finally he ended on a positive note, there is plenty of places for philosophers to get involved:
For philosophers of the historical / dialectical persuasion, homotopy type theory is just the kind of development that needs to be written up. There is plenty to learn from the development – the role of logician-philosopher Martin-Löf, constructive type theory, category theory, homotopic mathematics, influence of physics, computer science.
For proformalist philosophers, there are also plenty of opportunities. Look closely at intensionality, modality, polarity, the dependent type theory-language relation, and philosophy of physics.
Guest post by Emily Riehl
Whether we grow up to become category theorists or applied mathematicians, one thing that I suspect unites us all is that we were once enchanted by prime numbers. It comes as no surprise then that a seminar given yesterday afternoon at Harvard by Yitang Zhang of the University of New Hampshire reporting on his new paper “Bounded gaps between primes” attracted a diverse audience. I don’t believe the paper is publicly available yet, but word on the street is that the referees at the Annals say it all checks out.
What follows is a summary of his presentation. Any errors should be ascribed to the ignorance of the transcriber (a category theorist, not an analytic number theorist) rather than to the author or his talk, which was lovely.
Let us write for the primes in increasing cardinal order. We know of course that this list is countably infinite. A prime gap is an integer . The Prime Number Theorem tells us that is approximately as approaches infinity.
The twin primes conjecture, on the other hand asserts that
i.e., that there are infinitely many pairs of twin primes for which the prime gap is just two. A generalization, attributed to Alphonse de Polignac, states that for any positive even integer, there are infinitely many prime gaps of that size. This conjecture has been neither proven nor disproven in any case. These conjectures are related to the Hardy-Littlewood conjecture about the distribution of prime constellations.
The basic question is whether there exists some constant so that infinitely often. Now, for the first time, we know that the answer is yes…when .
Here is the basic proof strategy, supposedly familiar in analytic number theory. A subset of distinct natural numbers is admissible if for all primes the number of distinct residue classes modulo occupied by these numbers is less than . (For instance, taking , we see that the gaps between the must all be even.) If this condition were not satisfied, then it would not be possible for each element in a collection to be prime. Conversely, the Hardy-Littlewood conjecture contains the statement that for every admissible , there are infinitely many so that every element of the set is prime.
Let denote the function that is when is prime and 0 otherwise. Fixing a large integer , let us write to mean ≤ . Suppose we have a positive real valued function —to be specified later—and consider two sums:
Then if for some function it follows that for some (for any sufficiently large) which means that at least two terms in this sum are non-zero, i.e., that there are two indices and so that and are both prime. In this way we can identify bounded prime gaps.
The trick is to find an appropriate function . Previous work of Daniel Goldston, János Pintz, and Cem Yildirim suggests define where
where and is a power of .
Now think of the sum as a main term plus an error term. Taking with , the main term is negative, which won’t do. When the main term is okay but the question remains how to bound the error term.
Zhang’s idea is related to work of Enrico Bombieri, John Friedlander, and Henryk Iwaniec. Let where (which is “small but bigger than ”). Then define using the same formula as before but with an additional condition on the index , namely that divides the product of the primes less that . In other words, we only sum over square-free with small prime factors.
The point is that when is not too small (say ) then has lots of factors. If and there is some so that and . This gives a factorization with which we can use to break the sum over into two sums (over and over ) which are then handled using techniques whose names I didn’t recognize.
On the size of the bound
You might be wondering where the number 70 million comes from. This is related to the in the admissible set. (My notes say but maybe it should be .) The point is that needs to be large enough so that the change brought about by the extra condition that is square free with small prime factors is negligible. But Zhang believes that his techniques have not yet been optimized and that smaller bounds will soon be possible.
There’s going to be a “thematic trimester” in Paris starting next spring:
- Semantics of proofs and certified mathematics, Institut Henri Poincaré, April 22nd - July 11th, 2014, organized by Pierre-Louis Curien, Hugo Herbelin, Paul-André Melliès.
If you like applications of category theory to logic and computer science, there should be a lot for you here!
The basic layout is this:
- Week 1 — Kick-off: Formalisation in mathematics and in computer science
- Week 3 — Workshop 1: Formalization of mathematics in proof assistants, organized by Georges Gonthier and Vladimir Voevodsky.
- Week 6 — Workshop 2: Constructive mathematics and models of type theory, organized by Thierry Coquand and Thomas Streicher.
- Week 8 — Workshop 3: Semantics of proofs and programs, organized by Thomas Ehrhard and Alex Simpson.
- Week 10 — Workshop 4: Abstraction and verification in semantics, organized by Paul-André Melliès and Luke Ong.
- Week 12 — Workshop 5, Certification of high-level and low-level programs organized by Christine Paulin and Zhong Shao.
A lot of people I know will attend parts of this, such as Jean Benabou, Marcelo Fiore, Dan Ghica, André Joyal, Samuel Mimram, and Bas Spitters. And that makes me happy, because Paul-André Melliès has invited me to spend up to a month attending this series of workshops, perhaps in two 2-week stretches. With a little luck I’ll be able to actually do this.
(My wife Lisa Raphals has gotten invited to Erlangen for the spring of 2014, meaning roughly April 1 - June 1. If she and I succeed in getting leaves of absence, I’ll go with her, and then take some trips to nearby places. Since I split my time between the Wild West and the Far East, Paris seems nearby to Erlangen to me. I also have vague invitations to IHES, Prague and Berlin which I might try to take advantage of. And if you have a luxurious villa in northern Italy or the French Riviera, let me know.)
Suppose is a topological space and is an open subset, with closed complement . Then and are, of course, topological spaces in their own right, and we have as a set. What additional information beyond the topologies of and is necessary to enable us to recover the topology of on their disjoint union?
Recall that the subspace topologies of and say that for each open , the intersections and are open in and , respectively. Thus, if a subset of is to be open, it must yield open subsets of and when intersected with them. However, this condition is not in general sufficient for a subset of to be open — it does define a topology on , but it’s the coproduct topology, which may not be the original one.
One way we could start is by asking what sort of structure relating and we can deduce from the fact that both are embedded in . For instance, suppose is open. Then there is some open such that . But we could also consider , and ask whether this defines something interesting as a function of .
Of course, it’s not clear that is a function of at all, since it depends on our choice of such that . Is there a canonical choice of such ? Well, yes, there’s one obvious canonical choice: since is open in , is also open as a subset of , and we have . However, , so choosing wouldn’t be very interesting.
The choice is the smallest possible such that . But there’s also a largest such , namely the union of all such . This set is open in , of course, since open sets are closed under arbitrary unions, and since intersections distribute over arbitrary unions, its intersection with is still .
Let’s call this set . In fact, it’s part of a triple of adjoint functors between the posets and of open sets in and , where is defined by , and is defined by . Here denotes the continuous inclusion .
Now we can consider the intersection , which I’ll also denote , where is the inclusion. It turns out that this is interesting! Consider the following example, which is easy to visualize:
- , the open left half-plane.
- , the closed right half-plane.
If an open subset “doesn’t approach the boundary” between and , such as the open disc of radius centered at , then it’s fairly easy to see that , and therefore is the open right half-plane.
On the other hand, consider some open subset which does approach the boundary, such as
the intersection with of the open disc of radius centered at . A little thought should convince you that in this case, is the union of the open right half-plane with the whole open disc of radius centered at . Therefore, is the open right half-plane together with the strip .
This example suggests that in general, measures how much of the “boundary” between and is “adjacent” to . I leave it to some enterprising reader to try to make that precise. Here’s another nice exercise: what can you say about for an open subset ?
Let us however go back to our original question of recovering the topology of . Suppose and are open such that is open in ; how does this latter fact manifest as a property of and ? Note first that . Thus, since is the largest such that , we have , and therefore . Let me say that again:
This is a relationship between and which is expressed purely in terms of the topological spaces and and the function , which we have just shown is necessary for to be open in .
In fact, it is also sufficient! For suppose this to be true. Since is open in , there is some open such that . Given such a , the union also has this property, since . Note that in fact , and also , the largest open subset of whose intersection with is . (Since , unlike , is not open, there may not be a smallest such, but there is always a largest such.) Now I claim we have
To show this, it suffices to show that the two sides become equal after intersecting with and with . For the first, we have
and for the second we have
using the assumption at the step .
In conclusion, the topology of is entirely determined by
- the induced topology of an open subspace ,
- the induced topology on its closed complement , and
- the induced function .
Specifically, the open subsets of are those of the form — or equivalently, by the above argument, — where is open in , is open in , and .
An obvious question to ask now is, suppose given two arbitrary topological spaces and and a function ; what conditions on ensure that we can define a topology on in this way, which restricts to the given topologies on and and induces as ? We may start by asking what properties has. Well, it preserves inclusion of open sets (i.e. ) and also finite intersections (), including the empty intersection (). In other words, it is a finite-limit-preserving functor between posets. Perhaps surprisingly, it turns out that this is also sufficient: any finite-limit-preserving allows us to glue and in this way; I’ll leave that as an exercise too.
Okay, that was some fun point-set topology. Now let’s categorify it. Open subsets of are the same as 0-sheaves on it, i.e. sheaves of truth values, or of subsingleton sets, and the poset is the (0,1)-topos of 0-sheaves on . So a certain sort of person immediately asks, what about -sheaves for ?
In other words, suppose we have , , and as above; what additional data on the toposes and of sheaves (of sets, or groupoids, or homotopy types, etc.) allows us to recover the topos ? As in the posetal case, we have adjunctions and relating these toposes, and we may consider the composite .
The corresponding theorem is then that is equivalent to the comma category of over , i.e. the category of triples where , , and . This is true for 1-sheaves, -sheaves, -sheaves, etc. Moreover, the condition on a functor ensuring that its comma category is a topos is again precisely that it preserves finite limits. Finally, this all works for arbitrary toposes, not just sheaves on topological spaces. I mentioned in my last post some applications of gluing for non-sheaf toposes (namely, syntactic categories).
One new-looking thing does happen at dimension 1, though, relating to what exactly the equivalence
looks like. The left-to-right direction is easy: we send to where is applied to the unit of the adjunction . But in the other direction, suppose given ; how can we reconstruct an object of ?
In the case of open subsets, we obtained the corresponding object (an open subset of ) as , but now we no longer have an ambient “set of points” in which to take such a union. However, we also had the equivalent characterization of the open subset of as , and in the categorified case we do have objects and of . We might initially try their cartesian product, but this is obviously wrong because it doesn’t incorporate the additional datum . It turns out that the right generalization is actually the pullback of and the unit of the adjunction at :
In particular, any object can be recovered from and by this pullback:
Now let’s shift perspective a bit, and ask what all this looks like in the internal language of the topos . Inside , the subtoposes and are visible through the left-exact idempotent monads and , whose corresponding reflective subcategories are equivalent to and respectively. In the internal type theory of , and are modalities, which I will denote and respectively. Thus, inside we can talk about “sheaves on ” and “sheaves on ” by talking about -modal and -modal types (or sets).
Moreover, these particular modalities are actually definable in the internal language of . Open subsets can be identified with subterminal objects of , a.k.a. h-propositions or “truth values” in the internal logic. Thus, is such a proposition. Now is definable in terms of by
I’m using type-theorists’ notation here, so is the exponential in . The other modality is also definable internally, though a bit less simply: it’s the following pushout:
In homotopy-theoretic language, is the join of and , written . And if we identify and with their images under and , then the functor is just the modality applied to -modal types.
Finally, the fact that is the gluing of with means internally that any type can be recovered from , , and the induced map as a pullback:
Now recall that internally, is a proposition: something which might be true or false. Logically, has a clear meaning: its elements are ways to construct an element of under the assumption that is true.
The logical meaning of is somewhat murkier, but there is one case in which it is crystal clear. Suppose is decidable, i.e. that it is true internally that “ or not ”. If the law of excluded middle holds, then all propositions are decidable — but of course, internally to a topos, the LEM may fail to hold in general. If is decidable, then we have , where is its internal complement. It’s a nice exercise to show that under this assumption we have .
In other words, if is decidable, then the elements of are ways to construct an element of under the assumption that is false. In the decidable case, we also have , so that — and this is just the usual way to construct an element of by case analysis, doing one thing if is true and another if it is false.
This suggests that we might regard internal gluing as a “generalized sort of case analysis” which applies even to non-decidable propositions. Instead of ordinary case analysis, where we have to do two things:
- assuming , construct an element of ; and
- assuming not , construct an element of
in the non-decidable case we have to do three things:
- assuming , construct an element of ;
- construct an element of the join ; and
- check that the two constructions agree in .
I have no idea whether this sort of generalized case analysis is useful for anything. I kind of suspect it isn’t, since otherwise people would have discovered it, and be using it, and I would have heard about it. But you never know, maybe it has some application. In any case, I find it a neat way to think about gluing.
Let me end with a tantalizing remark (at least, tantalizing to me). People who calculate things in algebraic topology like to work by “localizing” or “completing” their topological spaces at primes, since it makes lots of things simpler. Then they have to try to put this “prime-by-prime” information back together into information about the original space. One important class of tools for this “putting back together” is called fracture theorems. A simple fracture theorem says that if is a -local space (meaning that all primes other than are inverted) and some technical conditions hold, then there is a pullback square:
where denotes -completion and denotes “rationalization” (inverting all primes). A similar theorem applies to any space (with technical conditions), yielding a pullback square
where denotes localization at .
Clearly, there is a formal resemblance to the pullback square involved in the gluing theorem. At this point I feel like I should be saying something about . Unfortunately, I don’t know what to say! Maybe some passing expert will enlighten us.
This weekend I’m giving a talk on “The Foundations of Applied Mathematics”. It’s mostly about how the widespread use of diagrams in engineering, biology, and the like may force us to think about math in new ways, at least if we take those diagrams seriously.
I wouldn’t typically emphasize the term ‘foundations of mathematics’ when talking about this. But I’m speaking at a Category-Theoretic Foundations of Mathematics Workshop, so I want to fit in.
Unfortunately, this means I need to think a bit more generally about how applications of mathematics can impinge on foundational issues. I need to do this just so I’m not taken by surprise when people start objecting or asking tough questions.
So, I’d like to hear what you have to say. Preferably before Sunday morning!
Here’s a bit of what I’m saying. Again, this isn’t the important part of my talk, just the introduction… but I’m afraid I’m biting off more than I can chew.
We often picture the flow of information about mathematics a bit like this:
Of course this picture is oversimplified in many ways! For example:
- The details depend enormously on time and place. Let’s focus on post-WWII mathematics just to keep the topic from growing too large.
- There are many branches of science and engineering, and a very complex flow of information among these.
- In academia, only some applications of mathematics are now classified as “applied mathematics”.
- Some branches of physics now communicate more directly to “pure mathematics” than “applied mathematics”.
- Computer science also plays a distinctive role, often communicating directly to “foundations of mathematics”.
But the picture is close enough to true that deviations are interesting.
In particular: can developments in applied mathematics force changes in the foundations of mathematics?
Applied mathematics often provokes new developments in pure mathematics. But can it demand new foundations?
Can applications of math directly affect the foundations of math, or is the conversation always mediated by pure mathematicians?
Computer science is the biggest example of “applied math” that grew directly out of work in logic, where new ideas directly impact foundations:
- Uncomputability, undecidability,…
- Computer-aided proofs: what is a proof?
- Category-theoretic logic
But I want to talk about some other applications of mathematics that seem to call for category-theoretic foundations….
A new portrait gallery opened this week in the James Clark Maxwell Building which houses the Edinburgh University maths department. It consists of seventy portraits of mathematicians selected, as the name suggests, by Michael and Lily Atiyah.
Due to the magic of the internet, you do not have to travel to Edinburgh to see the exhibition as Andrew Ranicki, who helped organise the gallery, has made it available on his website.
The commentaries on each of the photographs give interesting personal insights of the Atiyahs. I’d definitely recommend that you younger mathematicians out there give them a read.
If Martin-Löf dependent type theory is a formal system which projects down via truncation to (typed) first-order logic, might we not expect there to be a modal type theory which would project down to first-order modal logic? I’ll use this post to play around with the idea.
Normally we take a modal logic to be a system which allows us to study certain ways of qualifying propositions.
It is necessarily the case that P; It is obligatory that P ; It is known that P, etc.
With first-order modal logic, these propositions may have free variables which we can then bind by a quantifier. This allows us to express controversial metaphysical theses such as:
‘Everything is necessarily self-identical’ or ‘For every thing, it is necessarily the case that that thing is identical to itself’ ().
It still seems that only propositions are modalised. But given all we’ve been hearing at the Café from Mike on type theory, we know that propositions can be considered as just a certain kind of type. Then, if we can modalise proposition-as-types, why not modalise all types? This is what happens in modal type theory.
Despite the enormous amount of attention given over to modal logic by philosophers, I’m unaware of any mention of the idea that modal operators could apply to anything other than propositions. Instead, the majority of the work in modal type theory occurs in computer science. To pick out one interpretation of ‘necessity’ from de Paiva, Goré and Mendler’s useful summary Modalities in constructive logics and type theories
…the constructive singles out those terms of type that are “closed”, i.e., which are completely built from constructors over which primitive recursion is well-defined.
On the other hand, we’ve seen Urs and Mike working out higher modalities for physics, in the case of cohesive structures.
But in view of the fact that dependent type theory seems to fit well with natural language, is it not possible that we already have a trace of modal types in our everyday conversations? Let’s think. A director might say while casting
He’s a possible Hamlet.
Now, we might try to rewrite every instance of
X is a possible Y
It is possibly the case that X is a Y,
but this may do some violence to our inclination to attach the modality directly to a noun phrase. When cooking, some things are necessary ingredients for a dish, others optional. Beef is a necessary ingredient of a beef stew, while ale is only a possible ingredient.
Looking now at the epistemic modality, in my current state of knowledge, I may have France and Germany as known EU members, and China as a known non-EU member. I may also have Norway as a possible EU-member. Indeed, I can tag on ‘known (to me)’ to most noun phrases, and the dual ‘possible’ = ‘not known to be not’.
Thinking of dependent types, I may not know for sure someone’s nationality.
Karen is a possible citizen of Norway, hence, since I think Norway a possible EU country, as far as I know she is a possible EU citizen.
We would then need to see how modalities and dependent types interact. It seems we have
Given that dependent sum is what projects down to existential quantification, this might put us in mind of the so-called ‘Barcan formula’. One of the originators of first-order modal logic was the philosopher Ruth Barcan, who gave her name to a postulated entailment and its converse. These postulated entailments run between and .
I like that idea that intuitions concerning one level of logic or mathematics often arise from something deeper behind the scenes. So the intuitionists had picked up on something when they saw that was acceptable but not the converse. Behind the scenes, the former is natural, arising from currying the evaluation of an exponential (, where is falsity here).
What then of the Barcan and converse Barcan formulas? Well we had some discussion over at the nForum, and came to see that the converse Barcan was a projection of something natural in the presence of a certain pullback. So
Let’s illustrate this using the dependent type of players of teams:
where that top right entry can be formulated differently
possible players of possible teams = possible (players of teams).
Converse Barcan is a projected shadow of the inclusion of possible players of actual teams within possible (players of teams), i.e., possible players of possible teams. Only when all possible teams are actual teams do we have Barcan, that is, that every possibly player is a possible player of an actual team.
In view of the large philosophical literature on topics such as possible objects, there are opportunities for modal dependent type theory and philosophy to interact. One thing to work out would be how to think about modalities applied to the universe of types itself. Semantically, are we to imagine a collection of possible worlds over which the given types vary? We may want the base here to be a full -groupoid.
Last summer in Barcelona, Joachim Kock floated the idea that there might be a connection between two invariants of graphs: the Tutte polynomial and the magnitude function. Here I’ll explain what these two invariants are, I’ll give you some examples, and I’ll ask for your help in understanding what the magnitude function tells us.
The Tutte polynomial of a graph is old and well-understood. It turns a graph into a polynomial in two variables, with natural number coefficients:
The magnitude function of a graph is new and ill-understood. It turns a graph into a rational function in one variable, with rational coefficients:
It can also be expanded as a power series with integer coefficients:
I can’t figure out what the relationship between the two is — if any — or what information is contained in the magnitude function. It’s exasperating! I’ll be very pleased if someone can shed light on the situation.
In this post, “graphs” are always going to be finite and undirected. But some of the time I’ll allow loops and multiple edges, and some of the time I won’t. I’ll say which I’m doing.
(Formally, a graph with loops and multiple edges consists of a finite set , a finite set , and a map . Here is the quotient of by the obvious action of the group of order ; it can be seen as the set of subsets of of cardinality 1 or 2. A graph without loops or multiple edges consists of a finite set and a subset that, seen as a relation on , is symmetric and irreflexive.)
The Tutte polynomial
The Tutte polynomial knows a lot about a graph. It knows how many edges the graph has, how many maximal forests it contains, and how many ways there are to colour each of its vertices from a palette of 38 colours without ever giving the same colour to adjacent vertices. It’s related to the Potts model in statistical physics and the Jones polynomial of knot theory.
I’ll give you the definition in a moment, but first we need some terminology.
For this part, graphs will be allowed to have loops and multiple edges. Given an edge of a graph , there are two fundamental operations we can perform:
Deletion. The graph is simply with the edge removed (but the vertices left untouched).
Contraction. The graph is formed from by first removing , then identifying the two vertices that it used to join.
An edge is called a bridge if has more connected-components than , and a loop if it joins a vertex to itself.
The Tutte polynomial of a graph is characterized by the following two conditions:
if is neither a bridge nor a loop;
if consists of bridges, loops, and no other edges.
It takes some work to prove that there’s any polynomial with these two properties, since in the first condition there are usually many different edges we could choose. But this form of the definition has the advantage that it provides a nice algorithm for calculating Tutte polynomials. For instance, here’s a picture from Wikipedia of the algorithm at work:
The conclusion is that if is the five-vertex graph at the top of the tree then
The magnitude function
I’ll say twice what the magnitude function of a graph is: once for people who know what the magnitude of a metric space is, and once for people who don’t. In this part, graphs are not allowed to have loops or multiple edges — or more exactly, they could do, but adjoining or deleting loops or duplicate edges doesn’t make any difference to the magnitude function.
First, here’s the explanation for if you know about magnitude of metric spaces. Every finite metric space has a magnitude function , where is scaled up by a factor of and the bars indicate magnitude. (This “function” may have a finite number of singularities.) We can turn a graph into a metric space by taking the points to be the vertices and the distance between two vertices to be the number of edges in a shortest path joining them. The magnitude function of a graph is the magnitude function of the resulting metric space. It will be convenient to make the substitution and write .
Now here’s a direct explanation.
List the vertices in some order: . (The choice of order won’t affect anything.) Let be the number of edges in a shortest path from to (assuming there is one). Let be the square matrix over whose -entry is , or if there is no path from to .
We’re going to want to invert , so let’s consider its determinant. It’s a polynomial in . Also, , so ; hence is not the zero polynomial. It follows that the matrix is invertible over , the field of rational functions with rational coefficients. We define
to be the sum of all entries of the inverse matrix . This is the magnitude function of the graph .
It’s a fact that the magnitude function of a graph can always be expanded as a power series with integer coefficients. This is a special property of : your average rational function over is merely a Laurent series with rational coefficients.
(The key to the proof is that the constant term of the polynomial is , which is invertible in . It follows that can be expanded as a power series over . Hence can be too.)
What does the magnitude function tell us about a graph?
I’ll come clean: I have no direct evidence that the magnitude function of a graph tells us anything interesting about it. However, in several other settings, magnitude and related invariants have proven to be very fruitful, producing important quantities associated to topological spaces, metric spaces, sets, groupoids, orbifolds, associative algebras, etc. So it’s a good bet that it’s going to be fruitful for graphs too.
Moreover, in the particularly interesting setting of metric spaces, magnitude has been particularly reluctant to give up its secrets. So, the lack of obvious interpretation for the magnitude function of a graph doesn’t make me think it’s less likely to be interesting: it makes me wonder what it’s hiding!
But what does the magnitude function tell us? It’s hard to see. For instance, take the 4-cycle :
How can we interpret this? It’s tempting to try to understand a power series over as the generating function of something, but what? Even ignoring the negative signs, what’s being counted here?
For another example, take to be a forest with vertex-set and edge-set . Then
where is the set of connected-components of . What is this function telling us about the graph?
Tutte vs. magnitude
In some ways, the information contained in the magnitude function seems to be complementary to that contained in the Tutte polynomial. For example:
the magnitude function knows how many vertices a graph has, but the Tutte polynomial doesn’t (at least if we’re allowing disconnected graphs)
the Tutte polynomial knows how many edges a graph has, but the magnitude function doesn’t (at least if we’re allowing multiple edges).
In other ways, the magnitude function and the Tutte polynomial seem more similar than complementary. Let’s consider some families of graphs known to have the same Tutte polynomial:
all trees with a given number of edges have the same Tutte polynomial; they also have the same magnitude function.
The three graphs known as the bull, the (3,2)-tadpole and the cricket (shown below) all have the same Tutte polynomial; they also all have the same magnitude function.
In fact, it’s consistent with the evidence below that for connected graphs, the Tutte polynomial determines the magnitude function.
However, in examples such as those above, I see no way of transforming the Tutte polynomial into the magnitude function. One way of proving that such a transformation exists would be to use the so-called “universal property” of the Tutte polynomial. (I don’t know whether it’s a universal property in the categorical sense.) Roughly, this says that Tutte determines magnitude if is determined by and whenever is an edge of that is neither a bridge nor a loop. But I don’t know whether that’s the case; I suspect not.
All in all, I don’t think the magnitude function is likely to be a specialization of the Tutte polynomial, even for connected graphs. In other words, the magnitude function seems to contain extra information about the graph that Tutte doesn’t. But what?
Properties and examples
I’ll now try to convey an idea of how the magnitude function behaves, by way of some basic properties and examples. For comparison, I’ll show the behaviour of the Tutte polynomial alongside.
Disjoint unions Any two graphs and have a disjoint union .
Joins Suppose we have a graph with a distinguished vertex , and another graph with a distinguished vertex . We can form a new graph , the “join”, by first taking then identifying with .
Knows the number of vertices?
Tutte: no, if we’re allowing graphs to have multiple connected-components: see “discrete graphs” below. Yes, if we stick to connected graphs: then the degree of as a polynomial in (with regarded as constant) is .
Magnitude: yes: it’s .
Knows the number of edges?
Tutte: yes: . (Here and later, I’ll use for the set of edges of a graph , and for the set of vertices.)
Magnitude: no, if we’re allowing loops or multiple edges, since adding a loop or a duplicate edge doesn’t change the magnitude. Yes, if we stick to graphs without loops or multiple edges: the coefficient of in the power series expansion of is times the number of edges. In other words, . (See this comment, and the reply to it, below.)
Discrete graphs Let be a discrete graph, that is, one with no edges at all.
Forests Let be a forest, that is, a disjoint union of trees.
Cycles Let be the -cycle:
Magnitude: . (Those are the floor and ceiling functions in the denominator.) This naturally splits into two cases. If is even then
If is odd then
Complete graphs Let be the complete graph on vertices (that is, every vertex is joined to every other by a single edge).
Tutte: it seems that there is no neat formula for , and that computing it is nontrivial. A single example:
Complete bipartite graphs I’ll just do one. Let be this graph:
Bull, tadpole, cricket These three graphs all have the same Tutte polynomial and the same magnitude function:
Let be any one of them.
There’s no mystery as to why they all have the same Tutte polynomial. It’s simply because all three graphs are the join of and two copies of the graph consisting of a single edge, and the Tutte polynomial of the join of a bunch of graphs is determined by their individual Tutte polynomials. The same goes for the magnitude functions.
Petersen graph Don Knuth observed that the Petersen graph “serves as a counterexample to many optimistic predictions about what might be true for graphs in general”. So even though I don’t have a prediction to be counterexampled, I thought I’d better check its magnitude function for disturbing behaviour. I’m not disturbed.
Tutte: according to Wikipedia, the Petersen graph has Tutte polynomial
Actually, this does defeat a conjecture! Having read the previous examples, you might have wondered whether the coefficients of the magnitude function (when expanded as a power series) always alternate in sign. The Petersen graph shows that they don’t.
Linear codes I’m going to be very sketchy here. (If you want to know more, just say so in the comments.) Let be a finite field. A linear code over is simply a linear subspace of , for some .
Tutte: Every linear code has a Tutte polynomial . (More exactly, the definition of Tutte polynomial extends smoothly from graphs to matroids, and every linear code gives rise to a matroid.) But also, with every linear code is associated an important polynomial called its “weight enumerator”. It’s a fact that the weight enumerator of a code is determined by its Tutte polynomial together with the cardinalities of and .
Magnitude: On the other hand, every linear code can be regarded as a metric space (with the Hamming distance inherited from ), and so has a magnitude function. It’s a fact that the weight enumerator of a code is essentially the same thing as its magnitude function; each determines the other once you know the cardinality of .
So in the case of linear codes, the Tutte polynomial (plus the cardinalities of and ) determines the magnitude function.
The magnitude function of a graph remains mysterious to me. I have very little intuition for what it means. I understand the magnitude of a metric space moderately well for subspaces of , but metric spaces coming from graphs are about as un-Euclidean as can be.
Here are some questions I’d like answering.
What information can be read off from the magnitude function of a graph?
Is there a useful way to regard the power series expansion of the magnitude function as a generating function? In other words, do the coefficients count something?
It’s consistent with the evidence above that two graphs with the same Tutte polynomial and the same number of vertices have the same magnitude function. Is it true?
It’s consistent with the evidence above that two connected graphs with the same Tutte polynomial have the same magnitude function. Is it true?
Consider a formal system, such as logic, type theory, or some programming language. Suppose you want to prove some “meta-theoretic property” of this system, such as the following.
- Consistency: There is no proof of false. This is obviously a desirable property of a formal system.
- The existence property: if it is provable that “there exists an such that ”, then there is some particular such that is provable.
- The disjunction property: if “ or ” is provable, then either is provable or is provable. The existence and disjunction properties are not true of all formal systems: they are characteristic of constructive logics. (Excluded middle obviously violates the disjunction property, while the axiom of choice obviously violates the existence property.) In fact, arguably they are a defining feature of constructive logics, where proving that something exists ought to involve “constructing” a particular such thing.
- Canonicity: any term of natural-number type evaluates to (or, is provably equal to) some numeral (and various similar statements). This is an important property for any system that can function as a programming language: we want our programs to compute results!
- Parametricity: I’ll explain this later on.
How might you go about proving properties like this? If you’re a syntactically minded type theorist, you might try to use induction over types. That is, you first prove something for “base types” such as the natural numbers , then prove that if it holds for and then it holds for derived types such as , , and , etc. After a lot of work reformulating the thing you’re trying to prove as a sufficiently general statement for such an induction to go through, you finally make it all work and write QED. Then, because the reformulation you ended up with involves assigning “relations” (sometimes unary, sometimes binary) to types in an inductive way, you call the method logical relations.
I’ve gathered that this sort of thing is perfectly intuitive to a type theorist. However, when I first encountered it, being a category theorist, I had a very hard time understanding it. Certain steps seemed extremely unmotivated, and I didn’t see any principled reason for what was going on. I think I understand the type theorists’ point of view a little better now — but fortunately there’s another way to structure the whole argument, which I think makes much more sense to a category theorist.
As category theorists, we like to use universal properties to avoid doing work. (Or, more precisely, we like to use them to package up work into an easily reusable theorem, so that we only have to do the work once.) Fortunately, the syntax of type theory has a nice universal property: it constructs free categories with specified structure.
A little more precisely, from any type theory , we can build a category whose objects are, roughly, the types in the theory, and whose morphisms are terms with free variables. That is, a morphism is a term of type involving a free variable of type . Some more details, and a few examples, can be found at the nLab page on syntactic categories.
Now whatever structure our type theory had, it is usually visible as some categorical property of the syntactic category . For instance, if had product types, then has categorical products. If also had function types, then is cartesian closed. And so on. Moreover, is the initial such category: for any other category with the appropriate categorical structure, there is an essentially unique functor preserving that structure. (I’m sweeping a lot of details under the rug here, having to do with weakness and coherence and strictification and what categorical level this universal property lives at, but the basic idea is the important thing.)
Now the question becomes, how can we use this universal property to prove meta-theoretic properties of the type theory? Clearly, we are going to have to cook up some particular structured category (or perhaps more than one) so that the existence of a structure-preserving functor tells us something about . Thus, the construction of will have to involve somehow—but must also be different from . Moreover, we should expect to have to use the uniqueness aspect of the universal property too.
When I first wrote the above paragraph, I said that we were going to have to cleverly cook up some category . Then I felt embarrassed, because I remembered Tom’s comment that “if cleverness is the first quality that comes to mind, then it suggests to me that something is insufficiently understood.” So I thought for a bit, and now I can say more about where this comes from. (I can actually say even more than I’m about to right now, but the rest will have to wait for another post.)
Let’s take a clue from type theory itself. Consider a type like , the natural numbers type. Semantically, we want to think of this as a natural numbers object, which is an initial object in some category. More specifically, for any equipped with morphisms and , there is supposed to be a unique morphism respecting this structure.
Inside of type theory, the existence of such a morphism is easy to specify: it’s called recursion and is the natural “elimination rule” for the type . But uniqueness of that morphism is not as natural, type-theoretically: it would be asserting an equality, which is less “computational”. Instead, we generalize the recursion rule to say that not only does every such come with a morphism out of , every such over comes with a section. This turns out to be equivalent to asking to be initial. For if is initial, then when is over , the map must be a section (by uniqueness of the composite ); and conversely, for any two morphisms , if their equalizer admits a section, then they are equal. A homotopical version of this equivalence was proven recently by Awodey, Gambino, and Sojakova.
(You may wonder whether talking about “sections” is any better than talking about equality, since being a section of means that . However, dependent type theory has a way to talk about sections of particular kinds of maps—called “display maps” or “dependent projections” or “fibrations”—without referring explicitly to any kind of equality.)
This suggests that to use the uniqueness as well as the existence part of the universal property of , we might look for a structured category over , for which it would be interesting to have a section. Moreover, we might expect the functor to be some sort of category-theoretic fibration.
Now how can we build an interesting fibration over , without knowing much about itself? Well, if we had a functor for some other category , and some other functor , then we could build the comma category of over or under . The most obvious choice of , of course, would be the identity functor of . But what sort of functor can we build without knowing much about ? Well, as Tom also reminded us last year, about the only functors out of an arbitrary category that we can even mention are the representables, . And finally, what’s the most canonical object ? One very canonical object is the terminal one.
There are still a few choices that we might make differently, but I hope I’ve convinced you that at least one reasonable possibility to consider for is the scone of : the comma category of over the global sections functor . Explicitly, an object of consists of an object together with a function . Elements of are called global sections of , so we can view as a family of sets indexed by the global sections of . Similarly, a morphism from to consists of a morphism in together with a function making the evident square commute. The projection simply forgets the primed data.
Over a year ago, I talked about scones from a “geometric” point of view. At that time, I said that given a “forgetful functor” , we can think of the scone as a “concretization” of : its objects are “sets equipped with -structure”. It’s at least interesting to ponder this in our current situation: the objects of are “sets equipped with the structure of a type”. I’m not really sure what that means, but it sounds profound.
Anyway, in order to apply the universal property of to get a functor , we’ll need to know that inherits all the categorical structure that has. And to conclude that this map is a section of the projection, we’ll need to know that the projection also preserves all this structure. But suppose we take these two facts as given for a moment, so that exists and is a section of the projection. That last means that for , is the object itself equipped with a family of its global sections. Thus, what does is just to equip every object of with a family of sets indexed by its global sections—that is, a map —in a way which is functorial and structure-preserving.
In particular, insofar as the maps are injective (as often happens), what does is assign to each type a predicate—that is, a unary relation—on its set of global sections. And a global section of a type in in just a “closed term” of type (meaning a term involving no free variables). Thus, in general we may think of the elements of over as “witnesses” or “proofs” that some relation “” holds.
Finally, if we look “under the hood” into the actual construction of , we see that it’s built inductively over the structure of types (since this is how we prove the universal property of ). Thus, these unary relations are precisely the syntactically-minded type theorist’s “logical relations”. What category theory does for us is (1) fold all inductions over the structure of types into the single statement that is initial in some category, so that we only have to do it once; and even more importantly, (2) tell us automatically what each inductive step in the construction of should look like: it’s determined by the corresponding categorical structure in , which in turn is characterized by a universal property.
In order to get any mileage out of this, of course, we need an explicit description of how inherits structure from and from , in a way that is preserved by the forgetful functor . This is basically always completely straightforward. Here are some easy examples that you can check for yourself:
The terminal object of is .
The initial object of is .
The product of and is . In other words, a witness of is just a witness of together with a witness of .
A morphism is a monomorphism just when both and are so. In particular, a subobject of is given by a subobject in , together with for each , a subset .
If includes (“truncated”) existential quantification, so that is a regular category, then is a regular epi just when both and are so.
If includes binary unions of subobjects, then the union of and is .
If includes a type of natural numbers, then the natural numbers object of is , where is the ordinary set of natural numbers and the function sends each to the numeral . In other words, a witness of is a natural number such that is equal to the numeral .
This is enough to prove consistency, canonicity, and the disjunction and existence properties! Consistency is the easiest: if there were a term of type false, it would be a morphism in . But then would take this to a morphism . Since preserves all the categorical structure, is the terminal object and is the initial object . But then we would have a morphism in , which is impossible. (Of course, this is a relative consistency result, as always – if our axioms for were inconsistent, then there would be a morphism .)
For the existence property, suppose we can prove . That means exactly that the composite in is a regular epi. Since preserves the relevant categorical structure in all cases, in this case it preserves regular epis. It follows that is regular epi, and in particular is a surjective function. But also preserves the terminal object, so , and hence must be nonempty. Therefore, is nonempty, so has a global element in . In other words, there is a closed term such that is provable. The argument for the disjunction property is similar; I’ll leave it as an exercise. (I think Peter Freyd was the first to prove these two properties categorically in this way.)
Finally, for canonicity, suppose is a term of natural-number type, hence a global section in . Then is a morphism in , since preserves the terminal object and the natural numbers object. But is the usual set of natural numbers, so this means that must be equal to some numeral (namely, the primed part of ).
I think this much is already pretty awesome, but now we get to parametricity. For this we need to know, first of all, what exponentials look like in . Of course, they must be preserved by the forgetful functor, so the underlying -object of is the exponential in . But now the elements of the set which lie over must be equivalent to maps in lying over in , and hence equivalently to maps lying over . In other words, if is a morphism in , hence a global element of , then the fiber of over consists of all functions lifting .
In particular, if and are monic, hence merely predicates on the global sections of and respectively, then is also a predicate on global sections of , which holds of precisely when maps into . In other words, a witness of a function is a function exhibiting the fact that preserves witnesses. (This isn’t a proof that has exponentials, of course, only a characterization of what they must be if they exist. But as usual with this sort of thing, it’s easy to prove that this definition works.)
More generally, we’ll need local exponentials (exponentials in slice categories) and dependent products, but these work in essentially the same way: to every “function term” we assign the set of “liftings” of it to witnesses.
The other thing we need to know about is universe objects. Thus, suppose is a type of (small) types, giving a universe object in . In particular, we have the type of pointed types, with the first projection being the “universal fibration” in . That is, for any type , dependent types over are classified by maps , with the resulting display map being the corresponding pullback of .
Now the first thing we need is a family of sets indexed by the global sections of . Of course, a global section of is just a type dependent over , which is to say a type defined in the empty context. Now given such an , a witness of must correspond to a map in lifting in . But if is to be a universe object, such maps must be equivalent to objects of lying over . In other words, a witness of a type is any lifting of to an object of . Any lifting at all.
In particular, since must preserve all structure, we must have . Note that this is not merely a unary relation on : a given type has many witnesses.
We also need the universal fibration . But this is likewise easy: suppose is an object of , hence a global element of , and suppose also that is a global section of . Then we must have a commutative square of sets
Elements of lying over should be the same as liftings of to which agree with . But since the object must be the pullback of along its classifying map , this means precisely to give an element of lying over .
That is, for a type with a witness , a witness of is just an element of over , i.e. a witness of in the ordinary sense defined above.
Finally, we’re ready for parametricity. Consider the type , an inhabitant of which is a function assigning to every (small) type an endomorphism of . In classical mathematics, there are lots and lots of such functions. However, in constructive type theory, there’s essentially only one such function: the one which assigns to each its identity function! The point is that without some additional assumption such as excluded middle or AC, we can’t write down such a function that behaves differently depending on what is: e.g. we can’t say “if has exactly 2 elements, then swap them, otherwise, …”. (I believe this is the origin of the word “parametricity” — we say the function has to be defined parametrically in .) And if you just have an arbitrary , without knowing anything about it, then as Tom would say, what’s the only function that you can even mention? The identity function.
Of course, you can throw other garbage into the definition of such a function, but it won’t be able to change the fact that what comes out at the end of the day acts like the identity function. This is the theorem that we can now prove using the scone.
Categorically, the type is built as the local exponential in the slice category of over , followed by the dependent product from this slice category to itself. And all these constructions must be preserved by . Unraveling this, the first step gives us an object of over , whose component in is the dependent type . The additional data assigned by consists of
- for every (this is a global section of in ), and
- for every choice of a lift of to (this is a witness of ), and
- for every term (this is a global section of lying over the classifying map of ),
the set of liftings of to witnesses. Note that these witnesses are precisely given by the specified above. If we think only about relations, then at this stage, is recording, for every term , whether it preserves each unary relation on .
Now at the second step, we consider the liftings of each to the witnesses defined above. In other words, a witness of such a consists of, for every and every , a lifting of to these witnesses. In particular, this means that the function preserves every unary relation on the set . Clearly this is only possible if is the identity function (consider the characteristic relations of singletons).
Finally, any actual term yields a global section in , hence a global section of , and hence a witness of in the above sense. This is a precise form of the first parametricity theorem: any term of type behaves like the identity function when applied to terms in the empty context.
What is the point of all this? I think the example of universe objects is where the category-theoretic point of view really shines. When you do “logical relations” from the syntactic type-theory point of view, you basically have to make up each inductive clause of the definition of the logical relation as you go. The definition for function types is not too surprising, if you think about the unary relation as a sort of “well-behavedness”: it’s not unreasonable to say that a function is well-behaved if it maps well-behaved inputs to well-behaved outputs. (I believe the usual word for “well-behaved” in this context is “reducible”.) But I think the definition for a universe is pretty opaque from this point of view: a witness to reducibility of a type is any relation on its global elements at all?
Moreover, if you insist on thinking of as merely a relation on global elements, then you can’t even phrase it like this. Basic presentations of parametricity don’t actually include a universe object; instead they work in a “polymorphic type theory”, where you can form types that behave like “” but which are not literally dependent products over any type “”. Then you basically have to make up the combined rule for defining on these types as above: is reducible if for every type , the function preserves every unary relation on .
Let me end with a few words about generalizations. There are lots of other “parametricity” theorems – have a look at this classic paper. For instance, for any definable functors and , any term of type must be a natural transformation. To prove this, you need not just unary logical relations but binary logical relations. But these is easy to do categorically too: instead of the scone of , we consider the category whose objects consist of an , a set , and a function . This category inherits all the relevant structure in essentially the same way.
The most general context that I know of in which to do this is the colax limit of a (pseudo)functor , where is an inverse category. If is the interval category and picks out , then this colax limit is the scone. If is the parallel pair and picks out twice, then this colax limit is the place for binary relations I mentioned above. We obviously have -ary relations for all , and also potentially “higher” logical relations. (Are there any uses for those?)
The assumption that is inverse is what gives things like exponentials and universe objects such a nice form in the colax limit. I made use of that same fact in this paper, focusing mainly on the special case of colax limits of constant functors — which is the same as categories of diagrams over in some fixed category . But I recently realized that the same techniques work for non-constant functors, so that the scones work homotopically as well (up to a point), and in particular, inherit things like the univalence axiom.
In principle, this means that we can extend all the above theorems to type theory with univalence. Canonicity in the presence of univalence, in particular, has been an important open question since its introduction: does the univalence axiom introduce new closed terms of natural number type whose numeric value isn’t determined? Dan Licata and Bob Harper showed that the answer is essentially no in the 1-truncated case, by adding definitional equality rules to make the type theory 1-truncated, and then giving a logical relations argument to show canonicity. The gluing approach, on the other hand, doesn’t modify the underlying type theory, but yields instead a homotopical answer: every term of natural number type is homotopic to a numeral. (That this is true was conjectured by Voevodsky when he introduced the univalence axiom.)
Unfortunately, the gluing approach also suffers from a coherence problem. In a homotopical setting, we’d like to glue along a “global sections” functor valued in simplicial sets (or some other model for ). However, it seems hard to obtain a strict functor of this sort. We do have global sections functors into or , and since and contain one and two univalent universes, respectively, we can derive homotopy canonicity for 1-truncated type theory (in which all types are 1-types) with one or two univalent universes. (The groupoid scone was also considered, without univalence, in this paper by Hofstra and Warren.) The new version of my paper (posted today) describes this; it’s an open question how any of these results could be extended to type theories without a truncation axiom, and with arbitrarily many univalent universes.
Acknowledgments: this post owes a lot to Bob Harper, from whom I first learned about logical relations at OPLSS; to Peter Lumsdaine, who first tried to tell me what they had to do with scones; and to Thierry Coquand, whose talk at IAS last term about regarding setoids as truncated semi-simplicial types finally crystallized for me the relation between inverse diagrams and logical relations.
My student Roald Koudenburg recently successfully defended his thesis and has yesterday put his thesis on the arXiv.
Roald Koudenburg, Algebraic weighted colimits
I will give a rough caricature of what he does. For a much nicer overview, I suggest you read the well-written introduction to the thesis! (Relating to the Café, there’s even an example including Simon Wadsley’s Theorem into Coffee.)
Roald was originally thinking about Ezra Getzler’s approach to operads in Operads revisited, and needed to generalize a result of Getzler’s on when the left Kan extension of a symmetric monoidal functor along a symmetric monoidal functor is itself a symmetric monoidal functor.
If you are of that kind of persuasion, you will be aware that symmetric monoidal categories are the pseudo-algebras for the free symmetric strict monoidal category -monad on , the -category of categories, and that symmetric monoidal functors are the algebra maps for this -monad.
You might then consider the situation of a -monad on a -category, and ask when the left Kan extension of a map of -algebras along a map of -algebras is again a -algebra map.
Here are three examples of algebras for a -monad that Roald considers.
Ordered compact Hausdorff spaces are algebras for the ultrafilter -monad on -Cat. In this case the question becomes, when is the left Kan extension of a continuous order preserving map along another such map also continuous and order preserving?
Double categories can be considered as algebras for a certain -monad on a -category of internal categories in a specific presheaf category. (Yes, I find that a mouthful.)
Similarly, monoidal globular categories of Batanin are algebras for some -monad.
Going back to the symmetric strict monoidal category monad, this monad actually extends from the -category to the the proarrow equipment of categories, functors and profunctors. Mike wrote a nice post here at the Café on Equipments and what they have to do with limits.
It turns out that there are several examples of -monads on equipments, with interesting algebras and Roald looks at conditions necessary for when, given such a -monad , the left Kan extension of a map of -algebras along map of -algebras is a map of -algebras.
I should at this point say something precise about the main result: given a ‘right suitable normal’ monad on a closed equipment , Roald defines ‘right colax -promorphisms’ which, together with the usual colax T-algebra maps, form an equipment . The main theorem is the following.
Theorem: Let be a ‘right suitable normal’ monad on a closed equipment . The forgetful functor ‘lifts’ all weighted colimits. Moreover its lift of a weighted colimit , where is a pseudomorphism and is a right pseudopromorphism, is a pseudomorphism whenever the canonical vertical cell
is invertible, where is the structure map of .
It had always seemed to me that category theory provided no useful perspective on graph theory. But yesterday I learned a small fact that caused me to slightly revise that opinion. It’s that colourings of graphs — which seem to be a major source of questions in graph theory — are simply homomorphisms into a complete graph.
I’ll explain this, and I’ll describe a conjecture that Tom Hirschowitz told me about yesterday: for graphs and ,
The chromatic number of is the minimum of the chromatic numbers of and .
It’s amazing that something so basic is unknown!
For this post, a graph is a finite set equipped with a symmetric, irreflexive binary relation. The elements of the finite set are called the “vertices”, the relation is usually called , and rather than saying that two vertices are related, we say that there is an “edge” between them. Concretely, then, “graph” means a finite undirected graph without loops (that being the “irreflexive” part) or multiple edges.
For instance, for each natural number , the complete graph has vertex-set and an edge between and whenever .
A homomorphism of graphs is what you’d imagine. That is, given graphs and , a homomorphism is a map of sets such that for ,
This gives us a category of graphs.
Graph theorists know a lot about colourings. By definition, a colouring of a graph by colours, or an -colouring of for short, is a way of painting each vertex one of colours in such a way that no two vertices of the same colour have an edge between them.
Here’s the small fact I learned yesterday (from Wikipedia):
A -colouring of a graph is the same thing as a homomorphism .
Why? Well, writing , a homomorphism is a map of sets such that if have an edge between them then the vertices and of have an edge between them. But every pair of vertices of is joined by an edge except when . So a homomorphism is a map of sets such that if and are joined by an edge then . That’s exactly an -colouring.
This fact makes it obvious that colouring is functorial: given a homomorphism , every -colouring of gives rise to an -colouring of . (Simply compose the maps .) Concretely, we paint each vertex of the same colour as .
Every graph can be -coloured for sufficiently large : just paint all the vertices different colours. The chromatic number of is the smallest for which there exists an -colouring of .
The functoriality of colouring means that whenever there is a homomorphism . Categorically put, is a functor
Here is the poset regarded as a category: it has one object for each natural number, there is one map when , and there are no maps when .
The very famous four-colour theorem states that the chromatic number of a planar graph is at most four. For a long time it was only a conjecture. But there’s another fundamental conjecture about chromatic numbers — perhaps even more fundamental, as it doesn’t refer to the geometric notion of planarity. Tom Hirschowitz mentioned it to me by email yesterday; he’d just heard it from Stéphan Thomassé. Here goes.
The category of graphs has binary products, defined in the expected way: given graphs and , the vertex-set of is , and there’s an edge between and if and only if there are an edge between and and an edge between and . The projections and are what you’d expect.
Now, the fact that there are any homomorphisms implies that . The same goes for . So:
for all graphs and .
The conjecture is that this is an equality:
Conjecture (Hedetniemi, 1966) for all graphs and .
It’s known to be true for various classes of s and s, and I guess people have had their computers check it in millions of special cases. As I only learned about it yesterday, I know almost nothing more — only what I’ve read on Wikipedia and this page by Stéphan Thomassé. But presumably a question this basic, open since 1966, has attracted a lot of attention.
Intuitively (as Wikipedia says) the meaning of the conjecture is that has no sneakily economical colourings. In other words, if your aim is to colour with as few colourings as possible, you can’t improve on the following strategy: either colour minimally and then declare the colour of to be the colour of , or do the same with the roles of and reversed.
Categorically, the conjecture can be expressed as follows:
Conjecture (Hedetniemi) The functor preserves binary products.
That’s just because binary products in the category are minima.
Postscript: Perturbing the definition
In what I wrote above, I tried to stick closely to what I understand to be graph theorists’ conventions, except of course that some things are phrased in categorical language. I don’t know nearly enough graph theory to challenge the wisdom of those conventions.
However, I can’t help wondering whether things would become cleaner if we allowed our graphs to have loops. Formally, a graph-with-loops is a finite set equipped with a symmetric binary relation. Thus, what we’re calling a “graph” is a special sort of graph-with-loops: it’s a graph-with-loops with no loops! Formally, it’s a graph-with-loops in which the relation is irreflexive.
Homomorphisms of graphs-with-loops are defined just as for ordinary graphs, and products are described in the same way too. Thus, is a full subcategory of , the category of graphs-with-loops, and the inclusion preserves binary products.
Why allow loops? The reasons are aesthetic.
First, the irreflexivity condition tastes strange on a category theorist’s tongue. We’re suspicious of negatives.
Second, the category of graphs has the strange feature of possessing binary products but not a terminal object. This is “fixed” (if you think it’s a problem) by passing to the category of graphs-with-loops, in which the terminal object is the graph consisting of a vertex and a single loop on it.
Third, there’s a general rule, discussed at the Café before (though I can’t find it now) and possibly due to Grothendieck:
A good category containing some bad objects is preferable to a bad category containing only good objects.
I’m not keen on the words “good” and “bad” in mathematics, but I understand the intent. For instance, the rule tells us to work with the category of schemes rather than the category of varieties. The category of varieties is “bad” in lacking certain colimits, for example; but from the traditional point of view, some schemes are “bad”. Similarly, it suggests that categories of smooth spaces are preferable to categories of manifolds.
In our present situation, the presence of loops might be thought to make a graph “bad”, but allowing such graphs makes the category good.
One way in which a graph containing loops is “bad” is that it has no colourings at all. If a vertex has a loop on it, then there’s an edge between and , so there’s no colour you can give it. Thus, the chromatic number of a graph containing one or more loops should be taken to be . Chromatic number therefore defines a functor
The category is also “better” than : unlike , it has a terminal object (greatest element), . Since , the terminal graph-with-loops, contains a loop, we have . That is, preserves terminal objects. So it preserves binary products if and only if it preserves all finite products.
Actually, there’s something funny going on. A loop in a graph is a homomorphism . In categories of spaces, a map from the terminal object to a space is often called a point of . So, loops are points. This means that the graphs without loops — the “good” graphs — are exactly the graphs without points! What’s funny is that in some categories of spaces, such as the category of locales, the nontrivial spaces with no points at all tend to be seen as rather exotic. (For example, there’s the locale of surjections , which is nontrivial but has no points.)
Hedetniemi’s conjecture is straightforward when or contains a loop. For if contains a loop then we have a homomorphism , which induces another homomorphism
So . (Concretely, if contains a loop on a vertex then is an isomorphic copy of contained in .) But we also know that ; so . Since , this is exactly what Hedetniemi’s conjecture predicts.
So the conjecture can equivalently be reformulated as:
Conjecture (Hedetniemi) The functor preserves finite products.
At first glance this looks harder than the previous versions, since is a larger category and we’re asking for the preservation of terminal objects as well as binary products. But the observations above show that it’s actually equivalent.
Here’s a final question. The chromatic number functor is defined in terms of the complete graphs . But what’s so special about the s? What is their place within the category of graphs? I see no nice universal property that complete graphs satisfy. They look “indiscrete”, so at first I looked for some universal property similar to that satisfied by indiscrete topological spaces or indiscrete categories. But no — you have to be careful about the lack of loops. Perhaps the interplay of perspectives between and is somehow helpful here.
I’d like to remind you of this workshop, which is coming up soon:
- Category-Theoretic Foundations of Mathematics Workshop, Department of Logic and Philosophy of Science, U.C. Irvine, May 4-5, 2013.
The schedule of talks is available now; I’ll just tell you what they’re about.
Here are the talks:
Speaker: Samson Abramsky, Oxford
Title: Category theory as a tool for making models
Abstract: I will discuss category theory in its aspect as a tool for mathematical modelling in the sciences. It has played an important part in computer science and some areas in theoretical physics for some decades, and is being used increasingly in areas ranging from quantum information and foundations to natural language semantics and game theory. I shall discuss how the conceptual and foundational features of category theory have a strong influence on its use in mathematical modelling and applications.
Speaker: John Baez, UC Riverside
Title: The Foundations of Applied Mathematics
Abstract: Suppose we take "applied mathematics" in an extremely broad sense that includes math developed for use in electrical engineering, population biology, epidemiology, chemistry, and many other fields. Suppose we look for mathematical structures that repeatedly appear in these diverse contexts - especially structures that aren't familiar to pure mathematicians. What do we find? The answers may give us some clues about the concepts that underlie the most applicable kinds of mathematics. We should not be surprised to find some category theory hiding here!
Speaker: Olivia Caramello, Cambridge
Title: Grothendieck toposes as unifying 'bridges' in Mathematics
Abstract: I will present a novel view of Grothendieck toposes as unifying spaces in Mathematics being able to effectively serve as 'bridges' for transferring concepts and results across distinct mathematical theories. This approach, first emerged in the context of my Ph.D. research, has already generated many applications into different mathematical fields, including Topology, Algebra, Geometry, Functional Analysis, Model Theory and Proof Theory, and the potential of this theory has just started to be explored. In the lecture, I will explain the fundamental principles that characterize my view of toposes as unifying 'bridges', and illustrate the technical usefulness of these methodologies by discussing a few selected applications.
Title: Category theory and set theory: examples of their interaction
Abstract: I will begin by reviewing basic arguments and counterarguments about either set theory or category theory as a possible foundational theory for mathematics. I will then devote the main part of my talk to a few examples of positive interaction between both theories in the field of logic. Two examples will be considered more specifically, the first one in connection with algebraic set theory, the second one in connection with sketch theory.
Title: Structuralist foundations for abstract mathematics
Abstract: As Paul Benacerraf famously pointed out, set-theoretic foundations provide too much information for the working mathematician. Our talk is driven by the question: is it possible or desirable to develop a foundational framework that omits the irrelevant and distracting information? We consider one proposal motivated by the slogan that, "all mathematically relevant properties are structural." Unlike many previous structuralist proposals, this proposal is linguistic (syntactic) in nature. Namely, we present a language L together with an intended semantics, such that the following hold: (1) L is rich enough to be considered foundational; (2) L satisfies the appropriate adequacy conditions with respect to its intended semantics; and crucially (3) well-formed sentences in L are necessarily invariant under isomorphisms of the relevant type of structure. This proposal is strongly influenced by Michael Makkai’s "structuralist foundation of abstract mathematics," and his language FOLDS (first-order logic with dependent sorts).
Speaker: Ralf Krömer, Siegen
Title: "Psychological priority" vs. effects of training. The foundational debate on Category theory revisited
Abstract: The paper develops a particular answer to a standard objection (by Feferman) against uses of Category theory as a foundations of mathematics. This objection roughly says that concepts like "collection" and "operation" are "psychologically prior" to the concept of category. Our answer amounts to showing the relevance of mathematical training to the foundational debate on Category theory.
First, we briefly review this debate from the beginnings to Bénabou's 1985 proposal to use elementary portions of Category theory as a foundation of "naive" Category theory as a whole; we discuss both some of the relevant mathematical and metamathematical facts and a number of records of how workers in the field interpret these facts and their philosophical impact. The main thesis of the paper is that "psychological priority" (Feferman) is in the last analysis irrelevant for the problem of a "proper presentation" (Isbell) of Category theory; one needs to take into account the effects of training because without training one is not able to judge the "properness''.
In order to develop this claim more fully, a sidestep is made to the epistemology of C.S. Peirce, especially to his account of intuition and the role of consciousness in hierarchies of cognitions. What is particularly relevant for us in Peirce's anticartesian approach is his criticism of the role of the individual for epistemology. While traditional philosophy of the foundations of mathematics seems to seek a foundational stance accessible for every individual in isolation (and this is also implicit in Feferman's conception of psychological priority), Peirce stresses the importance of the presence of a community, of collective processes of regulation of work with concepts, and of learning in such a situation. On these grounds, a (still sketchy) comparison between hierarchies of cognitions in the sense of Peirce and hierarchies of mathematical concepts is presented. The Peircean conception of intuition is found to be relevant whenever the role played by mathematical training for mathematical intuition is stressed. This ultimately leads to a revision of the notion of foundations of mathematics.
Title: Some elements of the descent theory, from the point of view of Algebraic Geometry, with philosophical remarks.
Abstract: In the framework of CT descent theory is a characterization of morphisms (descent morphism) that constitute a theory of generalization and of extension concerning very different mathematical concepts and theoretical points of view.
Also, right before the conference, I’m giving a department colloquium talk on “Key Moments in Category Theory” on Friday May 3. The Department of Logic and Philosophy of Science at Irvine has been doing a number of other category-theoretic activities lately, including Colin McLarty’s visit last spring and Jim Weatherall’s category theory course this spring.
The free monoid on a set is
and the same formula works in many other categories. I guess it works in any monoidal category with coproducts, where the tensor product distributes over coproducts.
Sometimes it’s nice to study this free monoid construction using geometric series, by pretending that
It doesn’t exactly make sense, at least not to me, since I can’t think of any categories where I know both what “” means for an object , and what the “reciprocal” of an object means. But we can still sometimes squeeze some useful insights out of this idea.
But Mike Stay asks, what about ?
If we say
We’re back where we started from! This is familiar in the theory of , where the matrix
acts as a fractional linear transformation
But this matrix and its negative give the same fractional linear transformation. So, it’s really the quotient that matters here. The transformation corresponds to a famous element of order 6 in , namely
But when you cube this element, you get
so the corresponding element in has order 3.
Anyway, Mike’s puzzle is this:
Puzzle. Is there some context in which the cube of the free monoid construction is ‘the same’ as doing nothing at all… perhaps in some weak sense of ‘the same’?
When I say ‘weak sense’ you should be thinking things like ‘ has some of the same invariants as ’, or maybe ‘ has some of the same invariants as ’. And you should be reminded of Andreas Blass’ paper Seven trees in one, which uses similar shenanigans to motivate, and then find, a nice bijection between and , where is the set of finite binary trees.
Of course, because is a monad, we have morphisms and . But I don’t see how these help with Mike’s puzzle.