Souvenirs from Bangalore 2015

by Orr Shalit

Last week I attended the conference “Complex Geometry and Operator Theory” in Indian Statistical Institute, Bangalore. The conference was also an occasion to celebrate Gadadhar Misra‘s 60s birthday.

As usual for me in conferences, I played a game with myself in which my goal was to find the most interesting new thing I learned, and then follow up on it to some modest extent. Although every day of the three day conference had at least two excellent lectures that I enjoyed, I have to pick one or two things, so here goes.

1. Noncommutative geometric means

The most exciting new-thing-I-learned was something that I heard not in a lecture but rather in a conversation I had with Rajendra Bhatia in one of the generously long breaks.

A very nice exposition of what I will briefly discuss below appears in this expository paper of Bhatia and Holbrook.

The notion of arithmetic mean generalizes easily to matrices. If A,B are matrices, then we can define

M_a(A,B) = \frac{A+B}{2}.

When restricted to hermitian matrices, this mean has some expected properties of a mean. For example,

  1. M_a(A,B) = M_a(B,A),
  2. If A \leq B, then A \leq M_a(A,B) \leq B,
  3. M_a(A,B) is monotone in its variables.

A natural question – which one may ask simply out of curiosity – is whether the geometric mean (x,y) \mapsto \sqrt{xy} can also be generalized to pairs of positive definite matrices. One runs into problems immediately, since if A and B are positive definite, one cannot extract a “positive square root” from AB, since when A and B do not commute then their product AB need not be a positive matrix.

It turns out that one can define a geometric mean as follows. For two positive definite matrices A and B, define

(*) M_g(A,B) = A^{1/2} \sqrt{A^{-1/2} B A^{-1/2}} A^{1/2} .

Note that when A and B commute (equivalently, when they are scalars) then M_g(A,B) reduces to \sqrt{AB}, so this is indeed a generalisation of the geometric mean. Not less importantly, it has all the nice properties of a mean, in particular properties 1-3 above (it is not evident that it is symmetric (the first condition), but assuming that the other two properties follow readily).

Now suppose that one needs to consider the mean of more than two – say, three – matrices. The arithmetic mean generalises painlessly:

M_a(A,B,C) = \frac{A + B + C}{3}.

As for the geometric mean, there has not been found an appropriate algebraic expression that generalises equation (*) above. About a decade ago, Bhatia, Holbrook and (separately) Moakher, found a geometric way to define the geometric mean of any number of positive definite matrices.

They key is that they view the set \mathbb{P}_n of positive definite n \times n matrices as a Riemannian manifold, where the length of a curve \gamma : [0,1] \rightarrow \mathbb{P}_n is given by

L(\gamma) = \int_0^1 \|\gamma(t)^{-1/2} \gamma'(t) \gamma(t)^{-1/2}\|_2 dt,

where \|\cdot\|_2 denotes the Hilbert-Schmidt norm \|A\|_2 = trace(A^*A). The length of the geodesic (i.e., curve of minimal length) connecting two matrices A, B \in \mathbb{P}_n then defines a distance function on \mathbb{P}_n, \delta(A,B).

Now, the connection to the geometric mean is that M_g(A,B) turns out to be equal to the midpoint of the geodesic connecting A and B! That’s neat, but more importantly, this gives an insight how to define the geometric mean of three (or more) positive definite matrices: simply define M_g(A,B,C) to be the unique point X_0 in the manifold \mathbb{P}_n which minimises the quantity

\delta(A,X)^2 + \delta(B,X)^2 + \delta(C,X)^2.

This “geometric” definition of the geometric mean of positive semidefinite matrices turns out to have all the nice properties that a mean should have (the monotonicity was an open problem, but was resolved a few years ago by Lawson and Lim).

This is a really nice mathematical story, but I was especially happy to hear that these noncommutative geometric means have found highly nontrivial (and important!) applications in various areas of engineering.

In various engineering applications, one makes a measurement such that the result of this measurement is some matrix. Since measurements are noisy, a first approximation for obtaining a clean estimate of the true value of the measured matrix, is to repeat the measurement and take the average, or mean of the measurements. In many applications the most successful (in practice) mean turned out to be the geometric mean as described above. Although the problem of generalising the geometric mean to pairs of matrices and then to tuples of matrices was pursued by Bhatia and his colleagues mostly out of mathematical curiosity, it turned out to be very useful in practice.

2. The Riemann hypothesis and a Schauder basis for \ell^2.

I also have to mention Bhaskar Bagchi’s talk, which stimulated me to go and read his paper “On Nyman, Beurling and Baez-Duarte’s Hilbert space reformulation of the Riemann hypothesis“. The main result (which is essentially an elegant reformulation of a quite old result of Nyman and Beurling, see this old note of Beurling)  is as follows. Let H be the weighted \ell^2 space given by all sequence (x_n)_{n=1}^\infty such that

\sum_n \frac{|x_n|^2}{n^2} < \infty.

In H consider the sequence of vectors:

\gamma_2 = (1/2, 0, 1/2, 0, 1/2, 0,\ldots)

\gamma_3= (1/3, 2/3, 0, 1/3, 2/3, 0, 1/3, 2/3, 0,\ldots)

\gamma_4 = (1/4, 2/4, 3/4, 0, 1/4, 2/4, 3/4, 0, \ldots)

\gamma_5 = (1/5, 2/5, 3/5, 4/5, 0, 1/5, \ldots),

etc. Then Bagchi’s main result is

Theorem: The Riemann Hypthesis is true if and only if the sequence \{\gamma_2, \gamma_3, \ldots \} is total in H

This is interesting, though such results can always be interpreted simply as a claim that the necessary and sufficient condition is now provenly hard. Clearly, nobody expects this to open up a fruitful path by which to approach the Riemann hypothesis, but it gives a nice perspective, as Bagchi writes in his paper:

[The theorem] reveals the Riemann hypothesis as a version of the central theme of harmonic analysis: that more or less arbitrary sequences (subject to mild growth restrictions) can be arbitrarily well approximated by superpositions of a class of simple periodic sequences (in this instance, the sequences \gamma_k).