The complex matrix cube problem – new results from summer projects
by Orr Shalit
In this post I will summarize the results obtained by my group in the “Summer Projects Week” that took place two weeks ago at the Technion. As in last time (see here for a summary of last year’s project) the title of the project I suggested was “Numerical Explorations of Open Problems from Operator Theory”. This time, I was lucky to have Malte Gerhold and Satish Pandey, my postdocs, volunteer to help me with the mentoring. The students who chose our project were Matan Gibson and Ofer Israelov, and they did fantastic work.
If is an operator on a Hilbert space (we will henceforth write this as ), another operator on a larger space (we assume ) is said to be a dilation of if
where we have written in block operator form with respect to the decomposition . In this case, is said to be a compression of . We then write . If and are -tuples of operators, we say that is a dilation of , and that is a compression of , if for all (it is implicit that we use the same embedding of in ). We then write .
A -tuple is said to be normal if is normal and for all . Normal tuples of operators are the best understood ones, thanks to the spectral theorem. When a normal tuple acts on a finite dimensional space then it is simultaneously unitarily diagonalizable.
For an operator we define its norm to be the operator norm of , that is: . An operator is said to be a contraction if .
If is a tuple of contractions, we write for the minimal constant such that there exists a normal tuple such that and for all .
In other words, is the price we have to pay (in terms of norm) in order to be able dilate to a normal tuple. We call the dilation constant for .
Now we can define the universal dilation constant for -tuples of contractions to be
is a -tuple of contractions .
That is, is a dilation constant that works for all -tuples of contractions.
The complex matrix cube problem: What is ? In particular, what is ?
In other words, we ask: what is the minimal constant such that given a tuple of contractions, one can find a normal dilation of such that for all ?
It was proved that the constant always works if are all selfadjoint, and that this constant is optimal for selfadjoints; see this paper by Passer, Solel, and myself. Passer later proved that the constant holds for arbitrary tuples, however it is not known whether Passer’s constant is optimal. In particular we knew that . In last year’s project we found that is (at least slightly) strictly bigger than , and during the last year we improved that to , but there is a large interval where may be and we do not know what it is.
2. The basic approach
Suppose we are given a -tuple of contractions . We wish to know whether it is true or false that has a normal dilation such that for all .
The first observation is that it is enough to consider only tuples of unitaries. Indeed, if is a contraction (meaning that ) then
is a unitary dilation of . So given a -tuple of contractions, we can find a -tuple of unitaries such that . Thus, we may as well assume that is a tuple of unitaries, and ask whether we can dilate .
In order to be able to carry out calculations on a computer, we approximated a universal normal tuple (which can dilate anything up to a scale factor) by normal tuples of matrices with joint eigenvalues at the vertices of the polytope , where is a regular polygon with vertices that is circumscribed in the unit circle . When is moderately large, the boundary of is very close to . One can analyze easily bounds on the error that arises from this approximation.
Our first approach was to randomly select a tuple of unitaries and to check whether it has a normal dilation of norm at most . Up to a small error that arises from the approximation mentioned in the previous paragraph, the question is whether can be dilated to some ampliation of , where is the tuple of normals constructed above. Basically, modulo some equivalences within the theory, we know that has the required dilation of size at most , if and only if there exists a UCP map sending to for . This, modulo some more equivalences (and as been noted in this paper of Helton, Klep and McCullough) is equivalent to the existence of positive semidefinite matrices such that
where , for , and
The existence of such semidefinite matrices can be interpreted as the feasibility of a certain semidefinite program (SDP). In fact, we decided to treat the full semidefinite program as follows
Note that we moved to the right hand side, to make the equality constraint affine in the variables and . Recall that and are all fixed. In the implementation we actually defined this as a maximization problem
This is the same semidefinite optimization problem that we used last year as well. (Last year we solved it using CVX on MATLAB. This year we used the package cvxpy in Python for the semidefinite problem. Our code is available on Google Colaboratory, see the links below).
3. Failure/success of the basic approach and some observations
Currently the best known lower bound for is , which is obtained from , where and is a pair of unitaries such that . We say that and are -commuting unitaries, or that they -commute. The dilation constants for all -commuting unitary pairs were determined in a paper by Malte Gerhold and myself. It is worth noting that this paper of Gerhold and me was inspired by last year’s project, where Mattya Ben-Efraim and Yuval Yifrach computed . However, we do not believe that , we think that it is bigger.
So the first easy to state goal for this year was to search long enough and find an example of a pair of unitaties such that , or at least as big as possible. Let me say straight away that we did not find such an example. In fact, for all the random pairs that we tested, we typically got significantly lower, usually near (see below).
However, already last year we observed that concentration of measure phenomena form an obstruction to finding high dilation constants. Roughly speaking, “concentration of measure” means that a Lipschitz function on a high dimensional probability space will obtain values close to its mean with very high probability.
It turns out that the above heuristic has strong theoretical evidence. A paper of Collins and Male (which relies on an earlier breakthrough paper of Haagerup and Thorbjornsen) states that a tuple of independently sampled -tuples of random unitaries (sampled from the group given the Haar probability measure) converges strongly to the tuple given by the generators of the reduced C*-algebra of the free group . I won’t define what strong convergence is (see the attached papers), but it implies that , and we think that we should be able to prove that there is actually equality in the limit.
4. Some results and additional approaches
One chief outcome of our project is to have collected significant evidence that for a random (Haar distributed) pair of unitaries , we have that tends to be close to (up to the error bounds in our finite dimensional approximation). As remarked at the end of the previous section, this is also strong evidence that . Indeed we found that as the size of the matrices increases, the standard deviation from the mean goes down. For example, during one of the nights of the project-week, we ran an experiment with the following outcome. (Recall that is the number of vertices in the polygon approximating the disc, so that the dilation constant that we calculate has a relative error of at most . This means that for every pair for which we compute a dilation constant , the true dilation constant lies between and . Also, is just the number of operators in our -tuples, and is the size of the sampled unitaries).
k = 23, d = 2, no. of random samples for each n is 50 ----------------------------------------------------- n = 2 max_C = 1.397229, mean_C = 1.252607, std = 0.089505
n = 3 max_C = 1.435168, mean_C = 1.338410, std = 0.062456
n = 4 max_C = 1.442803, mean_C = 1.377033, std = 0.049362
n = 5 max_C = 1.431040, mean_C = 1.398623, std = 0.015876
n = 6 max_C = 1.433795, mean_C = 1.404585, std = 0.012104
n = 7 max_C = 1.428610, mean_C = 1.408119, std = 0.012655
n = 8 max_C = 1.439640, mean_C = 1.410235, std = 0.010174
n = 9 max_C = 1.425398, mean_C = 1.406535, std = 0.008882
n = 10 max_C = 1.427183, mean_C = 1.408760, std = 0.006846
n = 11 max_C = 1.425355, mean_C = 1.409243, std = 0.007134
Here are some more results, with smaller value of , which allows us to run more experiments with larger . Here we use the value , which gives a relative error of . The result below suggest that for large tends to be between and (and we believe it is ).
k = 12, d = 2, no. of random samples for each n is 50 --------------------------------------------------- n = 2 max_C = 1.389380, mean_C = 1.216731, std = 0.095579
n = 3 max_C = 1.426683, mean_C = 1.337595, std = 0.047033
n = 4 max_C = 1.417624, mean_C = 1.356411, std = 0.031138
n = 5 max_C = 1.418676, mean_C = 1.362433, std = 0.027081
n = 6 max_C = 1.397214, mean_C = 1.372076, std = 0.014774
n = 7 max_C = 1.406423, mean_C = 1.372238, std = 0.012304
n = 8 max_C = 1.412893, mean_C = 1.376826, std = 0.010334
n = 9 max_C = 1.395714, mean_C = 1.377344, std = 0.008770
n = 10 max_C = 1.390549, mean_C = 1.377275, std = 0.006920
n = 11 max_C = 1.395763, mean_C = 1.377383, std = 0.009049
n = 12 max_C = 1.398410, mean_C = 1.379734, std = 0.006276
n = 13 max_C = 1.398447, mean_C = 1.379338, std = 0.005721
n = 14 max_C = 1.390360, mean_C = 1.379050, std = 0.005425
n = 15 max_C = 1.393251, mean_C = 1.380928, std = 0.005420
n = 16 max_C = 1.389973, mean_C = 1.379776, std = 0.004730
n = 17 max_C = 1.393275, mean_C = 1.379678, std = 0.005481
n = 18 max_C = 1.392536, mean_C = 1.380155, std = 0.004127
n = 19 max_C = 1.389623, mean_C = 1.379491, std = 0.003916
n = 20 max_C = 1.390470, mean_C = 1.379793, std = 0.004657
n = 21 max_C = 1.391533, mean_C = 1.379091, std = 0.004251
n = 22 max_C = 1.390274, mean_C = 1.379855, std = 0.004138
n = 23 max_C = 1.385381, mean_C = 1.379334, std = 0.002905
Note that the maximal values of the dilation constant are (perhaps counter intuitively) attained for small values of . This is because the standard deviation diminishes with , and one is less likely to find a large counter example by chance. However, since we see that the standard deviation becomes smaller as increases, this means that to get a good feel of what happens we don’t need to run many tests. We can sample a single pair of random unitaries of large size (say ) and compute their dilation constant. With very high probability, you will get something that is equal to up to the error due to the fact that we are using finite . You can try it yourself, with the code linked to at the end of this post.
Here is a quick test I ran, with and one sampled pair of unitaries for every value of (this took one hour to run):
n = 10, C = 1.3991621945198882
n = 20, C = 1.4022442529059544
n = 30, C = 1.3974791375625744
n = 40, C = 1.4001629040203714
n = 50, C = 1.400105201371202
Although we have not found new examples with large dilation constants, the results we obtained have implications also for the dilation constant , because in joint work-in-progress with Gerhold, Pandey and Solel, we proved that , thus in particular would imply that , which is an improvement to the best known upper bound .
The numerical results mentioned above made us gain confidence in the conjecture , and hence . We were happy to quickly guess that . We then ran some tests for the case and found out that this is probably not true, and that probably (for the reader’s convenience: ):
d = 3, k = 12 (error margin = 0.035276), no. samples for each n = 20 ------------------------------------------------------------------- n = 2 max_C = 1.596420, mean_C = 1.410624, std = 0.100716
n = 12 max_C = 1.603867, mean_C = 1.589800, std = 0.011682
n = 20 max_C = 1.605948, mean_C = 1.591090, std = 0.007812
Finally, instead of randomly sampling pairs of Haar distributed unitaries, we constructed finite dimensional compressions of (we took the subspace of generated by all words of length less than or equal to some ) and computed the dilation constant of these compressions. The idea was that if , then we should be able to find this by checking a certain (large enough, but finite) compression. However, for very small the running time and memory requirements blow up, and the results do not teach us much:
m (maximal word length) = 1, n (dimension) = 5, k = 10 ------------------------------------------------------- C = 0.951057 (runtime: 1 seconds) m (maximal word length) = 2, n (dimension) = 17, k = 10 ------------------------------------------------------- C = 1.164818 (runtime: 4 seconds) m (maximal word length) = 3, n (dimension) = 53, k = 10 ------------------------------------------------------- C = 1.242592 (runtime: 45 seconds) m (maximal word length) = 4, n (dimension) = 161, k = 10 ------------------------------------------------------- C = 1.279138 (runtime: 657 seconds)
For the computer chokes. Perhaps it would be worth trying this with enlarged resources.
5. The numerical range of random unitary matrices
A final service that Matan and Ofer did for us, was to help us understand the shape of a certain intriguing geometrical object that arise in the theory.
Given two operators , their numerical range is the set
is a state .
This set contains significant information on the structure of the operator space generated by and . We can prove (work-in-progress with Gerhold) that for pairs of unitaries sampled randomly and independently from the Haar measure on , the numerical range converges in the Hausdorff metric to almost surely. Thus, is what a random matrix range of looks like. A paper of Collins, Gawron, Litvak and Zyczkowski showed that if one samples a series of two independent matrices from the Gaussian Unitary Ensemble (for example), then converges almost surely to the ball. We asked: what does converge to (almost surely)?
My first guess was that the limit (which we can prove is ) is a bidisc. But rather quickly, using formulas from a paper of Lehner, we realized that it is not a bidisc. What does it look like? Is it a ball?
This was computed for us by Matan and Ofer (who used the formulas from Lehner’s paper in the computation). Here is the projection onto the real parts:
This looks like the unit ball in the norm for (the “TV screen”), so they tested to see what ball this is closest to. It seems to be somewhere between and . However, we don’t really expect this to be an ball.
Actually, Lehner’s formulas do not directly give us but only its polar dual for all . So first Matan and Ofer computed , (the real part of) which is illustrated below, and then they computed the polr dual.
6. Summary of outcomes
The outcomes (and non-outcomes) of the project are:
- We have not found with larger than , and in particular we do not have a bigger lower bound for than what we already knew.
- On the other hand, we have evidence that converges to as . Repeating our experiments gives the same result. Therefore,
- Using theoretically-based heuristic, we have evidence that . Therefore,
- Using proved theoretical results, we have evidence that .
- We have not been able to obtain non-random lower bounds for by compression to the subspace of words of length at most . The best we get from this, adding the error, is a lower bound of at most .
- We have a program to draw to any precision, and we know that it looks kind of like a “tv screen”, or “Android app”.
- We have open source (courtesy of Matan and Ofer) code that is able to reproduce our results and expand our data.
A short presentation of the project by Matan and Ofer can be found here (the presentation they used made a combination of slides and whiteboard, so the above text hopefully makes up for the missing oral presentation).
Our code for calculating the dilation constants can be found on colaboratory here. Our code for calculating the numerical range of the free Haar unitaries can be found on colaboratory here. NOTE: in both files one has to play with the parameters to see significant results, in the code files the values of , for example, are small, so that one can get it running fast. (The same remarks hold for the resolution used to draw the numerical range in the second link.) If you want to carry out serious experimentation, you need to give thought to the parameters