The remarkable Hilbert space H^2 (Part II – multivariable operator theory and model theory)

by Orr Shalit

This post is the second post in the series of posts on the d–shift space, a.k.a. the Drury–Arveson space, a.k.a. H^2_d (see this previous post about the space H^2).

1. Model theory

One of the ways in which one can understand general linear operators on a finite dimensional space is by the Jordan normal form of a matrix. Recall that every linear operator T on a finite dimensional (complex) space can be decomposed as the direct sum T = J_1 \oplus \ldots \oplus J_k, where J_i are Jordan blocks, that is T is made up from simple, understandable building blocks. This is a ubiquitous strategy in mathematics: to decompose a general object into tractable, well–understood pieces (for example, every finitely generated abelian group is the direct sum of cyclic groups, etc., etc.).

When it comes to operators of general type on infinite dimensional spaces, no such decomposition is known to mankind (if it was, then mankind would probably have an answer to the invariant subspace problem). A completely different strategy that is used in the infinite dimensional setting is the following: instead of trying to decompose an operator into smaller and better understood pieces, what we do is exhibit the operator as a piece of a bigger and better understood operator. The various ways in which strategy has been implemented go under the name model theory.

How can something complicated be a piece of something simple? How can this help us understand the complicated thing? A good example (from a different field) to have in mind which explains the philosophy behind this scheme and answers both of these questions, is Whitney’s theorem: every smooth manifold can be embedded in Euclidean space. 

Here is one way in which this works. Let S denote the operator of multiplication by the coordinate function on H^2:

Sf (z) = z f(z) .

S is called the shift. Let K be a fixed infinite dimensional and separable Hilbert space (say \ell^2). Consider the Hilbert space H^2 \otimes K. This space can be identified as H^2 direct sum with itself \dim K times. Now consider S \otimes I_K defined by S \otimes I_k (f \otimes k) = (Sf) \otimes k. This can be identified with the direct sum S \oplus S \oplus \cdots of S with itself \dim K times. Then we have the following theorem.

Theorem 1: Let T \in B(H) (H separable) with \|T\|<1. Then H can be identified with a subspace of H^2 \otimes K which is invariant under (S \otimes I_K)^* such that T^* = (S \otimes I_K)^* \big|_H

(Of course, if one prefers, one may replace T with T^* and then one gets T = (S \otimes I_K)^*\big|_H). Stated in an almost equivalent way, the assertion is that

T^n = P_H (S \otimes I_K)^n \big|_H

for all n=1,2,\ldots . We say that the shift is a universal model for contractions.

Here is a consequence of the fact that shift is a universal model:

Theorem 2 (von Neumann’s inequality): Let T \in B(H), \|T\|<1. Then for any polynomial

\|p(T)\| \leq \sup_{|z|=1} |p(z)| .

Proof: This follows at once once we know that \|p(S)\| = \sup_{|z|=1} |p(z)| for any polynomial p. But \|p(S)\| = \sup_{|z|=1}|p(z)| is a consequence of H^2_1 being a subspace of L^2(\mathbb{T}).

2. The d-shift as a universal model for commuting row contractions

We now come to multivariable operator theory. Multivariable operator theory is concerned with the analysis of tuples of operators, rather than single operators. That is, one has a d–tuple T = (T_1, \ldots, T_d) and one tries to understand their simultaneaus action on the space and how they relate with each other.

To see why this is more complicated, consider a pair of operators T = (T_1, T_2) on a finite dimensional space. For each separate operator we can find a basis with respect to which it is in Jordan form, and this gives a relatively simple description of what T_i does to the space. In particular, given a polynomial p it is not hard to compute p(T_i) respect to the Jordanizing basis.

However, in general (even if T_1 and T_2 commute) one cannot choose a Jordanizing basis that works for both operators at once. In particular, it is difficult to compute p(T_1, T_2) for a polynomial p in two variables.

There is a model theory for d–tuples of commuting operators (there is also a model theory for non-commuting tuples which we shall not discuss. See, for example, the work of Gelu Popescu). As above, we will need to impose some norm condition. For a d–tuple T = (T_1, \ldots, T_d) let us denote by \|T\| the norm of the operator [T_1, \ldots, T_d] : H \oplus \cdots \oplus H \rightarrow H given by

[T_1, \ldots, T_d] (h_1, \ldots, h_d) = \sum_{i=1}^d T_i h_i .

Let K be as in Theorem 1. Let S_i denote the operator of multiplication by the ith coordinate function in H^2_d:

S_i f (z) = z_i f(z) .

We denote S = (S_1, \ldots, S_d) and refer to this tuple as the d–shift.

Theorem 3: Let T = (T_1, \ldots, T_d) be a d–tuple of commuting operators on H such that \|T\|<1. Then H can be identified with a subspace of H^2_d \otimes K such that

T_i^* = (S_i \otimes I_K)^* \big|_H

Again, as a consequence, one has

p(T) = P_H(p(S) \otimes I_K)\big|_H

for every polynomial p(z_1, \ldots, z_d).

Thus, we say that the d–shift is a universal model for commuting row contractions.

As a corollary we obtain the following generalization of von Neumann’s inequality.

Theorem 4 (Drury’s inequality): Let T = (T_1, \ldots, T_d) be a commuting tuple of operators such that \|T\| \leq 1. Then for any polynomial p(z_1, \ldots, z_d) 

\|p(T)\| \leq \|p(S)\| .

Note the difference from von Neumann’s inequality: we do not claim that \|p(T)\| \leq \sup_{|z_1| = \ldots = |z_d| = 1}|p(z)|, and indeed, this inequality fails for T = S. But the fact that we can identify a (simple) tuple of operators on which the maximum norm is obtained for any polynomial is quite remarkable.

3. Some words of warning

Theorem 1 and Theorem 3 (the “dilation theorems”) are nontrivial and important theorems, but one should be warned that there is a limit to what they can tell us. This is because the invariant subspace lattice of S \otimes I_K is very complicated. In fact, it is at least as complicated as the invariant subspace lattice of any operator!

Theorem 1 tells us that if we can completely understand the invariant subspace lattice of S \otimes I_K then we can solve the invariant subspace problem; indeed, by Theorem 1 the invariant subspace problem is equivalent to the question whether or not (S \otimes I_K)^* has a minimal infinite dimensional invariant subspace (or equivalently, whether S \otimes I_K has a maximal infinite co–dimensional invariant subspace). No surprise, this problem turns out to be just as hard.

However, some invariant subspace theorems were obtained using model theory. For example, there are operators for which S \otimes I_G is a model, where G is a finite dimensional Hilbert space. This case is tractable, and one can show that the operator S \otimes I_G has no maximal infinite co–dimensional invariant subspaces.