Introduction to von Neumann algebras, Lecture 1 (Introduction to the course, and a crash course in operator algebras, the spectral theorem)

by Orr Shalit

1. Micro prologue

Perhaps we cannot start a course on von Neumann algebras, without making a few historical notes about the beginning of the theory.

(To say it more honestly and openly, what I wanted to say is that perhaps I cannot teach a course on von Neumann algebras without finally reading the classical works by von Neumann and also learning a bit about the man. von Neumann was a true genius and has contributed all over mathematics, see the Wikipedia article).

In the late 1920s, Hilbert, prompted by the latest developments in quantum mechanics, was running a seminar with his assistants Nordheim and von Neumann, trying to make sense of it all. The issue was that Heisenberg, Born and Jordan (who were at Gottingen at the time too) have recently introduced “matrix mechanics”, a mathematical formalism for quantum mechanics which involved infinite matrices – the eigenvalues of which were supposed to represent observable quantities of physical significance. At the time, the spectral theory of compact operators on Hilbert space was well understood (due to Hilbert’s previous work on integral equations – which were also inspired by problems in physics), but the infinite matrices arising in matrix mechanics were not bounded.

Hilbert, Nordheim and von Neumann quickly wrote a paper on the subject, but only in von Neumann’s subsequent work, published in the years 1927-1929, were the mathematical foundations for quantum mechanics crystallized. His treatment appeared in his 1932 monograph Mathematical Foundations of Quantum Mechanics; this account of the basic formalism of quantum mechanics was so definitive, that this is more or less the formalism that is still taught today (and we should note that his contemporaries, most notably Weyl and Dirac, also published their own closely related accounts; but each of them is the main character of a different story).

In that short period von Neumann defined Hilbert spaces (which were already “around”) and developed the spectral theorem for bounded and unbounded self adjoint operators, and many of its applications (e.g., the functional calculus and the Stone-von Neumann theorem). After this fantastic success von Neumann was led to take a deeper look into operators on Hilbert space. His vision penetrating into the depths, he saw the beauty and richness, and took upon himself the construction of the foundations of the theory of operator algebras (in part jointly with Murray). Four of the foundational papers on operator algebras were a series of papers named “On Rings of Operators I-IV”. In the introduction to the first one, von Neumann lists four reasons to tackle the problems in operator algebras which they treat:

First, the formal calculus with operator-rings leads to them. Second, our attempts to generalize the theory of unitary group-representations essentially beyond their classical frame have always been blocked by the unsolved questions connected with these problems. Third, various aspects of the quantum mechanical formalism suggest strongly the elucidation of this subject. Fourth, the knowledge obtained in these investigations gives an approach to a class of abstract algebras without a finite basis, which seems to differ essentially from all types hitherto investigated.

Von Neumann continued to work on quantum mechanics, and some of his ideas and the theory of operator algebra had influence on further developments in algebraic quantum field theory and quantum statistical mechanics (as far as I gather, it has turned out that some of the motivations for developing aspects of the theory have turned out to be misguided from a physical point of view). But among his many other interests and activities, he also continued to develop the theory of operator algebras (or “rings of operators” as he called them) as a piece of pure mathematics. Indeed, the “pureness” of von Neumann’s motivations is evident from the introduction to already the first “Rings of Operators”, and it seems to me that “differ essentially from all types hitherto investigated” is the reason that appealed to them most.  After his earlier developments in operator theory, it took them roughly five years (1930-1935) to understand the basic theory of von Neumann algebras (it then took another roughly ten years to have it polished and written, but it is clear that when writing first “Rings of” paper, von Neumann knew the result that would appear in his 1949 paper “On Rings of Operators. Reduction Theory”. Let us not forget that there was a war in the middle of all the dramatic developments in operator algebras). The subject which has grown to become what is now known “von Neumann algebras” has expanded exponentially since the 30s; the core and foundations of the subject – a sizable part of the material course – are all due to the early papers of von Neumann and Murray. Having learned this stuff from textbooks written many years later, it is humbling, inspiring and almost unbelievable to see how much was already there in the first papers.

Will now leave all discussions of historical background and connections to physics, and dive into pure, cold, mathematics. The development of the material will be, as usual in mathematics, only loosely connected to the historical development. One small remark for the reader who has already mastered this theory:

Remark: It is customary to prove the spectral theorem for normal bounded operators via Gelfand’s theory of commutative Banach and C*-algebras; this is a good example of teaching things not the way they historically happened, as Gelfand’s theory came about a decade after von Neumann’s spectral theorem (later thirties versus later twenties). This is also how I learned it. I took it as a small challenge to “unlearn” Gelfand theory and prove the spectral theorem without it, in order to reach the subject matter in the shortest path possible.

2. Introduction

We start with an overview of the subject, and a sketchy description of what we hope to achieve in this course. Deeper discussions will come later.

We let B(H) denote the algebra of all bounded operators on a (complex) Hilbert space H, equipped with the usual algebraic operations (including conjugation T \mapsto T^* where \langle T g, h \rangle = \langle g, T^* h \rangle for all g,h \in H) and the operator norm

\|T\| = \sup_{\|h\|=1}\|T h\| = \sup_{\|g\|=\|h\|=1}|\langle T g, h \rangle| .

The adjoint and norm are related by the “C*-identity”, which is of key importance:

(C*)      \|T^*T\| = \|T\|^2.

Exercise A: In case you never have, prove the C*-identity.

We let 1 or I or I_H denote the identity operator on H.

Definition: A (concrete) C*-algebra A is a subalgebra A \subseteq B(H) such that

  1. A is a *-algebra (if T \in A then T^* \in A).
  2. A is norm closed (if T_n \in A then \lim_n T_n = T then T \in A).

Here, \lim T_n = T means \lim \|T - T_n\| = 0.

A C*-algebra A is said to be a von Neumann algebra if 1 \in A and if whenever T_n \in A and if SOT \lim T_n = T then T \in A. Here, SOT \lim T_n = T means that \lim \|T_n h - T h\| for all h \in H; in this case we say that T_n converges to T in the  strong operator topology. (When we write T_n \to T, it is to be understood that \{T_n\} is a convergent net. For the purposes of this introduction, the reader can think of T_n as a convergent sequence of operators, but please refresh your memory regarding the notion of nets in topological spaces for later lectures.)

In short, a C*-algebra is a closed *-subalgebra of B(H) which is closed in the norm, and a von Neumann algebra is a C*-algebra that contains the identity and is also closed in the strong operator topology.

There is another way to define von Neumann algebras. Given a set \mathcal{S} \subseteq B(H), we define the commutant of \mathcal{S} (denoted \mathcal{S}') to be

\mathcal{S}' = \{T \in B(H) : ST = TS for all S \in \mathcal{S}\}.

If \mathcal{S}^* = \{S^* : S \in \mathcal{S}\} = \mathcal{S} then it is easy to see that \mathcal{S}' is a von Neumann algebra. By the next lecture, we will be able to prove the following: every von Neumann algebra arises as the commutant \rho(G)', where G is a group and \rho : G \to B(H) is unitary representation, i.e., a homomorphism from G into the group of unitaries on H. Thus, one may think of a von Neumann algebra as the algebra of all “symmetries” of some unitary representation.

In this course we will study the basic theory of von Neumann algebras. The first dividend of this theory is that is serves as a useful framework for studying operators on Hilbert space. Thus, our first task is to understand the C*- and von Neumann algebras that are generated by a single selfadjoint operator on B(H); much of this will be accomplished already in the first lecture.

We will see that if T is selfadjoint, then C^*(T) \cong C(\sigma(T)) and W^*(T) \cong L^\infty(X, \mu) where (X, \mu) is a measure space (precise meaning of the symbols will be given later). In fact, every commutative von Neumann algebra is isomorphic to L^\infty (X, \mu). First question: Given two von Neumann algebras L^\infty(X, \mu) and L^\infty(Y, \nu), when are they isomorphic? (in fact, there are at least two very natural notions of what “isomorphic” means, and we will have to be more precise about that). Second question: What other kinds of von Neumann algebras exist?

As a warm up, let us look at a baby example of the first question. The algebra L^\infty := L^\infty [0,1] acts on L^2 := L^2[0,1] by multiplication: given f \in L^\infty,

M_f : h \mapsto fh  ,  h \in L^2

is a bounded operator, and L^\infty \cong \{M_f : f \in L^\infty\} \subset B(L^2). Likewise \ell^\infty := \ell^\infty(\mathbb{N}) acts by multiplication on \ell^2(\mathbb{N}): given a = (a_n)_n, it acts as a diagonal operator

diag(a) : (b_n)_n \mapsto (a_n b_n)_n  ,  (b_n) \in \ell^2,

and \ell^\infty \cong \{diag(a) : a \in \ell^\infty\} \subset B(\ell^2). These two algebras, \ell^\infty and L^\infty, are abelian von Neumann algebras (the fact that they are strongly closed requires proof; it’s worth remembering that there is no dominated convergence theorem for nets). Are they isomorphic?

They might look to you pretty much the same, or very different, depending on who you are. If you have no experience with such questions, then it is not clear how one may go about deciding this problem. Perhaps a healthy intuition will say that they must be different, since they live on measure spaces of different natures. This will indeed solve the problem.

Here is one way to look at the problem. The algebra \ell^\infty has projections which are supported on single points. These projections have the property, that there are no nonzero projections sitting under them. On the other hand, any projection in L^\infty can be split into the sum of two smaller and nontrivial projections – this is because every set of nonzero measure can be split that way (the measure space has no atoms). It follows that the algebras cannot be *-isomorphic, since the notions of projections, positivity, and hence order, are invariant under *-isomorphisms.

In the setting of C*-algebras, projections are not always helpful, since there exist C*-algebras that have no nontrivial projections (can you think of an example?). But in a von Neumann algebra there is always a very rich supply of projections, and it turns out that the structure of the lattice of projections is the key to the main classification scheme of von Neumann algebras. We will spend a couple of weeks studying the lattice of projections in a von Neumann algebra.

As for the second question raised above (what other kinds of von Neumann algebras exist): it is clear that B(H) itself is a von Neumann algebra, for every Hilbert space H. Of course, one can form direct sums, so there are von Neumann algebras of the form

B(H_1) \oplus B(H_2) \oplus L^\infty(X_1, \mu) \oplus \ldots.

The von Neumann algebras we listed are relatively simple examples of von Neumann algebras; we will later see that they all fall into one family, called type I. We will define later what it means to be type I; for now it suffices to say that type I algebras are either full matrix algebras of the kind M_n(\mathbb{C}), full operator algebras B(H), commutative algebras of the kind L^\infty(X,\mu), direct sums or tensor products of the above, or “continuous direct sums” of all the above (so called direct integrals). In principle, one is able to classify all type I algebras acting on a separable Hilbert space in relatively simple terms.

There are other kinds von Neumann algebras, that are said to be of type II. Here is one way to construct such examples. Let G be a countable group. Let \ell^2(G) be the \ell^2 space with orthonormal basis \{e_g\}_{g \in G}. For every f \in G, we define the (unitary) operator U_f \in B(\ell^2(G)) by

U_f e_g = e_{fg}.

Clearly, \lambda : G \to B(\ell^2(G)), \lambda(g) = U_g is a faithful (unitary) representation of G. If we look at \mathbb{C} G – the subalgebra of B(\ell^2(G)) generated by \{U_g : g \in G\}, we get an algebra that in general “knows” some things about the group (though, it is in general not possible to recover G from \mathbb{C} G). Define

L(G) := \overline{\mathbb{C}G}^{SOT}

(the strong operator closure). Then L(G) is a von Neumann algebra; it is called the group von Neumann algebra of G. One of the problems we will study is: what can one learn about a group from its von Neumann algebra and vice versa. Another interesting thing to say, is that group von Neumann algebras give a class of examples of von Neumann algebras that we have not listed above. Not always is L(G) a type II algebra – for example, if G is commutative then L(G) is commutative, so it is type I. But in certain cases (and one can give precise conditions) L(G) can be shown to be of a completely different nature than the type I examples, and is said to be of type II. For example, if G is the free group F_n generated by n generators, then L(F_n) is of type II.

While on the subject of group von Neumann algebras, let us mention a very big open problem:

Open problem: Let m \neq n. Is it true or false that L(F_m) \cong L(F_n) ?

This is a notoriously difficult problem, and the attention that it has drawn resulted in several of the major developments in von Neumann algebras, for example free probability theory (about which we will probably have no time to elaborate).

We will say something about the general classification scheme for von Neumann algebras. It turns out that there are three basic types of von Neumann algebras: types I and II, examples of which we mentioned above, and yet another type – type III – which is of quite a different nature (as hinted above, these types are defined in terms of the structure of the lattice of projections in them). Every von Neumann can be decomposed into a direct sum consisting of a type I, a type II and a type III von Neumann algebra. Classification of von Neumann algebras can be in principle reduced to the classification of “simple” von Neumann algebras, which are called factors, which are the “building blocks” of general von Neumann algebras.

Every type I factor is of the form B(H), and these algebras are completely classified by dim H. McDuff showed that there are uncountably many type II factors (acting on a separable Hilbert space). The group algebras L(F_n) mentioned above are examples of factors of type II, and the open problem above suggests that classification of type II factors is beyond all hope. However, we will see that if L(G_i) are infinite dimensional factors (i=1,2) and if G_1, G_2 are both amenable, then L(G_1) \cong L(G_2).

At first Murray and von Neumann were not able to decide whether there do or do not exist factors of type III. Eventually, von Neumann constructed an example, and decades later Powers showed that there are uncountably many non-isomorphic type III factors (acting on a separable Hilbert space). The classification of so-called amenable type III factors was carried out mostly by Connes, a work for which he was awarded the fields medal (following work of Tomita-Takesai and others, and the classification was completed by Haagerup). We will not discuss this deep and difficult subject in this course, but I hope that we will at least see uncountably many non-isomorphic examples.

Just as one can form a von Neumann algebra L(G) that encodes some information about a group G, one can form a von Neumann algebra L^\infty(X, \mu) \rtimes G (called the crossed product) that encodes the action of a group G by measure preserving transformations on a measure space (X, \mu). We will discuss how the properties of the action are encoded in the crossed product L^\infty(X,\mu) \rtimes G. A relatively simple fact is that, under a certain “freeness” assumption, the action is ergodic if and only if the crossed product is factor (in which case, it is a type II factor). In the last two decades, the classification problem for type II factors arising this way has been studied in depth by Popa and others, and there are some celebrated results. The tip of the iceberg is this: under certain assumptions on the action, Popa has shown that if two crossed products are isomorphic, then the actions are (essentially) conjugate. This result is quite surprising (the converse direction is quite trivial), and the proofs are highly nontrivial, and will remain unfortunately beyond the scope of our course (however, hopefully by the end of the course a student will be in a position to approach the literature on the subject).

One final kind of problem that we will discuss will be very different than the kinds discussed in the last few paragraphs. The problems will be of the kind: what are the fundamental structural properties of von Neumann algebras? For example, von Neumann algebras are, in particular, C*-algebras. Not all C*-algebras are von Neumann algebras. What makes von algebras special? Do they have an abstract characterization? von Neumann algebras are also, in particular, Banach spaces. Do they happen to have some special properties, in terms of their Banach space structure? It turns out that they do: if M is von Neumann algebra, then it turns out that there is a (unique!) Banach space X such that M = X^* (i.e., M is the Banach space dual of the Banach space X), and existence of such a pre-dual characterizes the C*-algebras that “happen to be” von Neumann algebras.

That was a panoramic view of what we might hope to achieve in this course. But now we must start the course proper, and let us start from the very beginning.

3. A bit of operator theory on B(H)

We now recall some things that everyone who attended a first course in functional analysis (so everyone attending this course) should know. An operator T \in B(H) is said to be

  1. selfadjoint  if T^* = T,
  2. normal if T T^* = T^* T,
  3. isometric if T^* T = I,
  4. unitary if it is a normal isometry,
  5. a projection if T^2 = T and T^* = T (in this case it is the orthogonal projection onto some subspace of H; in Hilbert spaces, we will use the word projection for orthogonal projections),
  6. contraction if \|T\|\leq 1,
  7. positive if \langle T h, h \rangle \geq 0; we then write T \geq 0.

Let us write P(H) for the projections on B(H), U(H) for the unitaries on H, B(H)_+ for the positive elements, and B(H)_{sa} for the selfadjoint elements. The notion of positivity induces an order on B(H_{sa}): we say that T \geq S if T- S \geq 0.

For any T \in B(H), the spectrum \sigma(T) of T is the subset of the complex plane defined by

\sigma(T) = \{\lambda \in \mathbb{C} : \lambda I - T does not have a bounded inverse \}.

For every T \in B(H), \sigma(T) is a closed set contained in \{|z| \leq \|T\|\}. For selfadjoint operators, the non-emptiness of the spectrum is easier to establish than for general operators, and follows from the following facts:

Fix T \in B(H)_{sa}, and set m_T = \inf_{\|h\|=1}\langle Th, h \rangle and M_T = \sup_{\|h\|=1}\langle Th, h \rangle (it is easy to see that \langle Th, h \rangle \in \mathbb{R} for T \in B(H)_{sa}). Then

  1. \sigma(T) \subseteq [m_T, M_T] \subseteq \mathbb{R},
  2. \|T\| = \max\{|m_T|, M_T\},
  3. m_T, M_T \in \sigma(T).

In particular, the above facts can be put together to yield for a selfadjoint operator T:

(*)  \|T\| = \sup\{|\lambda| : \lambda \in \sigma(T)\}.

Exercise B: Prove the above 3 facts and equation (*) above (assuming that T \in B(H)_{sa}). Hint: To solve the exercise, technology from a first course in functional analysis suffices; perhaps the most nontrivial part is \|T\| = \max\{|m_T|, |M_T|\}, which can be reformulated as \|T\| = r(A), where r(A) = \sup_{\|h\|=1}|\langle Ah,h \rangle| is the numerical radius. A direct proof can be found in many texts, for example Proposition 10.2.6 in my book. Alternatively, one can cleverly reduce to the case of compact selfadjoint operators.

Given any T \in B(H) and any polynomial p(x) = \sum_{k=0}^n a_k x^k \in \mathbb{C}[x], the evaluation of T in p(z) has an obvious meaning:

p(T) = \sum_{k=0}^n a_k T^k = a_0 I + a_1 T + \ldots + a_n T^n.

In particular, p(T) is a well defined selfadjoint operator if p \in \mathbb{R}[x], i.e., if p is a polynomial with real coefficients.

Theorem 1 (spectral mapping theorem): Let T \in B(H) and p \in \mathbb{C}[x]. Then

\sigma (p(T)) = p(\sigma(T)) := \{p(\lambda) : \lambda \in \sigma(T)\}.

Proof: Fix a non constant polynomial p. For every \lambda \in \mathbb{C}, we can factor the polynomial p(x) - \lambda = a(x - c_1)\cdots (x-c_n). We therefore have

p(T) - \lambda I = a(T-c_1 I) \cdots (T - c_n I).

The left hand side is not invertible if and only if one of the factors T - c_i I is not invertible, which happens if and only if c_i \in \sigma(T). Thus, \lambda \in \sigma(p(T)) if and only if there is some c \in \sigma(T) \cap p^{-1}(\lambda).

Given a topological space X, let C(X) denote the algebra of complex valued continuous functions on X, and let C_{\mathbb{R}}(X) be the real valued continuous functions. We equip these algebras with the supremum norm, and this gives both these algebras the structure of a Banach algebra (i.e., a Banach space with a multiplication such that \|fg\|\leq \|f\| \|g\|). These algebras also carry a * operation f^* = \overline{f}, and this makes them into “abstract” C*-algebras (i.e., Banach algebras satisfying the identity (C*)) .

The following theorem makes sense of “evaluating a continuous function f \in C(\sigma(T)) at T“.

Theorem 2 (continuous functional calculus): Let T \in B(H)_{sa}, and let C^*(T) be the unital C*-algebra generated by T:

C^*(T) = \overline{\{p(T) : p \in \mathbb{C}[x]\}}^{\|\cdot\|}.

Then there exists an isomorphism \Phi : C(\sigma(T)) \to C^*(T) such that

  1. \Phi(p) = p(T) for every p \in \mathbb{C}[x].
  2. \|\Phi(f)\| = \|f\|_\infty for every f \in C(\sigma(T)).
  3. \Phi(f^*) = \Phi(f)^*.
  4. If f,g \in C_{\mathbb{R}}(\sigma(T)) and f\leq g then \Phi(f) \leq \Phi(g).

Remark: The mapping f \mapsto \Phi(f) is usually denoted simply f \mapsto  f(T). Note that for every f \in C(\sigma(T)), we have that f(T) \in C^*(T) \subseteq \{T,T^*\}''. The map \Phi is referred to as the continuous functional calculus. The inverse mapping \Gamma = \Phi^{-1} : C^*(T) \to C(\sigma(T)) is called the Gelfand transform.

Proof of Theorem 2: We consider first the real norm closed algebra

A_{\mathbb{R}} = \overline{\{p(T) : p \in \mathbb{R}[x]\}}

and show that there is an isomorphism \Phi:  C_{\mathbb{R}}(\sigma(T)) \to A_{\mathbb{R}}. The map p \mapsto p(T) is clearly an algebraic homomorphism from \mathbb{R}[x] into the real algebra \{p(T) : p \in \mathbb{R}[x]\}, which consists of selfadjoint operators only. By Weierstrass’s polynomial approximation theorem, \mathbb{R}[x] is dense in C_{\mathbb{R}}(\sigma(T)). So, to prove the existence of an isometric isomorphism, it suffices to show that \|p(T)\| =\|p\|_\infty for every p \in \mathbb{R}[x]. But by equation (*) and the spectral mapping theorem,

\|p(T)\| = \sup\{|\lambda| : \lambda \in \sigma(p(T))\} = \sup\{|p(c)| : c \in \sigma(T)\} = \|p\|_\infty,

where we have used the fact that p(T) is selfadjoint (here we use that the coefficients are real). Thus p \mapsto p(T) extends to an algebra isomorphism \Phi:  C_{\mathbb{R}}(\sigma(T)) \to A_{\mathbb{R}} satisfying items 1,2 and 3 in the the statement of the theorem (item 3 is satisfied in an empty way). To show that  \Phi preserves order, it is enough to show that if f \geq 0, then \Phi(f) \geq 0. But if f \geq 0 and is continuous on \sigma(T), then g := \sqrt{f} \in C_\mathbb{R}(\sigma(T)), so \Phi(f) = \Phi(g^2) = \Phi(g) \Phi(g) \geq 0, as the square of a selfadjoint operator.

Now if f \in C(\sigma(T)), then f = u + iv for unique u,v \in C_\mathbb{R}(\sigma(T)). Then we can define \Phi(f) = \Phi(u) + i \Phi(v). This extends to a well defined homomorphism into C^*(T), and it preserves positivity and the *-operation. Finally, \Phi is isometric:

\|\Phi(f)\|^2 = \|\Phi(f)^*\Phi(f)\| = \|\Phi(|f|^2)\| = \||f|^2\|_\infty = \|f\|^2_\infty .

From the continuous functional calculus we will derive the spectral theorem below, but first a couple of quick corollaries.

Corollary 3 (existence of a positive square root): Let T \in B(H)_+. Then there exists a unique positive operator S \in B(H) such that S^2 = T. In fact, S \in C^*(T).

Remark: The operator S is called the positive square root  of T, and is denoted T^{1/2} or \sqrt{T}.

Proof of Corollary 3: With the notation of the functional calculus, we have that T = \Phi(f), where f is the continuous function on \sigma(T) given by f(x) = x. Then S = f^{1/2}(T) = \Phi (f^{1/2}) is the required square root (the function f^{1/2} is just f^{1/2}(x) = \sqrt{x}; sorry for the pedantry!). The uniqueness is left as an exercise – you can find a solution at the end of this post.

The following exercise shows that C*-algebras are generated by their selfadjoint elements. It will also allow us later to extend theorems that we obtain for selfadjoint operators to theorems on normal operators (see Exercise K below).

Exercise C: Prove that for every operator T in a C*-algebra A, there exist two unique selfadjoint operators T_1, T_2 such that T = T_1 + i T_2. Moreover, T is normal if and only if T_1 T_2 = T_2 T_1 (in this case we say that T_1 and T_2  commute).

Exercise D: Prove that every element T in a C*-algebra A \subseteq B(H) is the linear combination of unitaries in A. (Hint: use the continuous functional calculus and the previous exercise). In other words, every C*-algebras is generated (in fact, spanned) by its unitaries.

Exercise E: Prove that if \lambda is an isolated point in the spectrum of a selfadjoint operator T, then \lambda is an eigenvalue (i.e., there exists a nonzero v \in H such that Tv = \lambda v).

To state another important decomposition theorem, we need a new definition.

Definition: An operator U \in B(H) is said to be a partial isometry if the restricted operator U\big|_{ker U^\perp} is an isometry from ker U^\perp onto Im U. The space ker U^\perp is called the initial space of U and the space Im U is called the final space of U.

Exercise F: If U is a partial isometry, then U^* U is the orthogonal projection onto the initial space of U, and U U^* is the orthogonal projection onto the final space of U.

Exercise G: For an operator U \in B(H), the following are equivalent.

  1. U is a partial isometry.
  2. U U^* U = U.
  3. U^* U is a projection.
  4. U^* is a partial isometry.

Corollary 4 (polar decomposition): Let T \in B(H). Then there exists a unique partial isometry U with ker U = ker T and a unique positive operator P with ker P = ker T such that T = UP. The operator P is given by P = (T^*T)^{1/2}, and it is contained in C^*(T).

Remark: The operator P = (T^*T)^{1/2} is denoted |T| is called the absolute value of T. The decomposition T = U|T| is called the polar decomposition of  T. We have noted that |T| \in C^*(T). As for U, it is in general not contained in C^*(T), but we shall see later that it is is always contained in the von Neumann algebra generated by T.

Proof of Corollary 4: Existence: Put P = (T^*T)^{1/2}. Then

\|Ph\|^2 = \langle P^2 h, h \rangle = \|Th\|^2.

In particular, ker P = ker T. Moreover, the equality of norms implies that the map Ph \mapsto Th is a well defined isometric linear map from Im P to Im T. It therefore extends continuously to an isometry U from \overline{Im P} = \ker P^\perp to  \overline{Im T}. Setting U=0 on ker P completes the construction.

Uniqueness: the assumptions imply that T^*T = P U^* U P = P^2 so P is the unique positive square root of T^*T. In the “existence” part of the proof we already noted that there is a unique partial isometry U with initial space ker T^\perp that maps Ph \mapsto Th.

4. The spectral theorem

The spectral theorem for selfadjoint operators is the basic structure theorem for selfadjoint operators. It tells us how a general selfadjoint operator looks like. Recall that if T is a selfadjoint operator acting on a finite dimensional space H = \mathbb{C}^n, then T is unitarily equivalent to a diagonal operator with real coefficients, that is, there exists a unitary operator U \in U(H) such that

U T U^* = \begin{pmatrix}\lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{pmatrix},

where \sigma(T) = \{\lambda_1, \ldots, \lambda_n\} (where some points in \sigma(T) are possibly repeated).

Moreover, if T is a compact selfadjoint operator on a Hilbert space, then T unitarily equivalent to a diagonal operator (an infinite diagonal matrix, acting by multiplication on \ell^2), the diagonal of which corresponds to the eigenvalues of T, which form a sequence converging to 0:

U T U^* = \begin{pmatrix}\lambda_1 & & \\ & \lambda_2 & \\ & & \ddots \end{pmatrix}.

If T is unitarily equivalent to a diagonal operator where the diagonal elements form a bounded sequence of real numbers (not necessarily converging to 0), then T is a bounded selfadjoint operator (which is not necessarily compact). However, a general bounded selfadjoint operator need not be unitarily equivalent to a diagonal operator.

Example: The operator T : L^2[0,1] \to L^2[0,1] given by T f(x) = xf(x) is a selfadjoint bounded operator, and it is an easy exercise to see that this operator has no eigenvalues (so it cannot be unitarily equivalent to a diagonal operator). However, the operator in this example is rather well understood, and it is “sort of” diagonal. The general case is not significantly more complicated than this.

To understand general selfadjoint operators, one needs to recall the notions of measure space and of L^p spaces. Let (X, \mu) be a measure space and consider the Hilbert space L^2 = L^2(X,\mu). Every f \in L^\infty = L^\infty(X,\mu) defines a (normal) bounded operator M_f : h \mapsto fh on L^2.

Exercise H: In case you never have, prove the following facts (or look them up; Kadison-Ringrose have a nice treatment relevant to our setting). Let (X, \mu) be a \sigma-finite measure space and f \in L^\infty(X, \mu).

  1. \|M_f\| = \|f\|_\infty (where \|f\|_\infty is the essential supremum of f, which is defined to be \inf \{t \geq 0 : \mu\{x:|f(x)|> t\} = 0\}).
  2. M_f^* = M_{\overline{f}}.
  3. If g : X \to \mathbb{C} and h \mapsto gh defines a bounded operator on L^2(X, \mu), then g is essentially bounded: \|g\|_\infty < \infty.
  4. If f,g \in L^\infty(X, \mu), then M_f M_g = M_{fg} and M_f + M_g = M_{f+g}.
  5. M_f is selfadjoint if and only if f is real valued almost everywhere.

The algebra L^\infty is an abstract C*-algebra with the usual algebraic operations, the *-operation f^* = \overline{f}, and norm \|f\| = \|f\|_\infty = ess-sup_{x \in X}|f(x)|. The map \pi : L^\infty \to B(L^2)

\pi(f) = M_f

is a *-representation (i.e., and algebraic homomorphism that preserves the adjoint \pi(f^*) = \pi(f)^*), which is isometric (\|\pi(f)\| = \|f\|), so omitting \pi we can think of L^\infty as a C*-subalgebra of B(L^2). Since (M_f)^* = M_{\overline{f}}, the operator f \sim M_f is selfadjoint if and only f is a.e. real valued. The operator M_f, where f \in L^\infty, is called a multiplication operator. Multiplication operators form a rich collection of examples of selfadjoint operators. The spectral theorem says that this collection exhausts all selfadjoint operators: every selfadjoint operator is unitarily equivalent to a multiplication operator.

Theorem 5 (the spectral theorem): Let T be a selfadjoint operator on a Hilbert space H. Then there exists a measure space (X, \mu), a unitary operator U : L^2(X,\mu) \to H, and a real valued f \in L^\infty(X,\mu), such that

U^* T U = M_f

When H is separable, X can be taken to be a locally compact Hausdorff space, and \mu a regular Borel probability measure. 

We will prove the spectral theorem in the case that T has a cyclic vector; the general case will then easily follow and will be left as an exercise.

Definition: Let T \in B(H)_{sa}. A vector h \in H is said to be a cyclic vector for T if

\overline{\{p(T) h : p \in \mathbb{C}[x]\}} = H.

Exercise I: Let T \in B(H)_{sa}. Prove that there exists a family of vectors \{h_i\} such that

H = \oplus H_i,

where H_i = \overline{\{p(T) h_i : p \in \mathbb{C}[x]\}}; in particular, for every i, A H_i \subseteq H_i. In other words, every selfadjoint operator is the direct sum of operators that have a cyclic vector.

Proof of the spectral theorem under the assumption that there exists a cyclic vector: Suppose that h \in H is a cyclic unit vector for T. Let X = \sigma(T). By the continuous functional calculus, there is an isometric *-isomorphism \Phi : C(X) \to C^*(T) which satisfies \Phi(p) = p(T) for every p \in \mathbb{C}[x]. Recall that we write f(T) = \Phi(f) for f \in C(X).

Define a linear functional \rho : C(X) \to \mathbb{C} by

\rho(f) = \langle f(T) h,h \rangle.

Then \rho is a positive linear functional on C(X), and \rho(1) = 1. By the Riesz representation theorem there exists a unique regular probability measure \mu (defined on all Borel subsets of X) such that \rho(f) = \int f d \mu for all f \in C(X). This is the measure \mu which that appears in the statement of the theorem.

Form L^2(X, \mu). We define U : L^2(X,\mu) \to H by first requiring that U g = g(T) h for all g \in C(X). Now, C(X) is a dense subspace of L^2(X,\mu), and by the cyclicality assumption, \{g(T) h : g \in C(X)\} is a dense subspace of H. So if we will show that U is isometric on C(X), it will follow that U extends to a unitary L^2(X,\mu) \to H; isometric-ness follows from:

\|g\|^2_2 = \int|g|^2 d \mu = \langle g(T)^* g(T) h,h \rangle = \|g(T)h\|^2,

for all g \in C(X).

Finally, let f(x) = x. Clearly, f \in L^\infty(X,\mu) is a bounded real valued function.  Then T U g = Tg(T) h = fg(T) h, while U M_f g = U fg = fg(T) h, so TU = U M_f, and the proof is complete.

Remark: In the proof above, we constructed a measure \mu = \mu_h by

(**)  \int f d \mu_h = \langle f(T) h,h \rangle for all f \in C(\sigma(T))

where h \in H was assumed to be a cyclic vector for T. In fact, the same construction makes very good sense also when h is not necessarily cyclic. The measure \mu_h is then sometimes referred to as the spectral measure associated to h (or T). Warning: the term “spectral measure” will appear again below and will then mean something different. In any case, it is an instructive exercise to see what the measure \mu_h looks like when T is a selfadjoint matrix and h is an arbitrary vector.

Exercise J: Show how the spectral theorem for general selfadjoint operators follows from the case where A has a cyclic vector. Take care to establish also the final assertion of the theorem.

5. The Borel functional calculus

In Theorem 2, we saw that for a selfadjoint operator T and a continuous function f \in C(\sigma(T)), one can define an operator f(T) \in C^*(T). In fact,

C^*(T) = \{f(T) : f \in C(\sigma(T))\}.

The mapping f \mapsto f(T) is called the continuous functional calculus, and has some nice algebraic and analytic properties. In this section we will extend the functional calculus to all bounded Borel functions, that is we will show how to define f(T) whenever f is a function defined on \sigma(T), that is Borel measurable. This assignment (called the Borel functional calculus) will have similar nice properties, with the main differences being (i) the map f \mapsto f(T) is not necessarily isometric, and (ii) f(T) will not necessarily lie in C^*(T),  but rather in W^*(T) (i.e., in the von Neumann algebra generated by T).

Let B(X) denote the algebra of all bounded Borel measurable functions on a compact space X, equipped with the supremum norm and the adjoint operation f^* = \overline{f}.

Theorem 6 (the Borel functional calculus): Let T be a selfadjoint operator on a Hilbert space H, and write X = \sigma(T). There exists a contractive *-homomorphism B(X) into W^*(T) that extends the continuous functional calculus. If \phi_n is a bounded sequence in B(X) that converges pointwise to \phi, then \phi_n(T) \to \phi(T) in the strong operator topology. 

Remark: By the end of the next lecture, you will be able to establish that this *-homomorphism is surjective, that is, that the von Neumann algebra generated by T has the form \{f(T) : f \in B(\sigma(T))\} (so you better be on the look out!).

Proof of Theorem 6: Given \phi \in B(X), the operator \phi(T) is defined to be U M_{\phi \circ f} U^*, where U^* T U = M_f is the unitary equivalence of T with a multiplication operator. This makes sense, because \phi being bounded and Borel measurable implies that \phi \circ f \in L^\infty(X, \mu). The only subtle point is to prove that \phi(T) \in W^*(T). We will prove this for the case where T has a cyclic vector. The case where T is a general selfadjoint operator on a separable Hilbert space will be left as an exercise (easy, given the proof for the cyclic case); the case where H is not even separable will be ignored.

Thus, let us assume that T = M_f on H = L^2(X, \mu), for the function f(x) = x , where \mu is a regular Borel probability measure on X = \sigma(T). By a consequence of Lusin’s theorem, there is a bounded sequence of continuous functions \phi_n : X \to \mathbb{C} that converge \mu-almost everywhere to \phi. By the dominated convergence theorem

\|M_{\phi_n}h - M_{\phi}h\|^2 = \int |\phi_n - \phi|^2 |h|^2 d \mu \to 0,

so M_{\phi_n} \in C^*(T) converge SOT to \phi(T) := M_{\phi} = M_{\phi \circ f}, thus \phi(T) \in W^*(T).

A similar argument also shows the final assertion of the theorem.

6. The spectral measure

Fix a selfadjoint operator T \in B(H). For the characteristic function \chi_A of a Borel set A \subseteq X = \sigma(T) we can define

E(A) = \chi_A(T).

Since \chi_A^* = \chi_A and \chi_A^2 = \chi_A, the operator E(A) is a projection. The properties of the functional calculus also imply that

Exercise K: 

  1. E(A \cap B) = E(A) E(B).
  2. E(\emptyset) = 0 and E(X) = I.
  3. E(\cup_n A_n) = \sum_n E(A_n) for every disjoint family of sets A_1, A_2, \ldots, , where the sum converges in the strong operator topology.

A projection valued map with the properties above is called a spectral measureThe spectral measure E constructed from the functional calculus above is called the spectral measure associated with T.

Sometimes, the spectral theorem is stated in terms of the spectral measure, rather than in terms of multiplication operators. One can show that for every bounded Borel function on X, the functional calculus is given by “integration against the spectral measure”

\phi(T) = \int \phi dE = \int \phi(\lambda) d E_\lambda,

where the integral converges in the following sense: for any \epsilon > 0, there is a partition A_1, \ldots, A_n of X such that

\left\|\phi(T) - \sum_{k=1}^n \phi(x_k) E(A_k) \right\| < \epsilon

for any choice of x_k \in A_k. (In fact, one can show that every spectral measure gives rise to a *-homomorphism \pi of B(X) by \pi (\phi) = \int \phi dE.) In particular, one has the formula

T = \int \lambda dE_\lambda.

This implies, in particular, that every selfadjoint operator T can be approximated in the norm by projections in the von Neumann algebra that it generates. Let us record this fact, and then give a more straightforward proof.

Corollary 7: Every von Neumann algebra M is equal to the norm closure of the linear span of projections in M. In fact, every selfadjoint operator is in the norm closure of its spectral projections corresponding to intervals with rational endpoints. 

Proof: By Exercise C, it suffices to show that every selfadjoint operator T can be approximated in the norm by projections in W^*(T). Assume that \sigma(T) \subseteq [a,b], and let a=x_0 < x_1 < \ldots < x_n = b be a partition of [a,b]. For every k=1, \ldots n, and y \in [a,b]

x_{k-1} \chi_{[x_{k-1},x_k)}(y) \leq y \chi_{[x_{k-1},x_k)}(y)  \leq x_k \chi_{[x_{k-1},x_k)}(y),

so (functional calculus)

x_{k-1} E([x_{k-1},x_k)) \leq T E([x_{k-1},x_k)) \leq x_k E([x_{k-1},x_k)).

Summing, one has

\sum_k x_{k-1} E([x_{k-1},x_k)) \leq T \leq \sum x_k E([x_{k-1},x_k)).

and since the projections are orthogonal we obtain

\left\| T - \sum_k y_k E([x_{k-1},x_k)) \right\| < \max \{|x_k - x_{k-1}|\}

for any  y_k \in [x_{k-1},x_k).

The final statement of the theorem follows from the same argument.

Remark: The operator T is in the norm closure of the spectral projections associated with it, but the spectral projections are (in general) not in the C*-algebra generated by T.

7. The spectral theorem for normal operators

The spectral theorem (Theorem 5) holds for normal operators in place of selfadjoint operators, with the difference that f is complex valued rather than real valued. Thus, every normal operator is unitarily equivalent to a multiplication operator. One may repeat the proof above (there is one and a half places where this poses a nontrivial challenge – the trickiest part being the equation labelled (*)). Another option is to use the result for selfadjoint operators, together with Exercise C and the existence of a spectral measure, in order to construct a spectral measure that is supported on (a compact subset of) the complex plane \mathbb{C}.

When dealing with a normal operator N, the spectrum X:=\sigma(N) is a subset of the complex plane, and one needs to use polynomials p(z,\overline{z}) is z and it conjugate; ordinary polynomials cannot approximate uniformly arbitrary continuous functions on X. Likewise, the C*-algebra generated by N is the closure of polynomials p(N,N^*) in N and its adjoint. In accordance, the definition of a cyclic vector needs to be modified so that the proof runs smoothly: we say that a vector h \in H is *-cyclic for N  if

\overline{span}\{p(N,N^*)h : p is a polynomial in 2 variables \} = H.

Then one can show that if H is a Hilbert space and N is a normal operator on H, then H decomposes into a direct sum of *-cyclic subspaces. Then one proves that a normal operator with a *-cyclic vector is unitarily equivalent to M_z on L^2(X, \mu), where X = \sigma(N). We leave the details as a significant exercise.

Exercise L: Show how to adjust the proof of the spectral theorem so that it works for normal operators; alternatively, deduce the spectral theorem for normal operators, from the spectral theorem for selfadjoint operators.

8. Additional exercises

Exercise M: Prove that a selfadjoint (or normal, if you wish) operator is compact if and only if E(\{z : |z|>\epsilon\} is a finite rank operator for every \epsilon > 0 (here E denotes the spectral measure associated with E).

Exercise N: Let T_1, T_2 be two cyclic selfadjoint operators. Then T_i is unitarily equivalent to M_x on L^2(X, \mu_i), where \mu_i is a (compactly supported) probability measure on \mathbb{R}. Prove that T_1 is unitarily equivalent to T_2 if and only if \mu_1 and \mu_2 are mutually absolutely continuous. The same result holds for *-cyclic normal operators. (Hint: you may want to recall the Radon-Nikodym theorem).