Advanced Analysis, Notes 1: Hilbert spaces (basics)

by Orr Shalit

In this lecture and the next few lectures we will study the basic theory of Hilbert spaces. Hilbert spaces are usually studied over \mathbb{R} or over \mathbb{C}. In this course, whenever we consider Hilbert spaces, we shall consider only complex Hilbert spaces, that is, spaces over \mathbb{C}. The are two reasons for this. First, the results in this post hold equally well for real Hilbert spaces with similar proofs. Second, in some topics that we will discuss later the nice results only hold for complex spaces. So we will ignore real Hilbert spaces because they are essentially the same and also because they are fundamentally different!

Remark: The only situation I know where it is really important to concentrate on real Hilbert spaces when doing convex analysis (there must be others that I don’t know of). On the other hand, it is often convenient – indeed, we already did so in this course – to study real Banach spaces.

1. Basics of the basics

Definition 1: Let G be a complex vector space. G is said to be an inner product space if there exists a function (\cdot, \cdot) : G \times G \rightarrow \mathbb{C} that satisfies the following conditions: 

  1. \forall x \in G . (x,x) = 0 \Rightarrow x = 0.
  2. \forall x \in G . (x,x) \geq 0
  3. \forall x,y \in G . (x,y) = \overline{(y,x)}.
  4. \forall x,y,z \in G . \forall a,b \in \mathbb{C} . (ax + by, z) = a(x,z) + b(y, z).

From properties 3 and 4 above we have the following property too: \forall x,y,z \in G . \forall a,b \in \mathbb{C} . (z, ax + by) = \overline{a}(z,x) + \overline{b}(z,y).

Henceforth G will denote an inner product space.

Examples:

  1. Let G = \mathbb{C}^n with the standard inner product: (x,y) = \sum_{i=1}^n x_i \overline{y}_i.
  2. Let G = C[0,1] (the continuous functions on the interval) with the inner product: (f,g) = \int_{0}^1 f(t) \overline{g(t)} dt.

Remark: The reader might not be familiar with Riemann integration of complex valued functions. There is really nothing to it: if f(t) = u(t) + iv(t) is a continuous, complex valued functions where u and v are its real and imaginary parts, then

\int_a^b f(t) dt := \int_a^b u(t) dt + i \int_a^b v(t) dt .

If x \in G, one defines the norm of x to be the nonnegative real number \|x\| given by

\|x\| = \sqrt{(x,x)}.

Theorem 2 (Cauchy-Schwarz inequality): For all x,y \in G

|(x,y)| \leq \|x\|\|y\|.

Equality holds if and only if x and y are linearly dependent.

Proof: Exercise: prove the Cauchy-Schwarz inequality. Try also to come up with a proof for the inequality (and not he “equality holds if and only if” part) that only uses properties 2,3 and 4 of inner product (from Definition 1). We will have occasion to use this.

Theorem 3 (triangle inequality): For all x, y \in G

\|x+y\| \leq \|x\| + \|y\|

Equality holds if and only if one of the vectors is a positive multiple of the other. 

Proof: One expands \|x+y\|^2 and then uses Cauchy-Schwarz :

\|x + y\|^2 = \|x\|^2 + 2 Re (x,y) + \|y\|^2 \leq \|x\|^2 + 2 \|x\|\|y\| + \|y\|^2 \leq (\|x\|+\|y\|)^2 .

Wee see that \|\cdot\| satisfies all the familiar properties of a norm:

  1. \forall x \in G . \|x\| \geq 0 and \|x\| = 0 \Rightarrow x = 0.
  2. \forall x \in G . \forall a \in \mathbb{C} . \|a x \| = |a| \|x\|.
  3. \forall x,y \in G . \|x+y\| \leq \|x\|+ \|y\|.

It follows that \|\cdot \| induces a metric d on G by

d(x,y) = \|x-y\|.

Definition 4: G is said to be a Hilbert space if it is complete with respect to the (metric induced by the) norm. 

Recall that (in this setting) a sequence \{x_n\}_n in G is said to be a Cauchy sequence if \|x_n - x_m\| \rightarrow 0 as m,n \rightarrow \infty, and that G is said to be complete if every Cauchy sequence actually converges to a point in G.


Examples: Let us review the examples we saw above.

  1. G = \mathbb{C}^n. This space is complete w.r.t. the norm (as one should know from a basic course in topology), so it is a Hilbert space.
  2. G = C[0,1]. This space is not complete w.r.t. the norm, so this inner product space is not a Hilbert space. Indeed, let f_n be the continuous function that is equal to zero between 0 and 1/2, equal to 1 between 1/2+ 1/n and 1, and linear on [1/2,1/2+1/n]. Then if m<n, then \|f_m - f_n\| < 1/m, thus this series is Cauchy. However, one can show that there is no continuous function f on the interval for which \|f_n - f\| \rightarrow 0.

Example: This is the fundamental example of a Hilbert space. Let \ell^2(\mathbb{N}) (or, for short, \ell^2) denote the set of all sequences of all square summable complex numbers. That is

\ell^2 = \{x=(x_n) \in \mathbb{C}^\mathbb{N} : \sum_{n=0}^\infty |x_n|^2 < \infty \}.

We endow \ell^2 with the natural vector space operations, and with the inner product (x,y) = \sum x_n \overline{y}_n. The norm then has to be \|x\| = \sqrt{\sum |x_n|^2}.

There are several things to prove here. First, we have to show that \ell^2 is a vector space; second, we to show that our definition of the inner product makes sense; third, we have to show that \ell^2 is complete.

The only non-trivial thing to check with regard to \ell^2 being a vector space is that if x, y \in \ell^2, then x+ y \in \ell^2. Using the triangle inequality in \mathbb{C}^{n+1} we find

\sum_{k=0}^n |x_k + y_k|^2 \leq (\sqrt{\sum_{k=0}^n|x_k|^2} + \sqrt{\sum_{k=0}^n|y_k|^2})^2 \leq (\|x\|+\|y\|)^2 < \infty

for all n. Letting n \rightarrow \infty we see that x + y \in \ell^2.  In a similar manner, one uses the Cauchy-Schwarz inequality in \mathbb{C}^n to verify that \sum x_k \overline{y}_k converges absolutely for all x,y \in \ell^2, and thus the inner product is well defined.

Suppose that \{x^{(n)}\}_n is a Cauchy sequence in \ell^2. That means that

\|x^{(n)} - x^{(m)}\|^2 = \sum_k |x_k^{(n)} - x_k^{(m)}|^2 \longrightarrow 0

as m,n\rightarrow \infty. In particular, for all k, we have that |x_k^{(n)} - x_k^{(m)}|^2 \longrightarrow 0, thus \{x_k^{(n)}\}_n is a Cauchy sequence of complex numbers. By completeness of \mathbb{C}, for each k there is some x_k such that \{x_k^{(n)}\}_n converges to x_k. Define x =(x_k)_k. We need to show that x \in \ell^2, and that \{x^{(n)}\}_n converges to x in the norm of \ell^2.

Now every Cauchy sequence in any metric space is bounded, so there is an M such that \|x^{(n)}\| \leq M for all n. Fix some \epsilon > 0, and choose some n_0 such that \|x^{(n)} - x^{(m)}\| < \epsilon for all m,n \geq n_0. Then for every integer N,

\sum_{k=0}^N |x^{(n)}_k - x^{(m)}_k|^2 < \epsilon^2

for all m,n \geq n_0. Letting m \rightarrow \infty, we find that

\sum_{k=0}^N |x^{(n)}_k - x_k|^2 \leq \epsilon^2

for all n \geq n_0, and therefore \sum_{k=0}^N |x_k|^2 \leq (M+\epsilon)^2. Letting N be arbitrarily large, we conclude that x \in \ell^2 and that \|x^{(n)} - x\| \leq \epsilon for all n \geq n_0. Since \epsilon was arbitrary, we also get that x^{(n)} \rightarrow x. That completes the proof that \ell^2 is a Hilbert space.

Remark: It is convenient at times to consider the close relatives of \ell^2 defined as follows. Let S be a set. Then we define \ell^2(S) to be

\ell^2(S):= \{f:S \rightarrow \mathbb{C} : \sum_{s \in S} |f(s)|^2 < \infty\}

with the inner product (f,g) = \sum_{s \in S} f(s) \overline{g(s)}. (In case S is uncountable then one has to be careful how this sum is defined. This issue will be discussed later on).

2. Completion

Some of you may have forgotten what an important role the completeness property of \mathbb{R} played in the first real analysis courses (hint: it was probably used nine out of every ten times in which some series or sequence was proved to be convergent, but you had no clue as to what it was converging to). In infinite dimensional analysis it is very hard to get far without completeness. Most of the great theorems (but not all!) depend crucially on completeness of the spaces involved. That, together with the example of C[a,b] above, is the reason for the importance of the following theorem.

Theorem 5: Let G be an inner product space. There exists a Hilbert space H, and a linear map V: G \rightarrow H such that

  1. For all g,h \in G, (V(g),V(h))_H = (g,h)
  2. V(G) is dense in H

If H' is another Hilbert space and V' : G \rightarrow H' is a linear map satisfying the above two conditions, then there is a bijective linear map U: H \rightarrow H' such that (U(x),U(y))_{H'} = (x,y)_H  for all x,y \in H and U(V(g)) = V'(g) for all g \in G.

Remarks: 

  1. The Hilbert space H is said to be the completion of G
  2. The last assertion of the theorem says that H is essentially unique, up to the obvious isomorphism appropriate for Hilbert spaces.
  3. Since V preserves all of the structure that G enjoys, one usually identifies G with V(G) and then the theorem is stated as follows: There exists a unique Hilbert space H which contains G as a dense subset.

Proof: Let H_0 be the vector space consisting of all Cauchy sequences in G. To be precise, let G^\mathbb{N} be the vector space of all sequences in G

G^\mathbb{N} = \{\{x_n\}_{n=0}^\infty : x_n \in G \textrm{ for all } n=0,1,\ldots\}

with the usual addition and scalar multiplication: if  x = \{x_n\}_{n=0}^\infty and y = \{y_n\}_{n=0}^\infty, then

cx + y = \{cx_n + y_n\}_{n=0}^\infty .

Now let H_0 be the linear subspace of G^\mathbb{N} consisting of all Cauchy sequences (one checks that it is indeed a linear subspace). Our goal now is to play around with H_0 until it becomes the sought after Hilbert space H. First thing let us try to define an inner product on H_0. If  x = \{x_n\}_{n=0}^\infty and y = \{y_n\}_{n=0}^\infty, then we define

[x,y] := \lim_{n\rightarrow \infty} (x_n,y_n) .

The limit really does exist: every Cauchy sequence (in any metric space) is bounded, so the identity

(*) (x_n,y_n) - (x_m,y_m) = (x_n,y_n-y_m) + (x_n-x_m,y_m)

together with Cauchy-Schwarz implies that \{(x_n,y_n)\} is a Cauchy sequence of complex numbers. It is easy to check that the form [\cdot,\cdot] satisfies conditions 2,3 and 4 of Definition 1. However, it does not satisfy condition 1, so it does not make H_0 into an inner product space.

We define an equivalence relation (check that it is) on H_0 by x \sim y if and only if x_n - y_n \rightarrow 0 (in other words if and only if [x-y,x-y] = 0), and we denote by H the set of all equivalence classes. H can also be thought of as the quotient of H_0 by the linear subspace of sequences that converge to 0. It is well known (and easy to show) that H is a vector space, and now we will make it into a Hilbert space. Let us denote by \dot{x} the equivalence class of x. By an identity similar to (*) above, the form (\dot{x}, \dot{y})_H = [x,y] is well defined (i.e., does not depend on the representative), and it follows that (\cdot, \cdot)_H inherits from [\cdot, \cdot] properties 2,3 and 4 of the definition of inner product. The form (\cdot, \cdot)_H also satisfies condition 1, since if [x,x] = 0 then x is equivalent to the zero element of H_0, hence \dot{x} = 0 (an alternative way to see that (\cdot, \cdot)_H is well defined is to use the fact that [\cdot, \cdot] (like any form satisfying 2,3 and 4 of Definition 1) satisfies the Cauchy-Schwarz inequality – which you were asked to do right after Theorem 2 above). To show that H is a Hilbert space it remains to show that it is complete. I’ll put in some details in the next few paragraphs, but it is a good idea for students to try to skip the next paragraphs (up to the next section) and fill in the details themselves.

Suppose that \{\dot{x}^{(n)}\}_n is a Cauchy sequence in H. As always with Cauchy sequences, to show that it converges to a limit, it suffices to show that a subsequence converges to a limit. Thus we may assume without loss of generality that \|\dot{x}^{(n)} - \dot{x}^{(m)}\| < 2^{-m} for all n \geq m.

To show that  \{\dot{x}^{(n)}\}_n converges in H, we will construct a y \in H_0 for which \dot{x}^{(n)} \rightarrow \dot{y}. Choose a representative x^{(n)} for \dot{x}^{(n)} for all n. We are free to choose any representative we wish, so let us use this freedom. Note that if k is any integer, x = \{x_n\}_{n=0}^\infty and x' = \{x_{n+k}\}_{n=0}^\infty, then x \sim x'. Therefore, we may always modify a representative by discarding as many initial terms in the sequence as we need without changing the equivalence class. So we choose x^{(n)} in such a way that \|x^{(n)}_k - x^{(n)}_l\| < 2^{-n} for all k,l.

Now define y \in G^\mathbb{N} by y_k = x^{(k)}_k. This sequence y is actually Cauchy, that is, y \in H_0. Indeed, Since \|x^{(k)} - x^{(l)}\| < 2^{-k} there is some m such that \|x^{(k)}_m - x^{(l)}_m\|< 2^{-k}. Thus, if l \geq k, then

\|y_k - y_l \| = \|x_k^{(k)} - x_l^{(l)}\|,

and

\|x_k^{(k)} - x_l^{(l)} \| \leq \|x_k^{(k)} - x_m^{(k)}\| + \|x_m^{(k)} - x_m^{(l)}\| + \|x_m^{(l)} - x_l^{(l)}\|.

Now \|x_k^{(k)} - x^{(k)}_m\|<2^{-k} and \|x_m^{(l)} - x_l^{(l)}\| < 2^{-l} because of the way in which we chose the representatives. The middle term it smaller than 2^{-k} by choice of m, thus

\|x^{(k)}_k - x_l^{(l)} \| < 3 \cdot 2^{-k},

which shows that y is a Cauchy sequence. To see that \dot{x}^{(n)} \rightarrow \dot{y}, we check, similarly to the above, that

\|\dot{x}^{(n)} - y\|_H = \lim_{k\rightarrow \infty}\|x^{(n)}_k - x_k^{(k)} \| < 3\cdot 2^{-n} .

Now we define V: G \rightarrow H by letting V(g) be \dot{x}, where x is the sequence x = \{x_n\}_n with x_n = g for all n. Then it is clear that V : G \rightarrow H is well defined and linear and that (V(g), V(h))_H = (g,h) for all g,h \in G. Moreover, if \dot{x} \in H and \|x_n - x_m\| < \epsilon for all m,n \geq N, then for g = x_N we have \|V(g) - \dot{x}\| < \epsilon. So V(G) is dense in H.

We now come to the final assertion. Let H' and V' be as described in the theorem. We define a map U : V(G) \rightarrow V'(G) by U(V(g)) = V'(g). From the properties of V, V', it follows that U is a linear map and that

(U(V(g)), U(V(h)))_{H'} = (g,h) = (V(g), V(h))_H.

It follows that U is an isometry (i.e., a metric preserving map) between dense subsets of two complete metric spaces, and thus it extends uniquely to a map (which we also denote by U) between H and H'. That completes the proof.

3. The space PC[a,b]

Let PC[a,b]  denote the space of piece-wise continuous functions on the interval [a,b], and define the following form:

(f,g)= \int_a^b f(t) \overline{g(t)} dt .

(The integral we require is the standard (complex valued) Riemann integral or equivalently the Darboux integral). Note that this isn’t exactly an inner product on PC[a,b] because it may happen that (f,f) = 0 for a function  f which is not identically zero; for example if f(x) \neq 0 at finitely many points and zero elsewhere. However, this problem is easily fixed, because

Exercise A: \|f\| = 0 if and only if the set where f is nonzero is finite.

So to fix the problem, say that f \sim g if f and g are equal at all points of [a,b] except finitely many. The set of equivalence classes may then be given the structure of an inner product space, with

(\dot{f},\dot{g})= \int_a^b f(t) \overline{g(t)} dt .

It is customary to denote the set of equivalence classes of piece-wise continuous functions also as PC[a,b], and to consider elements of this space as functions, with the provision of declaring two functions to be “equal” if they are equivalent.

Exercise B: Show that for every f \in PC[a,b], there is at most one g \in C[a,b] in its equivalence class. Thus, our new equality in PC[a,b] is consistent with the equality of functions in C[a,b].

Exercise C: Prove that C[a,b] is dense in PC[a,b].

4. The space L^2[a,b]

By Theorem 5, every inner product space can be completed to a Hilbert space, and when working with infinite dimensional inner product spaces it is often advantageous to pass to the completion. Let us look at the most important example of this.

In many problems in classical analysis or mathematical physics, such as Fourier series, differential equations, or integral equation, it seems that the natural space of interest is C[a,b] or PC[a,b]. Experience has led mathematicians to feel that it is helpful to introduce on these spaces the inner product

(f,g)= \int_a^b f(t) \overline{g(t)} dt

and to use the induced norm as a measure of size in this space. However, neither of these spaces is complete (we have seen that C[a,b] is not).

Exercise D: Show that with respect to the above inner product PC[a,b] is not complete.

Consider the inner product space PC[a,b] with the above inner product. By Theorem 5, PC[a,b] has a unique completion; denote the completion by L^2[a,b].

Exercise E: Prove that C[a,b] is dense in L^2[a,b], and hence deduce that L^2[a,b] is also “equal” to the abstract completion of C[a,b] (as a first step, clarify to yourself in what sense C[a,b] is “in” L^2[a,b]).

Now, L^2[a,b] was defined in Theorem 5 rather abstractly, as the quotient space of a space of sequences of elements in PC[a,b]. The space L^2[a,b] can be defined in a completely different manner, in rather concrete function-theoretic terms, as the space of square integrable Lebesgue measurable functions on the interval. This is the way this space is defined in a course in measure theory (in fact this is how it is usually defined). After one defines the space L^2[a,b], one can prove that it is complete, and that the continuous functions (or the piece-wise continuous functions) are dense in this space. However, the uniqueness part of Theorem 5 promises us that we get the same Hilbert space as we do from the construction we carried out above. Note that we did not get for free any measure theoretic result, it is the theorems in measure theory (completeness of L^2 and density of the piece-wise continuous functions in it) that allow us to conclude that the measure theoretic construction and the abstract construction give rise to the “same” Hilbert space.

At this point we do not require any knowledge from measure theory, rather, we will stick with our definition of L^2[a,b] as an abstract completion, and we will derive some of its function theoretic nature from this. This approach is good enough for many applications of the theory.

Let us decide that we will call every element f \in L^2[a,b]function. Since PC[a,b] \subset L^2[a,b], it is clear that we will be identifying some functions – we were already identifying functions at the level of PC[a,b], and since we added new functions we are sure to have at least as many identifications (in fact, we will have much more). Now, if f \in L^2[a,b] we cannot really say what is the value of f at a point x \in [a,b] (since already in PC[a,b] we cannot do this), but f has some other function-like aspects.

First of all, f is “square integrable” on [a,b]. To be precise, we define the integral of |f|^2 on [a,b] to be

\int_a^b |f(t)|^2 dt := (f,f) .

We see that is f and g are functions in L^2[a,b], then we consider f and g as being equal if \int_a^b |f(t) - g(t)|^2 dt = 0.

Second, f is actually “integrable” on [a,b], that is, we can define

\int_a^b f(t) dt := (f,1) ,

where 1 is the constant function 1 on [a,b], which is in PC[a,b], and therefore in L^2[a,b].

Lastly, we can define the integral of f on every sub-interval [c,d] \subseteq [a,b]:

\int_c^d f(t) dt := (f, 1_{[c,d]}),

where 1_{[c,d]} is the piece-wise continuous function that is equal to 1 on [c,d] and zero elsewhere.

All these definitions are consistent with the definitions of integral of piece-wise continuous functions. Moreover,  for a continuous function f, if we know \int_c^d f(t) dt for every interval [c,d] \subseteq [a,b] then we can completely recover f; for a piece-wise continuous function f, if we know \int_c^d f(t) dt for every interval [c,d] \subseteq [a,b] then we can recover the equivalence class of f in PC[a,b]. It follows (with some work) that if f \in L^2[a,b] then the collection of quantities (f, 1_{[c,d]}) for all c,d determines f, and consequently we can say that a function in L^2[a,b] is uniquely determined by its integrals over all intervals.

Exercise F: Show that a function in L^2[a,b] is completely determined by the values of its integrals over sub-intervals of [a,b].

You are meant to prove the above exercise using only our definition of L^2[a,b] as the abstract completion of PC[a,b] and the subsequent definition of integral over a sub-interval. It is also a fact in measure theory, that any “function” f \in L^2[a,b] is nothing more and nothing less then the totality of values \int_c^d f(t) dt.

5. The spaces L^2(K)

In a way similar to the above, if K is a subset in \mathbb{R}^n of the form K = [a_1, b_1] \times \cdots \times [a_n,b_n], then we define L^2(K) to be the completion of the space C(K) with respect to the inner product

(f,g) = \int_{a_1}^{b_1} \cdots \int_{a_n}^{b_n} f(t_1, \ldots, t_n) \overline{g(t_1, \ldots, t_n)} d t_1 \cdots d t_n .

The details are similar to the one dimensional case and we do not dwell on them.

L^2 spaces can be defined on spaces of a more general nature, but that is best done with some measure theory. The reason that we are taking our rather unusual approach is that students are allowed to take the course “Measure Theory” concurrently with “Advanced Analysis”, so we are trying in the beginning to get along with nothing from measure theory. But it is also interesting for me to see how far we get with only basic analysis and “pure” Hilbert space methods.

Advertisements