Advanced Analysis, Notes 7: Banach spaces (dual spaces and duality, Lp spaces, the double dual, quotient spaces)

by Orr Shalit

Today we continue our treatment of the dual space X^* of a normed space (usually Banach) X. We start by considering a wide class of Banach spaces and their duals. 

1. L^p spaces

Perhaps the best studied class of examples of infinite dimensional Banach spaces which are not Hilbert spaces are the L^p spaces. In this section we collect some important facts about L^p spaces which we shall use to illustrate some results or in applications to analysis.

In this section, we will assume some experience with measure theory.

Let (X, \mathcal{M}, \mu) be a measure space. Recall that this means that X is some set, \mathcal{M} is a \sigma-algebra of subsets of X (this just means that \mathcal{M} is a collection of subsets of X that contains X and is closed under taking complements and countable unions) and \mu : \mathcal{M} \rightarrow [0,\infty] is a function that satisfies:

  1. \mu(\emptyset) = 0, and
  2. If \{E_i\} is a countable sequence of disjoint sets, then \mu (\cup_i E_i) = \sum_i \mu( E_i).

The function \mu is called a  measure, and elements of \mathcal{M} are called measurable sets. All our measure spaces will be assumed \sigma finite, meaning that there is a sequence \{X_n\} of measurable sets such that X = \cup X_n and \mu(X_n)< \infty for all n.

For every measurable function f and every p\in [1, \infty), we define

\|f\|_p = (\int |f|^p d \mu )^{1/p}.

We also define

\|f\|_\infty = \inf\{a : \mu(\{|f|>a\}) = 0\} .

For p \in [1, \infty] we define L^p = L^p(\mu) = L^p(X, \mathcal{M}, \mu) to be the set of all measurable functions f : X \rightarrow \mathbb{C} for which \|f\|_p < \infty. Of course, we will identify  functions which are equal almost everywhere. The spaces \ell^p arise as L^p spaces when X = \mathbb{N} and \mu is the counting measure.

Definition: A function f : X \rightarrow \mathbb{C} is said to be simple if it as the form 

f = \sum_{k=1}^N a_k \chi_{E_k}

where a_1, \ldots, a_N \in \mathbb{C}, E_1, \ldots, E_N \in \mathcal{M} all have finite measure, and \chi_{E_k} denotes the characteristic function of E_k

We shall denote by \mathcal{S} the vector space of all simple functions. clearly, \mathcal{S} is contained in L^p for all p. It is a standard and important fact, that we shall use below, that if p<\infty, then \mathcal{S} is dense in L^p

Theorem 1: For 1 \leq p \leq \infty, L^p is a Banach space. For p<\infty, the space of simple functions of finite measure support are dense in L^p. The set of simple functions is dense in L^\infty.

Proof: We prove the theorem for 1\leq p < \infty; p = \infty is left as an excercise. By Minkowski’s inequality L^p is a normed space, and the issue we delve on is completeness.

We will use the fact that a normed space is complete if and only if every absolutely convergent sequence in the space is convergent (see the homework exercises). Thus, it suffices to show that if f_k are elements of L^p such that \sum \|f_k\|_p< \infty, then \sum f_k converges in the norm of L^p to an element of this space.

Let \{f_k\} be such a sequence. For every N, define

F_N = \sum_{k=1}^N |f_k| .

The sequence F_N converges everywhere to a to a function F = \sum |f_k| that has values in [0,\infty]. For all N,

\|F_N\|_p \leq \sum_{k=1}^N \|f_k\|_p \leq \sum_{k=1}^\infty \|f_k\|_p < \infty .

By the monotone convergence theorem,

\|F\|_p = \lim_{N\rightarrow \infty}\|F_N\|_p \leq \sum_{k=1}^\infty \|f_k\|_p < \infty

thus $F \in L^p$. In particular, the function F = \sum_{k=1}^\infty |f_k| is finite (meaning that its value is not \infty) almost everywhere. It follows that the sequence \sum f_k(x) converges absolutely for almost every x \in X, hence it defines a measurable function. Since |f| \leq F, we have that f \in L^p.

Now we will show that f = \lim_{N \rightarrow \infty}\sum_{k=1}^N f_k in the norm of L^p. For this we have to show that

(*)\int_X |f - \sum_{k=1}^N f_k |^p d \mu \rightarrow 0.

But $latex |f – \sum_{k=1}^N f_k |^p\rightarrow 0$ pointwise almost everywhere. On the other hand,

|f-\sum_{k=1}^N f_k|^p = |\sum_{k=N+1}^\infty f_k|^p \leq F^p \in L^1

for all N. Thus (*) follows from the dominated convergence theorem. That completes the proof.

Given p \in [1, \infty], it is customary to denote q = \frac{p}{p-1}, so that 1/p + 1/q = 1. q is called the conjugate exponent of p.

Theorem 2 (Holder’s inequality): \|fg\|_1 \leq \|f\|_p\|g\|_q.

For a proof, see this.

It is immediate from Holder’s inequality that every g \in L^q gives rise to a functional \phi_g \in (L^p)^* by way of

\phi_g(f) = \int fg d \mu .

By Holder’s inequality, \|\phi_g\| \leq \|g\|_q. A little more effort shows that \|\phi_g\| = \|g\|_q (this will be shown below). It is a deeper fact (to be shown further down below) that g \mapsto \phi_g is actually an isometric isomorphism from L^q onto (L^p)^*.

Theorem 3: Let p \in [1, \infty). Every continuous linear functional on L^p is of the form \phi_g for some g \in L^q

We may summarize the above discussion as (L^p)^* = L^q for all p \in [1, \infty). The rest of this section is devoted to proving this fact. The proof will follow pretty much the proof from Folland’s book “Real Analysis”. We fix a \sigma-finite measure space (X, \mu), and let p,q \in [1,\infty] be related by p^{-1}+q^{-1} = 1.

Lemma: For all g \in L^q, \|\phi_g\| = \|g\|_q

Proof: We already know that \|\phi_g\| \leq \|g\|_q, so it remains to show the reverse. We may assume that g \neq 0.

Assume first that q< \infty. Define = \|g\|_q^{1-q}|g|^{q-1}\overline{sign(g)}. Here sign is the function that returns for every complex number z = re^{-it} the phase e^{-it}. One checks readily that f \in L^p and that \|f\|_p^p = 1. Thus

\|\phi_g\| \geq |\int f g d \mu | = \|g\|_q,

so \|\phi_g\| \geq \|g\|_q.

Now we treat the case q = \infty. Fix \epsilon > 0. The set \{x: |g(x)| > \|g\|_\infty - \epsilon\} has nonzero measure. Let A be a subset of \{x: |g(x)| > \|g\|_\infty - \epsilon\} such that 0 < \mu(A) < \infty. Let f = \mu(A)^{-1} \overline{sign(g)}\chi_A. Then f \in L^1, and \|f\|_1 = 1. Moreover,

\|\phi_g\| \geq |\int f g | \geq \|g\|_\infty - \epsilon.

Since this holds for all \epsilon > 0, we find \|\phi_g\| \geq \|g\|_\infty. The other inequality has already been established, thus \|\phi\| = \|g\|_\infty.

Lemma: Let g : X \rightarrow \mathbb{C} be measurable and assume that

M_q(g) := \sup \{\int|fg| d \mu : f \in \mathcal{S}, \|f\|_p=1 \}

satisfies M_q(g) < \infty. Then g \in L^q and M_q(g) = \|g\|_q.

Proof: By Holder’s inequality. M_q(g) \leq \|g\|_q, so we will only prove \|g\|_q < \infty (and this will also give us g \in L^q). We treat the cases q< \infty and q = \infty separately.

The case q<\infty:

There is a sequence of simple functions g_n satisfying |g_n| \leq |g| and g_n \rightarrow g pointwise (such a sequence, which is easy to construct by hand, is constructed in any course on measure theory). Define f_n = \|g_n\|_q^{1-q}|g_n|^{q-1}. Then f_n \in \mathcal{S}, \|f_n\|_p \leq 1. Then

\|g\|_q\leq \liminf \|g_n\|_q = \liminf \int |f_n g_n| d \mu \leq M_q(g)

where the first inequality follows from Fatou’s Lemma.

The case q=\infty:

Let \epsilon > 0, and put A = \{x: |g(x)| \geq M_\infty(g) + \epsilon\}. We shall show that \mu(A) = 0, from which it would follow that \|f\|_\infty \leq M_\infty(g).

Indeed, if \mu(A) > 0, then (from \sigma-finiteness) there is B \subseteq A such that 0<\mu(B) <\infty. Define f = \mu(B)^{-1}\chi_B. Then \|f\|_1 = 1, and \int|fg| d \mu > M_\infty(g) + \epsilon; this is a contradiction to the very definition of M_\infty(g). Thus \mu(A) = 0, and the proof is complete.

Lemma: With the notation of the previous lemma,

M_q(g) = \sup \{ |\int fg d \mu | : f \in \mathcal{S}, \|f\|_p = 1\}.

Proof: It is clear that the right hand side is less than or equal to M_q(g). For the reverse inequality, let \epsilon > 0, and let f \in \mathcal{S} have p-norm 1 such that

\int |f g| d \mu > M_q(g) - \epsilon.

Let h = |f| \overline{sign(g)}. Then |\int h g d \mu| = \int|hg| d \mu > M_q(g) - \epsilon. Note, however, that h is not necessarily simple. But there is a sequence of simple functions h_n \rightarrow h such that |h_n| \leq |h|. By the dominated convergence theorem, \lim |\int h_n g d \mu| \rightarrow |\int hg d \mu | > M_q(g) - \epsilon. This shows established the required inequality.

Proof of Theorem 3: We already know that the map \Phi: L^q \rightarrow (L^p)^* given by \Phi(g) = \phi_g is isometric (here as above \phi_g is the functional \phi_g(f) = \int fg d \mu). It remains to show that this map is surjective. Recall that we are assuming that p<\infty; it is your job to find where we use this assumption.

Assume first that \mu is a finite measure. Let \phi \in (L^p)^*. We seek a function g \in L^q such that \phi = \phi_g. To obtain this function, define a function \nu : \mathcal{M} \rightarrow \mathbb{C} by

\nu(E) = \phi(\chi_E)

for E \in \mathcal{M}. By finiteness of \mu, the function \chi_E is in L^p so \nu is well defined. In fact, it turns out that \nu is a complex valued measure. Ineed, if E is the dsijoint union \cup E_n, then \chi_E = \sum \chi_{E_n} pointwise and hence (by monotone convergence) in norm, whence

\nu(E) = \phi(\sum \chi_{E_n}) =\sum \phi(\chi_{E_n}) = \sum \nu(E_n).

Now if E has \mu-measure zero, then \chi_E = 0 in L^p, hence \nu(E) = \phi(\chi_E) = 0. Thus \nu << \mu, so by the Radon-Nikodym Theorem, it follows that there is some g \in L^1 such that \nu(E) = \int_E g d\mu for all E \in \mathcal{M}. From this we have that

(*)\phi(f) = \int fg d\mu

for all f \in \mathcal{S}. The lemma above implies that g \in L^q and that \|g\|_q \leq \|\phi\|. Examining (*) again, we obtain by density of \mathcal{S} that \phi = \phi_g.

Finally, it remains to prove the theorem for a \sigma-finite measure. As the heart of the proof is behind us, this is left as an exercise for the reader.

2. Duality and the double dual

Corollary 14 in the previous lecture is a generalization of the fact that in an inner product space, if a vector x is orthogonal to all other vectors in the space then x = 0. This suggests the following notation. In some texts, when considering the action of f \in X^* on x \in X, the notation \langle f,x\rangle is used instead of f(x). This invites one to think of the relationship between X and X^* as something more symmetrical. Sometimes, elements of the dual space X^* are denoted as x^*. Let’s use this notation for a little while to make it familiar.

One can learn many things about a Banach space X from its dual X^*. For example:

Exercise A: If X^* is separable, then X is separable, too.

Example: The example of \ell^\infty = (\ell^1)^* shows that the converse is false. In addition, the exercise shows that \ell^1 \neq (\ell^\infty)^*.

Let S \subseteq X. We denote S^\perp : = \{x^* \in X^* : \forall x \in S . \langle x^*, x \rangle = 0 \}. If T \subseteq X^* then we denote ^\perp T = \{x \in X : \forall x^* \in T . \langle x^*, x \rangle = 0 \}. The spaces S^\perp and ^\perp T are called the annihilator of S and the pre-annihilator of T, respectively. With this notation we can state Corollary 16 from the previous lecture as follows: A subspace M contains x in its closure if and only if x \in ^\perp( M^\perp). In other words, \overline{M} = ^\perp (M^\perp)

Exercise B: True or false: (^\perp N )^\perp = \overline{N}?

Every x \in X induces a function \hat{x} \in X^{**} = (X^*)^* by way of

\hat{x}(x^*) = \langle x^*, x \rangle \,\, , \,\, x^* \in X^*.

For every normed space Y, we denote by Y_1 the (norm) closed unit ball of Y.

Proposition 4: The map x \mapsto \hat{x} is an isometry from X into X^{**}

Proof: Linearity is trivial. To see that the map is norm preserving,

\|\hat{x}\| =\sup_{x^* \in X^*_1} \langle x^*,x\rangle = \|x\|,

where for the second equality we used Corollary 13 from the last lecture.

Definition 5: A Banach space X is said to be reflexive if the map x \mapsto \hat{x} is an isometry of X onto X^{**}

Example: L^p is reflexive for all p \in (1,\infty). L^1[0,1], \ell^1, L^\infty[0,1] and \ell^\infty are examples of non-reflexive spaces (as Exercise A and the example following it show). It follows non of these spaces is isomorphic to an L^p space for 1<p<\infty.

3. Quotient spaces

Let X be a normed space, and let M \subseteq X be a closed subspace. In linear algebra we learn how to form the quotient space X/M, which is defined to be the set of all cosets x + M, x \in X, with the operations

(x + M) + (y + M) = x+y+M .

c(x+M) = cx + M .

It is known and also easy to show that these operations are well defined, and give X/M the structure of a vector space. Our goal now is to make X/M into a normed space, and to prove that when X is complete, so is M.

To abbreviate, let us write \dot{x} for x + M. We define

\|\dot{x}\| = d(x,M) = \inf \{\|x-m\|: m \in M\}.

Let us denote by \pi the quotient map X \rightarrow X/M.

Theorem 6: With the above defined norm, the following hold:

  1. X/M is a normed space.
  2. \pi is a contraction: \|\pi(x) \| \leq \|x\| for all x.
  3. For every y \in X/M such that \|y\|<1,  there is an x \in X with \|x\|<1 such that \pi(x) = y.
  4. U is open in X/M if and only if \pi^{-1}(U) is open in X.
  5. If F is a closed subspace containing M, then \pi(F) is closed.
  6. If X is complete, then so is X/M

Proof: 1. \|c\dot{x}\| = |c|\|\dot{x}\| is trivial. \|\dot{x}\| = 0 \Leftrightarrow \dot{x} = 0 follows from the fact that M is closed. For every x,y \in X, let m,n \in M be such that \|x-m\| and \|y-n\| are very close to \|\dot{x}\| and \|\dot{y}\|, respectively. Then

\|\dot{x} + \dot{y}\| \leq \|x+y - m -n \| \leq \|x-m\| + \|x-n\| \ ,

and the right hand side is very close to \|\dot{x}\| + \|\dot{y}\|, so \|\dot{x} + \dot{y}\| can only be very slightly bigger than \|\dot{x}\| + \|\dot{y}\|. That proves the triangle inequality.

Exercise C: Complete the proof of the theorem.

Remark: Note that by 4 of the above theorem, the topology induced by the quotient norm is precisely the quotient topology which one defines in topology.

Theorem 7: Let X be a normed space and M \subseteq X a closed subspace. Then

  1. M^* is isometrically isomorphic to X^*/M^\perp
  2. (X/M)^* is isometrically isomorphic to M^\perp

Proof: 1. Define T : M^* \rightarrow X/M^\perp be

T(m^*) = x^* + M^\perp ,

where x^* is any extension of m^* to X. This is well defined, because if y^* is another extension of m^* then x^* - y^* \in M^\perp. Well defined-ness implies linearity (think about it). T is also surjective, sine T(x^*\big|_{M}) = x^* + M^\perp for all x^* \in X^*. It remains to prove that T is isometric.

Let m^* \in M^* and let x^* be an extension. Then \|T(m^*)\| = \inf \{\|x^*+n^*\|:n^* \in M^\perp\}. But each functional x^*+n^* extends m^*, so \|x^*+n^*\| \geq \|m^*\|, whence \|T(m^*)\| \geq \|m^*\|. On the other hand, as n^* ranges over all n^* \in M^\perp, x^* + n^* ranges over all extensions of m^*. By Hahn-Banach, there is some extension, say y^* = x^* + n_1^*, such that \|y^*\| = \|m^*\|. Thus the infimum is attained and \|T m^*\| = \|m^*\|.

2. Define a linear map \pi^*: (X/M)^* \rightarrow X^* by

\pi^*(y^*) = y^* \circ \pi .

It is obvious that R(\pi^*) \subseteq M^\perp, since \pi vanishes on M. We need to show that \pi^* is a surjective isometry.

By 2 of Theorem 6, \|\pi^*(y^*)\|= \|y^*\circ \pi\| \leq \|y^*\|\|\pi\| \leq \|y^*\|. On the other hand,

\|\pi^*(y^*)\| = \sup \{|\langle y^* \circ \pi, x\rangle|: \|x\|<1\} = \sup\{|\langle y^* , \pi(x)\rangle|: \|x\|<1\} .

Using 3 of Theorem 6, \pi maps the unit ball of X onto the unit ball of X/M, so the right hand side is equal to

\sup\{|\langle y^* , y\rangle|: \|y\|<1\} = \|y^*\|.

It remains to show that the range of \pi^* is equal to M^\perp. Let x^* \in M^\perp. Then N(x^*) \supseteq M. It follows that if x_1 - x_2 \in M, then \langle x^*, x_1 \rangle = \langle x^*, x_2 \rangle, so we may define a functional f on X/M by

f(\dot{x}) = x^*(x).

By definition x^* = f \circ \pi. The kernel of f is equal to \pi(N(x^*)), and is closed by Theorem 6. By Exercise F in Notes 6, f is continuous. That completes the proof.

4. The adjoint of an operator

The idea of the proof in 2 of Theorem 7 — the construction of the map \pi^* — is one of general applicability.

Let T \in B(X,Y). Then we define T^* : Y* \rightarrow X^* by way of

T^* y^* = y^* \circ T .

Theorem 8: Let X and Y be normed spaces, and let T \in B(X,Y). Then the above defined T^* satisfies:

  1. T^* \in B(Y^*,X^*), so T^* is linear and bounded. 
  2. T^* is satisfies \langle Tx, y^* \rangle = \langle x, T^* y^* \rangle for all x \in X, y^* \in Y^*. Moreover, T^* is the unique function from Y^* to X^* with this property. 
  3. \|T^*\| = \|T\|.

Exercise D: Prove Theorem 8.

Theorem 9: Let X and Y be normed spaces, and let T \in B(X,Y). Then N(T^*) = R(T)^\perp and N(T) = ^\perp R(T^*)

Proof: Thanks to the notation, the proof is exactly the same as in the Hilbert space case (note, though, that both assertions require proof, and do not follow one from the other by conjugation).

Corollary 10: For T \in B(X,Y), ^\perp N(T^*) = \overline{R(T)}

Proof: This follows from Corollary 16 in the previous lecture.

Exercise E: Give an example of an operator for which N(T)^\perp \neq \overline{R(T^*)}.