Functional Analysis – Introduction. Part I

by Orr Shalit

I begin by making clear a certain point. Functional analysis is an enormous branch of mathematics, so big that it does not seem appropriate to call it “a branch”, it sometimes looks more like another tree. When I will talk below about functional analysis, I will mean “textbook functional analysis” and not “research functional analysis”. By this I mean that I will only refer to the core of the theory which is several decades old and which is more-or-less agreed to be the essential and basic part of the subject.

The goal of this post is to serve as an introduction to the course “Advanced Analysis, 201.2.5401”, which is a basic graduate course on (textbook) functional analysis. In the lectures I will only have time to give a limited description of the roots of the subject and the motivation will have to be brief. Here I will aim to describe what was the climate in which this tree grew, where are its roots and what are its fruits.

To prepare this introduction I am relying on the following sources. First and foremost, my love of the subject and my point of view on it were strongly shaped by my teachers, and in particular by Boris Paneah (my Master’s thesis advisor) and Baruch Solel (my PhD. thesis advisor). Second, I learned a lot on the subject from the book “Mathematical Thought from Ancient to Modern Times” by M. Kline and from the notes sections of Rudin’s and Reed-Simon’s books “Functional Analysis”.

And a warning to the kids: this is a blog, not a book, and if you really want to learn something go read the books (the books I mentioned have precise references).

The climate around 1900

The turn of the century marked a movement towards an abstractization of mathematics: groups, linear spaces, fields and metric spaces were defined around this time and studied as abstract objects.  There were several reasons for this abstractization. One was the reconstruction that analysis underwent in the 19th century, which opened the door for consideration of “general” functions and their properties. Another was Cantor’s set theory, which is the basis on which abstract structures can be defined. Perhaps another reason for the abstractization was that during the 19th century mathematics grew so big that it had to be re-organized or else it would explode.

Maybe the most dramatic cause for the abstractization in mathematics was the parting of mathematicians and Truth. The discovery of non-Euclidean geometry, as one example, or the logic paradoxes, as a completely different example, convinced many that mathematics deals not with absolute truths, but rather studies certain axiomatic systems. Defining a metric on the “space” of infinite sequences and studying the resulting “geometry”, something unthinkable a hundred years before, became a natural idea that was quickly adopted.

The atmosphere was appropriate for gleaning the common structure that pervaded several problems of mathematical analysis, and for slowly arranging it, during the years 1900 to 1930, into what today lies at the basis of the subject . The problems that had the most direct impact on the development of the theory came from the calculus of variations and integral equations.

The calculus of variations

In the calculus of variations one seeks to minimize (usually under some additional constraint) an expression such as:

J(f) = \int_a^b F(t,f(t),f'(t)) dt ,

where F is some given function of three variables and a<b are fixed numbers. The objective here is to find a function f which minimizes the expression J(f) (under the given constraints). J can be considered as a function that takes functions as arguments and returns real numbers as values, and, following Hadamard, is usually referred to as a functional. (Incidentally, Hadamard’s student Levy is the one who seems to have coined the term functional analysis, which was understood by him to be the study of functionals, and is therefore somewhat dated).

As a simple example, consider F(t,x,y) = \sqrt{1+y^2}. Then

J(f) = \int_a^b \sqrt{1+f'(t)^2} dt ,

thus the problem of minimizing J(f), subject to the constraints f(a) = a and f(b) = b, is easily recognizable as the problem of finding the function f which has the graph that is the curve of minimal length that joins the points (a,a) and (b,b).

Such minimization problems have been studied for about 200 years (since 1696), and for good reasons: not only do these problems arise in many geometrical settings, they also turn out to govern classical mechanics. Already in the mid 18th century Euler and Lagrange knew how to solve many of these problems in practice, but there were still some difficulties (for example, regarding sufficient conditions for a function to be a local minimizer, and even more so regarding global solutions). It was only in the last decades of the 19th century that Volterra, Hadamard and others started to study functionals systematically and abstractly. The notions of continuity and differentiability of functionals were defined, and the goal was to obtain general conditions for the existence of extrema similar to those in the calculus of functions of a single variable. The earlier works were not free of some difficulties, as the abstract theory of function spaces (the spaces on which these functionals were to operate) was not yet developed enough. In the thesis of Frechet a major step forward was made with the introduction of metric spaces (the terminology itself was introduced only later, by Hausdorff). Frechet not only introduced metric spaces quite abstractly (and also more abstract spaces), he also studied (sequential) compactness, continuity, he proved a generalization of Cantor’s nested interval theorem, he proved the existence of extrema for continuous functionals on compact sets, and he clarified the notion of differentiability of a functional. As Volterra and Hadamard, Frechet thought of functions as points in a “space”, but sharpened the understanding of what this “space” is.

We see how the beginning of the theory of abstract (function) spaces stemmed from the calculus of variations. This is one root of functional analysis. There is another root, not less important, which holds in it a crucial notion about which we have said nothing until now: the notion of a linear operator.

Integral equations

An integral equation is an equation of the form

f(x) + \int_a^b k(x,t)f(t) dt = g(x)

(or some variant of this form) where g and k are given functions on [a,b] and [a,b] \times [a,b], respectively, and f is an unknown function to be determined. The function k is called the kernel of the equation. Let me note in passing that in the early days such an equation was called a functional equation, a term that today is reserved to something quite different. Many differential equations of mathematical physics (for example the equation of forced oscillations, or the equation for the electrostatic potential) can be reduced to integral equations, and this is the reason for the attention that these equations received during the 19th century (we will see an example of such an equation an its reduction to an integral equation in a later lecture). Integral equations aren’t only a means for solving differential equations – in some problems of mathematical physics one can directly derive the integral equation that governs the system (for example, Hilbert showed this in 1912 for problems in gas dynamics).

Specific integral equations were studied throughout the nineteenth century by mathematicians such Abel, Liouville, Neumann and Poincaré. A general theory, dealing with general kernels, was kicked off by Volterra in the 1880s and then developed much further by Fredholm, Hilbert, and Schmidt in the first decade of the 20th century. Fredholm (in his 1903 paper) denoted

S_k f (x) = f(x) + \int_a^b k(x,t) f(t) dt

and manipulated the transformations S_k algebraically, in effect introducing the notion of an operator (today one usually denotes Kf (x) = \int_a^b k(x,t) f(t) dt and writes the integral equation as (I+K)f = g, where I is the identity operator). To this day a key theorem in the subject – the Fredholm alternative, which we shall learn in a future lecture – is named after him. Hilbert and his student Schmidt wrote (separately) a series of papers on the subject in which they came back to Fredholm’s equation from different angles, and introduced the notions of eigenvalues, eigenvectors, orthonormal systems and eigenvalue expansion in inner product spaces, though they worked at first in a rather concrete setting, and their main interest was in spaces of continuous functions. Since a (continuous) function is determined by its Fourier coefficients, and since the Fourier coefficients of continuous functions are square summable, Hilbert was led to consider the space \ell^2 , that is the space of all sequences x = \{x_n\}_{m=1}^\infty such that \sum_{n=1}^\infty |x_n|^2 < \infty. Hilbert also introduced the inner product on \ell^2, namely (x,y) = \sum x_n y_n. However, Hilbert did not seem to have the idea of a Hilbert space – he did not treat the sequences as points in an infinite dimensional space. This idea was pursued by Schmidt, who studied \ell^2 in a more systematic way, introduced the notation of the norm: \|x\| = (\sum |x_n|^2)^{1/2}, and established the Cauchy-Schwarz, triangle and Bessel inequalities, the generalized Pythagorean theorem, and the completeness of the space. Schmidt also introduced the concept of a closed subspace of \ell^2, and proved (in essence) the orthogonal decomposition theorem.

Another major contributor to the birth of functional analysis is F. Riesz. It was Riesz who sought to solve the integral equation in the space of (Lebesgue) square integrable functions (today denoted as L^2[a,b]), and in the paper where he successfully achieves this goal (1907) he also establishes a one-to-one correspondence between \ell^2 and L^2[a,b] – probably the first isomorphism between Banach spaces ever discovered. The reason why Riesz was interested in obtaining this one-to-one correspondence was that he wanted to characterize the sequences which are (generalized) Fourier coefficients of a function in L^2[a,b] with respect to some orthonormal system. This characterization was also obtained by E. Fischer in the paper (1907) in which he proves that L^2[a,b] is complete, and is known today as the Riesz-Fischer theorem.

Afterwards Riesz set out to solve the integral equation (and also other problems) in other spaces of functions. In 1910 he introduced the spaces L^p[a,b] (1<p<\infty) and managed to recapture many of the results of Fredhom, Hilbert and Schmidt in these spaces. To do so, he comes very close to the theory of Banach spaces that we know today: he defined bounded operators, operator norms, continuous linear functionals, the dual of a space (and he proved that L^p[a,b] is the dual of L^q[a,b] for 1<q<\infty), and the dual operator. Riesz also adapted the notion of a completely continuous operator (what is today known as a compact operator) that was previously introduced by Hilbert.

Thus, by the 1920s there were enough results and examples to suggest that some reorganization is due. This was taken up by Banach, who set down the axiomatic theory of Banach spaces and their operator theory, and by von Neumann, who set down the axiomatic theory of Hilbert spaces and the operator algebras which act on them. These two mathematicians shaped functional analysis and basic operator theory pretty much as we recognize them today. Much of what we shall learn in this course they already knew. This is the point where the  material for our course starts.

Parenthetical remark: The role of Lebesgue integration

Let me append this note: the appearance of Lebesgue integration in 1901 was probably crucial to the development of functional analysis as it developed, since in the absence of a concrete and complete function space such as L^2[a,b] it not clear that there would have been a natural path to follow from the analytic problems in integral equations to the notion of a Hilbert space.

The fruits of the general theory

Today students study compact operators on general Banach and Hilbert spaces in standard courses of functional analysis, and the problem of studying the solvability of integral equations – a challenging topic for a thesis of a genius PhD student a hundred years ago – can be given as an exercise to third year undergraduates. The general theory has given us an elegant and streamlined approach to the problems that gave rise to it. But is the alternative solution of a class of problems that preceded its inception the only triumph of functional analysis?

In the book that I cited above (published in 1972), M. Kline closes the chapter on functional analysis with the words (p. 1095):

Applications of functional analysis to the generalized moment problem, statistical mechanics, existence and uniqueness theorems in partial differential equations, and fixed point theorems have been and are being made. Functional analysis now plays a role in the calculus of variations and in the theory of representations of continuous compact groups. It is also involved in algebra, approximate computations, topology, and real variable theory. Despite this variety of applications, there has been a deplorable absence of new applications to the large problems of classical analysis. This failure has disappointed the founders of functional analysis.

I wish I understood better what is meant by this thought provoking (albeit somewhat self-contradictory) remark. Let me make a list of fruits that will hopefully make things appear less deplorable.

  1. First, to set things straight, let me say that there are countless instances where researchers working on problems in mathematics applied ready-to-use functional analytic tools which they just took off the shelf (this even happened to me in my research and I can tell you about examples when we meet). And on other countless instances, the tools were not exactly ready-to-use, but could be adapted. There are examples of this in many areas outside of functional analysis proper, such as approximation theory, partial differential equations, functional equations, computational geometry, abstract and concrete harmonic analysis, and more. We will see examples throughout this course. Of course, this usually doesn’t happen so cleanly as one might dream, and considerable set up is often required.
  2. That being said, there are also a great many instances where a certain proof to a certain result is known, but then somebody comes up with a functional analytic proof which is cleaner, shorter, or just different. (To try outside of my area of expertise: I think that the existence of harmonic differentials on compact Riemann surfaces is an example of this). Having additional proofs to known theorems deepens our understanding and strengthens the unity within mathematics.
  3. One may ask: is there really a theorem outside of functional analysis (say, in some field of hard analysis) where the proof we know uses functional analysis and that this use is essential, meaning that one simply cannot prove that theorem without the tools of the general theory. But this is like asking: is it really essential to take the plane to Australia, can’t you just sail there like in the good old days? (After explaining why I think that that question which “one may ask” is silly, let me tell you that I think that it is extremely interesting! I don’t know if anybody has an answer to this question, but I have encountered in my research (together with John McCarthy) a situation where we can prove a result in matrix theory that uses operator theory on infinite dimensional spaces, and we really do not know if this can be avoided).
  4. Sometimes the benefit of a theory is not only in the results but also in the language it provides. The language of operators on Banach spaces is natural for many problems in analysis (such as ergodic theory) and even for problems in engineering (for example control theory). Having the problems formulated in a unified and clean language makes them accessible to other mathematicians, suggests what the results should be and sometimes also suggests a way of solution (and sometimes point #1 above is applicable).
  5. It oftentimes happens that once the problem is formulated in the language of functional analysis, but there is no ready-to-use tool available off the shelf, that a researcher in ,say, partial differential equations will find it more convenient to prove the needed result in functional analysis and then to apply it to his original problem. Boris Paneah has told me that this has happened to him (it never happened to me, but probably because I am an operator theorist. But it did happen to me that my co-authors and I had to prove a result within the theory that we were trying to apply).
  6. The language provided by a general mathematical theory can also shape the way we view the world. The most striking example relevant to our topic is the formulation of quantum mechanics in terms of operators on Hilbert space (to be historically precise, the abstract formulation of Hilbert spaces was stimulated by early quantum mechanics. However, the abstract idea of a Hilbert space was already there).
  7. On the shoulders of functional analysis and operators on Hilbert spaces stands the theory of operator algebras – which opened up still new possibilities for formulation of theories in mathematical physics.
  8. The general point of view provided by functional analysis has shed new light on finite dimensional mathematics as well: geometry of finite dimensional spaces, matrix theory, linear algebra, numerical analysis, data analysis, computer vision are examples where training in abstract functional analysis has proved beneficial.