The perfect Nullstellensatz
by Orr Shalit
Question: to what extent can we recover a polynomial from its zeros?
Our goal in this post is to give several answers to this question and its generalisations. In order to obtain elegant answers, we work over the complex field (e.g., there are many polynomials, such as , that have no real zeros; the fact that they don’t have real zeros tells us something about these polynomials, but there is no way to “recover” these polynomials from their non-existing zeros). We will write for the algebra of polynomials in one complex variable with complex coefficients, and consider it as a function of the complex variable . We will also write for the algebra of polynomials in (commuting) variables, and think of it – at least initially – as a function of the variable .
Let us begin by recalling that by the Fundamental Theorem of Algebra, every (one variable) polynomial decomposes into a product of linear factors. Thus, if we know the zeros including their multiplicities then we can determine the polynomial up to multiplicative factor. Moreover, if we know that the zeros of some polynomials are , then we know that must have the form
where the can, in principle, be any positive integers.
Let us reformulate the above observation in a slightly different language, which generalizes well to the multivariable setting. If is polynomial, we write
Every polynomial generates a principal ideal . Conversely, every ideal in is principal. For an ideal we write
for all .
If , then . Now, if we begin with a polynomial as in (*), and we are given such that , what can we say about ? Well, if we knew that the zeros of have the same multiplicities as those of , then we would know that for some nonzero scalar , and in particular we would know that (and vice versa, of course). However, in general, only implies that , which is usually larger than . Note that if , then , because is clearly equal to the product of and some other polynomial.
Now let us consider the much richer case of polynomials in several commuting variables. For brevity, let us write for the vector variable , and let us write . Since this algebra is not a principal ideal domain (that’s an easy exercise), it turns out to be more appropriate to talk about ideals rather than single polynomials. Let us define the zero locus similarly to as above:
for all .
We also introduce the following notation: given , we write
for all .
Note that is always an ideal.
The question now becomes: to what extent can we recover from ? A slightly different but related question is: what is the gap between and ? We know already from the one variable case that we cannot hope to fully recover an ideal from its zero locus, but it turns out that a rather satisfactory solution can be given.
Suppose that is a polynomial which it not necessarily contained in , but that for some (think, for example, of and ). Then since , we also have that , so . So the ideal contains at least all polynomials such that .
Definition: Let . The radical of is the ideal
there exists some such that .
(On the left hand side, there are two different commonly used notations for the radical).
Exercise: The radical of an ideal is an ideal.
Theorem (Hilbert’s Nullstellensatz): For every ,
Nullstellensatz means “theorem of zero locus” in German, and we can all agree that this is an apropriate name for this theorem. We shall not prove this theorem; it is usually proved in a first or second graduate course in commutative algebra. It is a beautiful theorem, indeed, but it is not perfect. Below we shall obtain a perfect Nullstellensatz, that is one in which the ideal is completely recovered by the zeros, with no need to take a radical. Of course, we will need to change the meaning of “zeros”.
2. An introduction to noncommutative commutativity
My recent work in operator algebras and noncommutative analysis has led me, together with my collaborators Guy Salomon and Eli Shamovich, to discover another Nullstellensatz (actually, we have a couple of Nullstellensatze, but I’ll tell you only about one). This result has already been known to some algebraists in one form or another – after we proved it, we found that it can be dug out of a paper of Eisenbud and Hochester – but does not seem to be well known. I will write the result and its proof in a language that I (and therefore, hopefully, anyone who’s had some graduate commutative algebra) can understand and appreciate.
Let denote the set of all -tuples of matrices. We let be the disjoint union of all -tuples of matrices, where runs from to . That is, we are looking at all -tuples of commuting matrices of all sizes. This set is referred to in some places as “the noncommutative universe”. Elements of can be plugged into polynomials in noncommuting variables, and subsets of are where most of the action in “noncommutative function theory” takes place. We leave that story to be told another day.
Similarly, we let denote the set of all commuting -tuples of matrices. Note that we can consider to be the space , and then is an algebraic variety in given as the joint zero locus of quadratic equations in variables. We let . Now we are looking at all commuting -tuples of commuting matrices of all sizes. This can be considered as the “commutative noncommutative universe”, or the “free commutative universe”. Another way of thinking about , is as the “noncommutative variety” cut out in by the equations (in noncommuting variables)
Points in can be simply plugged in any polynomial , for example, if and , then for , we put
where is the identity of the same size as (that is, if , then the correct identity to use is ). In fact, points in can be naturally identified with the space of finite dimensional representations of , by
(We shall use the word “representation” to mean a homomorphism of an algebra or ring into for some ).
Now, given an ideal , we can consider its zero set in :
for all .
(We will omit the subscript for brevity.) In the other direction, given a subset , we can define the ideal of functions that vanish on it:
for all .
Tautologically, for every ideal ,
because every polynomial in annihilates every tuple on which every polynomial in is zero, right? The beautiful (and maybe surprising) fact is the converse.
3. The perfect Nullstellensatz – statement and full proof
We are now ready to state the free commutative Nullstellensatz. The following formulation is taken from Corollary 11.7 from the paper “Algebras of bounded noncommutative analytic functions on subvarieties of the noncommutative unit ball” by Guy Salomon, Eli Shamovich and myself (which I already advertised in an earlier blog post).
Theorem (free commutative Nullstellensatz): For every ,
Proof: This proof should be accessible to someone who took a graduate course in commutative algebra (but not too long ago!). We shall split it into several steps, including some review of required material. Someone who is fluent in commutative algebra will be able to understand the proof by just reading the headlines of the steps without going into the explanations. Recall that we are using the notation .
Step I: Changing slightly the point of view: what we shall prove is the following proposition:
Proposition: Let , and suppose that for every unital representation ,
- Representations of are precisely the representations of that annihilate , and
- Representations of are precisely point evaluations at points , thus
- Representations of are precisely points in ,
we see that if we prove the proposition, we obtain that it means precisely that if then , which the direction of the Nullstellensatz that we need to prove.
Thus our goal is to prove the proposition.
Step II: A refresher on localization.
We shall require the notion of a localization of a ring. Let be a commutative ring with unit (any ring we shall consider henceforth will be commutative and with unit) and let be a maximal ideal in . Define (the complement – not quotient – of in ). The localization of at is a ring that is denoted as (or ) that contains “a copy of ” and in which, loosely speaking, all elements of are invertible. Thus, still loosely speaking, the localization is the ring formed from all fractions where and .
More precisely, is the quotient of the set by the equivalence relation
if and only if .
Sometimes the pair is written as , and then multiplication is defined such that addition and multiplication are defined so as to agree with the usual formulas for addition and multiplication for fractions, that is,
We define a map by . Clearly, is the unit of , and is again a commutative ring with unit.
We shall require the following two facts, which can be taken as exercises:
Fact I: The localization of at a maximal ideal is a local ring, that is, it is a ring with a unique maximal ideal.
Fact II: If is such that for every maximal ideal in , then .
As we briefly mentioned in Fact I above, we remind ourselves that a local ring is a ring that has a unique maximal ideal. A commutative ring is said to be Noetherian if every ideal in is finitely generated.
We shall also require the following theorem, which is not really an exercise. If is an ideal in a ring , we write for the ideal generated by all elements of the form , where for all .
Krull’s intersection theorem: Let be a commutative Noetherian local ring with identity. If is the maximal ideal in , then
Take it on faith for now (or see Wikipedia).
Step III: A lemma on local algebras.
Recall that a ring is said to be a –algebra (or and algebra over ) if it is a complex vector space. We say that an algebra is local if it is a local ring. A commutative algebra is said to be Noetherian if its underlying ring is a Noetherian ring.
Lemma: Let be a local -algebra with a maximal ideal , and fix . Suppose that for every homomorphism ,
Proof: First, note that , because the quotient is isomorphic to , so must be mapped to zero under this map. Since is Noetherian, is finitely generated, as is also every power . It follows by induction that for every , the algebra is a finite dimensional vector space. Hence the quotient map can also be considered as a finite dimensional representation, so it annihilates . Thus for all . By Krull’s intersection theorem, .
Step IV and conclusion: proof of the proposition.
We now prove the above proposition, which, as explained in Step I, proves the free commutative Nullstellensatz. Let be an ideal in , and let be an element such that for every representation of . We wish to prove that , or equivalently, that . By Fact II above, it suffices to show that for every maximal ideal in .
Now let be any maximal ideal in . By the lemma of Step III (which is applicable, thanks to Fact I), if and only if its image under every representation of is zero. But every representation gives rise to a representation , which, by assumption, annihilates . It follows that for every maximal in , whence (Fact II) , and as required. That concludes the proof.
Remark: The proof presented here is from my paper with Guy and Eli. I mentioned above that the theorem follows from the results in a paper of Eisenbud and Hochester. Our proof is simpler then theirs, but they prove more: our result says that if for every -tuple of commuting matrices that annihilate , where in principle one might have to consider all . Eisenbud and Hochester’s result implies that there exists some (depending on of course) such that, if for all of size less than or equal to , then . (If you are asking yourself why we are proving in our paper a weaker result then one that already appears in the literature, let me say that this theorem is a rather peripheral result in our paper, and serves a motivational and contextual purpose, rather than supporting the main line of investigation).
4. The perfect Nullstellensatz in the one variable case
We now treat the Theorem (the free commutative Nullstellensatz) in the case of one variable. This really should be understood by everyone. The short explanation is that matrix zeros of a polynomial determine not only the location of the zeros but also their multiplicity.
Let , and let . So we know that for every square matrix that annihilates (that is, every such that ). Our goal is to understand why this is equivalent to belonging to the ideal generated by . One direction is immediate: if , and , then .
In the other direction, we need to show that if , then is a factor of for all . Everything boils down to understanding how polynomials operate on Jordan blocks. Consider a Jordan block
and consider the polynomial . Then one checks readily:
- is invertible if and only if .
- if and only if and .
It follows (assuming the form ) that if and only if for some , and . Since every matrix has a unique canonical Jordan form (up to a permutation of the blocks), we can understand precisely what matrices belong to : it is those matrices whose Jordan blocks have eigenvalues in the set , each of whose sizes are no bigger than .
So, if , then for every Jordan block for which (and vice versa). So letting , we see that must be a factor of , that is, has the form . Since this holds for all , we have that .
Remark: Note that the proof also shows that to conclude that , one needs to know only that for all of size less than or equal to for .
5. Further questions
The beautiful theorem we proved raises two important questions:
- Why is it interesting (besides the plain reason that it is evidently interesting). What questions does this kind of theorem help to answer?
- What does the set of commuting tuples of matrices look like? In order for the above theorem to be “useful” we will need to understand this set well.
I hope to write two posts addressing these issues soon.
Added April 23:
Remark: I should also mention the following very well known observation, which also explains how evaluation on Jordan blocks can identify the zeros of a polynomial including their multiplicity. If is a Jordan block:
and is an analytic function, then
This gives another point of view of the free Nullstellensatz in one variable.