Lecture 20: Proof of Mazur's theorem (part 1)

$$ \DeclareMathOperator{\JH}{JH} \DeclareMathOperator{\div}{div} \newcommand{\bQ}{\mathbf{Q}} \let\ol\overline \newcommand{\bP}{\mathbf{P}} \newcommand{\fh}{\mathfrak{h}} \newcommand{\fp}{\mathfrak{p}} \newcommand{\bT}{\mathbf{T}} \newcommand{\bZ}{\mathbf{Z}} \newcommand{\fa}{\mathfrak{a}} \newcommand{\bF}{\mathbf{F}} $$

In this lecture, we begin the proof of Mazur's theorem: if $N$ is a prime greater than 7 and not 13 then no elliptic curve over $\bQ$ has a rational point of order $N$. We begin by analyzing $[0]-[\infty]$ as a point on $J_0(N)$. We show that it is a non-trivial torsion point of order dividing $N-1$ and compute the Hecke action on it. We then prove the theorem under the assumption that all eigenforms have rational coefficients. This hypothesis allows us to apply our criteria directly to quickly prove the theorem.

Our goal

Our purpose in the next two lectures is to prove the following result, due to Mazur.

Theorem. et $N$ be a prime greater than 7 and not 13. Then no elliptic curve over $\bQ$ has a rational point of order $N$.

Recall that it is enough to construct a quotient $A$ of $J_0(N)$ such that $A(\bQ)$ has rank 0 and $0 \ne \infty$ in $A$. Furthermore, rank 0 is ensured if the Jordan--Holder constituents of $A[p](\ol{\bQ})$ are trivial and cyclotomic.

Throughout today's lecture, $N$ is as in the theorem. The prime 13 is excluded because $X_0(13)$ has genus 0, and therefore $J_0(N)$ is trivial. For the $N$ allowed by the theorem, $X_0(N)$ has positive genus.

The difference of the cusps

Order on the Jacobian

Proposition. The point $[0]-[\infty]$ of $J_0(N)$ is a non-trivial torsion point of order dividing $N-1$.

Proof. If $[0]-[\infty]=0$ in $J_0(N)$ then there would be a function $f$ on $X_0(N)$ with $\div(f)=[0]-[\infty]$. Such an $f$ would provide a degree 1 map to $\bP^1$, and so $X_0(N)$ would have genus 0. But this is not the case for the $N$ under consideration.

Consider the modular form $\Delta(z)$ of weight 12 for $\Gamma(1)$ on the upper half-plane. A basic fact about this form is that it is nowhere vanishing; this can be seen either from the product formula for it, or in terms of its description as the discriminant. Its $q$-expansion begins $q+\cdots$. Now, $\Delta(z)$ and $\Delta(Nz)$ are both modular forms of weight 12 for $\Gamma_0(N)$, and both are non-vanishing on the upper half-plane. Thus $f(z)=\Delta(z)/\Delta(Nz)$ is a nowhere vanishing function on the upper half-plane which is invariant under $\Gamma_0(N)$. It therefore descends to a meromorphic function on $X_0(N)$ which is holomorphic and non-vanishing on $Y_0(N)$. The $q$-expansion of $f$ at $\infty$ begins $q^{-(N-1)}+\cdots$. It follows that $f$ has a pole of order $N-1$ at $\infty$, as a function on $X_0(N)$. (The function $q$ on $\fh$ descends to a local parameter at $\infty$ on $X_0(N)$.) Since the only other zero or pole of $f$ occurs at 0, and the divisor of $f$ has degree 0, we necessarily have $\div(f)=(N-1)[0]-(N-1)[\infty]$, which shows that $[0]-[\infty]$ is $(N-1)$-torsion. ◾

Remark. In fact, Ogg showed that the exact order of $[0]-[\infty]$ is $(N-1)/\gcd(N-1,12)$, but we will not need this statement.

Remark. Mazur proves that $[0]-[\infty]$ generates the entire torsion subgroup of the Mordell--Weil group of $J_0(N)$.

Action of Hecke operators

Recall that for primes $\ell \ne N$, we have the Hecke operator $T_{\ell}$, which we can regard as an endomorphism of $J_0(N)$.

Proposition. We have $T_{\ell}([0]-[\infty])=(\ell+1)([0]-[\infty])$.

Proof. Consider the Hecke correspondence $f,g \colon X_0(N\ell) \rightrightarrows X_0(N)$. We have the following facts:

The curve $X_0(N\ell)$ has 4 cusps. In fact, the set of cusps for $X_0(N\ell)$ is the product of the sets of cusps for $X_0(N)$ and $X_0(\ell)$. We can therefore represent its cusps as pairs $(x, y)$ with $x,y \in \{0, \infty\}$. The first coordinate is the $X_0(N)$ coordinate.
We have $f(x,y)=g(x,y)=x$. This can be seen as follows. The maps $f$ and $g$ lift to the identity map and multiplication by $\ell$ on $\fh^*$. The elements of $\bP^1(\bQ)$ with $N$ in the denominator map to the cusp $\infty$ of $X_0(N)$, while all others map to 0. Since multiplication by $\ell$ cannot introduce a $N$ in the denominator, the statement follows.
The map $f$ has ramification index $\ell$ at $(\ast, 0)$ and index 1 at $(\ast, \infty)$. The map $g$ is the opposite. This is a simple calculation with stabilizer groups in $\Gamma_0(N)$ and $\Gamma_0(N\ell)$.

We thus see that $f^*([x])=\ell [(x,0)] + [(x,\infty)]$. Applying $g_*$, we find $g_*(f^*([x]))=(\ell+1)[x]$. ◾

The case of rational eigenforms

The quotient of the Jacobian

Say an abelian variety $A/\bQ$ satisfies condition $\JH(p)$ if the Jordan--Holder constituents of $A[p](\ol{\bQ})$ are all trivial or cyclotomic. Note that this condition is isogeny invariant: it is equivalent to asking that the semi-simplified reduction of the rational $p$-adic Tate module of $A$ is a direct sum of trivial and cyclotomic characters.

We have shown that, up to isogeny, we have a decomposition $J_0(N) = \prod A_f$, where the product is over the (Galois orbits of) normalized weight 2 cuspidal eigenforms $f$. Let's suppose for simplicity that each $f$ has rational coefficient field, so the $A_f$'s are elliptic curves (and the Galois orbits are singletons). Given $p$, there is a maximal quotient (up to isogeny) of $J_0(N)$ that satisfies $\JH(p)$, namely, the product of the $A_f$'s that do.

We want to more precisely define this quotient. For an eigenform $f$, let $\fp_f$ be the kernel of the homomorphism $\bT \to \bZ$ giving the eigenvalues of $f$. Recall that $A_f$ is by definition $J_0(N)/\fp_f J_0(N)$. Let $S$ be the set of those $f$ for which $A_f$ satisfies $\JH(p)$, and let $I=\bigcap_{f \in S} \fp_f$. We define $A=J_0(N)/I J_0(N)$. Up to isogeny, $A=\prod_{f \in S} A_f$, and so $A$ satisfies $\JH(p)$. From our criterion for rank 0, we find:

Proposition. $A(\bQ)$ has rank 0.

How do we know that there will be a prime $p$ for which $A$ is actually non-trivial? In fact, the analysis of $[0]-[\infty]$ provides such a prime. Let $p$ be a prime dividing the order of $[0]-[\infty]$. Then $J_0(N)(\bQ)$ has a $p$-torsion point, and so $J_0(N)[p]$ has a copy of the trivial representation in it. This copy must come from one of the $A_f$'s, and this $A_f$ must satisfy $\JH(p)$. From now on, we fix such a prime $p$.

Alternate characterization of I

To actually work with the quotient $A$, we will need to better understand the ideal $I$. We begin with the following observation:

Lemma. $f \in S$ if and only if $a_{\ell}(f)-(\ell+1)$ is divisible by $p$ for all $\ell$.

Proof. Suppose $f \in S$, so that $A_f$ satisfies $\JH(p)$. Then the semi-simplification of $A_f[p]$ is isomorphic to trivial plus cyclotomic, as it is a 2-dimensional representation with cyclotomic determinant. It follows that the trace of the Frobenius at $\ell$ on $A_f[p]$, which is equal to $a_{\ell}(f)$ mod $p$, is $\ell+1$. Conversely, if all the $a_{\ell}(f)$ are equal to $\ell+1$ modulo $p$, then $A_f$ is (up to semi-simplification) the sum of trivial and cyclotomic, and thus satisfies $\JH(p)$. ◾

Define the $p$-Eisenstein ideal $\fa$ to be the ideal of $\bT$ generated by $p$ and the $T_{\ell}-(\ell+1)$.

Lemma. We have $\bT/\fa=\bF_p$. The ideal $\fa$ is maximal.

Proof. By assumption, there exists a form $f$ in $S$, and for such a form the image of $T_{\ell}-(\ell+1)$ in $\bT/\fp_f \cong \bZ$ is divisible by $p$ for all $\ell$. Thus the image of $\fa$ in $\bT/\fp_f$ is not the unit ideal, and so $\fa$ is not the unit ideal. It is clear now that $\bT/\fa=\bF_p$, since the quotient is non-trivial, and every Hecke operator is identified with an integer. ◾

Lemma. The following are equivalent: (1) $f \in S$; (2) the image of $\fa$ in $\bT/\fp_f$ is not the unit ideal; (3) $\fp_f \subset \fa$.

Proof. The image of $\fa$ in $bT/\fp_f \cong \bZ$ is the ideal generated by $p$ and the $a_{\ell}(f)-(\ell+1)$. This is not the unit ideal if and only if $a_{\ell}(f)-(\ell+1)$ is divisible by $p$ for all $p$. Thus (1) and (2) are equivalent. Clearly, (3) implies (2). Conversely, if $\fp_f \not\subset \fa$ then $\fp_f+\fa=(1)$, since $\fa$ is maximal, and so the image of $\fa$ in $\bT/\fp_f$ is the unit ideal. ◾

Proposition. $I$ is the intersection of the minimal primes $\fp$ of $\bT$ which are contained in $\fa$.

Proof. The minimal primes of $\bT$ are exactly the $\fp_f$. By definition, $I$ is the intersection of the $\fp_f$ with $f \in S$. By the previous lemma, this coincides with the $\fp_f$ which are contained in $\fa$. ◾

The image of the cusps

Lemma. Let $R$ be a reduced noetherian ring, let $\fa$ be a maximal ideal, and let $I$ be the intersection of the minimal primes contained in $\fa$. Then $I_{\fa}=0$.

Proof. The local ring $R_{\fa}$ is reduced, and its nilradical is $I_{\fa}$. ◾

Corollary. We have $I_{\fa}=0$ for our ideals $\fa$ and $I$.

Suppose $X$ is a $\bT$-module in which all elements are killed by a power of $p$. Then the action of $\bT$ extends to one of the $p$-adic completion $\bT_p=\varprojlim \bT/p^n \bT$. Since this is a complete semi-local ring, it is a product of local rings, the factors corresponding to the maximal ideals. In particular, the localization $\bT_{\fa}$ is a direct factor of $\bT_p$. It follows that $X$ decomposes as $X_{\fa} \oplus X'$, where $\bT_{\fa}$ acts by zero on $X'$. We can identify $X_{\fa}$ with $X[\fa^{\infty}]=\bigcup_{n \ge 0} X[\fa^n]$, where $X[\fa^n]$ is the $\fa^n$-torsion in $X$.

Lemma. The map $J_0(N)[\fa^{\infty}] \to A[\fa^{\infty}]$ is an isomorphism.

Proof. Let $X=J_0(N)[p^\infty]$ and $Y=A[p^{\infty}]$. We then have a surjection of $\bT_p$-modules $X \to Y$. The kernel of this map is $X \cap IJ_0(N)$, which is $IX$. (Let $t_1, \ldots, t_n$ generate $I$. Then $IJ_0(N)$ is the image of the map $J_0(N)^n \to J_0(N)$ given by the $t_i$. Any $p$-power torsion element in the image comes from one in the source.) We thus have an exact sequence

$$ 0 \to IX \to X \to Y \to 0. $$

Now localize at $\fa$. On the one hand, this is an exact operation. On the other, $(IX)_{\fa} = I_{\fa} X_{\fa}=0$, since $I_{\fa}=0$. Thus the map $X_{\fa} \to Y_{\fa}$ is an isomorphism. ◾

Proposition. We have $[0] \ne [\infty]$ in $A$.

Proof. Let $P=[0]-[\infty]$, as a point of $J_0(N)$. Let $Q$ be a multiple of $P$ which is non-zero and $p$-torsion. Then $Q$ is killed by $\fa$. Indeed, we showed above that $T_{\ell}P=(\ell+1)P$, and so the same holds for $Q$. Thus $Q \in J_0(N)[\fa^{\infty}]$, and so its image in $A[\fa^{\infty}]$ is non-zero. ◾

The general case

The problem

In general, we have shown that we have a decomposition $V_pJ_0(N) = \prod_{f,\lambda} V_{f,\lambda}$, where the product is over pairs $(f,\lambda)$ consisting of a normalized weight 2 eigenform $f$ and a place $\lambda$ of its coefficient field $K_f$ above $p$, and $V_{f,\lambda}$ is 2-dimensional Galois representation over the field $K_{f,\lambda}$. We would like to take $A$ to be the quotient of $J_0(N)$ for which $T_pA$ is the product of those $V_{f,\lambda}$ which reduce mod $p$ to trivial plus cyclotomic. However, there might not be such a quotient. For instance, it could be that there is only one $f$, and that for some $\lambda$ the representation $V_{f,\lambda}$ has the right form, and for other $\lambda$ it does not.

What we'll do is take $A$ to be the quotient of $J_0(N)$ which is isogenous to the product of $A_f$'s over $f$'s for which $V_{f,\lambda}$ is of the right form for some $\lambda$. Since this might not hold for all $\lambda$, the abelian variety $A$ will not necessarily satisfy $\JH(p)$, and so our criterion for rank 0 will not apply directly. However, one can make sense of the piece of the Mordell--Weil group of $A$ corresponding to the "good" $\lambda$, and the proof of Theorem B will show that this is finite. Then a simple commutative algebra result will allow us to deduce that $A(\bQ)$ is finite.

The quotient A

As before, choose a prime $p$ dividing the order of $[0]-[\infty]$ in $J_0(N)$, and define $\fa$ to be the ideal of $\bT$ generated by $p$ and $T_{\ell}-(\ell+1)$ for all $\ell \ne N$.

Lemma. We have $\bT/\fa=\bF_p$. The ideal $\fa$ is maximal.

Proof. The representation $J_0(N)[p]$ contains a copy of the trivial representation. Thus for some $(f,\lambda)$, the semi-simplified reduction of $V_{f,\lambda}$ contains the trivial representation, and is therefore the sum of trivial and cyclotomic. It follows that $a_{\ell}(f)=\ell+1$ holds modulo $\lambda$, for all $\ell$. Thus the image of $\fa$ in $\bT/\fp_f$ is contained in $\lambda$. It follows that $\fa$ is not the unit ideal, and so $\bT/\fa=\bF_p$, as before. ◾

Let $I$ be the intersection of the minimal primes of $\bT$ contained in $\fa$, and let $A=J_0(N)/IJ_0(N)$. The following is proved exactly as before.

Proposition. We have $[0] \ne [\infty]$ in $A$.