Kepler's Laws

Kepler's Laws

We start here with a study of “celestial mechanics”, examining the interaction of bodies through gravitational forces. The fundamental problem at hand is to determine the trajectory of a particle. First, we’ll focus on finding the shape of the trajectory; in bound orbits, we’ll develop tools to find the position as a function of time.

The basic law underlying this is Newton’s law of universal gravitation, which is the classical description of gravity (we’ll see a bit of the modern modification to it—the study of general relativity—later on).

Consider two bodies of masses m1m_1 and m2m_2, with position vectors r1\mathbf{r_1} and r2\mathbf{r_2}. The force exerted on body 2 due to body 1, described by Newton’s law of universal gravitation, is:

F=Gm1m2r2r13(r2r1)(3.1.1)\tag{3.1.1} \mathbf{F} = -\frac{G m_1 m_2}{|\mathbf{r_2} - \mathbf{r_1}|^3} (\mathbf{r_2} - \mathbf{r_1})

Here, GG is the universal gravitational constant, equal to 6.67430×1011m3kg1s26.67430 \times 10^{-11} \: \mathrm{m^3\,kg^{-1}\,s^{-2}}.

👀
Modern analytic celestial mechanics started with Isaac Newton’s Principia (1687). The name celestial mechanics is more recent than that. Newton wrote that the field should be called “rational mechanics”. The term “dynamics” came in a little later with Gottfried Leibniz, and over a century after Newton, Pierre-Simon Laplace introduced the term celestial mechanics. Prior to Kepler, there was little connection between exact, quantitative prediction of planetary positions, using geometrical or numerical techniques, and contemporary discussions of the physical causes of the planets’ motion.

The problem we’re going to tackle is Kepler’s two-body problem, which asks us to calculate and predict the motion of two bodies under the influence of their mutual gravitational attraction. The two bodies are assumed to be point masses, and the force acting on them is given by Newton’s law of gravitation.

We’ll see later what we can say about the other cases of the nn-body problem.

To solve for the trajectories of the two bodies, it is most convenient to work in the center of mass frame. The center of mass R\mathbf{R} of the two-body system is given by

R=m1r1+m2r2m1+m2 \mathbf{R} = \frac{m_1 \mathbf{r_1} + m_2 \mathbf{r_2}}{m_1 + m_2}

The position vectors of the two bodies with respect to the center of mass are given by

r1=m2Mr+Rr2=m1Mr+R \mathbf{r_1} = \frac{-m_2}{M} \mathbf{r} + \mathbf{R} \qquad \qquad \mathbf{r_2} = \frac{m_1}{M} \mathbf{r} + \mathbf{R}

where M=m1+m2M = m_1 + m_2 is the total mass of the system, and r=r2r1\mathbf{r} = \mathbf{r_2} - \mathbf{r_1} is the vector joining the two bodies, also called the separation vector.

Working in the center of mass frame, R=R˙=0\mathbf{R} = \dot{\mathbf{R}} = \mathbf{0}, since there are no external forces. With this, we now only have to solve for the trajectory of the separation vector r\mathbf{r} as a function of time. Note that the center of mass frame is an inertial frame, so we can apply Newton’s second law to it.

To simplify the algebra, we define the reduced mass mm of the two-body system as

m=m1m2m1+m2(3.1.2)\tag{3.1.2} m = \frac{m_1 m_2}{m_1 + m_2}

such that Mm=m1m2Mm = m_1 m_2. Recommended, we define the gravitational parameter μ\mu as

μ=GM=G(m1+m2)(3.1.3)\tag{3.1.3} \mu = G M = G (m_1 + m_2)

The force acting on the two bodies are

F1=μmrr3F2=μmrr3 \mathbf{F_1} = \mu m \frac{\mathbf{r}}{r^3} \qquad \qquad \mathbf{F_2} = -\mu m \frac{\mathbf{r}}{r^3}

Using Newton’s second law F=ma\mathbf{F} = m \mathbf{a}, we can write

r¨1=μmm1rr3r¨2=μmm2rr3 \ddot{\mathbf{r}}_1 = \frac{\mu m}{m_1} \frac{\mathbf{r}}{r^3} \qquad \qquad \ddot{\mathbf{r}}_2 = -\frac{\mu m}{m_2} \frac{\mathbf{r}}{r^3}

Subtracting the two equations, we get

r¨=μrr3(3.1.4)\tag{3.1.4} \boxed{\ddot{\mathbf{r}} = -\mu \frac{\mathbf{r}}{r^3}}
📍

In the limit m1m2m_1 \gg m_2, we get that

  • M=m1M = m_1
  • m=m2m = m_2
  • r1=0\mathbf{r_1} = \mathbf{0}
  • r2=r\mathbf{r_2} = \mathbf{r}

We see that the trajectory of the separation vector is simply the trajectory of the smaller mass, m2m_2, in the center of mass frame, while the larger mass m1m_1 is at rest. This is the case when one of the bodies is much heavier, for instance the motion of Earth around the Sun.

The separation vector r\mathbf{r} also gives the trajectory of one of the bodies when viewed in the frame of the other body (which is a non-inertial frame).

Constants of Motion

There are a couple of observations we can make here to simplify the tasks. The first is to look for invariants—quantities that don’t change. These turn out to be quite helpful (and sometimes we can’t even solve the problem without them!). Because the force is along the line joining the two masses, it makes sense to think about the angular momentum, because the force is central in a sense.

In particular, because the force on the reduced mass is f(r)r^f(r) \mathbf{\hat{r}}, the “angular momentum”, r×mr˙\mathbf{r} \times m\mathbf{\dot{r}} is constant.

The total orbital angular momentum is given by

L=m1r1×v1+m2r2×v2=mr×v \begin{align*} \mathbf{L} &= m_1 \mathbf{r_1} \times \mathbf{v_1} + m_2 \mathbf{r_2} \times \mathbf{v_2} \\ & = m \mathbf{r} \times \mathbf{v} \\ \end{align*}

where v=r˙\mathbf{v} = \dot{\mathbf{r}}. The algebra is a little tricky, and offers no insight, so I’ve omitted it, but you should go through it once (you’ll actually get a R˙\mathbf{\dot{R}} term as well, but see the energy discussion later). Anyway, from our previous discussion, this is an invariant.

The total energy of the system is given by

E=12m1v12+12m2v22Gm1m2r1r22=12m1(R˙m2Mr˙)2+12m2(R˙+m1Mr˙)2Gm1m2r1r22=12MR˙2+12mv2μmr2 \begin{align*} E &= \frac{1}{2} m_1 v_1^2 + \frac{1}{2} m_2 v_2^2 - \frac{G m_1 m_2}{|\mathbf{r_1} - \mathbf{r_2}|^2} \\ &= \frac{1}{2} m_1 \left(\mathbf{\dot{R}} - \frac{m_2}{M} \mathbf{\dot{r}}\right)^2 + \frac{1}{2} m_2 \left(\mathbf{\dot{R}} + \frac{m_1}{M} \mathbf{\dot{r}}\right)^2 - \frac{G m_1 m_2}{|\mathbf{r_1} - \mathbf{r_2}|^2}\\ &= \frac{1}{2}M\mathbf{\dot{R}}^2 + \frac{1}{2} m v^2 - \mu \frac{m}{r^2} \end{align*}

but since the COM does not experience a net force, R˙\mathbf{\dot{R}} is constant, so we might as well discard it. Then our modified energy becomes,

E12mv2μmr2 E \to \frac{1}{2} m v^2 - \mu \frac{m}{r^2}

which is an invariant because the gravitational force is conservative. Alternatively, R˙\mathbf{\dot{R}} is just 0 in the com frame, so the energy in the COM frame is simply this.

We define the specific angular momentum and specific energy as

h=Lm=r×r˙ε=Em=12v2μr \begin{align*} \tag{3.1.5} \mathbf{h} &= \frac{\mathbf{L}}{m} = \mathbf{r} \times \dot{\mathbf{r}}\\ \tag{3.1.6} \varepsilon &= \frac{E}{m} = \frac{1}{2} v^2 - \frac{\mu}{r} \end{align*}

Since both energy and angular momentum are invariants, so are h\mathbf{h} and ε\varepsilon.

Since hr=0\mathbf{h} \cdot \mathbf{r} = 0, h\mathbf{h} lies perpendicular to the plane of motion, and hence defines the plane of motion, which must be fixed because h\mathbf{h} is constant.

We define the eccentricity vector as

e=h×r˙μrr(3.1.7)\tag{3.1.7} \mathbf{e} = -\frac{\mathbf{h} \times \dot{\mathbf{r}}}{\mu} - \frac{\mathbf{r}}{r}

Taking its time derivative

ddte=ddt(h×r˙μrr)=1μ(h×r¨)(r˙rr˙r2r)=1μ((r×r˙)×(μrr3))(r˙rr˙r2r)=1r3[(rr)r˙(rr˙)×r](r˙rr˙r2r)=1r3(r2r˙rr˙rr2r˙+rr˙r)=0 \begin{align*} \frac{d}{dt} \mathbf{e} &= \frac{d}{dt} \left( -\frac{\mathbf{h} \times \dot{\mathbf{r}}}{\mu} - \frac{\mathbf{r}}{r} \right) \\ &= -\frac{1}{\mu} \left(\mathbf{h} \times \ddot{\mathbf{r}} \right) - \left( \frac{\dot{{\mathbf{r}}}}{r} - \frac{\dot{r}}{r^2} \mathbf{r} \right)\\ &= -\frac{1}{\mu} \left( \left( \mathbf{r} \times \dot{\mathbf{r}} \right) \times \left(-\mu \frac{\mathbf{r}}{r^3} \right) \right) - \left( \frac{\dot{{\mathbf{r}}}}{r} - \frac{\dot{r}}{r^2} \mathbf{r} \right)\\ &= \frac{1}{r^3} \left[(\mathbf{r} \cdot \mathbf{r}) \dot{\mathbf{r}} - (\mathbf{r} \cdot \dot{\mathbf{r}}) \times \mathbf{r} \right] - \left( \frac{\dot{{\mathbf{r}}}}{r} - \frac{\dot{r}}{r^2} \mathbf{r} \right)\\ &= \frac{1}{r^3} \left( r^2 \dot{\mathbf{r}} - r \dot{r} \mathbf{r} - r^2 \dot{\mathbf{r}} + r \dot{r} \mathbf{r} \right)\\ &= 0\\ \end{align*}

we find that the eccentricity vector does not change with time, hence is also a constant of motion. It lies in the plane of motion (he=0\mathbf{h} \cdot \mathbf{e} = 0) and points in the direction of the periapsis. The magnitude of the eccentricity vector is given by

e=1+2εh2μ2(3.1.8)\tag{3.1.8} e = \sqrt{1 + \frac{2 \varepsilon h^2}{\mu^2}}

You can derive the magnitude by taking the dot product with itself, and subbing in the specific energy. It is a little tedious, so I’ve omitted it for clarity.

I can’t quite give you a motivation for the eccentricity vector, there is some rhyme or reason to it, but justifying it is a great task in itself, so I must ask you to take my word for this.

The magnitude of the eccentricity vector (as we’ll see in a bit), determins the eccentricity of a planetary orbit, but this is sadly not very apparent from our definition, although suggestive from its name.

First Law

Kepler’s first law of planetary motion states that the orbit of a planet is an ellipse with the sun at one of the foci.

First, we can ask what the solutions for the trajectories of particles in such a system are. Also we’ll be using properties of ellipse somewhat freely for the next sections, so if you aren’t familiar with them, I’d recommend reading this wikipedia page. The sections of importance are mostly the cartesian co-ordinates one, and metric properties.

To do this we define a new quantity (we don’t necessarily need it, but its very useful later, so I might as well define it here), we call the angle between the eccentricity vector and the position vector the true anamoly, θ\theta. We can use this to take the dot product:

re=recosθ=r(h×r˙μrr)=h2μr \mathbf{r} \cdot \mathbf{e} = r e \cos \theta = \mathbf{r} \cdot \left( -\frac{\mathbf{h} \times \dot{\mathbf{r}}}{\mu} - \frac{\mathbf{r}}{r} \right) = \frac{h^2}{\mu} - r    r=h2/μ1+ecosθ(3.1.9)\tag{3.1.9} \implies \boxed{r = \frac{h^2 / \mu}{1 + e \cos \theta}}

Note that this, if you have seen it before, is the equation of a conic section in polar coordinates, with the focus being at origin, the eccentricity being ee and the length of the semi-latus rectum being p=h2μp = \frac{h^2}{\mu}, which gives the relation

h=μp(3.1.10)\tag{3.1.10} h = \sqrt{\mu p}

The magnitude of e\mathbf{e} determines the shape of the trajectory:

  • e=0e = 0: Circle
  • 0e<10 \leq e < 1: Ellipse
  • e=1,h0e = 1,\, h \neq 0: Parabola
  • e>1e > 1: Hyperbola
  • e=1,h=0e = 1,\, h = 0: Straight line (degenerate conic section)

where the focus of the conic section lies at the origin of our system, i.e at the center of mass of the system.

The individual trajectories of the two bodies in the center of mass frame is therefore

r1=m2Mh2/μ1+ecosθ,r2=m1Mh2/μ1+ecosθ r_1 = \frac{m_2}{M} \frac{h^2 / \mu}{1 + e \cos \theta} \,, \qquad \qquad r_2 = \frac{m_1}{M} \frac{h^2 / \mu}{1 + e \cos \theta}

Of course because the centre of mass is fixed, the trajectories of the two bodies however will be oriented opposite to each other in our frame, with an angle of π\pi radian between them, to keep the center of mass fixed.

We can actually determine the energy of the system can by just the eccentricity ee and length of semi-latus rectum pp of the trajectory. Consider the dot product, ee=e2\mathbf{e} \cdot \mathbf{e} = ||\mathbf{e}||^2, using the magnitude of the eccentricity vector:

e2=1+2h2εμ2    ε=12(e21)μ2h2e^2 = 1 + \frac{2h^2 \varepsilon}{\mu^2} \implies \varepsilon = \frac{1}{2} \left( e^2 - 1 \right) \frac{\mu^2}{h^2}    ε=12μ(1e2)p(3.1.11)\tag{3.1.11} \implies \varepsilon = -\frac{1}{2} \frac{\mu (1 - e^2)}{p}

From here we can see that the sign of the total energy depends on the eccentricity of the trajectory.

  • 0e<10 \leq e < 1: An elliptical trajectory gives a negative total energy, and hence a bound orbit
  • e=1e = 1: A parabolic trajectory gives a zero total energy, and hence an unbound orbit
  • e>1e > 1: A hyperbolic trajectory gives a positive total energy, and hence an unbound orbit
🤔
Why are Bound Orbits Ellipses?

First of all the sign of ϵ\epsilon is the same as that of EE, of course. If we write down the expression for total energy:

E=12mv2Gm1m2r E = \frac{1}{2} mv^2 - \frac{Gm_1m_2}{r}

The first term is non-negative. If E0E \ge 0, rr is unbounded for large values, because its fine for it be large, as the first term will accomodate for it. Contrastingly, this is not the case for E<0E<0 as if rr grows too large, the last term goes 0\to 0, and the first term is 0\ge 0, even though E<0E < 0. Thus rr can’t grow too large, and is thus bounded.

You can also see this qualitatively in an energy diagram. For an extended discussion, see Kleppner & Kolenkow, An Introduction to Mechanics, 2nd ed, Section 10.4, page 382.

Therefore all bound orbits are ellipses, and all unbound orbits are hyperbolae or parabolae.

For our solar system, the orbits of the planets are bound. The sun is much more massive than any of the planets, hence is nearly at the center of mass, or the focus of the elliptical orbits of all planets, which proves the first law. Of course this isn’t exactly true, the orbits influence each other a bit, but its still good enough.

Second Law

Kepler’s second law of planetary motion states that a line segment joining a planet and the sun sweeps out equal areas during equal intervals of time.

The proof of it relies on the fact that the cross product of two vectors gives the area of a parallelogram possessing sides of those vectors. Hence, the triangular area dAdA swept out in a short period of time is given by half the cross product of the r\mathbf{r} and dr\mathbf{dr} vectors, for some short piece of the orbit, dr\mathbf{dr}.

Sweeping out area dA
dA=12r×dr=12r×vdt=h2dt dA = \frac{1}{2}|\mathbf{r} \times \mathbf{dr}| = \frac{1}{2}|\mathbf{r} \times \mathbf{v} dt| = \frac{h}{2} dt     dAdt=h2(3.1.12)\tag{3.1.12} \implies \boxed{\frac{dA}{dt} = \frac{h}{2}}

which is constant. Thus, we are done.

👀

Johannes Kepler’s laws improved the model of Copernicus. According to Copernicus:

  1. The planetary orbit is a circle with epicycles.
  2. The Sun is approximately at the center of the orbit.
  3. The speed of the planet in the main orbit is constant.

Despite being correct in saying that the planets revolved around the Sun, Copernicus was incorrect in defining their orbits. Introducing physical explanations for movement in space beyond just geometry, Kepler correctly defined the orbit of planets as follows:

  1. The planetary orbit is not a circle with epicycles, but an ellipse.
  2. The Sun is not at the center but at a focal point of the elliptical orbit.
  3. Neither the linear speed nor the angular speed of the planet in the orbit is constant, but the area speed (closely linked historically with the concept of angular momentum) is constant.

Third Law

Kepler’s third law of planetary motion states that the square of the period of revolution of a planet is directly proportional to the cube of the semi-major axis of its orbit.

Consider two bodies moving in bound elliptical orbits, with the semi-major axis of the orbit being aa, and the semi-minor axis being bb. The lengths of the two axes are related by the eccentricity of the orbit, ee, as

e2=1b2a2 e^2 = 1 - \frac{b^2}{a^2}

The area of an ellipse is πab=πa21e2\pi a b = \pi a^2 \sqrt{1 - e^2}. Moreover, the length of the semi-latus rectum is p=b2/a=a(1e2)p = {b^2}/{a} = a (1 - e^2), hence h=μa(1e2)h = \sqrt{\mu a (1 - e^2)} because as we had shown in eqn. (3.1.53.1.5), h=μph = \sqrt{\mu p}.

Integrating the equation of areal velocity, from the previous law we have:

dA=h2dtdA=h2dtπa21e2=12μa(1e2)P \begin{align*} dA &= \frac{h}{2} dt\\ \int dA &= \frac{h}{2} \int dt\\ \pi a^2 \sqrt{1 - e^2} &= \frac{1}{2} \sqrt{\mu a (1 - e^2)} P\\ \end{align*}

where PP is the time period of revolution.

Rearranging, we get

P2=4π2μa3(3.1.13)\tag{3.1.13} \boxed{P^2 = \frac{4 \pi^2}{\mu} a^3}

where PP is the period of one revolution. This is known as Newton’s form of the third law. If we work in the units of AU, sidereal years, and solar masses, we have GM=4π2G M_\odot = 4 \pi^2. Hence the equation simplifies to

a3=MP2(3.1.14)\tag{3.1.14} \boxed{a^3 = M P^2}

In our solar system, M=M+mplanetM=1M = M_\odot + m_{planet} \approx M_\odot = 1, we get the relation P2=a3P^2 = a^3. This gives us the third and final law!