Showing posts with label reinventions. Show all posts
Showing posts with label reinventions. Show all posts

2009-05-03

Cause, effect and Maxwell's equations

I've always thought classical electrodynamics was a somewhat mysterious thing.

Consider Maxwell's equations, in free space and natural units:

  1. ∇·E = ρ
  2. ∇·B = 0
  3. ∇×E = –∂B/∂t
  4. ∇×B = J + ∂E/∂t

where E and B are vector fields that respond to the sources ρ (charge density scalar) and J (current density vector) while also interacting with each other.

These equation tells us that (1) charge causes the electrical field to diverge; (2) the magnetic field does not diverge (magnetic monopoles do not exist); (3) a changing magnetic field causes the electric field to curl; (4a) a current causes the magnetic field to curl; (4b) a changing electric field works like a current.

With me so far? Good.

It is indisputable that these laws work; huge swaths of practical engineering depend on them for their bread-and-butter. But how can they work? Say that we turn on a current somewhere, such that B starts to change. The E-field feels this change and instantly rearranges itself (globally!) such that its curl matches what the changing B-field requires. But surely the E field will need to change in order to achieve this. And the B-field will need to respond to that, so the change of B we started out with is not actually the right one, and so on ad infinitum... How do the fields manage to obey the laws always and everywhere, no matter how we shake the sources about? In other words, is Maxwell's theory even internally consistent?

Only recently did I realize how easy it is to see that, yes, the theory is in fact consistent. I feel a bit like a simpleton for not having noticed this earlier – but I'm going to arrogantly assume that some readers have had trouble seeing it too, and therefore would find the following explanation interesting.

The trick is to reorder the terms in the equations slightly:

  1. ρ = ∇·E
  2. 0 = ∇·B
  3. B/∂t = –∇×E
  4. E/∂t = ∇×BJ

The two first equations are unchanged; they describe conditions on the fields that hold at any frozen instant of time. But the two last equations, rather than being formulas for finding the curls of the fields, now describe how each field changes from time t to tt, given the instantaneous values of the fields and sources at time t.

In principle, if we know all of the sources, we can choose how the fields are going to look at time t0, subject only to the constraints of equations (1) and (2). Then we can use equations (3) and (4) to evolve the fields as time passes, without anything left for choice. (We can choose how the sources behave, of course; and we can let the fields influence the motion of the sources through the Lorentz force law, but that is no conceptual problem).

However, as we do this, we had better make sure that equations (1) and (2) still hold at time t>t0. Since there is no room for choice after applying (3) and (4), this had better happen automatically – or the theory really is inconsistent. Let's see what happens when we differentiate (1) and (2), remembering that time and space derivatives commute and then unfolding the LHS using (4) and (3):

  1. ∂ρ/∂t = (&part/&partt)&nabla·E = ∇·(∂E/∂t) = ∇·(∇×BJ) = ∇·(∇×B) – ∇·J = –∇·J
  2. 0 = (&part/&partt)&nabla·B = ∇·(∂B/∂t) = ∇·(–∇×E) = – ∇·(∇×E) = 0

The second equation dissolves into 0=0, so if only ∇×B=0 is true at t0, it will stay true forever. On the other hand, the first equation becomes ∂ρ/∂t = –∇·J, that is, charge conservation! To demand that that ∇×E=ρ will stay true after t0 is exactly the same as to demand that charge is conserved.

The fact that Maxwell's equations imply charge conservation is well known and can be found in textbooks. But now we also know that any fields at t0 that have the right divergences can be extended to a solution of Maxwell's equations over all of spacetime, simply by evolving with (3) and (4). (That is, unless we worry about fields becoming infinite, which I don't at the moment.) If the t0 fields do not have the curl we would expect from electro- or magnetostatics given the sources at t0, this simply means that some radiation was passing through the lab at t0. The equations then tell us where that radiation is going, and (running time backwards) where it came from.

So far, so good. But note that in our new perspective, the whys and therefores of electrodynamics look somewhat different from the usual presentation: (1) the divergence of the electric field always points to the charges; (2) the magnetic field never diverges; (3) a curly electric field causes the magnetic field to change in the opposite direction of the curl; (4a) a curly magnetic field causes the electric field to change in the direction of the curl; (4b) a current causes the electric field to change in the opposite direction of the current.

It's not that one field feels the other one changing and curls in response. Rather the latter field feels the former's curl and begins to change in response to that curl. That seems to me to be a much more philosophically satisfying causal relation.

But the big novelty here is (4b). Currents no longer cause magnetic fields, at least not directly. Instead, currents cause the E-field to change, and it is the resulting imbalance in E that causes magnetic fields to build up as a secondary effect.

Here is one way of looking at it: Imagine we have some positive charge sitting atop an equal amount of negative charge, such that ρ=0 throughout. Now we move the positive and negative charges away from each other, creating a current. The immediate effect of this current is to string out a pencil of electric field lines between the positive and negative charges. This initial E field closely follows path of the current – of course we shouldn't expect a nice dipole field immediately due to the finite speed of light. But now, at the boundaries of the region of current, the strength of E varies in a direction perpendicular to the field lines, which means that there is a curl of the E-field. This makes a B-field begin to circulate around the path of the current. At its edge there's a curl of B which again starts to modify the E-field, and so on ... The end result is that the original pencil of electrical field lines radiate away from each other into free space until they assume the configuration predicted by electrostatics.

Of course, all this does not mean that it isn't useful to imagine that changes cause curls rather than vice versa, when solving concrete problems. Once we've seen that the equations are consistent, we can apply them in whatever direction we choose.


Just for completeness, we can do the same thing with potentials. Introducing the vector and scalar potentials A and φ, a standard presentation of electrodynamics is:

  1. ∇²&phi – ∂²φ/∂t² = –ρ
  2. ∇²A – ∂²A/∂t² = –J

together with the Lorenz gauge ∇·A+∂φ/∂t=0. As before, we solve each equation for the highest time derivatives:

  1. ∂²φ/∂t² = ∇²φ – ρ
  2. ∂²A/∂t² = ∇²AJ

We see that if we know the sources at all times, and the potentials and their rate of change at time t0, the potentials at all times are determined. Of course the potentials at t0 must satisfy the Lorenz condition, and they must keep satisfying it in the future. We differentiate it to find

0 = (∂/∂t)(∇·A+∂φ/∂t) = ∇·(∂A/&partt) + ∂²&phi/∂t² = ∇·(∂A/&partt) + ∇²φ – ρ

which is another relation between the potentials and their rate of change, this time also involving the charge density. This has to hold at t0, just as there was an initial condition involving charge density in the E-B formulation. But it also has to continue holding, to maintain the Lorenz condition beyond t0t, so we differentiate it once more:

0 = (∂/∂t)(∇·(∂A/&partt) + ∇²φ – ρ) = ∇·(∂²A/∂t²) + ∇²(∂φ/∂t) – ∂ρ/∂t = ∇·∇²A – ∇·J – ∇²(∇·A) – ∂ρ/∂t

But ∇·∇²A=∇²(∇·A) as a matter of vector calculus, so this reduces to charge conservation. The end result, as for the field case, is that we have two equations for the time evolution of the world, and two equations that give initial conditions that must hold at t0 – and the initial conditions will be preserved by the time evolution exactly if charge is conserved.

2008-06-15

The origin of spin

When I was in high scool I read Stephen Hawking's A Brief History of Time. It introduced the concept of quantum-mechanical spin rather confusingly:
... a particle of spin 1 is like an arrow: it looks different from different directions. Only if one turns it round a complete revolution (360 degrees) does the particle look the same. A particle of spin 2 is like a double-headed arrow: it look the same if one turns it round half a revolution (180 degrees)... there are particles that do not look the same if one turns them through just one revolution: you have to turn them through two complete revolutions! Such particles are said to have spin ½.
I tried in vain to imagine which kind of geometrical shape would have to be turned around twice in order to look the same. And if such a shape could exist, would it stop here? Are there particles that have to be turned three or four times in order to look the same? Or perhaps two-and-a-half?

I decided that someday I would find a book with actual formulas in it and try if I could understand what was actually going on here. Almost 20 years later, I'm still making progress.

I bought The Feynman Lectures on Physics and worked my way through them. That enabled me to figure out what Hawking tried to say: It is not about how the particle "looks" from different directions, but about how your mathematical description of a particle such as an electron changes whan you express it with respect to coordinate systems that point in different directions. If you turn the coordinate system through 360° (which might be thought of as rotating the electron 360° in the other direction, although I'm not sure that it is helpful to try to imagine rotating a point particle), and make sure that all parts of the mathematical description vary continuously, you end up in with certain numbers in the mathematical model being exactly what they started as, but multiplied by -1. These negations happen to cancel each other out when you use the model to find out how the electron behaves, which is good: The particle ought not to behave any differently because you've walked around it.

So I'd say that the election "looks" the same after a single revolution, but we speak of it in a slightly different way. It's as if it was a glass that started out half full, and after we turn it through 360° it appears to be half empty instead.

So far, so good. But how about the turn-three-times (spin 1/3) or turn-two-and-a-half-times (spin 0.4) varieties I'd hypothesized? More reading had to be done.

Presently I got to the point where mathematical gibberish such as "spin is a two-state quantum property where the amplitudes transform under SU(2)" appear to make sense to me. The two-revolutions rule is because SU(2) is a double cover of SO(3) which is the group of rotations in three-dimensional space. But why does the electron choose to transform under SU(2) – say, could it have picked a different group which is a triple cover of SO(3), leading to a three-revolutions rule instead?

Recently I figured out how to think of this such that it is clear that SU(2) is special. I'm rather pleased about this, because I've had to invent it myself – none of the textbook I've consulted explain it. (It would be ridiculous to pretend that I'm the first to invent it; these is recreational musings, not serious research).

The first thing to note is that even though SO(3) is often described as the groups of rotations in space, this is a bit misleading. It would be better so say that it is the group of instantaneous rotations in space. If you use an element of SO(3) to specify how to rotate a body in space, what you really get is a mapping that tells how to get from the old position of any point in the body to the its new position, but says nothing about how it got there. Yet, in everyday language "rotation" denotes the process of rotating something, rather than the end result. If you take a tangible object such as a book and rotate it, we speak of a process that takes place over time, and during that time the book occupies various intermediate positions, which change smoothly during the roation. Just pointing to the element of SO(3) that describes the book's final state ignores all that.

For example, you can place the book front side up on a table and flip it to the back side either turning it around the left edge or around the right edge. The book ends up in precisely the same position, yet the two ways of flipping are quantitatively different. You can't construct a continuously varying family of ways-to-flip which contains right-flipping as well as left-flipping and all end up in the same orientation. Try it! What should come right in the middle between left and right? We could turn the book around the bottom edge, towards ourselves, but then the flipped book ends up upside down, and we have to decide whether to turn it clockwise or counterclockwise in order to reach the specified ending position.

The idea of a continuously varying family of continuous rotation processes turns out (ha!) to be key. Let's try to make this a bit more formal and general. Warning: higher mathematics up ahead!

Start with a topological group G, i.e., a group which is also a topological space and where the law of composition is continuous. The main example to think of is G=SO(3), but most of what we'll do does not depend on the deep inner structure of SO(3) in particular.

Define an auxiliary group A whose elements are continuous maps a:[0,1]→G such that a(0)=1G. The law of composition on A is pointwise multiplication in G, that is, (a1*a2)(t)=a1(t)*a2(t). Clearly, A is a group. When G=SO(3), an element of A represents a particular continous rotation process. The composition in A is algebraically easy but has no intuitive geometrical interpretation.

An element of A contains more information than we're really interested in, so let's quotient out the differences between elements with the final state that are members of the same continuously varying family:

Let T consist of all elements a of A for which there exists a continuous map α:[0,1]×[0,1]→G such that α(t,0)=a(t) and α(0,u)=α(t,1)=α(1,u)=1G for all t and u. It is easy to see that T is a normal subgroup of A.

The goal of all this is to define the quotient group A/T, which I choose to call Gspun. One may now prove the following:

  • Gspun is simply connected.
  • There is a continuous homomorphism from Gspun to G, since T lies in the kernel of the "end-state" homomorphism from A to G which maps a to a(1). (The kernel of Gspun→G is the "fundamental group" for the topological structure of G).
  • For a∈A, choose any continuous f: [0,1]→[0,1] such that f(0)=0 and f(1)=1. Then a and a◦f represent the same element of Gspun.
  • For any a, b∈A, define (a;b)(t) to be b(2t) for t≤1/2 and a(2t-1)*b(1) for t≥1/2. Then a*b and a;b represent the same element of Gspun.
Thus in Gspun the group operation does have a geometrical interpretation: it corresponds to the process of first doing one continuous rotation and then another one.

Now back to physics, fixing G=SO(3). Imagine that we have a mathematical model of some physical system and a recipe that says how to change the model when we rotate the system in a gradual, physically plausible, continuous way. Such a rotation corresponds to an element of A, so the recipe really maps A into the space of changes to the model. Now we may want to consider only recipes that do not distinguish between rotation processes that can be varied continuously into each other. If so, the recipe must be a homomorphism from Gspun to the space of changes to the model.

And for G=SO(3), it turns out by pure accident that Gspun is isomorphic to SU(2)!

The books I've read tend to start by pulling SU(2) out of a hat, and then deriving that it accidentally corresponds to certain rotations. How lucky that the group the electron chose to represent happens to have a geometrical representation! I find it much more compelling to think oppositely: The electron chose the most general way of responding to rotations it could, and that turned out, accidentally, to have a simple interpretation in terms of complex numbers.