2008-06-29

In the future, anything can be a TLD

ICANN has announced the approval of a new policy for the creation of new top-level domains, known as gTLD's in the jargon.

A gTLD is the suffix such as .com or .net that appears at the end of an Internet address, except for two-letter country codes (such as .dk or .us). Originally the only gTLDs were .com, .org and .net, for general use, plus a few stray ones (.mil, .gov and .edu) which should probably have been under .us but was inherited from the Internet Stone Age where .us had not yet been invented. During the last decade or so, a dozen or so of new gTLDs, such as .info, .museum and .cat have been created, as an experiment in how to the Internet's namespace should evolve in the future.

The experimental process that gave us .info and its friends was, hmm, let us just call it rather heavy. Formal proposals had to be made, public comments collected, innumerable constituencies consulted, pages upon pages of reports written, read and eventually voted on by the board of ICANN. Unsurprisingly this approach has been found not to scale, so ICANN is streamlining it considerably.

In the future everyone will be able to propose a new top-level domain, and if you pass a few technical sanity checks (not yet quite defined, it seems) and pay the fee (at least several thousand dollars, it is rumored), then presto, you have your own top-level domain name.

The announcement suggests that this new process will be used to create city-based gTLDs such as .paris or .nyc in which second-level domains can be registered by the public like it happens in the existing ones. And we can readily imagine future trade-specific gTLDs – say, .food or .bank – and language-specific ones like the existing .cat for the Catalan community. (I appreciate that ICANN would not want to conduct a heavy voting-based process on .kurd ...)

Such community-based domains are the lofty, noble ideal. If there is a market for being zum-schwanen.berlin instead of zum-schwanen-wannsee.de, it's cool with me. Sure, the first one to squat the name of your city will be able to skim a profit off .yourcity domains, but they cannot bleed you too white because they need to compete with .com at least for new customers. In practice, though, I suppose the ICANN's suggestions of .brandname and .yournamehere will get more attention. Expect to see .ebay, .google, or .dell sometime soon.

This is really inevitable. It is already the prevailing wisdom in many places that if you're serious about running an internet-enabled business you need to register your name in all top-level domains where you qualify. If www.lego.net, www.lego.biz and www.lego.info all lead to the same site (which they do), then what is the point of having a gTLD there in the first place? The Market has spoken, and according to the Market, businesses do not want to fit into a hierarchical naming scheme if they can help it.

But there is a problem here. I cannot imagine that the Big Red will be satisfied to have a web site at http://www.coke/ – in the brave TLD-less world of tomorrow they'll want to be http://coke/. Fine, you say? Yes, but remember that up until now, http://coke/ has been an instruction to the browser to connect to the machine on the local network named coke, and display what that machine sends back. And, it will continue to mean that irrespective of the creation of a new TLD. If there is no such machine on the local network, you ought to get whatever the TLD resolves to. This may or may not be the case with your browser right now, but in a few years all major operating systems and browsers will be upgraded to understand this.

But what happens when you're on a local network where one machine happens to be called coke? Either you get The Coca-Cola Company or you get your local machine, but which one? That depends on seldom-explored details of your DNS resolver software, but it doesn't matter because neither answer is Right! If you get Coca-Cola, you suddenly can't access your local machine, and if you get the local machine, then you'll be misdirected when you click on a link to http://coke/ on a third-party website.

So it appears that in the future, network administrators must be careful to choose names for their machines that do not collide with any website their users may want to visit. You may think it is easy to avoid calling your machines coke, but any word out there might collide. For example, I name machines on my home network for people in Greek myth; no obvious trademarks there. The file server is called kreon, and I refer to it by this name on the command line dozens of time a day. But what if the fine people at kreon.com (which I did not know existed before I started this entry) decide they want their own TLD? I'll have to choose between renaming my machine or losing access to part of the web. Perhaps an easy choice for me to make, but imagine that you manage IT services at a university department with hundreds of machines... Choose strange names that are unlikely to collide? But how? All of pc42.com, box25.com, abc10.com, and k19.com exist today. These were the first four nonsense letter-digit combinations I tried to look up.

As far as I can see, the only way not to drive every network administrator in the world insane is if ICANN preemptively declares that, sure, you can all have the TLDs you want, but adresses like http://lego/ or foobar@lego must not be used. This can be enforced easily by a rule that one-element domain names must not resolve any A, AAAA, CNAME, or MX records, on penalty of losing control over the TLD.

I'm currently trying to figure out the right procedure for formally making this proposal.

(Thanks to Version 2 for making me aware of this.)

2008-06-20

You may be a compiler writer if ...

Some time ago, in a (probably misguided) attempt to adapt at runtime to a bit of endian weirdness, I wrote some code that looked more or less like

   int64_t buffer[2];
   ...
   volatile int64_t *a = (volatile int64_t*)buffer;
   volatile int32_t *b = (volatile int32_t*)buffer;
   *a = 1;
   if( *b == 1 ) doSomeStuff();
There was a bit more to it, but this is representative. This code worked well for quite some time, until we needed to move the program that it is part of to machine with a different processor architecture. When we turned on optimization in the compiler, the program stopped working.

Frantic debugging ensued. Eventually I discovered that the optimizing compiler turns the snippet above into something like this (retouched into hypothetical pseudo assembly such as to protect the innocent):

   add    sp,#24,r13
   mov    #0,r3
   load   0(r13),r5
   mov    #1,r4
   store  r3,4(r13)
   cmp    #1,r5
   store  r4,0(r13)
   bsr.eq doSomeStuff

Now, an ordinary competent programmer should be able to understand that C99's strict aliasing rules authorize a compiler to assume that *a and *b refer to different objects, and therefore the optimizer may reorder the code such that the read of *b happens before *a is written. I admit it surprised me a bit that the volatile qualifiers did not prevent this, which seems to violate the letter of the C standard regarding volatile accesses and sequence points. However, apparently the compiler in question does not recognize any ordering constraints on accesses to "different" volatile-qualified objects within a basic block.

However, notice that even though the compiler is not smart enough to know that the two pointers alias, it is smart enough to figure out that they can share the same register! If you think that is perfectly reasonable and understandable, you're seriously at risk of being a compiler writer at heart.

For the record, I do and I am.

P.S. The Right way to code this is type-pun through an union instead:

   union {
     int64_t a[2];
     int32_t b;
   } buffer;
   ...
   buffer.a[0] = 1;
   if( buffer.b == 1 ) doSomeStuff();

which in this case compiled into an uncoditional call to doSomeStuff. Understanding Strict Aliasing by Mike Acton explains both the rules and the cheats that nevertheless work with admirable clarity. Respectfully recommended for your edification.

2008-06-18

My router wants Taiwan

Every so often – actually, every so seldom – my Zyxel Prestige 660R-61 ADSL router will enter the strangest failure mode you can imagine. Everything works except I'm unable to resolve DNS names, giving the immediate impression that the entire internet has disappeared. (I once had a smart, brilliant coworker burst into my office and announce that this was a big moment in the history of the internet: Google had gone down! Turned out that our nameserver had fallen ill, and the web addresses he had used as a control group happened to be cached locally).

No, lacking DNS is not in itself strange. The strange thing is how this lack arises.

See, every time I send a DNS query packet from one of the several computers behind the router, what comes back is a response packet that purports to tell be the IP address of www.kimo.com.tw. Apparently the router is not trying to falsify the address of the host I want to find. I ask about, for example, www.google.com, and back comes a packet saying (translated from RFC-1035 speak): "Thank you for your inquiry about the IP address of www.kimo.com.tw. It is my pleasure to inform you that the IP address of www.kimo.com.tw is 207.69.188.186".

It's always www.kimo.com.tw. It's always 207.69.188.186. It only happens for UDP queries; DNS over TCP is unaffected. It's not the nameserver at my ISP that misbehaves; I get the same pattern when I ask a root server about "com.". I don't know whether the router rewrites my outgoing requests to be about www.kimo.com.tw, or responds with a stock reply on its own, or rewrites incoming answers. I don't know whether it affects UDP packets not to/from port 53.

This has happened two or three times over a period of several years. It seems to tend to follow downtime on the ADSL connection. But whenever it happens, rebooting my local router clears the problem. It's very strange.

Is the router getting infected with some malware? I have a hard time figuring out what said malware could be attempting to achieve. Because the name in the reply does not match that in the request, the resolver on my local computer will just fail instead of return a wrong answer to the application.

After intensive web searching the best I have been able to find is this page in Russian, which judging by Google's translation seems to describe exactly this syndrome. Apparently it is claimed that no malware is involved, but I cannot make sense of the machine-translated explanation of what actually happens.

2008-06-17

Fridge logic: 2001, A Space Odyssey

– So basically we have no idea what the damned thing is for?

– Basically no.

– Or why the Chinese went and buried it under umpteen tons of moon rock?

– Well, we don't really have solid evidence that it's Chinese in the first place. It's just the working hypothesis that appeared to be least crazy at the moment.

– Hrmf. What are some of the more crazy working hypotheses?

– Natural geological formation. Aliens from outer space. Something truly evil that the NSA cooked up and have not told the rest of us about.

– Look, I know we're famous for eating little children, but could we please cease the interdepartmental potshots until we're sure the nation's security is not under imminent attack?

– Hey, he asked for crazy theories.

– Yes, and I'm sorry for that. So what I hear is that the Chinese connection is not as solid as we thought?

– Some of our analysts are pretty certain that the Chinese could not possibly have reached the spot with more than two astronauts for 5 hours without our knowing it when they went up in 1998.

– Well, personally I think it is still the least crazy theory, but I concede it is plenty crazy already.

– What about the Russians?

– Don't be silly.

– Okay, gentlemen, we know absolutely nothing. The question is, what do we do about that? We can't realistically delay briefing the President more than until tonight, and whoever goes talk to him has better have some recommendation for concrete action with him.

– Isn't that obvious? Since we don't know anything all we can do is wait until the scientists up there get us some more information.

– How is that "concrete action"?

– One thing we do have to decide is whether to go public with this or not. I'm all for putting a lid on it, but we could get into really nasty stuff if we hush it up now and then –

– I will not accept any public announcement for as long as there's a risk that this is some kind of Chinese superweapon.

– How could an inert featureless slab of whateverite be a weapon? It doesn't even point towards the earth.

– Let me tell you –

– There are rumors that it is making our people sick somehow. Some sort of bioweapon.

– Rumors, which kind of rumors? Unless I've been severely misinformed, nobody on Earth even knows the thing exists except for a few deeply trusted people who all report directly to somebody who's present in this room, and don't know who else knows. So who is spreading rumors to whom?

– Correction. The rumors do not mention this TMA thing, but there are rumors that people in Clavius are falling ill like flies.

– A natural assumption given that the base has been quarantined since yesterday afternoon. It's utterly false, but we've been discreetly encouraging it. There has to be some explanation.

– Great. We'll have the world press poking into this in a matter of hours. Then afterwards we'll have to defend not only withholding information but also lying.

– All in a day's work for you, I'd think.

– Stop that, you two. I'm more concerned about security at the site. There are two thousand people stationed on Clavius, and at least two hundred of them are aliens. The rest are civilians without any security indoctrination to speak of. I'd prefer if we –

– 1700.

– What?

– There are 1700 people in Clavius, not 2000.

– Thank you for that highly relevant correction. Now, is there any way we could get some actual military –

– Look, even if we had the ability to deploy any significant number of troops to the moon on day's notice – which I'll neither confirm nor deny even to this exclusive audience – we'd be running openly afoul of any number of international political commitments if we did. And for what good? Soldiers are not some kind of magical pixie dust that just makes everything right. We'll fight any known enemy that we can see and know how to kill, but I thought we agreed that such an enemy is simply not present on the moon right now.

– What I meant was –

– I say we pull the digging team back to base and then nuke the thing.

– WHAT?!

– I'm not even going to dignify that with a response.

– Gentlemen, please!

– Wait a moment .. I think I've got it!

– Yes?

Let's just send a senior NASA bureaucrat to the moon and have him look at the thing in person!

– By golly! That'll solve all our problems.

– Good thinking, man.

– Excellent. We have a plan. Heywood, you're going.

– What, me?!

– Yes, you. Judging from past performance, you've contributed absolutely nothing of value so far, and you might as well continue doing that up on the moon.

– But I have to be at a tennis tournament next –

– Our hearts weep. Cancel it. I'll have one of my people put together a powerpoint for you to show to the Old Man, but you'll have to do the talking. I propose Justin go with you and provide moral support. Any other protests? Good. David, can you tell your people to start warming up a rocket or something for Dr. Floyd to go in?

* * *

Seriously, though, why did Heywood Floyd go to the moon? A fair part of 2001: A Space Odyssey is devoted to his journey, which proceeds with considerable haste and at enormous taxpayer expense. But it never becomes clear that there is anything he's uniquely qualified to do once he gets there, save for just being at the center of a Visiting VIP Tour. He does get a glowing but nonspecific introduction before he gives a peptalk on the moon base, though.

In reality his only purpose is to show us, the readers, around. He's a Watson without a Sherlock. Come to think of it, this describes many of Arthur C. Clarke's protagonists.

2008-06-15

The origin of spin

When I was in high scool I read Stephen Hawking's A Brief History of Time. It introduced the concept of quantum-mechanical spin rather confusingly:
... a particle of spin 1 is like an arrow: it looks different from different directions. Only if one turns it round a complete revolution (360 degrees) does the particle look the same. A particle of spin 2 is like a double-headed arrow: it look the same if one turns it round half a revolution (180 degrees)... there are particles that do not look the same if one turns them through just one revolution: you have to turn them through two complete revolutions! Such particles are said to have spin ½.
I tried in vain to imagine which kind of geometrical shape would have to be turned around twice in order to look the same. And if such a shape could exist, would it stop here? Are there particles that have to be turned three or four times in order to look the same? Or perhaps two-and-a-half?

I decided that someday I would find a book with actual formulas in it and try if I could understand what was actually going on here. Almost 20 years later, I'm still making progress.

I bought The Feynman Lectures on Physics and worked my way through them. That enabled me to figure out what Hawking tried to say: It is not about how the particle "looks" from different directions, but about how your mathematical description of a particle such as an electron changes whan you express it with respect to coordinate systems that point in different directions. If you turn the coordinate system through 360° (which might be thought of as rotating the electron 360° in the other direction, although I'm not sure that it is helpful to try to imagine rotating a point particle), and make sure that all parts of the mathematical description vary continuously, you end up in with certain numbers in the mathematical model being exactly what they started as, but multiplied by -1. These negations happen to cancel each other out when you use the model to find out how the electron behaves, which is good: The particle ought not to behave any differently because you've walked around it.

So I'd say that the election "looks" the same after a single revolution, but we speak of it in a slightly different way. It's as if it was a glass that started out half full, and after we turn it through 360° it appears to be half empty instead.

So far, so good. But how about the turn-three-times (spin 1/3) or turn-two-and-a-half-times (spin 0.4) varieties I'd hypothesized? More reading had to be done.

Presently I got to the point where mathematical gibberish such as "spin is a two-state quantum property where the amplitudes transform under SU(2)" appear to make sense to me. The two-revolutions rule is because SU(2) is a double cover of SO(3) which is the group of rotations in three-dimensional space. But why does the electron choose to transform under SU(2) – say, could it have picked a different group which is a triple cover of SO(3), leading to a three-revolutions rule instead?

Recently I figured out how to think of this such that it is clear that SU(2) is special. I'm rather pleased about this, because I've had to invent it myself – none of the textbook I've consulted explain it. (It would be ridiculous to pretend that I'm the first to invent it; these is recreational musings, not serious research).

The first thing to note is that even though SO(3) is often described as the groups of rotations in space, this is a bit misleading. It would be better so say that it is the group of instantaneous rotations in space. If you use an element of SO(3) to specify how to rotate a body in space, what you really get is a mapping that tells how to get from the old position of any point in the body to the its new position, but says nothing about how it got there. Yet, in everyday language "rotation" denotes the process of rotating something, rather than the end result. If you take a tangible object such as a book and rotate it, we speak of a process that takes place over time, and during that time the book occupies various intermediate positions, which change smoothly during the roation. Just pointing to the element of SO(3) that describes the book's final state ignores all that.

For example, you can place the book front side up on a table and flip it to the back side either turning it around the left edge or around the right edge. The book ends up in precisely the same position, yet the two ways of flipping are quantitatively different. You can't construct a continuously varying family of ways-to-flip which contains right-flipping as well as left-flipping and all end up in the same orientation. Try it! What should come right in the middle between left and right? We could turn the book around the bottom edge, towards ourselves, but then the flipped book ends up upside down, and we have to decide whether to turn it clockwise or counterclockwise in order to reach the specified ending position.

The idea of a continuously varying family of continuous rotation processes turns out (ha!) to be key. Let's try to make this a bit more formal and general. Warning: higher mathematics up ahead!

Start with a topological group G, i.e., a group which is also a topological space and where the law of composition is continuous. The main example to think of is G=SO(3), but most of what we'll do does not depend on the deep inner structure of SO(3) in particular.

Define an auxiliary group A whose elements are continuous maps a:[0,1]→G such that a(0)=1G. The law of composition on A is pointwise multiplication in G, that is, (a1*a2)(t)=a1(t)*a2(t). Clearly, A is a group. When G=SO(3), an element of A represents a particular continous rotation process. The composition in A is algebraically easy but has no intuitive geometrical interpretation.

An element of A contains more information than we're really interested in, so let's quotient out the differences between elements with the final state that are members of the same continuously varying family:

Let T consist of all elements a of A for which there exists a continuous map α:[0,1]×[0,1]→G such that α(t,0)=a(t) and α(0,u)=α(t,1)=α(1,u)=1G for all t and u. It is easy to see that T is a normal subgroup of A.

The goal of all this is to define the quotient group A/T, which I choose to call Gspun. One may now prove the following:

  • Gspun is simply connected.
  • There is a continuous homomorphism from Gspun to G, since T lies in the kernel of the "end-state" homomorphism from A to G which maps a to a(1). (The kernel of Gspun→G is the "fundamental group" for the topological structure of G).
  • For a∈A, choose any continuous f: [0,1]→[0,1] such that f(0)=0 and f(1)=1. Then a and a◦f represent the same element of Gspun.
  • For any a, b∈A, define (a;b)(t) to be b(2t) for t≤1/2 and a(2t-1)*b(1) for t≥1/2. Then a*b and a;b represent the same element of Gspun.
Thus in Gspun the group operation does have a geometrical interpretation: it corresponds to the process of first doing one continuous rotation and then another one.

Now back to physics, fixing G=SO(3). Imagine that we have a mathematical model of some physical system and a recipe that says how to change the model when we rotate the system in a gradual, physically plausible, continuous way. Such a rotation corresponds to an element of A, so the recipe really maps A into the space of changes to the model. Now we may want to consider only recipes that do not distinguish between rotation processes that can be varied continuously into each other. If so, the recipe must be a homomorphism from Gspun to the space of changes to the model.

And for G=SO(3), it turns out by pure accident that Gspun is isomorphic to SU(2)!

The books I've read tend to start by pulling SU(2) out of a hat, and then deriving that it accidentally corresponds to certain rotations. How lucky that the group the electron chose to represent happens to have a geometrical representation! I find it much more compelling to think oppositely: The electron chose the most general way of responding to rotations it could, and that turned out, accidentally, to have a simple interpretation in terms of complex numbers.

Memetracing: Doctorow on a Balloon

What is it with Cory Doctorow and hot-air balloons? Humorous references appear to be popping up in the strangest places – last in Bruce Schneier and the King of the Crabs (which I discovered through Schneier's own blog).

But just what do these references actually reference? I find them mildly funny because they evoke memories of the Xkcd episode "Blogofaire", as well as the small print in "Online Communities".

On the other hand, I don't get out much. For all I know, Doctorow might be famous for being an avid balloon pilot in addition to a writer, and that's what everybody are alluding to. On the other-other hand, I cannot seem to google up any non-humorous Doctorow/balloon juxtapositions.

So perhaps it is all an in-joke referencing the xkcd strips, or perhaps xkcd was itself referencing some ur-joke known to only a few initiates (or, alternatively, to everybody but me). Impossible to know, really – when an Internet meme reaches a certain critical mass, half of those who pass it on have no idea what it refers to, anyway. Close to nobody ever played Zero Wing, but that does not prevent the all your X are belong to us snowclone from being productive.

2008-06-10

Unix domain socket woes

At work we needed to create a "doorkeeper" process that accepts incoming network connections and then hands over the connected sockets to one of several, already running, service processes. The service process then talks directly to the client through the network layer. I can't go into detail about why that happens to be the right design for us, so just stipulate that it is what we want.

I was assigned the task of researching how to do the socket handover and write code to implement it. Solutions had to be found for Windows, Linux, and Mac OS X. Possibly later other BSD variants too.

On Windows things turned out to be fairly easy. There's a dedicated system call, WSADuplicateSocket(), for exactly this scenario. This syscall fills in an opaque structure in the doorkeeper process, whose binary content you then transmit to the service process (say, through pipes that you set up when the service process was started). In the service process another system recreates a copy of the socket for you, and then you're free to close the doorkeeper's socket handle.

Things were not quite as straightforward on the Unix side. (At least it works the same way on Linux and Darwin, modulo a few ifdefs). The way to move an open file descriptor between existing processes is to send it as out-of-band data (aka an "ancillary message") the the SCM_RIGHTS tag on a Unix domain socket (aka an AF_UNIX/PF_UNIX socket). The kernel knows that SCM_RIGHTS is magic and processes its payload of file descriptors such that the receiving process receives descriptors that are valid in its own context, in effect an inter-process dup(2).

Or at least, this was the only way to do it that a few hours of googling disclosed to me. It works, but ... ho boy, Unix domain sockets! In my 20+ years of programming, many of them with a distinctly Unix focus, I've never before encountered a situation where AF_UNIX is preferred solution. This turns out to be not entirely without cause.

Firstly, Unix domain sockets are sockets, which means that you have to go through all the motions of the client/server-based sockets paradigm we know and love from the TCP/IP world. First the two processes must somehow agree on a rendezvous address. Then one of the processes creates a socket, binds it to your chosen address, calls listen() and accept() on it, then gets a second socket on which the actual SCM_RIGHTS transaction can take place, and finally closes both of its sockets. Meanwhile the other process goes through a socket()–connect()–communicate–close() routine of its own. Many of these calls are potentially blocking, so if you're doing non-blocking I/O and want to schedule unrelated work packets in your thread while all this goes on (which we happen to do), the entire transaction needs to be broken into a fair number of select()-triggered work packets. Not that any of this is difficult to code as such, but it is still about the most complex-to-set-up way of doing IPC Unix offers.

Secondly, Unix domain sockets are also files, or more accurately, they are file-system objects. A listening socket has a name and a directory entry (otherwise it cannot be connected to) and a inode. The inode and directory entry are created automatically by the bind() call. This means, among other things, that you have to create your socket in a particular directory, and it has better be one where you have write permission. All of the standard security considerations about temporary files kick in – if you create your socket in /tmp, somebody might conceivably move it out of the way and substitute his own socket before the client process tries to connect, and so forth.

Further, cleaning up after the transaction becomes an issue. The inode for the listening socket will disappear by itself once nobody is using it, as is the nature of inodes. However, the directory entry counts as using the inode, and it does not go away by itself. You need to explicitly unlink(2) it in addition to closing the socket. If you forget to unlink it, it will stick around in the directory, taking up space but being utterly useless because nobody is listening to it and nobody can ever start listening to it except by unlinking it and creating a new one in its place with bind(). In particular, bind() will fail if you try to bind to an existing but unused AF_UNIX socket inode.

What happens if the server process dies while it listens for the client to connect? The kernel will close the listening socket, but that will not make the directory entry go away. You can register an atexit() handler to do it, but atexit() handlers are ignored if the process dies abnormally (e.g. if it receives a fatal signal). There is essentially no way to make sure that things are properly cleaned up after.

This is a major downside of Unix domain sockets. It is very probably impossible to change, because of the fundamental design choice of the Unix file system model that you can't get from an inode to whatever name(s) in the file system that refer to it. Not even the kernel can. But, good reason or not, it means that you either have to accept the risk of leaking uncollected garbage in the file system, or try to used the same pathname for each transaction, unlinking it if it already exists. Unfortunately the latter option conflicts with wanting to run several transfers (or several instances of the doorkeeper process) in parallel.

Access control for the Unix domain socket transaction is another issue. In theory, the permission bits on the inode control who can connect to the listening socket, which would be somewhat more useful if you could set them using fchmod(2) on the socket. But at least on Linux you can't do that. Probably you can use chmod(2) after bind() but before listen(), but you'd need to be sure that no bad guy has any window to change the meaning of the pathname before you chmod it. This sounds a bit too tricky for my taste, so I ended up just checking the EUID of the client process after the connection has been accept()ed. For your reference, the way to do this is getsockopt() with SO_PEERCRED for Linux, and the getpeereid() function for BSDs. The latter uses a socket option of its own internally, but getpeereid() appears to be more portable. Note that on Linux you have the option to check the actual process ID of your conversation partner; BSD only gives you its user and group IDs.

I'm as critical as the next guy about stuff that comes from the Dark Tower of Redmond, but in this particular case the two-click approach that works on Win32 does appear to be significantly less complex than the Unix solution.


(New comments disabled on 2013-06-11 due to persistent comment spam)

2008-06-07

Proprosing a constitutional reform

Thursday was Constitution Day in Denmark. I went to a celebratory get-together organized by the party's Hvidovre branch, and spent some quality time sitting on a lawn in the sun, eating brunch and listening to live jazz. Very nice.

The main (and only) speaker was our local MP Morten Helveg Petersen, who for about a decade now has been calling for a rewrite of the constitution. Not, if I understand him correctly, because that there is anything directly wrong with our current one, but more because its language and organization is antiquated. It is hard to understand in general for people who don't know what it means already – it defines our form of government as being "limited monarchial" and says "the King" whenever it means "the government". Its enumeration of human rights and political liberties is rather sketchy compared to what our neighbouring countries have. It creates no end of trouble each time the EU wants a broader jurisdiction. And so forth.

Isn't that all exciting? I guess not, and therein lies a problem. I have nothing against most of the revisions Morten wants to make, but – I'm very sorry – it is hard for me to see why to bother. Apparently the plan is that the revised constitution should say to do things exactly how we do them already, only expressing that more clearly and thoroughly. This does not appear to me as a promising approach to constitutional reform. The optimistic take on this would be that if even our most vocal proponents of reform don't go beyond merely editorial changes, we as a country must be doing pretty well. Which we probably are, but who says we cannot do better yet?

So let us stir things up a bit. I propose that we reintroduce a bicameral legislature!

(For those following along outside Denmark, the original democratic constitution of 1849 create a legislature consisting of an upper house, the Landsting, whose election rules favored landowning and conservative interests, and a lower house, the Folketing, with a closer approximation to universal suffrage. The precise rules changed several times over the years, and after the 1936 elections the relative strengths of the parties were the same in both houses. This made the Landsting increasingly irrelevant, and it was abolished in the 1953 Constitution which is still in force. Today the Folketing alone makes up an unicameral legislature).

There are three main points in my proposal:

  1. The Folketing becomes the upper house of a bicameral parliament. It is supplemented by a new Lower House, name to be decided. (Put suggestions in the comments, please!)
  2. Members of the LH are not elected, but selected by lot among citizens who have registered as willing to serve. Seats are refilled on a rolling basis, such that new members are continuously entering the LH.
  3. Bills are heard and voted on first in the Folketing and then in the LH. Add the percentage of ayes in the Folketing to the percentage of ayes in the LH. If the sum exceeds 100%, the bill becomes law.

The idea of selecting members by lot goes back to the democracies of ancient Greece. Classical Athens had a 500-seat executive assembly called the boule, selected by lot among the citizenry. Lots were also drawn to fill most public offices; only a small number of individual offices were filled by election. I don't know any modern democracies that have taken up this idea for its formal legislative process. But actually we're not that far from it; many politicians act with great deference to the results of random opinion polls. One way to think of my Lower House is as a standing opinion poll and "focus group" for the Folketing. Its opinions will be at least somewhat more informed than those emerging from a point-blank phone poll, because the members are full-time legislators and have some time to read and consider each bill before voting on it.

In most existing bicameral legislatures, a bill has to pass in each house individually. My "add the percentages" method is what I think is really innovative in my proposal. Here is how it works:

Because members of the LH have no hope of re-election, they cannot be held to any partisan or ideological allegiances. They are unlikely to depart from their own honest opinion just becuase it might not be popular in the electorate at large. They don't have to keep friends with anybody in the LH after their term expires. They will have few tools for enforcing parliamentary deals among themselves, and are thus unlikely to make such deals in the first place. In short, the LH will be an extremely unruly and unpredictable lot.

And this is a good thing? Yes it is, because the Folketing votes first on every bill. At that time, it is impossible to be sure how much support it will have in the LH, and therefore ministers who want their bills to pass will have an interest in securing as large as possible a majority in the Folketing as possible. It is said jokingly that politics in our current system is all about "being able to count to 90", that being the size of a majority bloc in the 179-seat Folketing. A minister who makes a concession to an opposition party in order to get 100 votes for his bill rather than 90 is a fool; he does not need more than 90 votes. With an unruly LH in the equation, a minister doing this gets a real advantage, namely that his bill is less likely to be killed in the LH.

Thus the net effect of the LH will be a much greater incentive for the government to seek broad compromises among the Folketing's parties, which I count as a definite win for our democracy.

It gets better yet. The "professional" politicians in the Folketing will have a direct interest in having the LH members vote for their bills (or against bills they would like to fail). They will want to convince a certain number of LH members that this is a good (or bad) bill. The "amateur" members of the LH have little reason to be impressed with appeals to tactics or political tit-for-tat (you might promise me to vote for something else I want next month, but how do I collect on that promise if you don't keep it and my term expires shortly after?) or technocratic gobbledygook. The professionals will need to produce and present actual convincing and intelligible arguments about the merits of each bill. Certainly it will not be possible just to look the other way, cross one's fingers and hope that people will have forgotten this come next election.

As a bonus, the list of participants in the Lower House seat lottery can also be used to draw pools of lay judges and jurors for the criminal justice system. The current procedure for this is ill-defined and highly biased towards members of organized parties. That could definitely use a makeover, too.

2008-06-05

Not quite a blog: The Easterbrook files

No long-time reader of this blog would accuse me of being a prostrate admirer of everything American. But one thing I do find quite nifty is the detailed and well-reasoned judgements their court system produces. They're generally freely available on the web. They are fairly accessible once one learns a few key words and turns of phrase. Some of them are even entertaining reads, in the best sense of that word. I often wish we had something like them in Denmark, where analysis and discussion are not usually found in court rulings.

Now, before you all lynch me, there are certainly many features of the American legal system that I definitely would not want to import. For example, the ground rule that each party in a lawsuit ends up paying his own lawyer's fees, no matter who wins (except when there is an explicit statutory provision otherwise) is not one I'd want to live under – it appears to encourage "nuisance suits" where the plaintiff knows that his case would likely not survive an actual trial, but hopes to get the defendant to settle out of court for a smaller sum that it would cost him to litigate for the win that he is clearly entitled to. And the idea of using juries to decide civil suits strikes me as quaint, to say the least. Many of the laws that American judges have to apply are, of course, batshit insane, but that is hardly the fault of the judges themselves. And something appears to be seriously broken about their tort system.

None of that keeps me from liking the form of the judgements.

In particular, I've become quite the fan of the opinions written by Chief Judge Easterbrook of the U.S. Court of Appeals of the 7th Circuit. You'll find his latest opuses here, intermixed with those of his colleagues; there is no option to limit the search by author. [Amended June 15: yes there is, with RSS feeds even!]

My first encounter with Easterbrook's writing was this one in which the infamous Wallace v. GPL case was finally put to rest. As a longtime user, writer and sometimes advocate of free software, I'm of course pleased with the legal conclusions in there, but they cannot really have been difficult to arrive at. What struck me was that the writing was beautiful. At first I thought that the text I'd arrived at through a dubious chain of links couldn't possibly be the actual ruling of the court. It read more like an essay than an official document. Perhaps it was a summarizing article for some magazine? But no, this was the real stuff.

The Wallace opinion mentioned in passing that itself and similar "opinions" were available for free on the court's website, so I went there and took a look. As it turned out, not all the opinions I sampled reached quite the heights of prosely pleasure that Wallace did. It stands to reason that many court cases are inherently so full of stuffy details and other drudgery that they cannot possibly be made interesting to read about. But a fair number of them proved interesting enough reads to be worth my time.

Among the highlights is one case which opens like this:

[Defendant], a tax lawyer whose opinion letters while at [a law firm] lead to the firm's demise (it had to pay more than $75 million in penalties on account of his work), designed a tax shelter for himself, which one client owning a 37% share. Like many tax shelters it was complex in detail but simple in principle, and to facilitate exposition we cover only its basics, rounding all figures.
[...]
A transaction with an out-of-pocket cost of $6.000 and no risk beyond that expense, while generating a tax loss of $3.6 million, is the sort of thing that the Internal Revenue Service frowns on.
and another one which carefully dissects of a piece of statutory language:
There remains the question whether the paraphernalia conviction relates to "simple possession of 30 grams or less of marijuana". If [defendant] had been caught with the pipe and five grams, the answer would be yes. As it happens, he was caught with the pipe and zero grams. Yet zero is less than five. The ancient Romans and Greeks did not think zero a number, but today we understand that zero is smaller than 30.
You can't make that up. Some you can't even excerpt without doing injustice to the writing: this one you'll just have to read for yourself.

Yet – and this is an important point – for all the colorful (sometimes downright snarky) language Easterbrook always manages to convey the impression that justice is in fact being dealt out, and things that need serious thinking about have gotten it. Doing that while still being an entertaining read takes serious rhetorical talent.

Respectfully recommended for your reading pleasure.

(Be sure also to enjoy the excellent typography of the 7th Circuit opinions. Most of the other circuit courts whose websites I have sampled do provide the text all right, but in rather ugly formats).

P.S. I do know one good argument for having less detailed judgements in the Danish system: time. It takes lots of time and care to draft opinions to the American standard, which would not only burden the already strained resources of the courts, but also add delays of weeks or months to each particular case. The people whose actual money or futures are at stake might reasonably prefer their judgements to arrive quickly rather than being beautifully presented.

2008-06-04

Fridge logic: Ender's Game

Okay, so we've got hyper-intelligent children. We've got antigravity and instantaneous FTL communication and (if only in the first three pages of the book) direct neural interfaces. We've got bug-eyed, telepathic space aliens with hive minds, and a nasty breed of wasp that stings without waiting to be insulted first. We've got a Fantasy Game which cannot quite decide whether it is HAL or merely Eliza.

And we accept all that because science fiction is all about disbelief properly suspended. We accept that interstellar war works exactly like Napoleonic-era land warfare, except that it's in 3D. One side's army meets another side's army at a designated time and place in empty space, and then they have a battle, and one of the armies win. Each battle can be planned and fought and won in a single sitting, 10 to 15 hours tops (this appears to imply that battles take place on a spatial scale much, much smaller than the distance between IPL and Eros, which takes three months to traverse in the fastest available craft). We even accept that Special Relativity seems to apply only halfway, because time dilation does occur but everybody shares the same time coordinate, or the ansible would make no sense.

I don't complain about any of this. Really. It's okay.

But I've been wondering about the Battle School and how it hangs together on its own premises.

See, the point is to take gifted children and train them for military command from an early age. You want to identify and stimulate those children who turn out to excel at leadership. They'll get the most demanding subsequent education and have golden career paths ahead of them. Those who don't excel quite that much will nevertheless end up commanding something. "None has retired from a position of lower rank than chief executive officer on an interplanetary vessel".

However: Not many of the Battle School students seem to get much, or any, hands-on command experience. Commanders do, of course. Toon leaders, perhaps. The subtext seems to be that before Ender, toon leaders are mostly for passing on the commander's executive decisions. But the majority of students are common soldiers, whose responsibility is limited to keeping formation and shooting straight. That's supposed to prepare anybody for command? Are two thirds of the children the I.F. spends fortunes launching into orbit doomed to never get a chance anyway? Or is the system set up such that almost everyone get a turn at commanding before they graduate? (Mick implies as much: "All the guys from my launch have their own teams now. Not me.")

So I sat down to do the numbers on the Battle School, in its pre-Ender steady state.

We have some hard input data. An army comprises a commander and 40 kids, 4 of which will be toon leaders. We don't know how many armies there are, but that's OK; everything will scale with the number of armies. Students enter the school at age 6. They are promoted into an actual army after they turn 8. The earliest possible graduation is at 12 years, but I cannot find a definite source for the typical graduation age. Let's put it at 13 years; that leaves three years for pre-Command before entering Command School at 16. Now if we can estimate the average time a commander is in command before he graduates, we can compute the percentage of students who get to be commanders at some point in their course.

Major Anderson tells us that "Usually they go [commander] at 11". That leaves one or two years of command before graduation, which is consistent with other data. For example, fresh commanders do not have battles for the first three months; they would need to be in the rotation for battles for an appreciable multiple of that time, or comparative rankings of armies and soldiers would be meaningless. Battle is usually every two weeks. When we first meet Bonzo Madrid, we hear that "Salamander Army is just beginning to emerge from indecent obscurity. We have won twelve of our last twenty games", which presumably means 20 games since Bonzo's three-month break-in period. Thus Bonzo has been a commander for just about a year ...

Um, wait just a minute here.

Three years after that, Bonzo is still commander in Salamander Army when he fights Ender in the shower. A short time before that Graff tells General Pace that if Bonzo were to be graduated now it would be "ahead of schedule" and reveal to Ender that he is being protected. However, by then Bonzo must be at least 15; he would be long overdue for graduation.

Ladies and gentlemen of this supposed jury, it does not make sense!

And it's not just Bonzo. Ender joins Rat Army a few days after his 7th birthday; at that time Dink Meeker is a toon leader in Rat, but has been promoted (and refused) commander twice. He is also still at Battle School when Ender graduates. In Ender's last battle for Salamander, the opponent is Leopard, commanded by one Pol Slattery. And in the morning of the shower fight, Ender commands Dragon against Pol Slattery's Badger Army. If it's the same Pol Slattery, he must be around 14 at that time.

Even worse things surface if we turn to Ender's Shadow. There, on the day after Ender graduates from Battle School, Bean goes to the commanders' mess and extempores a speech against the the competitive standing system. Among the senior commanders that he has to convince, we find Shen and Alai! Those two are explicitly from Ender's launch, but now they're suddenly commanders, at most a few months after Ender himself went commander at an impossibly young age.

Has Orson Scott Card no respect for chronology?

Of course he has. But, for the ansible to work, he needed to find somewhere to stow away that pesky relativity of simultaneity from SR, to wit, at Battle School. Only here it does not apply to observers with a nonzero mutual velocity; it applies to students in their respective courses of study.

P.S. Join me next week when I model the economy of Lusitania. How many full-time brickmakers can it sustain?

P.P.S. Just kidding.

P.P.P.S. Obligatory xkcd reference.

2008-06-02

Comment policy

(This sure seems pretentious, to start by putting up an officious comment policy. However, if this does end up being a huge success, it will be easier to find it if it's right at the end of the archive. Never hurts to think ahead just a bit).

For the time being, new comments on old posts are welcome and solicited. Just don't expect an immediate response. For that matter, don't expect immediate responses for new comments of new post. My past record in responding to things such as email is less than stellar, so even with the best of intentions I probably shouldn't be making any promises here.

If you just want to get into contact with me and your point is not related to any particular posting, please use email instead of commenting on a random post. I make no promises to answer email within any reasonable timeframe, but that is no worse than for comments.

Please imagine that the next paragraph is written in lawyery all-caps. I do not plan to have to invoke it, but might if a troll or spammer shows up.

I reserve the right to censor any comment without warning, without apology, and without any attempt at fairness or consistency. Freedom of speech means you get to start your own blog or website, not that you have any inherent right to speak here.

First p0st!

Yeah, so I got myself a blog.

Somewhere inside my skull there's a slightly younger and more principled me desperately trying to grab my attention: What on earth for, you dummy?! You've got a perfectly serviceable website already. You want to tell something to the world, you put it up on your website. And it's true, I could do that. But I sorta never got around to do it, more or less because putting stuff on my regular website seemed to require that I find some good structure for it. The blog concept offers the extraordinary idea that it is okay not to try to structure the inherently unstructured; you just keep a chronological list of everything in the order you wrote it and leave it to the search engines to try to make order of it. If that's what it will take to get me to commit my Messages to the Cosmos to ink (or, at least, bits), then it deserves a try. Also you won't get comments or backpings from putting up static web pages, and I'd like to play with something more reactive.

Fair enough, the imaginary younger me concedes. But, look, ... why on blogspot?! You're a moderately accomplished hacker, you run your own GNU/Linux server, the least you could do is to install some blogging software on that and host it yourself. And indeed, that is what the younger me would have done. (Actually, I suspect that he would have preferred to code his own blog system using perl scripts and duct tape). However ... these days I just don't care. If Google wants to host my blog for free, and I don't have to spend time managing the software or, even worse, write it, then it looks like a win to me. I'll try to remember making local backups of my postings.

It perhaps bears explaining that the younger me dates from a time before I left academia and started writing software for money full-time. I get all the programming I want to do done while at work, and I get a good appreciation of the fact that software which I designed and wrote is running right now to provide valuable services to customers who pay real money for that. That sort of thing does wonders to a self-confidence – which I'd never have thought needed any wonders done to it. But the fact stands that I don't anymore think I need to do things the hard way, as if to prove that I'm a real grown-up who can do things the hard way.

There's a different sub-self which is quite cross at me because I'm writing this blog in English rather than my native Danish. This sub-self insists that I ought to know better than to further the "domain loss" of Danish when most of the people who could conceivably care what I think share my native language and yet I don't use it in the blog. This argument has merits, but in the end vanity won out. I increase my potential audience by at least a hundredfold by writing in English; given the low number of people who are likely to care what I think, this could mean the difference between an audience and no audience. Sorry, but the Danish language will have to survive without the help of this blog.

So what can you expect to find here?

Basically, anything that catches my fancy, or that I think the The World ought to know. I'm not going to try to stick to any particular topic or format. In no particular order, I'll probably have something to say about physics, railway track layouts, science fiction, programming languages, politics, going barefoot, Copenhagen and a dozen other odd topics. The format will range from carefully constructed essays to quick by-the-ways. Posting frequency will be whenever I write something. We'll see whether I'm going to run out of topics to write about.

In general, I'm going to assume that the reader knows at least approximately what I'm talking about. Some postings will have heavy prerequisites. Others will not. It's not as if I'm aspiring to have any "regular readership" anyway. I'm just putting stuff out on the net and relying on the search engines to bring forth interested readers for each piece – or not.

I promise that I will never make any self-deprecating jokes about how low my readership is. I will probably try to use the phrase "as long-time readers of this blog will know" with a straight face within the first ten postings. People either read this or they don't. And if they don't, then at least an empty universal quantification is always true.