Monday, March 24, 2008

Leap Years and The Calculation

One of the first of many obscure details I learned as a working computer programmer was how leap years truly work. Most adults can probably tell you that leap years come every four years. Many can no doubt even tell you which years they will be, though they may have to refer to their correspondence with Presidential Elections in the U.S. or the Summer Olympic Games. I suspect very few people will tell you the exceptions to the four year pattern in general conversation.

Little details like this are exactly the kinds of things one must deal with when programming, and it's often critically important to get the full details correct. Leap years actually occur on years evenly divisible by four, excepting those years that are also divisible by one hundred, unless they are divisible by four hundred. Getting things like this wrong can cause really obscure issues. The year 2000 was one of those divisible-by-four-hundred leap years. Interconnected systems are particularly vulnerable to people getting things like this wrong. It's easy to imagine a reporting program that expects input on the first of every month. (Accounting, inventory, and physical observation data are some examples that can all be batched up by remote stations and "mailed in" using this pattern.) If one side of the communications channel gets the date wrong, suddenly things don't work the way they were expected to. It's annoying if you are collecting data on the movement of a glacier using a remote laser range finder. It's lawsuit bait if you are batching accounting transactions for an automated billing system.

Leap years are a simple example. The fun really starts when you get into things like correcting for daylight savings time, or figuring out when Easter will be this year. In the Middle Ages, Easter was a much bigger milestone than it is today, and figuring out the date it fell was so important that the process had its own name: Computus. Translated from the Latin, this simply means The Calculation. Having the observance tied to the solar and lunar cycles and the need for it to fall on a Sunday combine to make a fairly nasty little computational problem. The basic algorithm goes like this: Easter Sunday falls on the first Sunday after the first full moon after the vernal equinox. If you have this year's calendar in front of you, it's easy enough (though you will usually have to agree to assume the equinox falls on the 21st of March). If you don't have a calendar, or need to be making out next year's, or the following, it gets harder quickly. One math doctoral student calculates that the progression of Easter dates takes nearly six million years to repeat. The Wikipedia link above has some examples of the tables that were used to help with The Calculation. Those tables are a medieval version of what we would call a "feature requirements specification" these days.

Of course, groups of churches settled on different, incompatible methods for The Calculation. Differences in the interpretation what "full moon" and "vernal equinox" mean have lead down the years to divergent Easter date behavior. Folks, this kind of stuff is why we want you want to have precise, documented definitions of all the terms used in your specifications...

No comments: