Tuesday, February 26, 2008

500-mile Limit For Email

I should preface the following geek-post with some words for our gentle readers who may not have spent a lifetime handling bits, bytes, and software in general. True geeks can skip the intro (or read it an laugh smugly at my own limitations -- Jej).



Nowadays (and this includes the last decade), to send email to someone, one just puts his/her email address (like "my_friend@his_company.com", or "john.doe@mailco.com") in the address field of a form displayed on one's computer screen, and then click "Send". The "his_company.com" or "mailco.com" names a computer that receives email. It doesn't matter if that email computer is on the other side of the earth, or in space. The email is routed to that email computer. The email computer takes care of letting one's recipient know that email has arrived.



The internet is built such that the email being sent is routed along a path from the sending computer to the recipient's email computer. The length of the path doesn't matter. So, absolutely *any* talk about some limit on the *distance* one can send email is like talking about a magic beanstalk.



With that, sit back gentle reader and enjoy this true story. We'll try to degeek a few geekish terms for clarity as we go -- in square brackets [like this].



The case of the 500-mile email



Read the FAQ [FAQ = Frequently Asked Questions] about the story.



The following is the 500-mile email story in the form it originally appeared, in a post to sage-members [SAGE is an operating system, like MS Windows or Unix] on Sun, 24 Nov 2002.:



From trey@sage.org Fri Nov 29 18:00:49 2002

Date: Sun, 24 Nov 2002 21:03:02 -0500 (EST)

From: Trey Harris

To: sage-members@sage.org

Subject: The case of the 500-mile email (was RE: [SAGE] Favorite impossible task?)



Here's a problem that *sounded* impossible... I almost regret posting the story to a wide audience, because it makes a great tale over drinks at a conference. :-) The story is slightly altered in order to protect the guilty, elide over irrelevant and boring details, and generally make the whole thing more entertaining.



I was working in a job running the campus email system some years ago when I got a call from the chairman of the statistics department.



"We're having a problem sending email out of the department."



"What's the problem?" I asked.



"We can't send mail more than 500 miles," the chairman explained.



I choked on my latte. "Come again?" [The Stat chairman talking about a magic beanstalk.]



"We can't send mail farther than 500 miles from here," he repeated. "A little bit more, actually. Call it 520 miles. But no farther."



"Um... Email really doesn't work that way, generally," I said, trying to keep panic out of my voice. One doesn't display panic when speaking to a department chairman, even of a relatively impoverished department like statistics. "What makes you think you can't send mail more than 500 miles?"



"It's not what I *think*," the chairman replied testily. "You see, when we first noticed this happening, a few days ago--"



"You waited a few DAYS?" I interrupted, a tremor tinging my voice. "And you couldn't send email this whole time?"



"We could send email. Just not more than--"



"--500 miles, yes," I finished for him, "I got that. But why didn't you call earlier?"



"Well, we hadn't collected enough data to be sure of what was going on until just now." Right. This is the chairman of *statistics*. "Anyway, I asked one of the geostatisticians to look into it--"



"Geostatisticians..." [Says our beleaguered email maintenance guy]



"--yes, and she's produced a map showing the radius within which we can send email to be slightly more than 500 miles. There are a number of destinations within that radius that we can't reach, either, or reach sporadically, but we can never email farther than this radius."



"I see," I said, and put my head in my hands. [Talk of magic beanstalks can get demoralizing in the 21st Century.] "When did this start? A few days ago, you said, but did anything change in your systems at that time?"



"Well, the consultant came in and patched our server and rebooted it. But I called him, and he said he didn't touch the mail system."



"Okay, let me take a look, and I'll call you back," I said, scarcely believing that I was playing along. It wasn't April Fool's Day. I tried to remember if someone owed me a practical joke. [Personally, I believe that if this geek had that thought then the answer should be, "Yes" -- but I digress.]



I logged into their department's server, and sent a few test mails. This was in the Research Triangle of North Carolina [a big high-tech area, like Silicon Valley], and a test mail to my own account was delivered without a hitch. Ditto for one sent to Richmond, and Atlanta, and Washington. Another to Princeton (400 miles) worked.



But then I tried to send an email to Memphis (600 miles). It failed. Boston, failed. Detroit, failed. I got out my address book and started trying to narrow this down. New York (420 miles) worked, but Providence (580 miles) failed.



I was beginning to wonder if I had lost my sanity. [The magic beanstalk begins to intrude on reality. It's an ugly disquieting sensation.] I tried emailing a friend who lived in North Carolina, but whose ISP [ISP = Internet Service Provider -- which often provides a central email recipient computer (an email "server") for its subscribers] was in Seattle. Thankfully, it failed. If the problem had had to do with the geography of the human recipient and not his mail server, I think I would have broken down in tears.



Having established that--unbelievably--the problem as reported was true, and repeatable [Scientific Method -- hence, magic beanstalk is real], I took a look at the sendmail.cf file. [sendmail.cf = a file that tells the computer's operating system some specific preferences in how to go about sending email. "Cf" = "configuration file".] It looked fairly normal. In fact, it looked familiar.



I diffed it [diff = to check for the differences, or changes, or alterations, between two copies of a file.] against the sendmail.cf in my home directory. It hadn't been altered--it was a sendmail.cf I had written. And I was fairly certain I hadn't enabled the "FAIL_MAIL_OVER_500_MILES" option. [The configuration file allows preferences, also called options, to be change. Some of the options can be turned on, enabled. Geeks often talk about insane imaginary things -- this particular option is one of those: not real, a joke.] At a loss, I telnetted [telnet = a gritty underbelly way to communicate directly between computers] into the SMTP port [SMTP = Simple Mail Transfer Protocol; port = a virtual "door" into, or out of, a computer]. The server [a computer providing some service, like SMTP] happily responded with a SunOS [SunOS = another operating system, from the Sun computer company] sendmail banner [the standard heading information from the email-sender program].



Wait a minute... a SunOS sendmail banner? At the time, Sun was still shipping Sendmail 5 [version #5 of the email-sender program] with its operating system, even though Sendmail 8 [version #8, a newer version] was fairly mature [mature = likely to have few bugs]. Being a good system administrator [sysadmin = job is to keep the computers running correctly], I had standardized on Sendmail 8. And also being a good system administrator, I had written a sendmail.cf that used the nice long self-documenting option and variable names [geeks talk cryptically, but some things are even too cryptic to geeks; hence, geeks often prefer long clear names for things -- to avoid getting themselves overgeeked] available in Sendmail 8 rather than the cryptic punctuation-mark codes [versus nice clear long names] that had been used in Sendmail 5.



The pieces fell into place, all at once, and I again choked on the dregs of my now-cold latte. [This sysadmin has just realized that not only has the magic beanstalk intruded into reality -- something that one must accept due to the Scientific Method -- the beanstalk was *really* real, it had a rock solid reason to exist.] When the consultant had "patched the server," [patch = upgrade part of, without replacing the whole thing] he had apparently upgraded the version of SunOS, and in so doing *downgraded* Sendmail [from version 8 to version 5]. The upgrade helpfully left the sendmail.cf alone, even though it was now the wrong version.



It so happens that Sendmail 5--at least, the version that Sun shipped, which had some tweaks [tweaks = small changes]--could deal with the Sendmail 8 sendmail.cf, as most of the rules [rules = sendmail instructions to do "this" in case of "that"] had at that point remained unaltered. But the new long configuration options--those it saw as junk, and skipped. [Some computer programs are designed to die immediately (called a "crash") if they don't get pure nutritious input, and other computer programs are designed to ignore impure filthy incomprehensible inputs by ignoring them and keep on ticking.] And the sendmail binary [binary = a computer program finalized in the computers raw machine language of ones and zeros] had no defaults compiled in [built in] for most of these, so, finding no suitable settings in the sendmail.cf file, they were set to zero.



One of the settings that was set to zero was the timeout to connect to the remote SMTP server. [timeout = wait for such and such time, then give up if you didn't get what you expected; "remote SMTP server" = an SMTP server program running on another computer somewhere.] Some experimentation established that on this particular machine with its typical load [load = the number and kind of programs currently running], a zero timeout would abort a connect call [connect call = a request to send or receive information directly between computers] in slightly over three milliseconds.



An odd feature of our campus network [network = a bunch of computers that can talk directly with each other] at the time was that it was 100% switched. An outgoing packet [packet = section of email text; all email is sent in small fixed-size sections] wouldn't incur a router delay until hitting the POP [POP = Post Office Protocol, how to retrieve email sitting on an email-serving computer] and reaching a router on the far side. So time to connect to a lightly-loaded remote host [remote host = a different computer providing some service] on a nearby network would actually largely be governed by the speed of light distance to the destination rather than by incidental router delays.



Feeling slightly giddy, I typed into my shell [a Unix line-at-a-time command receiver]:



$ units
1311 units, 63 prefixes

You have: 3 millilightseconds
You want: miles
* 558.84719
/ 0.0017893979

[$ = a prompt to type a command; units = a program to translate between different types of measurement units]

[units, prefixes = the number of different units and prefixes known]

[You have: = prompt; sysadmin enters his choice]

[3 millilightseconds = the distance light travels in 3 milliseconds. Electrons travel no faster on the internet.]

[You want: = prompt; sysadmin enters his choice]



"500 miles, or a little bit more."



Trey Harris
[Click on the title above, or date stamp below, to see the full post.]

4 comments:

  1. Wow. Like watching an episode of "House" -- incomprehensible most of the way through, but then just barely coalescing into human-speak in the nick of time, with a pretty cool conclusion to make it worth the ride.

    That was cool. Thanx

    GHS

    ReplyDelete
  2. There's a reason you don't upgrade the OS on a production server. You build the server. You install the services you want it to run. You configure. You test. You adjust, and test again. And when you're convinced that it's Ready for Prime Time, you go live and let real users at it.

    Helmuth Graf von Moltke famously wrote: "No campaign plan survives first contact with the enemy". I apply this to the tech support business as: "No software configuration survives first contact with users." I now have to add "...or 'helpful' consultants."

    ReplyDelete
  3. 3 millilightseconds = 558 miles. I'm sure that knowledge will come in handy some day. I'm just not sure I want to be there when that day comes.

    ReplyDelete
  4. daddyquatro - interestingly, in quantum physics, you often measure time in distance terms based on speed-of-light (because they work more easily that way, but don't ask me to explain it, it's been too long). It is really interesting the first time you work calculations and all your "time" units are in METERS.

    I'll have to admit, though, that I didn't expect the timeout to be the problem. I was figuring on it somehow rejecting all but regional IP address values.

    ReplyDelete

We reserve the right to delete comments, but the failure to delete any particular comment should not be interpreted as an endorsement thereof.

In general, we expect comments to be relevant to the story, or to a prior comment that is relevant; and we expect some minimal level of civility. Defining that line is inherently subjective, so try to stay clear of insulting remarks. If you respond to a comment that is later deleted, we may take your response with it. Deleting your comment isn't a personal knock on you, so don't take it as such.

We allow a variety of ways for commenters to identify themselves; those who choose not to do so should take extra care. Absent any prior context in which they may be understood, ironic comments may be misinterpreted. Once you've earned a reputation for contributing to a conversation, we are likely to be more tolerant in those gray areas, as we'll understand where you're coming from.