February 18, 2004

E-mail Failure Affects 3,200

Print More

Many on campus are still left smarting from a Feb. 10 e-mail shutdown that caused 3,200 users to lose access to their mail from approximately 2:30 a.m. to 2:00 p.m. The root of the shutdown, according to an e-mail from Don MacLeod, Cornell Information Technologies (CIT) assistant director of client/server systems and services, is believed to be “an error in [vendor Sun Microsystem’s] operating system.”

According to the e-mail, only users whose mail is routed through the “postoffice8” server were affected. Postoffice8, one of five CIT mail servers, consists of four mail spools. Only one of the spools was affected by the failure. This isn’t the first time this spool was impacted. A similar system failure occurred on Jan. 19, though its magnitude was not as severe as the most recent malfunction. The root of both problems appears to be the same.

In addition to losing sending and receiving ability for nearly 12 hours, old mail stored in the affected spool’s postoffice8 e-mail accounts was temporarily lost. As of yesterday, the old e-mail had not yet been recovered.

“We’ll be starting the restore process tomorrow,” MacLeod said yesterday. “Our indications seem to be that a large percentage of it will be restored, [but] we don’t know an exact percentage.”

CIT was inundated with postoffice8 users who wanted to know why it seemed like all of their mail had been erased. Most students do not back up their mail, and as a result won’t see any of their old mail until today.

“The collection of user mailboxes that make up an e-mail spool file system contains many millions of small individual files,” MacLeod said, explaining why the retrieval process takes so long. “Backing up and restoring a large volume of data that is stored as many small files is far less efficient (and time consuming) than backing up the same volume of data that is stored as a small number of much larger files.”

CIT is in the process of implementing a number of measures that will help avoid similar problems in the future and help ensure users will at least be able to take preventive measures to preserve future e-mails. In addition to waiting for a new server, “We are working on documentation and procedures that would allow [postoffice8] users to self-migrate to another server,” MacLeod added.

Ultimately, though, MacLeod does not really foresee postoffice8 continuing to fail after the new measures are seen through and doesn’t believe users should feel the need to switch to another postoffice.

“[The change] would be more psychological than physical.” MacLeod said.

Archived article by Billy McAleer