[lbo-talk] Probably for Entertainment Only

Carrol Cox cbcox at ilstu.edu
Wed Sep 10 08:51:26 PDT 2003


Below is a collection of the reports on ISU's web page re the mail outage that began last Thursday. I have not the slightest idea whether this was a one-time thing specific to ISU or whether it may be an index to more widespread difficulties with e-mail service. Perhaps people more computer-savvy than I can comment.

Carrol

Sep. 10, 2003

Unscheduled Outage - Mail & ULID Services 11:45am 09/09/03

The following services are now up and running:

Webmail IMAP Email iCampus Email Password Changes ULID Account Activation ULID Services

POP email is still unavailable. Also, when trying to check email using iCampus, some users may receive a Null String error. If you get this error when trying to check email through iCampus, call the Help Desk at 438-HELP and they will log your call while this problem is being researched.

Unscheduled Outage - Incoming Email 1:30pm 09/04/03

A hardware failure has caused the central server that handles incoming email to be temporarily unavailable. Computer Infrastructure Support Services (CISS) is working with the vendor to resolve this problem as quickly as possible. Outgoing email is fine, but the messages are being queued for delivery when the incoming email server is back online.

Update 6:44am 09/05/03

Email through the central servers continues to be unavailable this morning as CISS works to resolve this problem with assistance from the vendor. Further information will be posted to this page as it becomes available.

Update 1:14pm 09/05/03

CISS continues to work through bringing the email server back online. There is an enormous amount of stored email involved to verify and it takes a very long time to do that safely.

All incoming messages are being queued for delivery and none have been lost or rejected during this outage.

Update 10:17am 09/06/03

Email remains unavailable this morning as data integrity continues to be verified on the enormous amount of stored email. At this time hundreds of thousands of email messages have been delivered to your respective addresses and will be available once we bring the systems back on online.

Update 9:11am 09/07/03

The data integrity checks were finally completed early this morning. IMAP service was restored at approximately 4:30 AM. Service via Webmail2 was restored around 8:15am. We will now work to restore mail service to the iCampus portal. Unfortunately, email access via POP will continue to be disabled until the failed hardware components in the server can be replaced later this afternoon. We are planning to take all mail offline at 5:00 pm today to replace the components. We estimate the outage to be no longer than 1 to 2 hours at the very longest. Further details are available at http://www.listserv.ilstu.edu/cgi-bin/wa?A2=ind0309&L=netalert-l&F=&S=&P=180.

{This text added at end of this post]

Update 2:52pm 09/07/03

Email access has been temporarily suspended while we investigate performance issues dealing with stored email access.

Update 6:44am 09/08/03

As reported above, the mail server finished verifying data integrity and was brought back online yesterday morning. Later yesterday, however, older email was becoming re-corrupted upon access, which degraded server performance to the point that access needed to be suspended once again. CISS continues this morning to work with the vendor to resolve this issue and restore service to you as soon as possible.

Update 4:22pm 09/08/03

Email service has been restored this afternoon for IMAP (mail left on the server, accessed through an email client like Eudora or Outlook) connections only. Webmail2 and POP (mail downloaded directly to the local computer) connections have not been restored as they consume many more times the resources per connection. We are still seeing evidence of data corruption, including duplicate or garbled messages, errors transferring between mailboxes, etc. The system is stable as of this moment, but we are not confident at this point that it will continue to be so and it may have to be taken offline again if performance degrades significantly due to the volume of corrupted messages.

CISS continues to monitor the email system, and continues to work toward a permanent solution to these email problems. Again, we appreciate your patience throughout the process. Please continue to monitor this page for the most up-to-date information on the status of campus email.

- - - - - - -

[Detailed report referred to above]

Subject: Outage: Campus Email

Around 1 pm on Thursday, September 4, one the central email servers unexpectedly became unreachable. This server is the primary email message store and serves as the post office at which your mail is delivered. Due to hardware failure and subsequent software issues the service was taken offline. The hardware issues were addressed and isolated by late Thursday evening with testing done to verify the integrity of the system and the physical disk array. After that time, however, it was necessary to allow the email system to verify all of the stored email and attempt to correct any errors found. There is an enormous amount of stored email involved to verify, and many errors were found and repaired. This is a very time-consuming process, but absolutely necessary to ensure that users have access to all email present on this server prior to the outage. Although many data errors may have been present prior to the outage it was necessary for system stability to verify all data present which dates back to 1999. Our diagnostics show only negligible data corruption; we hope this is indeed the case.

The data integrity checks were finally completed early this morning. IMAP service was restored at approximately 4:30 AM. Service via Webmail2 was restored around 8:15am. We will now work to restore mail service to the iCampus portal. Unfortunately, email access via POP will continue to be disabled until the failed hardware components in the server can be replaced later this afternoon. POP is a very inefficient protocol which consumes considerably more resources per instances than IMAP. We continue to urge you to use IMAP and to set your Check Mail to no less than 10 min.

We are planning to take all mail offline at 5:00 pm today to replace the components. We estimate the outage to be no longer than 1 to 2 hours at the very longest.

For those desiring more information, the server involved in this outage hosts all stored email, including Inbox contents. Outgoing mail, or sending mail continued fine during the outage as did the receiving and delivery of incoming email. However, because of this host being down our users were unable to view any of their email. None of the messages received to @ilstu.edu email accounts during this outage were lost or rejected. All incoming email was queued and held on our email front end servers then slowly dequeued over to their final destination. At this time, those messages are in the final process of being flushed out of the queue, and will be delivered over the next several hours. There were well over 500,000 messages queued for delivery, and they likely will arrive in your Inbox out of chronological order.

CISS appreciates your patience during this unscheduled outage, and worked around the clock to safely restore email service without loss of data to you as quickly as possible. Please contact us at ciss at ilstu.edu if you have any questions or concerns regarding this outage.



More information about the lbo-talk mailing list