I am alive, but RAID 5 is not

Uncategorized — Антон Марчуков @ 15.12.06 22:15

Do not think that I’ve suddenly disappeared. I am still here. On Monday I came to the work and got a call asking about our mail server operation. That was the begining of the week full of hard work. RAID 5 on the mail server serving several thousands of users has failed and we did not set up a backup system yet. Actually that was in proccess that time, but the death came first :-(

To stress this much. I need to tell than when we got to the point when we were able to read something from that RAID, we’ve found a broken UFS system there (our sysadmins prefer FreeBSD even on mail servers).

And to stress this much more more much. The master LDAP server was on that mail server. It was replicated to the daemon on another server… and yeah, the second deamon hung some time ago, so replication process was not working for a month or so. And of cause no backups at all (I had some, but they were also made a month ago).

So, this week I came to the work at 8:30 and worked without any pauses till 20:00 everyday. I did several things at once and set up so much servers that I can hard to tell, maybe for all my past life I did never installed so much software.

It almost finished. The hardware is not considered operational yet, but I set up two reserve servers that will emerge in one on the next week. And yes, with help of our magic skills we have restored all the mail along with ldap database from the broken UFS partition on the broken RAID array.

The awful part of the story ends here.

