Tough Week - The first for 2008

February 21, 2008 | Author: Nerbie | Filed under: General Topics

I had a tough week this week. For the last 96 hours I only slept like less than 24 hours. It all started last Sunday when I found out one of our web hosting server that serves 100++ small to medium websites had some corruption on its Linux file system. With that scenario I decided to do a Linux OS reload. Basically, I needed to get the latest backup of all accounts on it. It took 12 hours to finish everything. Later that night admin pull the plug for reformatting and OS reload. Most likely I needed to be up the entire night till Monday morning for decision matter.

Came Monday morning server were ready and started to restore accounts from backup. It could have been the end of the story before lunch but I found out the datacenter admin did a very wrong moved when he uses different IP addresses for the server. Of course domain names need to propagate. That made some accounts looks down and not visible on clients Internet connection. So, that was it. It’s very stressful when clients were mad for they cannot send email and browse their websites. Most likely everything went its original state on Monday evening.

That’s not the end of the story. Came Tuesday afternoon, another server suddenly showing a very high CPU load. Imagine its always 98%. No matter what I do it keep on going back to 98%. With it web browsing and email retrieving and sending were affected. The server is up but it’s very slow. Again clients were calling and texting left and right. Of course Im replying to them and picking up the mobile phone though sometimes I just can’t since I’m also working on the server. My server admin abroad also helps me on the issue. He did some root kit hunter checking, suspecting the server was comprised. It took several hours figuring out what went wrong. Until it was put to around 60% cpu usage. The server was clean except some poorly code php scripts which was used by some clients but cpu usage was still abnormal.

Around 3AM (wednesday) I decided to ask the datacenter to do a hardware check. Basically, we need to pull the plug. Again I need to be up waiting for the result from the hardware testing. Past 5:1AM Bingo! The culprit? Cooling fan of the server was not on its top condition anymore. After replacing it server works normal. CPU usage down to 10% or even less. What a week I would say.

So you think it’s over? Not quite. I had this new client from Korea, A Filipino whose website is currently hosted on our newer server.. His website barely a month old was a hit with mostly OFWs visitors in Korea and canada. The site started with a 400MB Hard disk space and 10gig bandwidth. Within a week it used up the 10 gig bandwidth. I upgraded the account to 50 GIG. As of today it’s nearly hitting its bandwidth limit. To make the story short, the site was affecting the normal operation of the server. Of course some clients were affected since server load goes up during pick hours with this site around. Ive decided to put the site down temporarily and guess what? since my mobile number was posted on the suspended page, people abroad calling me and texting me asking why the site went down. Grabeh! I cant barely handle it since I needed to explain to them that I already talked with the owner and informing them that I only do what the owner of the site wanted us to do.

Today I ordered a new dedicated server for this account since the owner allowed me to move his account to a new server. Hopefully, it will turn out well tonight if I can get the server.

That’s all for now.

xygoxen

No comments yet.

feel free to leave a comment

You must be logged in to post a comment.

Recent Comments

Recent Photos

Blogroll