As much of the web seemed to notice this morning, several sites running on Rackspace’s servers went down. Yes, again.
For the second time in 8 days, a power outage interrupted service at one of its data centers. And again it was the Dallas center that was effected. This time however, Rackspace was able to get things up and running fairly quickly, and more importantly, communicated well through its blog and Twitter throughout the downtime.
Still, it raises the question, why do power outages keep taking down a service that so many rely on? They have backups in place, so what’s going on?
Last time, Rackspace blamed the failure on a series of events that began with a power failure, and eventually tripped up its backup system. It isn’t saying exactly what happened this time yet, but if it was the same issue, obviously that’s a problem.
The issue here is reliability. A lot companies run their services through Rackspace. If it goes down, even for just an hour, that’s lost business. Last time around, Rackspace coughed up as much as $3.5 million in credits to those who were affected. It will undoubtedly have to cough up money this time around as well.
The promise of reliability is presumably one of the reasons Google kept Gmail in beta for so long, finally removing it today after several years. Now, if something happens, there’s no leaning on that beta crutch anymore.
To be clear, Rackspace has a pretty good history when it comes to reliabilty. Before last week’s downtime, it was November 2007 that a major outage last occurred. And that’s why it’s so troubling that we’ve seen two outages in just about a week.
[photo: flickr/Gordon McDowell]









That title is one of the funnies ones I’ve seen in a long time
.
Because the people that work there don’t actually work. They’re hanging out on crap sites like http://www.anonboard.com all the time, of course everything’s breaking.
Seeing this comment really “grinds my gears.” I started working at Rackspace in the DFW datacenter in Aug 06 working with the DataCenter Operations team. I was involved in getting servers back up during the outage a year and a half ago caused by the truck running into a transformer. I actually got called in after work to help out with it. Saying a comment like this is really disrespectful of the people that work in that DC and bust their asses to satisfy customers and meet our goals. When an outage like this occurs, everyone drops everything their doing to fix it. I can guarantee you that they aren’t goofing off during times like these. We don’t treat it like a joke because it isn’t. It’s very serious business.
NB – I should have said this before, anything I say on here are my own opinions and don’t represent the opinions of Rackspace. Don’t hold what I say against the company itself.
NB – I should have said this before, but anything I say on here are my own opinions, and not the opinions of Rackspace itself. Don’t hold what I say as a comment from Rackspace or as a representation of anything from them. What I say is said with the information and knowledge of events that I’m aware of. I may not always be 100% correct.
the question really is, Is it possible to provide fanatical service that exceeds 100% uptime? I think that’s what hosting services should try to accomplish and something *I* would pay for!
I have some cold fusion I can sell you. Or maybe a 14-inch….. [crap, propriety] … powerstrip.
If they have backups and cloud computing systems why should customers expect regular downtimes? This is 2nd time in 8days!
How exactly do you EXCEED 100% uptime? 100% uptime is near impossible as it is for anybody. Rackspace gives the 100% uptime SLA to say that if you EVER have downtime from infrastructure issues, that we’ll pay you compensation for it and as far as I’ve seen working here, we’ve always stood by it.
Expecting more than 100% uptime is asinine.
Best blog post title ever! (now I’ll read the rest)
I am wondering what all this affected. I use them for Cloud Servers and I just checked my server and it has been up for a while now.
Maybe this is their co-lo stuff? I have actually never had the cloud server go down (over the last month) and I am in the DFW center.
We rent a single managed server in the DFW data center, and our sites went down both last week and today. We’re an inside web team and recently convinced a major division to drop their vendor and work this us on their sites. But now Rackspace can’t keep their server up. Not good!
Funnily enough I received an e-mail survey from Rackspace this morning before the outage, which asked how their “fanatical” service was. The funny part was that it was signed by a VP and included his personal office and cell numbers. I was very tempted to call that cell number today when my sites were down…but ultimately decided it was a dick move that was not likely to get the sites back up any faster.
Hi, Snowwrestler,
Please take advantage of the opportunity to contact that VP and share your thoughts with them. Hit me up on twitter (r_mac) and I’d be happy to provide my direct contact info as well. We’re committed to fixing the issues we’ve experienced and living up to your expectations. Please let me know how I can help.
Best regards,
Richard
Yea, I use them for my Cloud Files and Cloud Servers as well, and haven’t noticed any downtime on my servers. And it’s all located in DFW.
+1 on best title ever
I’m on Cloud Sites – it didn’t seem to effect any sites – just mail & webmail connectivity & couldn’t access control panel or any other rackspace sites.
It affected us–we have a managed dedicated server there. I think this outage was confined to a particular “sector” of the DFW data center because some of our competitors (who are coincidentally also hosted at Rackspace’s DFW data center) did not go down. But that’s just my guess based on anecdotal evidence. Their phones were busy, and their ticket system went offline briefly, so I doubt that it was a trivial number of servers/customers.
The most recent outage was only in a small segment of one of the phases of the DFW datacenter. Doesn’t mean it’s any less damaging or depressing when events like this do happen, but we work through them and improve.
They say it was a power issue but I didn’t see anything actually go offline, just become inaccessible. As in network issues.
All my Rackspace Mail accounts were down for a while, then were rejecting my password, then were back up less than 20 minutes after they first went down.
It could have been power to the routers or switches, not power to the whole place or your servers.
The switches in the cabinets don’t have redundant power supplies. It was only switches that were connected to powerstrips that were powered by one of the UPS units that went down. The problem was related to a buss duct (http://en.wikip...g/wiki/Bus_duct) that supplied power from the UPS unit to the power distribution unit. As far as I understand it, this is a very unusual failure and is pretty unpredictable. I’m not even sure if it’s something checked during routine maintenances.
Glad I have my sites hosted at StrataScale (www.stratascale.com)
You gotta have reliability when you have managed servers…ouch Rackspace
It’s already been a year and a half since that outage? Wow. It weirds me out when I remember stuff from TC years in the future,
We have two servers in the DFW center and we went down for about 5-10 minutes today. After last weeks problems my heart fluttered a bit. But all was good after a few short minutes.
Rackspace still remains the most reliable hosting company I’ve ever done business with, and believe me, there are lots in that list.
This seems to be a trend with datacenters lately for some reason. Glad I don’t own a mission critical site right now.
We have servers at Rackspace and pay a significant premium for their (normally) excellent service. We have recently moved a couple servers to a different, less expensive provider to test. Times are tough and saving 50% on server costs is huge.
The sad part is we chose to leave our DFW servers online (and move the Virginia servers first). The new service, at LiquidWeb, has been significantly better with all of the Rackspace DFW trouble. I really like Rackspace and we’ve been with them for over 6 years, but if the outages continue I’m afraid it will be an easy decision to move.
Here comes all the anti-cloud clowns:
“how can you trust the cloud, bla,blabla.. wait a second, my company’s database server is down, and the fallback server isn’t loading. And the IT guy forgot to run backups yesterday..well, no work today.. how much are we paying the backup guy, anyway?? maybe we should hire a restore guy..”
And congratulations to Techcrunch for reporting this news on one of their longtime sponsors!!
Rackspace is continuously trying to please a consumer base that is never happy. Kudos to Rackspace for continuing to work hard for people who do not appreciate the efforts.
Rackspace is working hard to make money. They charge a 3-5x premium for a service they haven’t been providing. I’m sure you would feel differently if you had an affected server.
The premium is for the Fanatical Support and the SLA. Having people available to you on the phone, email, and online 24/7/365 isn’t cheap and having to guarantee that a huge infrastructure like ours doesn’t go down isn’t cheap either. If you don’t need the support or SLA guarantees there are plenty of other places to get that sort of thing for cheaper. It’s a matter of what you need.
I previously worked for a company that had about 15 boxes all at a co-lo. I just don’t understand in this day and age with Amazon/Rackspace Cloud why anyone would go “physical” anymore.
It is only a matter of time before Google gets into the game (fully).
Of course it is only a matter of time before they own the whole Internet and are powerful enough to just charge for everything, i.e. penny per e-mail, etc.
“I just don’t understand in this day and age with Amazon/Rackspace Cloud why anyone would go ‘physical’ anymore.”
Yeah …. ’cause we all know that cloud hosting doesn’t need power to stay up.
huh, RackSpace sucks donkey huevos!
Robert Scoble = trail of destruction
I keep saying it, but can someone name me a company that Mr. Scoble has joined that actually was in better shaped after he left? Anyone?
I’m glad I chose to deploy my SaaS with AWS and not RackSpace. Cheaper … and it doesn’t fall over because Amazon haven’t invested properly in infrastructure.
RackSpace have no excuse for their outages – they claim they are fanatical about their service – apparently that’s just lip service.
@Chris: you can tout Amazon all day, but their outage a month ago lasted FOUR hours, which is longer than both of Rackspace’s outages combined and doubled (probably tripled):
http://www.data...zon-ec2-outage/
And it’s not the first time: “EC2 previously experienced extended outages in February 2008 and October 2007.”
http://www.data...zon-s3-and-ec2/
http://www.data...wipes-out-data/
http://news.cne...-9962010-7.html
http://www.roug...y_s3_failed.php
“Why do power outages keep taking down a service that so many rely on? They have backups in place, so what’s going on?”
It could be the hackers that are causing the problems. Hackers can hack through any computer hardware and cause it to fail.
Nutty. I almost used them to host my stuff. http://www.fast2290.com They’ve always been really reliable.
I decided to try GoGrid this time.
Rackspace – Fanatical about Downtime.
Yes The Planet is way better.
We’ve made mistakes, we admit that, but we are definitely not like that. If you really want to poke fun, The Planet had an outage of their own a while back caused by a room blowing up. That outage was much longer than ours and I don’t recall people going as crazy over it as they have with ours. We admit our faults and work to correct them.
Well I was so close to hosting http://www.appgiveaway.com with Rackspace after hearing lots of good things about them but I was constantly having problems when testing the website on the Rackspace servers, anyway they did have a 30 day trial so I took my money back and went to The Planet and I must say they have been great.
Rackspace do have good support but what use is that if your site is going to be down often. Personally I would rather have a website up and running without any downtime and no support at all.
Why not just use AWS?
MY GOD .!!
Scoble shows up at Rackspace and suddenly they start experiencing major power outages. I’m just saying!
Not surprising. They were a bunch of nubs a year ago when a foolish compatriot of mine hosted his site there, and they’re no better now.
This is why I host my own servers. If they go down, I’ve got nobody to blame but myself — but more importantly, I know exactly how long it’s going to take to get them back online and precisely why they went offline. Nothing worse than telling customers that their site is down, swearing up and down it’s an isolated incident — and then it happens again a week or two later.
My experience is that colo power redundancy is a joke. Generators usually work, but there are often failures in the mechanical switches and other equipment between the generator and the UPS system. No colo is going to double up on generators (can you imagine the cost?) so if one generator fails to get its power through to the colo, a lot of people are screwed.
Got a call from Rackspace as soon as our site went down…much better response this time around, which was half the problem during the last outage.
We can live with this recent outage.
“The issue here is reliability. A lot companies run their services through Rackspace. If it goes down, even for just an hour, that’s lost business.”
Online Tech prides itself on their reliability. Our data center in mid-Michigan has performed at 100% uptime in power delivery over the last 4 years running — never experiencing an outage.
We truly feel sorry for Rackspace customers. No one wants to lose business.
We are currently offering a special for Rackspace customers: $0 setup fee, one month free.
Upgrade to Online Tech. http://www.onlinetech.com
Classy.
I couldn’t resist responding to this. I’ve seen you guys tweeting like crazy after our outages being vultures trying to pick off customers that are unhappy. I personally think is a very douchebaggy way of getting new customers and I would never get service with a company that does this. I shouldn’t pick you based upon comparing yourselves to another company, you should be able to stand on your own two feet. Also, just a side note but Rackspace was outage free for over 6 years prior to the recent outages.
Do you guys at onlinetech really think that anybody is impressed with this kind of low class marketing? Good luck with that.
Here’s a good indication of your real importance: Gartner doesn’t even recognize that onlinetech exists in the hosting market:
http://mediapro...2/article2.html
http://mediapro...1/article1.html
The difference in reliability between RackSpace and another decent provider ( with much less expensive monthly cost) is not as much as one perceive.
This further proves that 100% (or 99.999%) uptime is impossible, no matter what the provider says, advertises or promises you.
Yes, 100% uptime is nye impossible, but we still say it as a show of trust that we are behind our customers 100% and will compensate them if we ever do let them down.
99.999% means 5.25 minutes of downtime per year.
99.99% means 52.5 minutes of downtime per year.
99.9% means 8.76 hours of downtime per year.
So with Rackspace’s outage, it looks more like 99.9% than 5-9’s to me.
affected
is Serverbeach/Peer1 hosted co-locating with Rackspace, they had downtime the same time, also said power supply.
While MG Siegler’s headline for this post is humorous, it unfairly conjures up images of the guy unplugging the power cord for the runway lights in the movie Airplane. Last week a Rackspace competitor (also in Dallas) was down for 8 hours and made little effort to keep customers informed. Kudos to Rackspace (& Justin) for good communication, transparency and taking their lumps like a man.
Online Tech – you have mastered the art of bottom feeding. Good luck with that.
Thanks for the compliment. I’m not a PR person, just work for Rackspace. It’ll be 3 years in Aug and I’ve really grown to love the company and truly believe in our vision. I just felt that some of these comments deserved a response. We’ll keep working to earn our customer’s trust back and learn from our failings.
Though we are not directly affected but we can still feel the pain for other webmasters (hosting at rackspace) for the downtime like that. We have been considering Rackspace to host our websites but I think we still need to wait before making a move. Thank you twitter for keeping us posted.