Free: Pack Of MySpace Branded Playing Cards »
Google’s Apps SLA Allows It To Minimize Downtime Of Gmail, Calendar And More
by Robin Wauters on December 4, 2008

Any Google App could be unavailable for more than 21 hours on a given day, and the company could still claim they had 100% uptime. That’s the gist of an analysis penned by Pingdom, who took a closer look at the Service Level Agreement for Google Apps.

The most interesting tidbit in the SLA, which applies for Gmail, Google Docs, Google Calendar and more (emphasis ours):

“Downtime Period” means, for a domain, a period of ten consecutive minutes of Downtime. Intermittent Downtime for a period of less than ten minutes will not be counted towards any Downtime Periods.

“Monthly Uptime Percentage” means total number of minutes in a calendar month minus the number of minutes of Downtime suffered from all Downtime Periods in a calendar month, divided by the total number of minutes in a calendar month.

True enough, this exempts Google from admitting it had up to 21 hours of downtime in one day (worst-case of course, see Pingdom’s calculation for more information about that as well a more likely scenario), because it ignores all unavailability under 10 minutes, which by today’s standards is a very long period even for free services (SLA applies only to paying customers, but still).

Google’s Apps SLA may guarantee 99,9% uptime, but this little loophole makes it darn easy for the company to honor that.

Advertisement

Comments rss icon

  • What kind of scenarios would they have repeated periods of < 10 minute downtime? I’m talking about more than 3-4 periods of that.

    Even though this is possible, I think it’s safe to say that it’s not probable, they would have to be intentionally trying to bother people.

    • It’s a theoretical worst-case scenario. From the blog post I referred to:

      “Here is the problem: What if Google Apps was down for 9 minutes, up for 1 minute, down 9 minutes, etc. That would mean 54 minutes of downtime each hour.”

      • You could do the same with 9.999 minutes and 0.001 seconds of uptime between them for 99.99% downtime. 21 hours isn’t a theoretical maximum in this doomsday scenario, total lack of access is.

        I doubt Google has any malicious designs on this license wording. It’s likely to make it clear-cut whenever they have downtime or not… if a website drops out for 5 seconds and comes back, does that really count as downtime? Probably no one cares. Google is just defining a cut off point for this at 10 minutes before they would call it downtime.

  • FYI – SLA is only for paying customer

  • 1. Is it usual to be able to, I don’t know, kick a server and make it boot up for one minute before crashing again? If not, this is a completely made-up problem. Although I’m not a hardware expert, I think that in most cases if a computer is down for more than a few minutes, then it stays down. There’s no way for Google to make it sputter into life to let it manipulate uptime statistics.

    2. Even if we accept that this could happen, then why limit the headline to “They could be down for 21 hours, all they have to do is be up for one minute every ten”? Why should Google make the time between each downtime one minute? Why not one second? “Google could be down for 23 hours and 58 minutes and still claim 100% uptime!” Or epsilon? “Google could be down all day!”

    The point at the end of Pingdom’s post – that outages under ten minutes can be very common – is valid, but the “worst case scenario” analysis is just silly, and TechCrunch’s spin that Google could use this clause to ignore much longer periods of uptime is even sillier. They’re not technically capable of it, and even if they were, they’d be no point because the argument is so facile absolutely no-one would say “oh OK, I guess you didn’t break the SLA, that’s alright then”.

    • I worked for a company during the original dot com boom, and we had a web cluster of 24 servers, and each was rebooted every seven minutes. The was the official work-around given by our application server provider, at like $1M per processor. Ah, heady times indeed!

      So, yes, this is a real scenario.

  • I see your points, Sam B, but I think this is an interesting and important post. TOS agreements can be crafty little foxes… just read Bank of America’s TOS (I know this is completely different, but the craftiness is the same, though with BOA it’s not downtime, it’s taking poor peoples’ money)

  • I had blogged about this some while ago. Even with 99.9% availability your email can be down for more than 8 hours. If it is 8 hours at a stretch, it can seriously affect your organization. It is not the average downtime that gets you but the spikes. For more,
    http://www.manu...good-enough/61/

  • We use Google Apps Premier in my company and it was down for me (just me not the whole company) for 12 hours on Tuesday. I still have not received an incident report and find this unacceptable as e-mail is a critical part of my business. The outage has really made me question the SAAS model. My only recourse is to leave Google and frankly they won’t care about losing a customer of my size.

  • Good find Robin.

    The bigger point here is not the worst case scenario, but the weak promise of their SLA–this loophole severely weakens the value of their 99.9% uptime guarantee.

    Imagine running your business on Google apps and the service blacks out 7 minutes every day during peak hours. This would equate to a 99.5% uptime. Is this what you were expecting when you paid $$ for an SLA guarantee?

    Would you be happy if your phone stopped working for 7 minutes a day at peak hours?

    I’m not sure how this passes their “do no evil” mantra.

  • While 9 minutes down and 1 up every 10 minutes is a very unlikely scenario, 9 minutes downtime three or four times in a day is VERY possible. That’s 27 minutes of downtime per day – which works out to 98% uptime. But even with that 2% downtime, google can still CLAIM 100%

  • a theoretical loop hole, but one that google would never exploit…….not to this extent anyway.

    I’m sure they’re pleased they can hide 99% of their downtime though as its probably for less than 10mins to they never have to declare it and keep the 100% uptime stat…sneaky.

  • Tony (upthread) nailed it. Google can offer a poor service and claim 100% uptime. This is very disengenuous – some might say, evil :)

    Claiming 100% uptime when in actuality you are achieving 96-98% uptime is an out-and-out lie.

  • I’m actually amazed people believe that anything can have an uptime of 100%. It’ll be awhile before that’s the case as we see things like local network issues that prevent customers from accessing our services, provider issues at the colo affecting service, etc. In the colo scenario, we’ve actually seen some of the patterns outlined in the article — up for a few minutes, down for a few minutes, up again, etc. But, of course, you’d write your SLA to only take into account issues within your control, therefore a power outage knocks your whole colo out and technically you never violated your SLA as the issue didn’t reside with you, but with your colo. The other crafty thing companies do with SLAs is build a metric of how much, if any, they will credit. Most customers believe they’ll get a full month of credit, but in reality an outage of a few hours might only result in a few dollars worth of credit, not a full credit. Only rarely do companies credit the full bill and that’s when there’s a major issue and they have dozens of customers threatening to leave them.

    -matt

  • While i partly agree with most of the comments, this post reminds us that you can’t claim a 100% up-time when it’s not. Lying is an evil act indeed.

  • If any of you people can create a more robust + scalable web system than Google can, you have the right to talk shit about them.

    Otherwise, you guys are just bunch of cynical turds.

    • I agree with Tech Ninja, I think it’s ridiculous that people expect so much out of a _free_ service. You get what you pay for.

      • i don’t think google apps is a free service — people (companies) contract google to create separate (local or remote) instances of gmail / docs for them to use for their business/organization.

  • Who in their right mind would give Google their code, above all the other tons of data we let them have about us.

  • Almost all SLAs seem to define uptime as net of “scheduled downtime” and also downtime of < x minutes.

    There is validity to this approach — but also a clear fiction. We’ve taken the approach and just used Pingdom data at EchoSign electronic signature and show it going back 18 months at http://trust.echosign.com, warts and all.

    That way, there is no debate.

  • What ever happend to the slogan “Do No Evil”? I guess Google now has attorneys involved.

    BTW, and had thousands of “temp workers” they can layoff and not have to claim they ever had layoffs or reports this extra workfoce (1/3 of the company staff) as employees to their stock holders. Glad I dumped their stock months ago, I can’t agree with their tatics.

  • That’s something in every SLA, but even that, I think 10 mins duration is rather long. I don’t know if, beside “Downtime period”, how Google defines “downtime”. In some/many companies, it is defined as “no connection at all”, that means, all servers are down, or all network connections are down. If someone in Asia can use the service when a American can’t, it won’t be couned into “downtime”.

  • Galvanick Lucipher - December 5th, 2008 at 7:50 am PST

    Oh look, more whining from the entitlement generation. “I WANT FIVE NINES AND I WANT IT CHEAP!” Google’s pay-for hosting service is pretty economical. If you want five nines of uptime with a real SLA behind it, you can get it from other providers, but they, um, charge more for better service. Whodda thought?

    • It doesn’t mean that their SLA needs to be predatory. Sure, buyer beware, you should read the SLA to know what your getting into, but the language is not really all that clear, is it.

  • Google’s pay-for hosting service is pretty economical

  • Some folks like a conspiracy theory, but it’s a lot simpler than that.

    This was really driven by engineers who do indeed believe in “don’t be evil”, who looked at some other companies’ incredible SLA promises and would have none of it, who want to spell out explicitly for customers the worst they’re signing up to, and who want to under-promise and over-deliver on expectations.

    SLA terms are a surprisingly complicated business. Google’s gets to the meat of it within a few paragraphs. And when was the last time anyone remembers a “scheduled downtime” for GMail, also stated in the SLA?

    The important thing is to look at what the company does to keep its customers happy, uptime, ROI, and all. Google has had a pretty good track record so far, if imperfect.

    Notwithstanding some availability monitoring service providers that have a history of issuing news releases with the word “Google” in it. If it works for their business, more power to them :-)

Leave Comment

Commenting Options

Enter your personal information to the left, or sign in with your Facebook account by clicking the button below.

Alternatively, you can create an avatar that will appear whenever you leave a comment on a Gravatar-enabled blog.

Trackback URL
bugbugbug