Oops! LinkedIn Goes Down, Can't Get Up

LinkedIn may be profitable and growing fast, but something seems to be going very wrong with the business social networking service today. The service has been going up and down all day over here in Europe, and has now been displaying an ‘Oops’ page for the past 55 minutes (4:30 PM CET).

Update: it’s back up, downtime lasted exactly one hour.

Update 2: and it’s back down again (4:45 PM CET), they’re having serious technical issues over there.

Update 3: it’s bouncing up and down. Seems to be stable now (4:52 PM CET)

As we’ve learned from people e-mailing us about us (and comments on this post and on Twitter), LinkedIn seems to have been dealing with various technical problems for weeks now. We’re getting LinkedIn’s side of the story and will update this post asap.

It’s a very unusual thing to happen with LinkedIn, which has always had quite a reliable web service, and there’s no official update to be found on the LinkedIn Blog (which is still up). Meanwhile, Twitter users are spreading the word about its current downtime fast.

We got in touch with Pingdom, who states:

LinkedIn started having intermittent problems around 5:30 a.m. US EST. The errors we are getting from them are HTTP 500 errors (internal server error). At 9:36 a.m. US EST the site went down and has kept returning errors since (HTTP 500 errors). As of this writing it’s been that way for 55 minutes.

Update 4: The problem was too many backed up messages, specifically LinkedIn’s Message Queuing. Explanation is up on the LinkedIn Blog:

Many of you trying to use LinkedIn between 2:18am and 4:08am US Pacific time this morning, and all of you trying to use LinkedIn between 6:10am and 7:43am, were unable to get in. This is not what we want for our users, and we are very sorry for the inconvenience caused. Rest assured, all of your postings and messages were sent out.

What caused the outage? LinkedIn uses a technology called “Message Queuing” within our site to allow our various services (for example, Network Update Service, inMails) to communicate with each other asynchronously, so that a sudden surge of usage on one part of the site will not affect performance on another. Starting early this morning, we ran into some issues with our Message Queuing services, which caused the message queues to back up.