Newsletters
News & Information for Technology Purchasers NewsFactor Sites:       NewsFactor.com     Enterprise Security Today     CRM Daily     Business Report     Sci-Tech Today  
   
This ad will display for the next 20 seconds. Click for more information, or
Home Enterprise I.T. Cloud Computing Applications Hardware More Topics...
Neustar, Inc.
Protect your website & network
using real-time information & analysis

www.neustar.biz
Network Security
Real-time info services with Neustar
Average Rating:
Rate this article:  
Google Blames Outage on Software Bug
Google Blames Outage on Software Bug

By Jennifer LeClaire
January 27, 2014 1:30PM

    Bookmark and Share
Isolated events like this are not a problem and users will forgive the Google outage. It becomes a problem when a pattern develops. If it were to happen multiple times it could become a problem for Google. Gmail has become a very strategic product and it's unlikely that Google will experience many more of these outages, said analyst Greg Sterling.
 

Related Topics

Google
Gmail
E-mail
Outage



If you are a hardcore Google user, you may have been tempted to pull out a few hairs last Friday as several of the company’s key services experienced a painful hiccup. Now, Google is shedding some light on the incident.

Specifically, Google users who use logged-in services like Gmail, Google+, Calendar and Documents were unable to access those services for about 25 minutes, according to Google vice president of Engineering Ben Treynor.

“For about 10 percent of users, the problem persisted for as much as 30 minutes longer,” he said on Friday. “Whether the effect was brief or lasted the better part of an hour, please accept our apologies -- we strive to make all of Google’s services available and fast for you, all the time, and we missed the mark today.”

What Really Happened?

Treynor reports that the issue has been resolved, and the company is now focused on correcting the bug that caused the outage, as well as putting more checks and monitors in place to ensure that this kind of problem doesn’t happen again. He then offered a technical explanation for what occurred and how it was fixed.

At 10:55 a.m. PST Friday morning, Treynor explained, an internal system that generates configurations -- essentially, information that tells other systems how to behave -- encountered a software bug and generated an incorrect configuration. The incorrect configuration was sent to live services over the next 15 minutes, caused users’ requests for their data to be ignored, and those services, in turn, generated errors.

“Users began seeing these errors on affected services at 11:02 a.m., and at that time our internal monitoring alerted Google’s Site Reliability Team. Engineers were still debugging 12 minutes later when the same system, having automatically cleared the original error, generated a new correct configuration at 11:14 a.m. and began sending it; errors subsided rapidly starting at this time,” Treynor said. “By 11:30 a.m. the correct configuration was live everywhere and almost all users’ service was restored.”

Will Google See User Backlash?

With services once again working normally, Treynor said work is now focused on removing the source of failure that caused Friday’s outage, and speeding up recovery when a problem does occur. Google then took three more steps:

(1) Correcting the bug in the configuration generator to prevent recurrence, and auditing all other critical configuration generation systems to ensure they do not contain a similar bug; (2) adding additional input validation checks for configurations, so that a bad configuration generated in the future will not result in service disruption; and (3) adding additional targeted monitoring to more quickly detect and diagnose the cause of service failure.

We caught up with Greg Sterling, principal analyst at Sterling Market Intelligence, to get his take on the outage -- and its resolution. He told us because Google has such a strong reputation as a engineering-driven company when something like this happens it's surprising to many people.

“Again, isolated events like this are not a problem and users will forgive the outage,” Sterling said. “It becomes a problem when a pattern develops. If this were to happen multiple times it would start to become a problem for Google. Gmail has become a very strategic product for the company and it's unlikely that Google will experience many more of these outages.”
 

Tell Us What You Think
Comment:

Name:





 Network Security
1.   Tor Working To Fix Security Exploit
2.   Wall Street Journal Hacked Again
3.   Dropbox for Business Boosts Security
4.   Hackers Breached StubHub Accounts
5.   Banks Hit by Android-Skirting Malware


advertisement
Android SMS Worm on the Loose
Malware lets bad actors cash in.
Average Rating:
Tor Working To Fix Security Exploit
Bug reportedly reveals ID of users
Average Rating:
New Technology Defeats Privacy Efforts
Study identifies 3 browser techniques.
Average Rating:
Product Information and Resources for Technology You Can Use To Boost Your Business

Network Security Spotlight
Researchers Working To Fix Tor Security Exploit
Developers for the Tor privacy browser are scrambling to fix a bug revealed Monday that researchers say could allow hackers, or government surveillance agencies, to track users online.
 
Wall Street Journal Hacked Again
Hacked again. That’s the story at the Wall Street Journal this week as the newspaper reports that the computer systems housing some of its news graphics were breached. Customers not affected -- yet.
 
Dropbox for Business Beefs Up Security
Dropbox is upping its game for business users. The cloud-based storage and sharing company has rolled out new security, search and other features to boost its appeal for businesses.
 

Enterprise Hardware Spotlight
Microsoft Makes Design Central to Its Future
Over the last four years, Microsoft has doubled the number of designers it employs, putting a priority on fashioning devices that work around people's lives -- and that are attractive and cool.
 
Contrary to Report, Lenovo's Staying in Small Windows Tablets
Device maker Lenovo has clarified a report that indicated it is getting out of the small Windows tablet business -- as in the ThinkPad 8 and the 8-inch Miix 2. But the firm said it is not exiting that market.
 
Seagate Unveils Networked Drives for Small Businesses
Seagate is out with five new networked attached storage products aimed at small businesses. The drives are for companies with up to 50 workers, and range in capacity from two to 20 terabytes.
 

Navigation
NewsFactor Network
Home/Top News | Enterprise I.T. | Cloud Computing | Applications | Hardware | Mobile Tech | Big Data | Communications
World Wide Web | Network Security | Data Storage | CRM Systems | Microsoft/Windows | Apple/Mac | Linux/Open Source | Personal Tech
Press Releases
NewsFactor Network Enterprise I.T. Sites
NewsFactor Technology News | Enterprise Security Today | CRM Daily

NewsFactor Business and Innovation Sites
Sci-Tech Today | NewsFactor Business Report

NewsFactor Services
FreeNewsFeed | Free Newsletters

About NewsFactor Network | How To Contact Us | Article Reprints | Careers @ NewsFactor | Services for PR Pros | Top Tech Wire | How To Advertise

Privacy Policy | Terms of Service
© Copyright 2000-2014 NewsFactor Network. All rights reserved. Article rating technology by Blogowogo. Member of Accuserve Ad Network.