Thursday, March 2, 2017

How a single typo brought the web to its knees

This week, an Amazon Web Service (AWS) failure caused a massive outage all over the internet. Today, we know why: a typo. The company released a detailed report today explaining what happened. An employee entered what they thought was a routine command to remove servers from an S3 subsystem. By mistake, they entered a larger number than intended. These servers supported two other S3 subsystems, both of which manage the storage and metadata for the entire region. Down went the dominoes. AWS assures everyone that it’s prepared for the occasional failure. Fixing the employee’s error should have been as simple as…

This story continues at The Next Web

No comments:

Post a Comment