Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Word is that the last time Google had a failure involving a cyclical dependency they had to rip open a safe. It contained the backup password to the system that stored the safe combination.


The safe in question contained a smartcard required to boot an HSM. The safe combination was stored in a secret manager that depended on that HSM.

The engineer attempted to restart the service, but did not know that a restart required a hardware security module (HSM) smart card. These smart cards were stored in multiple safes in different Google offices across the globe, but not in New York City, where the on-call engineer was located. When the service failed to restart, the engineer contacted a colleague in Australia to retrieve a smart card. To their great dismay, the engineer in Australia could not open the safe because the combination was stored in the now-offline password manager.

Source: Chapter 1 of "Building Secure and Reliable Systems" (https://sre.google/static/pdf/building_secure_and_reliable_s... size warning: 9 MB)


Lovely.

Safes typically have the instructions on how to change the combination glued to the inside of the door, and ending with something like "store the combination securely. Not inside the safe!"

But as they say: make something foolproof and nature will create a better fool.


I'm sure this sort of thing won't be a problem for a company whose founding ethos is 'move fast and break things.' O:-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: