The Obama campaigns technologists were tense and tired. It was game day and everything was going wrong.
Josh Thayer, the lead engineer of Narwhal, had just been informed that theyd lost another one of the services powering their software. That was bad: Narwhal was the code name for the data platform that underpinned the campaign and let it track voters and volunteers. If it broke, so would everything else.
They were talking with people at Amazon Web Services, but all they knew was that they had packet loss. Earlier that day, they lost their databases, their East Coast servers, and their memcache clusters. Thayer was ready to kill Nick Hatch, a DevOps engineer who was the official bearer of bad news. Another of their vendors, StallionDB, was fixing databases, but needed to rebuild the replicas. It was going to take time, Hatch said. They didnt have time.Theyd been working 14-hour days, six or seven days a week, trying to reelect the president, and now everything had been broken at just the wrong time. It was like someone had written a Murphys Law algorithm and deployed it at scale.Theyd been working 14-hour days, six or seven days a week, trying to reelect the president, and now everything had been broken at just the wrong time.
And that was the point. “Game day” was October 21. The election was still 17 days away, and this was a live action role playing LARPing! exercise that the campaigns chief technology officer, Harper Reed, was inflicting on his team. “We worked through every possible disaster situation,” Reed said. “We did three actual all-day sessions of destroying everything we had built.”