Good Retry, Bad Retry: An Incident Story
Good Retry, Bad Retry: An Incident Story

medium.com
Good Retry, Bad Retry: An Incident Story

I've never run a big system like this, but like the lead character in the story, I always figured exponential backoff would be enough. Turns out there's more.