It was 8:42 PM at the Amazon office when Priya, a senior engineer, noticed a Slack message from Raj, the newest backend SDE on her team. “Hi Priya, the orders API endpoint you own is suddenly returning 503s after my changes. Could you help me debug? Sorry for the late message.” Priya hesitated for a second. She’d spent months refining that endpoint, handling edge cases, scaling for Prime Day traffic, even documenting the entire flow for new team members. She remembered her own early days, when breaking something felt terrifying, and the seniors weren’t always patient. For a split second, Priya wanted to reply, “It was working fine earlier. Did you check the wiki?” But she didn’t. She joined the call. Raj was clearly stressed, walking through his recent commits. First month at Amazon, production issue, and it was late. Priya listened, asked focused questions, and together they tracked down a missing API key in the environment config. Raj sighed, “Thank you, Priya. Didn’t mean to break anything, just trying to add new promo logic.” Priya replied, “No worries. Bugs happen to all of us. What matters is catching them fast. I’d always rather have a teammate who asks for help than one who hides a problem.” The next morning, Priya shared a quick note in the team channel: > “Thanks to Raj for surfacing the issue quickly and collaborating on the fix. These things happen, what counts is how we respond.” And as she updated her notes, Priya reflected: → Being smart is good but being reliable, calm, and collaborative under pressure is what builds trust. → The most critical projects rarely go to the “lone genius.” They go to the person the team can depend on, day after day. Over time, more teammates started coming to Priya for tricky launches, because she was steady, approachable, and always showed up when it mattered. Because in the long run, that’s what makes you the engineer everyone trusts to deliver. And that’s the reputation that really opens doors.
I get the message behind the story but if it's out of hours and your change just broke something then the first choice should always be to rollback to the known good version and fix it in the morning (including fixing why your tests didn't spot the issue before deployment) If your process doesn't support that then the focus should be fixing the process.
Reliable teammates for the win!
Let’s not confuse emotional support with operational excellence. Also please don't celebrate such rescues as a triumph of team culture. It’s not. It’s a symptom. Let’s be real. Who reviewed Raj’s changes were they peer-reviewed or peer-overlooked? Did any kind of tests or config checks exist for these changes? What about guardrails, automated safeguards and backout plan. Did any of them exists or was it just vibes and hope. Priya was kind. Raj was brave. But this isn’t a culture win it’s a systems fail. Kindness is lovely. But in high reliability teams it shouldn’t be the incident response plan and definately not a substitute for basic engineering hygiene.
Doesn’t this point to a lot of problems in the on-call/incident and team culture in general? - Why was Priya available on slack at 8:42pm? - If she was supposed to be a backup, why couldn’t Raj page her using a dedicated mechanim? - Why does Raj thinks that saying “endpoint you own” is an ok thing to say? - Why did Raj deploy so late? - Why didn’t canary deployment catch this and roll back the change? I get the point, Priya is nice and she jumped to help, ownership etc etc. Raj was in panick, but ultimately this team needs to sit down together and get their basics right. 😅
Are we suggesting Priya overwork and fix others bugs while Raj is getting stressed? Also give all the credit to Raj for surfacing the bug while Priya took time to help and help resolve the issue. Is this the culture we are trying to manage?
Do you honestly think that Amazon legal would appreciate these stories about them? You work at google. Most engineers sign NDA.
Good reviews turn juniors into strong engineers. Bad reviews just create fear
Got the point, Priya is an amazing human being and a great team player but I can see a lot of gaps in this whole process and whichever organization that encourages such episodes and treat them as normal is not setting the right example and the posts like these as well!
Reliability > brilliance
This is why mentorship matters as much as architecture