Observability-first Approach Tracing as a tool is immensely productive. It cannot be overestimated - especially in production, but also in development when debugging our own mistakes. In the previous article I covered why we should use that facility and why they invented it in the first place. After first sight and couple of rounds, you fall in love with that tool. Right after you gain some experience, more questions start to arise.
Debugging Concurrency Let me tell you a story. We have deployed a new version of software on the whole fleet. Hundreds of machines in 4 different regions around the world. We observed our metrics, inspected our logs - nothing there. Success, another deployment without any issues. Fast forward two days, and we received a customer support ticket to investigate. One of our integrations complained that they are observing a significant amount of errors when calling our endpoints, in two distinct regions.