It was clear his site reliability engineering skills helped and a test automation suite added value by providing feedback about the production environment ā beyond just checking things pre-prod. The discussion reiterated the importance of experimenting with (monitoring) Service Level Objectives (SLOs) and other important metrics.
What have you learnt from working with Site Reliability Engineers (SREs)? How has that influenced your approach to continuous quality? How do you work with SREs in a quality engineering environment? How do you help SREs consider risk and testing approaches?
I found working with, and later leading, a platform engineering team (which might have been called SRE in another org) that there was a surprising amount of overlap between our goals and work required to achieve them. Especially around pipelines/workflows, observability and monitoring, performance and scalability, and disaster recovery.
Iād always encourage anyone working in a cross-functional team to spend time pairing with people from the other functions to better understand their work and learn more about it. Often you can learn things that will be relevant to you and help you improve in your area of specialism. Certainly this was the case for me. Our platform team had a broad remit, and included Backend Software Engineers, Platform / Site Reliability Engineers, and Database Administrators.
Through working with them all I became a better Quality Engineer - I improved my software development skills, learnt how to write and manage terraform for infrastructure as code, improved my SQL skills, and developed a much better understanding of how our myriad of services worked together and impacted each other. More than that though, I developed my leadership skills. I gained a better understanding of how this group of very senior, diverse, and highly experienced subject matter experts could go from being just a group of specialists, to become a team that delivered value together.