Through a career spanning over twenty years at ISPs, eCommerce shops, and technology giants like Twitter and Stripe I have learned that complex systems will inevitably fail. To repair and expand these systems, we need humans! The goal is to create adaptive capacity, to be resilient.

Professional Work

After joining Twitter in 2012 as one of the first Site Reliability Engineers (SRE) I leaned into observability. While I still consider this work important, I came to the conclusion that charts don’t solve problems, people do. This led me to invest more in learning about resilience engineering and adaptive capacity.

In 2019 I joined SignalFx, later acquired by Splunk and researched resilience engineering and it’s precursors to inform the design of products, organizations, and tooling.



I speak regularly and conferences across the country promoting resilient and thoughtful, empathetic operations.

Having spent most of my career on call I believe that organizations can greatly improve the happiness and effectiveness of employees and customers by investing in resilience!