Resilience

Through a career spanning over twenty years at ISPs, eCommerce shops, and technology giants like Twitter and Stripe I have learned that complex systems will inevitably fail. To repair and expand these systems, we need humans! The goal is to create adaptive capacity; to be resilient.

Professional Work

After joining Twitter in 2012 as one of the first Site Reliability Engineers (SRE) I leaned into observability. While I still consider this work important, I came to the conclusion that charts don’t solve problems, people do. This led me to invest more in learning about resilience engineering and adaptive capacity.

In 2019 I joined SignalFx, later acquired by Splunk and researched resilience engineering and it’s precursors to inform the design of products, organizations, and tooling. In 2021 I joined Jeli, later acquired by PagerDuty, to work on incident analysis so that organizations can more easily learn from their work.

Most recently I co-founded Oilcan and created Hotpot.

Writing

A 3 part series on automation:
Contributor to Jeli’s Incident Analysis 101 series in Putting It All Together.
I wrote How to turn an engineering incident into an opportunity for LeadDev.
Explained micro-learning opportunities for practical, daily improvement.

Speaking

RubyConf 2021

At RubyConf 2021 I spoke about finding inspiration for resilience in other industries and settings.

DevOps Days Denver 2024

In 2024 I spoke at DevOps Days Denver about on-call needs a rethink.

SRECon Americas 2025

Early in 2025 we completed research at Hotpot that showed organizations are not set up to support on-call work so I spoke about it at SRECon Americas.

Podcasts

In November of 2021 I spoke with Software Misadventures about failure and success in software engineering.

Publications

Technical Reviewer for Increment: Issue 16, February 2021; Reliability and Increment: Issue 17, May 2021; Containers.

Future

Having spent most of my career on call I believe that organizations can greatly improve the happiness and effectiveness of employees and customers by investing in resilience!