One Mo' Gin

Resilience

Through a career spanning over twenty years at ISPs, eCommerce shops, and technology giants like Twitter, Stripe, and Airbnb I have learned that complex systems will inevitably fail. To repair and expand these systems, we need humans! The goal is to create adaptive capacity; to be resilient.

Professional Work

After years focused on observability, I came to the conclusion that charts don’t solve problems, people do. This led me to invest more in resilience engineering and adaptive capacity.

In 2019 I joined SignalFx, later acquired by Splunk, where I researched resilience engineering and its precursors to inform the design of products, organizations, and tooling. In 2021 I joined Jeli, later acquired by PagerDuty, to work on incident analysis so that organizations can more easily learn from their work.

I spent a few fulfilling years building my own startup around ergonomic on-call tooling before joining Airbnb, where I’ve continued that thread working on large-scale reliability initiatives — most notably SLOs — helping engineers understand and trust the systems they build.

Writing

Speaking

RubyConf 2021

At RubyConf 2021 I spoke about finding inspiration for resilience in other industries and settings.

DevOps Days Denver 2024

In 2024 I spoke at DevOps Days Denver about on-call needs a rethink.

SRECon Americas 2025

Early in 2025 we completed research at Hotpot that showed organizations are not set up to support on-call work so I spoke about it at SRECon Americas.

Podcasts

Publications

Future

Having spent most of my career on call I believe that organizations can greatly improve the happiness and effectiveness of employees and customers by investing in resilience!