One Mo' Gin

Observability

Through my career as a software engineer, manager, and executive, observability has been a constant theme.

Observability is more than monitoring and charts.** It’s a collection of techniques and tools that deepen our understanding of complex systems — not just when things go wrong, but all the time.** The goal is to give operators genuine insight into how a system behaves: what changed, what’s slow, what’s correlated. Good observability improves the ergonomics of that understanding, making systems less surprising to the people who run them.

Open Source Work

Professional Work

After joining Twitter in 2012 I quickly found my calling in the Observability team. My Observability at Twitter post was the first mention of “observability” in this context. (The team existed before me, I was just the one to share it outside of Twitter!)

Upon joining Stripe in 2015 I created and led an observability team and worked to change Stripe’s culture such that observing our systems was a core concern. I led the creation of an entirely new observability stack with minimal interruption, managed and changed vendors a few times, and contributed to large improvements in reliability and confidence at Stripe through both observability tooling and incident process.

In 2019 I joined SignalFx as a Technical Director, functioning as a Field CTO. My role was a mix of advocacy, customer engagement, and product improvement. Late in 2019 SignalFx was acquired by Splunk.

After Splunk I spent time at Jeli, working on incident analysis and learning. From there I founded Oilcan, where I spent several years building ergonomic on-call tooling aimed at making the lives of on-call engineers less miserable.

I’m now at Airbnb, working on infrastructure engineering. My focus includes large-scale reliability initiatives like SLOs — building the systems and culture that let engineers understand and trust what they’ve built.

I’m often asked by investors to discuss my thoughts of new or existing monitoring products, and I enjoy speaking about these tools with others both to learn and provide my thoughts. I’ve also participated on customer advisory boards, representing my engineering teammates and learning challenges from vendors.

Writing

Speaking

I speak regularly and conferences across the country promoting observability and thoughtful, empathetic operations.

Monitorama 2016:

Monitorama PDX 2016 - Cory Watson - Creating A Culture of Observability at Stripe from Monitorama on Vimeo.

Here are the slides if you prefer to flip through them rather than listen to me talk.

There are also versions of this talk from:

AWS Loft 2019:

I gave at talk at the New York AWS Loft office called “Demystifying Observability” for startups. It’s a combination of beginner info and practical advice for how observability can help you even when you’re just getting started.

Monitorama PDX 2019:

I had the pleasure of giving a 5 minute “vendor talk” at Monitorama PDX 2019. These talks are sometimes product pitches, but more often they are just a chance to speak about something important/interesting for the attendees and maybe mention your product. I decided to talk about how to think about observability tooling inspired by John Allspaw’s “An Open Letter to Monitoring/Metrics/Alerting Companies”.

Monitorama Baltimore 2019:

I spoke about a Dashboard Renaissance, or techniques and processes for making dashboards a more helpful part of your observability and monitoring work. Slides are here.

KubeCon US 2019:

Originally conceived as a set of lessons from my personal role purchasing tools to my job at a vendor where I work with dozens of customers making the same decisions. This talk covers 6 different ways to improve your stance on observability from a social perspective.

SREcon20 Americas

Incidents are an amazing source of education, but we often fail to incorporate the findings into our observability tooling. This talk provides methods for doing just that, with a bit of help from my friends at Jeli.

Podcasts

Other