Blog Post

Automate Data Incidents with New Monte Carlo Integration

by
Nick Freund
-
February 9, 2024

In product development, the best things are often born of happenstance. Completely new ways of thinking about what a product does, and where it might go, can be catalyzed by a single conversation. And such was the birth of our Monte Carlo integration, and our new module automating data incident workflows — both of which are available for you to enhance your data knowledge practice today. 

Earlier this year, we spoke with our longtime customer Dr. Squatch about a very specific feature we intended to develop: data status pages, and their immediate, knee-jerk response was to ask us for help.

We use Monte Carlo as our data observability solution. But our ability to quickly let stakeholders know which assets are impacted is a mess once Monte Carlo surfaces an issue. Can you help us?
- Danielle Mendheim, Director of Data and Technology at Dr Squatch

What spawned since that conversation is not only a new and exciting integration, but also unlocks an entire new vector of data knowledge — trust . We believe trust is the most important pillar of data knowledge, and we’re thrilled to bring it to innovative data teams like Dr Squatch. 

Why being on call sucks

We have spoken ad nauseum about why staffing data service desks blows, but do you know what also sucks? Being the unlucky data team member stuck in the “on call” rotation for data incidents.

“Bottom of the pyramid” work like triaging incidents and answering questions is rarely fun but left unchecked it can soon dominate a team’s time and capability. To address this, most data teams of scale we speak with, either a team of full-stack analytics engineers or a dedicated data engineering org, use the on-call model for data incidents. This allows everyone some respite from this low-value busy work that folks hate. 

Workstream's new integration with Monte Carlo is a game-changer for our team. It allows us to proactively manage incidents, and quickly communicate those issues to the business through alerting and status pages.
– Danielle Mendheim, Director of Data and Technology at Dr Squatch

On one hand, this busy work resembles software debugging: sifting through flagged incidents or failing tests on code written by someone else, where you might have relatively little context on why specific decisions were made. On the other hand, being on call for data incidents includes social nuance: triangulating what dashboards the failure might have broken, who in the business the incident might have impacted, and how you might go about communicating the incident without creating trust issues and backlash. 

And while your time on-call might be thankfully short, communicating to the next person up what issues exist, what you have learned, and the follow-up remediation is laborious and annoying.

Cue the trumpets, and the Workstream.io  - Monte Carlo integration!

Automating your data incident workflow

Today, we are excited to announce a set of capabilities to automate your data incident workflow from end to end. Teams can now say goodbye to manually tracking down issues and logging tickets, and hello to automation. 

Now, Workstream.io is connected with the core tools that you use to observe data, such as Monte Carlo. Workstream.io will automatically alert you and create incidents in your workspace whenever anomalous activity is detected in that system – be it a unit test breaks, or a custom monitor fails.

This integration allows teams to leverage a centralized view of their data incidents, including what it relates to both up and downstream, and streamline collaboration among their on-call team. You can use Workstream to manage handoffs by assigning ownership of incidents, sharing critical context with your team, and automatically triggering the creation of tickets in your agile project management system. 

And when it is the end of your time on call, your teammates can get easy insight into where things left off without extra effort from you. When writing up your retrospective, you can now directly reference and deep link to your data incidents in our Workstream docs functionality

Containing the blast radius of an incident

With Workstream.io and Monte Carlo, you can now instantly know who and what is affected when you have data downtime. This includes what we would consider table stakes features such as which dashboards, and which data users might have been impacted. But the integration goes further: it will show you when exactly an end user might have viewed broken or stale data, what exactly they were looking at, or saying about the data at that moment in time. With our data incident workflows, you can instantly know how an incident has impacted your business, from end to end. 

Lastly, this integration allows you to store all the communications and processes around a single incident in one place, even when that incident impacts many dashboards or is linked to many failing tests. This helps not only your data teams’ core workflow but also your efforts to (re) build trust across your organization. 

(Re) building trust with the business

If a tree fell in a forest, and no one heard it, did anything happen? Unfortunately, this adage cannot be applied to most data failures because someone will almost always notice and ask “Does this data look right?” You can now instantly demonstrate accountability and your dedication to data quality by keeping the business informed throughout the incident resolution process.

Monte Carlo's new integration with Workstream’s data knowledge solution addresses the needs of customers to be alerted on data issues directly where they work, further streamlining data observability workflows. We are excited to be partnering with the Workstream team to help our mutual customers automate their data incident workflows and build a culture of trust in data across their organization
- Barr Moses, Co-founder and CEO of Monte Carlo

Now you can automatically send alerts to your stakeholders where they already work via our push notifications and our data status pages. Your end users will not only be able to see the current status of an incident and any core context you have added but also historical insights and past trends in data downtime and availability. 

With Workstream.io, you no longer need to jump across multiple data, task management, or communication tools–- simply connect us to Monte Carlo, and we’ll handle the rest. 

The cherry on top: month-to-month pricing

We are pretty excited by this new integration, and how our data incident workflows are automating busy work for our customers, but we also understand that nobody wants to buy software right now. If the impact of this new integration resonates, we would love to be helpful to you and your team before the end of the year. Feel free to reach out and we can get you started with our new month-to-month pricing plan in 30 minutes or less. Getting started with data knowledge is cheaper and easier than ever before.

With our Team plan, you can now enable everyone with unlimited knowledge for just $800/ month when you use the code EARLYADOPTER, valid until 12/31. 

You can also lock in this low month-to-month pricing here, and our team will be in touch quickly to get you going

by
Nick Freund
-
February 9, 2024