“Thanks so much for being so damn rock solid with Temporal Cloud [during the AWS outage]. Our on-call didn’t flinch when us-east-1 went down, and being able to trigger a failover to us-east-2 was 🤌 ” — Robert Ross, CEO of FireHydrant
Industry
High Tech
Use Case
Event-driven alerting and escalation orchestration
Company Size
51-250
SDK
Go
Temporal
Cloud
FireHydrant builds incident management and alerting software that helps engineering teams define the services they own, ingest alerts from their observability tools, route and escalate notifications, open incidents directly from alerts, and manage the entire incident lifecycle through to retrospectives.
Signals, one of FireHydrant’s flagship products, competes with PagerDuty and sits at the center of that workflow.
The team wanted a platform they could trust during bad days, not just the easy ones. Visualization of executions mattered because it lets engineers see exactly what happened for a given alert rather than guess.
Being able to visualize exactly what happened gives us confidence the system behaved as expected.
Offloading state was equally important so the team could put more attention on customer problems and less on orchestration plumbing. And after a year-plus running Signals, they found Temporal a strong fit whenever they needed more observability or a single place to unify work.
Signals was new work, but some surrounding runbook logic lived in a generic async job processor. During incident bursts, many events could arrive in a short window, and the system risked re-evaluating the same rules repeatedly. That led to duplicate work and queuing complexity right when teams needed clarity.
Troubleshooting third-party integration failures also took longer than it should have because it wasn’t obvious which customers or incidents were affected. Adopting a workflow mindset required moving away from “a simple queue” toward Workflows, Activities, and the guarantees between them.
Before Signals, the team had experimented with self-hosting Temporal. Once they scoped the product and saw how extensive it would be, they launched on Temporal Cloud to minimize operational risk while building expertise. Over time, they moved more workloads into Cloud and, with growing familiarity, felt confident evaluating hosting choices again.
The team modeled the alert journey as a Workflow. When a monitoring provider reports an issue, a single Workflow instance becomes the durable thread that ties everything together: identify customer configuration, deliver notifications, wait for acknowledgements, escalate if there’s no response, and record outcomes.
That same thread is where support starts when someone asks why a message didn’t arrive. Engineers open the Workflow’s history and work back from the alert.
Under the hood, long-lived Workflows coordinate short, idempotent Activities such as “send Slack message,” “send SMS,” or “create Slack channel.” This keeps side effects small and traceable while the Workflow owns timing and recovery. To avoid duplicate work during bursts, the Workflow evaluates rules and executes only those that have just become true. If another event arrives mid-evaluation, the design prevents unnecessary rework instead of piling up jobs in a queue.
Signals also spans languages. FireHydrant’s Ruby code can create Workflows that Go Workers execute, and vice versa. The team standardized on Protocol Buffers for parameters so cross-language calls are safe and predictable.
Temporal Cloud handled the heavy lifting from day one. Instead of running their own cluster while ramping up a complex new product, the team focused on behavior and user experience. As they grew comfortable with Temporal and their own needs, they moved additional services to Cloud and kept the door open to self-hosting where it makes sense.
During a Google Cloud authentication outage and the more recent AWS outage, many systems struggled to send notifications. But Signals kept moving. Durable Workflows, established connections, and clear escalation logic meant alerts continued their path and reached the people who needed to respond.
Temporal did very well during the recent AWS outage and we were able to continue serving alerts to our customers.
Tying executions to specific alerts and incidents shortened the path from “something failed” to “which customer saw it and why.” What once felt like a black box became an execution you can open, read, and reason about.
Workflows and Activities give reviewers a common mental model. Determinism encourages careful thinking earlier in the process, surfaces potential problems before they reach production, and makes stepwise rollouts easier to plan.
Once the code is written in the pattern, we know it will run to completion and we can track it every time.
The experience with Signals made Temporal the default choice for new areas where observability and unification help. As the platform matured, the team moved additional workloads to Temporal Cloud and now evaluates hosting based on what each service needs rather than habit.
The FireHydrant team shared a few learnings:
Tired of losing revenue during provider outages? Temporal’s failover capabilities can help.
Start today with a free trial of Temporal Cloud and $1,000 in credits.
Ready to learn why companies like Netflix, Doordash, and Stripe trust Temporal as their secure and scalable way to build and innovate?