If you're using Temporal for your workflows, you already know what a powerful tool it is for building durable and reliable systems. But what if you need your workflows to respond faster—especially for user-facing applications? Latency can be a challenge when responsiveness matters. The good news is, there are ways to make your workflows snappier without sacrificing durability.

When working with interactive user experiences, even small latency variations can cause significant problems. Temporal Cloud is fast, but its minimum workflow end-to-end latency is around 100ms. That might not seem like much, but it can quickly add up when you factor in necessary network calls and activity execution time. For workflows that need to fit within a 500ms latency budget, this end-to-end latency can become a real bottleneck.

This post covers techniques to improve workflow performance by optimizing response time. We will explain how to do this in a way that improves response time but maintains Temporal’s core strengths - durability and reliability.

It all begins with understanding where latency comes from.

Hot Paths and the Latency Problem

Durable Execution with Temporal is a great solution for developers building simple, transparent, reliable systems, but there are valid objections for Temporal to be in the hot path for interactive user transactions. Take a look at this example, which is fairly standard for e-commerce use cases:

payment order processing

The order has to be submitted, sent to the payment system, and sent to the pay order workflow where the transaction is validated, authorized, and captured. From there, the fulfillment process is triggered, and only then can the user receive the much-needed feedback regarding the success or failure of their payment.

Latency reduction starts with breaking down and understanding where a workflow consumes time.

Measuring Temporal Workflow Latency

When you manage your workflows with Temporal, the time it takes for the Workflow to complete depends on a variety of factors. This includes Workflow startup, Task scheduling, and network calls to the server.

This figure shows how latency can build up over time. Note, this is just an illustrative example and real latencies will vary.

workflow latency breakdown

In this example, the Workflow Task requires some time for the Schedule and Start stages, plus some time to record results in the Completion stage, in addition to the 40 milliseconds of Worker run time.

In the corresponding Activity Task, the Schedule, Start, and Completion stages are also necessary. They add another 60 milliseconds to the 60 milliseconds used by the Activity Worker. As such, in this example, one single Activity takes 100 milliseconds (for the Workflow Task) + 120 milliseconds (for the Activity Task) to complete.

Latency Testing Methodology

Throughout this post, we use time measurements to show how Temporal application efficiency improves by adopting latency-reducing patterns and techniques. Here’s how we did that:

  • We used a single laptop-based Worker written in Java.
  • We measured response times from the originating Client app, also on a laptop.
  • To simulate business logic in the Activities, we added a total of 120-milliseconds of sleep (see diagram below).
  • Our Worker was connected to Temporal Cloud in the US-East-1 region of AWS.

payment order processing 2

The right number of properly tuned Workers, deployed closer to the Temporal Cloud Service, would have shown better performance and lower network latency, and investigating this is a great place to start to lower overall response time.

Using this testing methodology, the total end-to-end latency for the Workflow was 850 milliseconds before implementing the efficiency improvements discussed in this post.

Faster User Feedback with the Early Return Pattern

You can deliver faster user feedback if you re-design this flow to run the payment and fulfillment tasks asynchronously.

The Early Return pattern is an effective way to reduce latency by splitting your workflow into two distinct parts:

  • The first part, Synchronous Quick Response, handles the essential tasks that need to be completed immediately, such as validation. It quickly processes these tasks and sends a response back to the user without delay.
  • The second part, Asynchronous Background Work, takes care of the longer, more complex tasks that don’t require immediate user feedback. These tasks continue in the background, without blocking the user's workflow.

This approach improves the user experience, making your system feel more responsive without sacrificing your overall workflow integrity.

Here is a diagram updated to show the validation is completed and returns to the user while the transaction completion in the workflow continues on asynchronously.

payment orders with early return

When we tested with Early Return, we found that our workflows were 40-50% (850 to 265 milliseconds) faster, as measured from the Client.

early return speed

Try it out

Implementing the Early Return pattern is a great way to reduce latency without sacrificing durability. Getting started with Early Return is easy, whether you're using Temporal Cloud or running a local or self-hosted Temporal Service.

These links let you try out the Early Return pattern for reducing latency in Java and Go:

Please note: Early Return uses the Update-with-Start feature:

  • In Temporal Cloud: Update With Start is in Public Preview.
  • Self Hosted/Local: If you’re not using Temporal Cloud, update your Temporal server configuration to 1.26.2 - which includes the Public Preview release of Update With Start.

Use Local Activities

Early Return lets you optimize your Workflows split into synchronous and asynchronous parts. But what if you can’t split your workflow? You can still improve latency using Local Activities.

Local Activities run in the same process as your Workflows. This means there’s no need to make a round-trip call to the Temporal Service. Local Activities can save 50ms or more for each activity.

Be aware there are some limitations of Local Activities:

  • Local Activities work only for short activities that do not exceed the workflow task timeout. Local Activities do not support heartbeating.
  • Local Activities are not efficient for long retries. When a retry interval exceeds the workflow task timeout a timer is used to schedule the next retry. This implies that multiple events are added to the history on each retry. Normal activities can be retried practically indefinitely.
  • Local Activities have “at least once” semantics. A failure of a workflow task would lead to their re-execution. This includes re-execution of the whole sequence of Local Activities.
  • Local Activities extend Workflow task execution. While the task is running it cannot react to signals. So it increases the latency for signal and update handling.

What you get, though, is a great tradeoff. You reduce the execution time significantly, making your Workflow speedier and more responsive.

local activity workflow latency breakdown

Here you can see a breakdown of latency for Local Activity calls. Local Activities count as one Workflow Task, and they don’t require a Start step to match with a Worker because they’re already matched to the Workflow worker. They can also save time by not recording completion until the end of a series of Local Activities.

As a result, our Local Activity results were 60-70% faster when measured from the Client versus regular activities. In our testing, the Local Activity implementation took only 275 milliseconds for the entire workflow vs 900 milliseconds for the original Workflow response time.

local activity results

What is Idempotency

“Idempotency” means designing your Activities to succeed once and only once. An idempotent approach avoids process duplication that could withdraw money twice or ship extra orders by mistake. Run-once actions maintain data integrity and prevent costly errors. Idempotency keeps operations from producing additional effects, protecting your processes from accidental or repeated actions, for more reliable execution.

For more information, check out Idempotency and Durable Execution.

In this example, our activities are short-lived and idempotent, so they are a good fit for Local Activity requirements.

Try it out

Combining Techniques

Want to squeeze even more speed out of your workflows? You can combine Early Return with Local Activities to return a response to the user more quickly while also continuing longer tasks in the background. This combination can cut your workflow response time by as much as 91%.

In tests, combining Early Return and Local Activities reduced total workflow time from the original 850ms to 160ms, making it 20% faster than Early Return alone, and 91% faster than the original workflow.

combining techniques result

Try it out:

Eager Workflow Start to Make Workflows More Local

In a previous section, we discussed Local Activities. What if we could additionally make the Workflow local? That’s Eager Workflow Start.

Eager Workflow Start allows a Temporal Client to quickly execute a Workflow instead of delegating that execution to a separate worker. Normally, the Temporal Service assigns a task to a Worker that will durably execute the tasks. This assignment is called Matching. Matching adds extra time before the work begins to execute. With Eager Workflow Start, there’s no need for that matching step, which allows your workflow to begin faster.

eager-workflow-start

As we said above, with Local Activities, the workflow went from running in 900ms to running in 275ms. When adding Eager Workflow Start (which cannot run without Local Activities), the workflow speed improved further to 262ms.

eager workflow start breakdown

Try it out

  • In Temporal Cloud: You can request the pre-release "Eager Workflow Start" feature by opening a ticket or talking to your support team.
  • Self-Hosted/Local: If you’re not using Temporal Cloud, set your Temporal server configuration by launching with the following flag to enable Eager Workflow Start:
temporal server [start-dev] \
--dynamic-config-value system.enableEagerWorkflowStart=true

Calculate your Savings

Here’s a quick comparison of the different methods and their impact on latency:

  • Regular workflows: 850ms (total workflow)
  • Early Return: 199ms (first response)
  • Local Activities: 275ms (total workflow)
  • Early Return + Local Activities: 160ms (first response)
  • Eager Workflow Start + Local Activities: 265ms (total workflow)

latency outcomes

Moving Forward

The techniques discussed in this post can help you drastically reduce latency, making your Temporal workflows faster and more responsive. Here’s a summary of the techniques you can mix and match to achieve the latency your end users are looking for:

Discover More:

Want to learn more about Temporal?