How we built it: Building Scheduled Workflows in Temporal
Temporal’s new Scheduler feature introduces an improved experience for developers with a schedule-specific API, finer-grained configurations, improved visibility, and the ability to have overlapping runs. For an overview of its features and use, Keith Tenzer, one of our Solutions Architects, wrote a great blog post.
As this article is a contribution to our How We Built It series, I want to focus on the Why? and How? of our new Scheduler feature. As Keith covers in his post, there was a previous simple schedule feature in Temporal, but it had its drawbacks:
- Schedule couldn’t be stopped without affecting the running Workflow
- Workflow couldn’t be terminated without affecting schedule
- Schedule couldn’t be paused or have its configuration updated
- Couldn’t support overlapping runs
- Started in an unqueriable state
- No good visibility into running schedules
To improve upon these drawbacks, we knew we had a few options.
We could build a separate scheduler system in the server using low-level components like mutable state and timer queues. This would be our most efficient option, but also the most work intensive. Ultimately, we didn’t go with this option. But, as a nod to its potential, we still designed the API with this implementation in mind, should the opportunity to switch present itself down the road.
Another option was to build more developer-friendly functionality on top of the existing CronSchedule implementation. But, as this would mean a Schedule and a Schedule Workflow are still conflated, the problem of overlapping runs wouldn’t be solved. The inability to support overlapping runs put this option to rest.
The third, and prevailing, option was to build out the new Scheduler as a Workflow. The logic of the Scheduler process would look a lot like a Workflow even if we didn’t write it as one. And, if we did write it as one, we get to reuse a lot of functionality already in the server. Ultimately, this ability to reuse existing functionality proved a strong motivator and we moved forward with building a Workflow-based scheduler.
Building a Workflow-based scheduler
Deciding to build the Scheduler as a Workflow certainly wasn’t the last design choice we had to make. The next big decision we had to make was where to house the Workflow implementation.
Do we implement the scheduler as part of the SDKs? What about as a separate library that users can link to in their workers? Or do we implement it directly on the Server workers?
SDK inclusion and shipping it as a library would mean re-implementing the scheduler Workflow in multiple languages (Go and Java don’t use our Core SDK). And, in the case of implementing it as a library, users would have to worry about an additional library in their build process and potentially suffer a degraded experience from build complications.
All this considered, we decided to build the scheduler directly into Temporal Server. With this option, users wouldn’t have to install or add any additional components—they'd just make calls to an API. From a product perspective, implementing the API on the Server also gives us the flexibility to switch to a more efficient implementation later on.
Implementing the scheduler as a Workflow presented some unique engineering challenges.
One of the first challenges we ran into was with Namespaces. Our existing internal Workflows run on the Worker role, in a Namespace called
temporal-system. This is a local, not-replicated Namespace. But, our new scheduler Workflows correspond to entities (“Schedules”), which live in a user Namespace. So we asked ourselves: can we make it work if the Schedules entity lives in a user Namespace, but the Workflow that implements it lives in a system Namespace?
We decided against trying to make this fit, primarily because of concerns around replication and migration. We support replicating a user Namespace to another Cluster and migrating active copies to new Clusters—as I mentioned above, the system Namespace is not replicated, so this would require building a lot of new replication functionality. But, if we were to run the scheduler Workflows in the user Namespace, we pretty much get replication for free.
So what did we do?
We run the scheduler Workflows in the user Namespace. But what about the SDK worker in the Worker role? It only listens on a task queue in the system namespace. For now, we did the simplest thing possible: we just run more SDK workers! For each user Namespace, we run at least one SDK worker on one of the server Workers. There’s a new component that keeps the running workers in sync with registered Namespaces.
This may sound excessive but the SDK worker is relatively light, so hundreds of them in one process is not a problem. In the future, we plan to optimize this by allowing one SDK worker to handle multiple Namespaces and task queues at once (this may be the subject of a future post).
As things stood, with the scheduler Workflows in the user Namespace, if a user were to list all Workflows, they would see these scheduler Workflows as well. We consider these an internal implementation detail and didn’t want them coming up in user queries. So we had to figure out how to “hide” these scheduler Workflows from user interactions.
One option would be to simply exclude these Workflows from visibility. However, we need visibility to implement “list schedules”. We could push them to a separate visibility index, but our current visibility implementation only works with a single index, so this would require a lot of work.
Then we thought, maybe we can just filter them out of the results of visibility queries that we get back from the database? But this led to pagination woes: a request for a page of 100 user Workflows might return only 95, or worse, 0, if only scheduler Workflows were found. But we were on the right track.
In the end, we decided to use a simple trick: a search attribute. We marked these Workflows with a specific search attribute, and added a clause on that attribute to all queries (unless a query mentions it specifically). To keep things general, we called this new level of grouping within a Namespace a "Namespace division."
Seeing some benefits, we might later extend this Namespace division functionality to other elements of Temporal.
Signal vs. query
I noted above that two of the big improvements with this new scheduler is that it is stateful and can now be interacted with via a synchronous gRPC API. So, under the hood, calls like
create schedule and
delete schedule are really
start workflow and
terminate workflow. What about
Since commands like
update are changing the state of the Workflow to the effect that its behavior changes, they are implemented with signals. But, what about something like
describe schedule? This got a little tricky. Part of the output of
describe schedule is a set of the currently-running Workflows that were started by the scheduler. The scheduler knows what Workflows it started, but does it know which are still running? For the scheduler’s state to be updated, it would need to be notified when the Workflow is complete. But, the only things that can wake up a Workflow are a timer (or timeout), a signal, an Activity completion, or a Child Workflow completion.
The Workflows that are started by the scheduler are not Child Workflows; they're independent (this is because the scheduler Workflow is an implementation detail that we don’t want to expose and because Child Workflow completion notification doesn’t work across continue-as-new anyway). This means that the only reasonable way to get notified if a scheduled Workflow completes is to have an Activity that listens and returns when the Workflow completes.
So based on this, we can query the Workflow for "describe", but the set of running Workflows it returns might include ones that have completed already but whose state has not yet updated. How did we avoid this?
Eventually, we’ll use a new feature that allows synchronous Workflow updates, but this is still in development. So for now, the “describe” implementation in the frontend service checks that all the “running Workflows” are still running. If any are completed, it sends the scheduler a signal, to tell it to wake up and refresh the state of all its running Workflows, and then it does the query again.
Building our new scheduler as a Workflow was a great challenge and ultimately allowed us to build a feature that I really believe is going to improve the developer experience for Temporal users.
If you want to learn more about using the new scheduler in your applications, check out our docs for SDK-specific information.