Watch the full interview here.
Back when I joined Temporal in early 2020, one of the first things our CEO and co-founder Maxim said to me was, “we need a single binary.” Single binary was the term Max used to represent the idea of a cross-platform binary that contained a full-fledged Temporal Cluster (including persistence and dependencies). At the time, I would have never imagined that feature request would turn into a three-year journey.
For context, even in early 2020 (when there wasn’t a production build of Temporal), we knew that our technology could be intimidating and unapproachable for new users. Although a lot of that intimidation had to do with the nature of the product itself, a fair share was also caused by how we packaged and distributed Temporal.
Up until now, the official way of developing locally with Temporal was a solution based on Docker compose. Many of you are very comfortable with Docker, but the reality is that it’s still a foreign concept to many developers. Even for those who know how to use it, Docker can represent additional overhead and indirection, which slows you down. This is something we’ve received direct feedback about historically but also something we intuitively knew to be true. This was the initial and major motivation behind the idea of a single binary, a lighter-weight and self-contained Temporal development experience.
Little did I know that the single binary would become one of the most widely discussed things we would never end up building ourselves.
Shoulda coulda woulda
Those of you reading may be rightfully asking, “why didn’t you just build it”? Fair. The answer is mostly a bunch of bad Temporal-agnostic reasons like “conflicting priorities” and “lack of an owner,” but to our credit, the project did represent serious engineering work. It’s totally reasonable if it doesn’t immediately make sense why serious development was needed. To make things clear, I will briefly talk about what goes into Temporal Cluster.
At the end of the day, a Temporal Cluster is really made up of three main things:
- Services distributed and run as binary artifacts. These are created by compiling the service code defined in the Temporal Server repo.
- The dedicated UI server and the UI itself. Historically, the UI server was a Node.js application.
- Dependencies that are run as part of the Temporal Cluster and are required for it to function. The two dependencies that are especially interesting are the persistence and visibility store.
As you might have guessed, when it comes to compiling the entire cluster into a single binary file, the services (#1) are the least of the issues. We simply compile the services into a single binary instead of distributing them separately. Not much more to say.
The UI (#2) was a mixed bag. Bundling the UI itself (as in the CSS, HTML, and JavaScript) doesn’t represent much of an issue because it's just a matter of including the relevant files in the binary. The UI server, though, definitely presented an interesting challenge because it's not really feasible to compile a Node.js application (with all required libraries and packages) into a Go binary. The problem wasn’t insurmountable, but it needed solving.
Dependencies (#3) are where things get really interesting. For the majority of Temporal history, Cassandra was the default database. Let’s just say that Cassandra is not designed to be compiled into a binary with a bunch of other code that runs in memory. That being said, Temporal does support pluggable persistence, so we’re not actually limited to Cassandra. Unfortunately, MySQL and PostgreSQL also fail the sniff test when compiling to a single binary that can run in memory. Fortunately, early in our discussions, a team member realized that, although MySQL wasn’t going to work out, a SQL-compatible alternative could: SQLite. SQLite is interesting because it's an in-memory database that works like SQL but can be used directly as a library. These properties made it an ideal fit.
With SQLite, we felt confident that the persistence aspect of the single binary was solvable, but one other major dependency was in our way: Advanced Visibility. Temporal has two levels of Visibility support:
- Basic Visibility: Always supported and powers basic listing and filtering.
- Advanced Visibility: Optional extension to Basic Visibility; enables users to create and update custom Workflow indices.
Basic Visibility is designed to work directly with the standard Temporal persistence, but for performance reasons, Advanced Visibility can be leveraged only by setting up Temporal with Elasticsearch or using Cassandra. Elasticsearch is not designed to be compiled and run in-memory with a bunch of other code. If we wanted to leverage SQLite, we would also need a solution to the Elasticsearch problem. At this point, analysis paralysis set in, and the project stopped making forward progress.
Overthinking it
While we were busy overthinking things, some Temporal power users at Datadog had been independently inspired to solve the same problem. More than that, they had similarly converged on SQLite as the most viable answer to solving the persistence problem. They also made a key decision to target an MVP without support for Advanced Visibility. It wouldn’t be perfect, but you have to start somewhere.
This is where things were more than a year ago when Jacob LeGrone (the primary author of Temporalite and the person who brought Temporal into Datadog) approached me about the idea of creating a self-contained Temporal binary along with Datadog contributing an SQLite driver upstream. We enthusiastically approved the idea, and Temporalite was born.
Although we’re immensely appreciative of all the time and energy Jacob and others at Datadog put into Temporalite, I don’t actually see that as the biggest contribution the team made to us. When Datadog came in and proved out the single binary, their motivation wasn’t just limited to improving the getting-started experience for Temporal. For example, one of their planned use cases required running an embedded Temporal server instance. The Datadog team also led the charge in using Temporalite as a testing dev tool, which quickly became a huge boon for the local dev cycle. This is all to say that they didn’t just build the core foundation of a single binary. They inspired us to think more fundamentally about the idea itself.
Getting inspired
With Temporalite now available, we immediately began to see the impact and excitement from early adopters. It became clear fairly quickly that Temporalite represented the evolution of our Docker-based getting-started experience, along with much more. After a few months, we felt confident that switching to a Temporalite-based solution as the default would be the best choice for future adopters of our technology.
After we internally committed to Temporalite, we realized that asking a tough favor of Datadog might be necessary. Although we have immense gratitude and respect for the Datadog brand and engineering team (probably some of the strongest Temporal users that exist), we knew that it would be treacherous to have a default getting-started experience that was not formally owned by Temporal. Outside of ideological reasons for this being true, there are also practical reasons. Many companies and users want to consume binaries that are signed directly by the originating company, and Temporalite is a flavor of compiling Temporal Server itself. With this in mind, I approached the Datadog team and asked whether they would be willing to transfer ownership. Their number-one goal was the health and success of the technology and community around it, so they happily transferred the repo to us. They remained and will remain as maintainers and equal partners.
A unified solution
As previously mentioned, it wasn’t just the codebase of Temporalite that Datadog gifted us, it was the inspiration to think bigger and better. We began to imagine what a fundamentally better getting-started experience would look like if based on Temporalite. One clear requirement was complete and total support for all Temporal APIs because, as discussed earlier in this post, our Advanced Visibility features have been historically limited to Elasticsearch. Making the UI available was also a requirement, but thanks to some solid foresight on our part, we had already rewritten the UI server in Go as part of releasing our next-generation UI experience last year.
One novel requirement emerged after seeing what Datadog had done. I realized that (at least in my own imagined ideal experience) there would be only a single Temporal tool to rule them all. Compare this with current reality, where even with Temporalite, users are still required to install and separately manage our command-line tooling tctl (which enables users to operate and interact with running Temporal Clusters).
I began to discuss with the team the idea of unifying our tooling under a single, easily installable package. It took some selling, but eventually, we were all able to align. Thus Temporal CLI was born.
Temporal CLI
Temporal CLI comes with the outstanding power and convenience of Temporalite plus support for all core Temporal functionality, including Advanced Visibility and our next-generation tctl experience. For the first time, all that is required to start using Temporal is running one install command from your household package manager, such as brew.
brew install temporal
We’re incredibly excited and grateful for the official release of Temporal CLI and truly believe it will drastically improve the experience of Temporal development for new and existing users.
For us, this is just step 1. We plan to work closely with Datadog on a shared roadmap and vision for the Temporal CLI to ensure it continues to evolve and expand to suit the needs of our users.
One spoiler in that area that I’m happy to preview is the guided onboarding experience we’re hoping to build into and distribute with Temporal CLI. This experience is made possible by our powerful Web UI rewrite that landed last year.
But as I said, this isn’t just our vision. Jacob had some specific thoughts to share for what’s happening and what’s coming next.
“Here at Datadog, we’re looking forward to further stabilization now that temporaltest package is validated with each change to Temporal server. Removing the dependency on temporalio/temporalite in our repos also makes it easier to upgrade quickly to the latest server release.
There is a ton of excitement for Advanced Visibility support with SQLite. This was the last missing piece to make our Workflow implementations fully portable between our larger production clusters and Temporalite-based dev/test environments. We should be able to ship higher level helpers for enabling Advanced Visibility and pre-registering custom Search Attributes in temporaltest soon as well.
Embedding Temporal in a larger application is a crazy new direction that we’re beginning to explore seriously because it opens up a bunch of use cases that would have previously caused circular runtime dependencies on our production Temporal service. For example, we’re working on tooling to compile a Temporal server, UI, and one or more workers in a single binary. The idea is that this tool can run automation which will bootstrap and upgrade our lowest-level cloud infrastructure. Even these single-node servers can be configured for high durability by replicating SQLite state to blob storage via Litestream.” - Jacob LeGrone