Stripe is at the forefront of the payment processing game, handling an astonishing 700 terabytes in Kafka publish throughput daily – a feat that is only possible with the open-source powerhouse Kafka. However, managing 50 Kafka clusters across three environments and multiple regions can be a difficult task. That’s where Stripe’s Stream Infrastructure team stepped in, leveraging the incredible capabilities of Temporal to build a state-of-the-art Kafka Control Plane. Welcome to a talk where we’ll do a deep dive into Stripe’s Kafka Control Plane, exploring:

  • The extensive functionality of the Control Plane, ranging from broker replacement and cluster rebalancing, to topic configuration and health management.
  • The reasons behind choosing Temporal as the foundation for this advanced platform and the benefits it offers for managing Kafka at scale.
  • How we tackled challenges during development, like ensuring the safety of long-running workflows that make state changes to our system, crucial for preventing payment-processing downtime!
  • Innovative features we’ve developed ensure all system state changes remain secure, providing a reliable and efficient experience.
  • The challenges that we faced when building our control plane in Temporal.