Using Temporal with Anthropic's Message Batches API
Ever find yourself jotting down questions while researching, only to run them through an AI assistant later? That's exactly what inspired this project. I had a simple script that would process my research questions one-by-one through Claude, but occasionally requests would fail and—let's be honest—I was too lazy to figure out which ones. I'd just run the whole thing again and eat the cost. 💸
This integration tackles several key problems when working with AI at scale:
- Cost efficiency - The Message Batches API offers a whopping 50-85% discount on token costs compared to standard API calls. Who doesn't like saving money?
- Reliability - Using Temporal workflows means our processing survives crashes, restarts, and even my laptop being tossed into the ocean (which I considered at one point, apparently).
- Simplicity - No need to build complex queuing systems or track batch IDs manually. Temporal handles all that state management for us.
- Scalability - The Message Batches API supports up to 10,000 queries per batch that will be processed within 24 hours, with higher rate limits to boot.
How It Works
The solution combines two powerful tools:
- Anthropic's Message Batches API: A specialized endpoint for processing large volumes of Claude queries asynchronously at a significant discount
- Temporal: A distributed orchestration platform that makes our workflows durable and fault-tolerant. If you're hear, you probably have heard of Temporal before.
The core workflow is surprisingly simple:
- Submit questions in a batch to Claude
- Poll for completion status (with Temporal managing this state)
- Retrieve and process results when ready
Implementation Highlights
// Create a batch request
export async function createBatch(messages: SimpleBatchMessage[]): Promise<string> {
const requests: BatchCreateParams.Request[] = messages.map((message) => ({
custom_id: message.id,
params: {
model: 'claude-3-7-sonnet-latest',
max_tokens: 1024,
messages: [{ role: 'user', content: message.content }],
},
}));
const batch = await anthropic.beta.messages.batches.create({ requests });
return batch.id;
}
The Temporal workflow ties everything together with built-in resilience:
export async function batchRequests(messages: SimpleBatchMessage[]) {
let completed = false;
// Step 1: Create the batch
const batchId = await createBatch(messages);
while (!completed) {
completed = await checkBatchStatus(batchId);
if (completed) {
return getBatchResults(batchId);
}
await sleep('10 seconds');
}
}
Visual Monitoring
One of the nicest bonuses is being able to see what's happening through Temporal's UI. The post mentions seeing workflows in progress—giving you that warm fuzzy feeling of knowing your batch jobs are chugging along nicely, even if you step away.
Why This Matters
This isn't just about running a bunch of AI queries—it's about creating infrastructure that's:
- Resilient: No more losing track of batches when processes crash
- Cost-effective: Potentially reducing costs by up to 85% compared to standard API usage
- Observable: See exactly what's happening with your processing at all times
- Maintainable: Clean separation of concerns through Temporal's activity/workflow pattern
For anyone working with AI at scale—whether analyzing customer feedback, processing documents, or running large evaluations—this approach provides a solid foundation that won't break the bank or your sanity.