Code Exchange - Temporal Hugging Face Question Planetarium

This is an example implementation of running a hugging face model using Temporal activities and workers. Our page is using WebSockets to communicate updates back and forth between our flask server, and our client browser. By leveraging temporal we get automatic job queuing and execution in addition to retries, timeout management, question status tracking and more.

Screenshot of running application

How this project works:#

Screenshot of terminal running the above processes

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   Flask App     │    │  Temporal       │    │  Worker         │
│                 │◄──►│  Server         │◄──►│  (AI Model)     │
│                 │    │                 │    │                 │
│ • Web Interface │    │ • Workflow      │    │ • SmolLM3-3B    │
│ • WebSockets    │    │   Queue         │    │ • Inference     │
│ • Status Updates│    │ • Job Management│    │ • Retry Logic   │
└─────────────────┘    └─────────────────┘    └─────────────────┘

This project has 3 major components that communicate with each other.
- The flask app manages our frontend, pushes out updates to a browser using WebSockets, and creates new workflows to be coordinated by our temporal server.
- The Temporal server assigns work to our worker, manages the queue, handles retries and more.
- Our worker, executes our hugging face model and returns the results of our questions.

Temporal UI

You're able to ask as many questions as you'd like and can navigate to the temporal UI running on port 8233 to view the status of the queue. You can navigate to a specific task queue by clicking on the question from our flask UI. From the temporal UI, you can restart, terminate or just view the status of any specific workflow.