AgentOfCode
AgentOfCode is an "agentic" LLM solution that leverages various Gemini models & Sonnet 3.5 to iteratively work through Agent of Code problems, committing its incremental progress to Github along the way.
The agent parses the problem HTML, extracts examples, generates unit tests/implementation, and then automatically executes the unit tests. After that, it iteratively "debugs" any errors or test failures by rewriting the unit tests and/or implementation until it comes up with something that passes tests, and then it tries executing the solution over the problem input and submitting to see if it was actually correct.
Temporal Workflow
To give you a sense of the agent's debugging process, here's a screenshot of the Temporal workflow implementing the agent that passed day 1's part 1 and 2.