At Traceloop, we’re solving the single thing engineers hate most: writing tests for their code. More specifically, writing tests for complex systems with lots of side effects, such as this imaginary one, which is still a lot simpler than most architectures I’ve seen:

As you can see, when an API call is made to a service, there are a lot of things happening asynchronously in the backend; some are even conditional.
Traceloop leverages LLMs to understand the OpenTelemetry traces outputted by such systems. Those LLMs can then be used to generate tests that make sure that the system always works as designed. So, for example in the above diagram, we make sure that:
- Another service stores data in the database
- Potato Handler Service is only called if the user actually likes potatoes, and so on…
But how does it actually work?
While LLMs are kind of magic in many domains, having them generate high-quality tests is a difficult task. We need to give the LLM significant context about what’s happening in the system so it can generate meaningful tests. But we can’t