DeepEval – Synthetic Data, Bulk Review, Custom Metric Logging by jacky2wong

Share This Article

Sed ut perspiciatis unde.

DeepEval v0.14 Update

To those new to DeepEval, DeepEval provides a Pythonic way to run offline evaluations on your LLM pipelines so you can launch comfortably into production. It provides a testing suite for LLMs.

In this product update, we include a number of improvements such as:

Synthetic Data Creation Using LLMs
Bulk Review For Synthetic Data Creation
Custom Metric Logging
Improved Developer Experience + CLI Improvements

Let’s get started:

For Retrieval Augmented Generation applications for tools like LlamaIndex, developers want an easy way to quickly measure the performance of their RAG pipeline.

This is now achievable in just 1 line of code.

dataset = create_evaluation_query_answer_pairs(
  openai_api_key="sk-xxx", 
  context="FastAPI is a Python language.",
  n=3
)

Under the hood, it uses ChatGPT to automatically create n number of query-answer pairs. It uses a simple ChatGPT prompt, takes in the original context and feeds it into a LLMTestCase . The LLMTestCase abstraction is one of the building blocks of DeepEval that allows for measuring performance of these RAG pipelines.

Interested in finding out more? Read about how to run this here.

Once you have created synthetic data, you can easily add / remove synthetic data pieces. You can see a sample screenshot of the dashboard for reviewing synthetic data.

The best part? You can view the dashboard completely in Python and can be self-hosted. This is done simply by running:

dataset.review()

When reviewing the dataset, you will be able to easily delete a row and add a row depending on what data you think is important for your evaluation.

Custom metric logging has be

DeepEval – Synthetic Data, Bulk Review, Custom Metric Logging by jacky2wong

DeepEval – Synthetic Data, Bulk Review, Custom Metric Logging by jacky2wong

Share This Article

Newsletter

HackTech

Leave a comment Cancel reply

Editor's Choice

DeepEval – Synthetic Data, Bulk Review, Custom Metric Logging by jacky2wong

DeepEval – Synthetic Data, Bulk Review, Custom Metric Logging by jacky2wong

Share This Article

Newsletter

HackTech

Leave a comment Cancel reply

Editor's Choice

Sign Up to Our Newsletter