Custom Metrics

Upload and manage custom evaluation metrics

Custom metrics must be Python files defining an evaluate(prediction: str, reference: str) → float function returning a score in [0.0, 1.0]. The AI agent will review your code, test it in a sandbox, and confirm compatibility.

Shaprompt Agent

AI-powered prompt optimization assistant

Hi! I'm Shaprompt Agent. I can help you optimize prompts, generate datasets, clean data, and set up evaluation metrics. What would you like to do?