Custom Metrics

Upload and manage custom evaluation metrics

Custom metrics must be Python files defining an evaluate(prediction: str, reference: str) → float function returning a score in [0.0, 1.0]. The AI agent will review your code, test it in a sandbox, and confirm compatibility.