Skip to content

Conversation

@schnecle
Copy link
Contributor

@schnecle schnecle commented May 2, 2024

No description provided.

@schnecle schnecle force-pushed the schnecle-custom-evaluator-docs branch from 2020b3f to 20ae1ea Compare May 2, 2024 18:48
async (datapoint: BaseDataPoint) => {
const score = await deliciousnessScore(judge, datapoint, judgeConfig);
return {
testCaseId: datapoint.testCaseId,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought: Can we avoid requiring the plugin writer to have to write this boilerplate testCaseId: datapoint.testCaseId?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we can at the moment because it's the way that we map all the additional metadata from the input dataset into the final scored result.


Now, define the function that will take an example which includes `output` as is required by the prompt and score the result. Genkit test cases include `input` and `output` as required fields, with optional fields for `context`. It is the responsibility of the evaluator to validate that all fields required for evaluation are present.

This example leverages `handlebars` to hydrate the prompt and `zod` to format and validate the response.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably update the evaluators to use dotprompt instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree!

@schnecle schnecle merged commit 59a38e7 into main May 3, 2024
@schnecle schnecle deleted the schnecle-custom-evaluator-docs branch May 3, 2024 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

5 participants