Add scorer purpose #87

ankrgyl · 2024-08-04T17:28:43Z

This allows us to exclude LLM-as-a-judge calls from certain metrics (e.g. token counts)

github-actions · 2024-08-04T17:29:15Z

Braintrust eval report

Autoevals (purpose-scorer-1722992568)

Score	Average	Improvements	Regressions
NumericDiff	75.2% (+0pp)	-	-

github-actions · 2024-08-04T17:29:15Z

Braintrust eval report

Autoevals (purpose-scorer-1722792524)

Score	Average	Improvements	Regressions
NumericDiff	75.2% (+0pp)	-	-

manugoyal · 2024-08-04T18:56:25Z

js/oai.ts

@@ -69,6 +72,7 @@ export function buildOpenAIClient(options: OpenAIAuth): OpenAI {
 }

 declare global {
+  // eslint-disable-next-line


is there a more specific warning we can disable? or maybe a comment to explain what we are ignoring?

Yep added more specific one

manugoyal · 2024-08-04T18:57:56Z

py/autoevals/oai.py

@@ -18,7 +18,12 @@ class OpenAIWrapper:
    RateLimitError: Exception


+_WRAPPED_OPENAI = False


Can we avoid the global variable by returning this boolean as part of prepare_openai?

github-actions · 2024-08-07T02:01:54Z

Braintrust eval report

Autoevals (main-1722996118)

Score	Average	Improvements	Regressions
NumericDiff	75.2% (+0pp)	-	-

Add scorer purpose

8087e44

ankrgyl requested a review from manugoyal August 4, 2024 17:29

ankrgyl added 2 commits August 4, 2024 11:32

Fix non-wrap case

945d291

Fix

33b3d28

manugoyal approved these changes Aug 4, 2024

View reviewed changes

ankrgyl added 2 commits August 4, 2024 12:01

Comments

5d90221

Merge branch 'main' into purpose-scorer

19402c9

ankrgyl merged commit 2c47406 into main Aug 7, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add scorer purpose #87

Add scorer purpose #87

ankrgyl commented Aug 4, 2024

github-actions bot commented Aug 4, 2024 •

edited

Loading

github-actions bot commented Aug 4, 2024

manugoyal Aug 4, 2024

ankrgyl Aug 4, 2024

manugoyal Aug 4, 2024

ankrgyl Aug 4, 2024

github-actions bot commented Aug 7, 2024 •

edited

Loading

		@@ -18,7 +18,12 @@ class OpenAIWrapper:
		RateLimitError: Exception


		_WRAPPED_OPENAI = False

Add scorer purpose #87

Add scorer purpose #87

Conversation

ankrgyl commented Aug 4, 2024

github-actions bot commented Aug 4, 2024 • edited Loading

Braintrust eval report

github-actions bot commented Aug 4, 2024

Braintrust eval report

manugoyal Aug 4, 2024

Choose a reason for hiding this comment

ankrgyl Aug 4, 2024

Choose a reason for hiding this comment

manugoyal Aug 4, 2024

Choose a reason for hiding this comment

ankrgyl Aug 4, 2024

Choose a reason for hiding this comment

github-actions bot commented Aug 7, 2024 • edited Loading

Braintrust eval report

github-actions bot commented Aug 4, 2024 •

edited

Loading

github-actions bot commented Aug 7, 2024 •

edited

Loading