Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factuality Evaluator failing #28

Closed
ishaan-jaff opened this issue Nov 9, 2023 · 7 comments
Closed

Factuality Evaluator failing #28

ishaan-jaff opened this issue Nov 9, 2023 · 7 comments

Comments

@ishaan-jaff
Copy link

Tried this code snippet:

from autoevals.llm import *
import openai

openai.api_key = "sk-"
 
# Create a new LLM-based evaluator
evaluator = Factuality()
 
# Evaluate an example LLM completion
input = "Which country has the highest population?"
output = "People's Republic of China"
expected = "China"
 
result = evaluator(output, expected, input=input)
print(result)
 
# The evaluator returns a score from [0,1] and includes the raw outputs from the evaluator
print(f"Factuality score: {result.score}")
print(f"Factuality metadata: {result.metadata['rationale']}")

I see this error:
Score(name='Factuality', score=0, metadata={}, error=KeyError('usage'))
Factuality score: 0

Traceback (most recent call last):
  File "/Users/ishaanjaffer/Github/litellm/litellm/tests/test_autoeval.py", line 19, in <module>
    print(f"Factuality metadata: {result.metadata['rationale']}")
KeyError: 'rationale'

Any suggestions on how i can debug this ?

@ankrgyl
Copy link
Contributor

ankrgyl commented Nov 10, 2023

Hmm this error seems to imply that the result from OpenAI did not include the "usage" key. We have some logic in autoevals that tries to extract usage metrics and log them, that's probably failing for some reason.

We're in the middle of reworking this in #27 and I suspect, especially if you're not using braintrust, that this will resolve the issue.

In the meantime, could you share the version of autoevals, openai, and python you're using?

@ishaan-jaff
Copy link
Author

autoevals 0.0.30
openai 0.28.1

python 3.10

@ishaan-jaff
Copy link
Author

it would also be useful if you print the stacktrace on errors

@ankrgyl
Copy link
Contributor

ankrgyl commented Nov 10, 2023

Interesting, do you mind patching #27 or re-testing after we land that change? I was not able to repro the error, but I suspect the response you're getting from OpenAI (perhaps related to your key?) is missing the usage field.

@ishaan-jaff
Copy link
Author

yes, i can re test once you've landed #27 , should i leave this issue open till then ?

@ankrgyl
Copy link
Contributor

ankrgyl commented Nov 10, 2023

just published 0.0.31. Please leave it opne!

@ishaan-jaff
Copy link
Author

works now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants