Using Anthropic's Prompt Caching with Instructor #940

wottpal · 2024-08-14T18:22:16Z

wottpal
Aug 14, 2024

I think it would be a great fit for instructor to support Anthropic's new Prompt Caching. E.g. for parsing structural data, it could store the base prompt once and only receive the content-to-parse in subsequent calls to save tokens.

emclaughlin215 · 2024-08-19T10:44:34Z

emclaughlin215
Aug 19, 2024

For my use case, I managed to get something working doing the following:

def generate_structured_completion(
    self, system_prompt: str, messages: Sequence[ChatMessage], response_model: type[T]
) -> T:
        client = Anthropic(api_key=api_key)
      
        # This is for all other single model cases
        # stolen from instructor/patch.py
        if get_origin(response_model) is Iterable:
            iterable_element_class = get_args(response_model)[0]
            response_model = IterableModel(iterable_element_class)  # type: ignore
        if not issubclass(response_model, OpenAISchema):
            response_model = openai_schema(response_model)
      
        response: PromptCachingBetaMessage = cast(
            PromptCachingBetaMessage,
            client.beta.prompt_caching.messages.create(
                system_prompt,
                messages,
                extra_headers={"anthropic-beta": "prompt-caching-2024-07-31"},
                tools=[response_model.anthropic_schema],
            ),
        )
        tool_calls = [c.input for c in response.content if c.type == "tool_use"]
        
        if tool_calls:
            tool_call = tool_call[0]
            return response_model.model_validate(tool_call)
       else:
            # no tools were called.

Might be helpful for other, so leaving it here.

2 replies

dogonthehorizon Aug 24, 2024

What is one giving up by not using the patched client from Instructor to do the tool completion like this?

aquemy Sep 4, 2024

Actually, my original solution does not work, but a simple monkey patch works perfectly:

from anthropic import Anthropic
import instructor
from pydantic import BaseModel

class User(BaseModel):
    name: str 
    age: int

client = Anthropic(default_headers={"anthropic-beta": "prompt-caching-2024-07-31"})
# Monkey patch
client.messages.create = client.beta.prompt_caching.messages.create
client_inst = instructor.from_anthropic(client)

message, completion = client_inst.messages.create_with_completion(
            max_tokens=1024,
            messages=[
                {
                    "content": "Generate a user",
                    "role": "user",
                }
            ],
            model="claude-3-5-sonnet-20240620",
            response_model=User,
        )

>>> from rich import print
>>> print(message)
User(name='John Doe', age=30)
>>> print(completion)
PromptCachingBetaMessage(
    id='msg_01618HJgSkxFfQ1BvxJomnvb',
    content=[ToolUseBlock(id='toolu_01DXVxLDEqXGWt7tFKAKJeco', input={'name': 'John Doe', 'age': 30}, name='User', type='tool_use')],
    model='claude-3-5-sonnet-20240620',
    role='assistant',
    stop_reason='tool_use',
    stop_sequence=None,
    type='message',
    usage=PromptCachingBetaUsage(cache_creation_input_tokens=0, cache_read_input_tokens=0, input_tokens=380, output_tokens=52)
)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Anthropic's Prompt Caching with Instructor #940

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Using Anthropic's Prompt Caching with Instructor #940

wottpal Aug 14, 2024

Replies: 1 comment · 2 replies

emclaughlin215 Aug 19, 2024

dogonthehorizon Aug 24, 2024

aquemy Sep 4, 2024

wottpal
Aug 14, 2024

Replies: 1 comment 2 replies

emclaughlin215
Aug 19, 2024