Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

InstructorRetryException: Error code: 400 / Invalid schema for function due to date field #1314

Open
drorata opened this issue Jan 20, 2025 · 2 comments
Labels
bug Something isn't working question Further information is requested

Comments

@drorata
Copy link

drorata commented Jan 20, 2025

I start with the following model:

class SpeedingTicket(BaseModel):
    """
    You are processing speeding tickets at the service center and expected to extract the data for the defined fields
    """

    license_plate: str | None = Field(
        description="The license plate number of the speeding vehicle."
    )
    location: str | None = Field(
        description="The address/location where the speeding took place"
    )
    speed: int | None = Field(description="The recorded speed of the vehicle.")

In addition, I have the following speeding ticket:

ticket="""
# SPEEDING TICKET

**Ticket Number:** 123456789

**Date Issued:** 20 January 2025
**Time Issued:** 14:35

## Driver Information
- **Name:** John Doe
- **License Number:** D1234567
- **Address:** 123 Main Street, Berlin, Germany

## Vehicle Information
- **Make:** BMW
- **Model:** 320i
- **License Plate:** B-1234XYZ
- **Color:** Black

## Violation Details
- **Location:** A100, Berlin, Germany
- **Speed Limit:** 80 km/h
- **Recorded Speed:** 110 km/h
- **Excess Speed:** 30 km/h

## Fine Amount
- €120

## Officer Information
- **Name:** Officer Anna Schmidt
- **Badge Number:** 5678

## Payment Information
- **Payment Due Date:** 20 February 2025
- **Payment Options:**
  1. **Online:** [www.berlintrafficfines.de](http://www.berlintrafficfines.de)
  2. **By Mail:** Traffic Department, P.O. Box 1234, Berlin, Germany
  3. **In-Person:**
"""

Lastly, I run:

client = instructor.from_openai(
    OpenAI(
        api_key=settings.openai_api_key,
    ),
    mode=Mode.TOOLS_STRICT,
)

result = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": ticket,
        }
    ],
    response_model=SpeedingTicket,
)
print(result)

And I get the expected result, namely: license_plate='B-1234XYZ' location='A100, Berlin, Germany' speed=110

However, if I modify the model to (note the added date field):

class SpeedingTicket(BaseModel):
    """
    You are processing speeding tickets at the service center and expected to extract the data for the defined fields
    """

    license_plate: str | None = Field(
        description="The license plate number of the speeding vehicle."
    )
    location: str | None = Field(
        description="The address/location where the speeding took place"
    )
    ticket_date: date | None = Field(description="The date of the ticket issuing.")
    speed: int | None = Field(description="The recorded speed of the vehicle.")

and execute the same call, I get the following error:

InstructorRetryException: Error code: 400 - {'error': {'message': "Invalid schema for function 'SpeedingTicket': In context=('properties', 'ticket_date', 'anyOf', '0'), 'format' is not permitted.", 'type': 'invalid_request_error', 'param': 'tools[0].function.parameters', 'code': 'invalid_function_parameters'}}

How this small and rather reasonable change break the call? Thanks in advance!

@github-actions github-actions bot added bug Something isn't working question Further information is requested labels Jan 20, 2025
@drorata
Copy link
Author

drorata commented Jan 22, 2025

Following this hint and the documentation of openai here I tried to replace the ticket_date above with the following:

ticket_date: str | None = Field(description="The date of the ticket issuing. Provide in an ISO datetime format.")

and with this the call works as expected and I get the date and time of the ticket. Why doesn't it work with date and datetime?

@paulelliotco
Copy link
Contributor

The issue occurs because OpenAI's API schema validation doesn't directly support Python's date or datetime types. Here's the explanation and solution:

Root Cause

  1. OpenAI's API schema validation only supports a limited set of JSON Schema types
  2. The date type in Pydantic includes a format validator that's not supported by OpenAI
  3. The API expects ISO format strings for dates

Solution

Use string type for the API communication while maintaining proper date validation internally. Here's the fixed implementation:

from datetime import date
from pydantic import BaseModel, Field, validator
from typing import Optional

class SpeedingTicket(BaseModel):
    """
    You are processing speeding tickets at the service center and expected to extract the data for the defined fields
    """
    license_plate: Optional[str] = Field(
        description="The license plate number of the speeding vehicle."
    )
    location: Optional[str] = Field(
        description="The address/location where the speeding took place"
    )
    ticket_date: Optional[str] = Field(
        description="The date of the ticket issuing in ISO format (YYYY-MM-DD)."
    )
    speed: Optional[int] = Field(description="The recorded speed of the vehicle.")

    @validator('ticket_date', pre=True)
    def validate_date(cls, value):
        if value is None:
            return None
        if isinstance(value, date):
            return value.isoformat()
        try:
            # Validate and ensure ISO format
            return date.fromisoformat(value).isoformat()
        except ValueError:
            raise ValueError("Date must be in ISO format (YYYY-MM-DD)")

    def get_date_object(self) -> Optional[date]:
        """Helper method to get the date as a date object"""
        return date.fromisoformat(self.ticket_date) if self.ticket_date else None

Usage Example

result = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": ticket,
        }
    ],
    response_model=SpeedingTicket,
)

# Access the date as a string
print(result.ticket_date)  # Output: "2025-01-20"

# Access the date as a date object
print(result.get_date_object())  # Output: datetime.date(2025, 1, 20)

Key Benefits

  1. Works with OpenAI's API constraints
  2. Maintains type safety and validation
  3. Provides easy conversion to date objects when needed
  4. Ensures consistent ISO format for dates

This solution should resolve the InstructorRetryException while maintaining proper date handling. Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants