jxnl / instructor

structured outputs for llms

Home Page:https://python.useinstructor.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Is it possible to get the inputs that Instructor uses?

ahuang11 opened this issue · comments

Is your feature request related to a problem? Please describe.
I'd like more visibility on how Instructor post its requests.

Describe the solution you'd like
I'd like a create_request method

import instructor
from pydantic import BaseModel
from openai import OpenAI

# Define your desired output structure
class UserInfo(BaseModel):
    name: str
    age: int


# Patch the OpenAI client
client = instructor.from_openai(OpenAI())

# Extract structured data from natural language
request_data = client.chat.completions.create_request(
    model="gpt-3.5-turbo",
    response_model=UserInfo,
    messages=[{"role": "user", "content": "John Doe is 30 years old."}],
)

request_data

Outputs:

{
  "model": "gpt-3.5-turbo",
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "UserInfo",
        "parameters": {
          "type": "object",
          "required": [
            "age",
            "name"
          ],
          "properties": {
            "age": {
              "type": "integer",
              "title": "Age"
            },
            "name": {
              "type": "string",
              "title": "Name"
            }
          }
        },
        "description": "Correctly extracted `UserInfo` with all the required parameters with correct types"
      }
    }
  ],
  "messages": [
    {
      "role": "user",
      "content": "John Doe is 30 years old."
    }
  ],
  "tool_choice": {
    "type": "function",
    "function": {
      "name": "UserInfo"
    }
  }
}

Describe alternatives you've considered
logfire, but I'd like to use the request downstream for LLM evaluation purposes

Additional context

As a temporary workaround, I think this works:

from pydantic import BaseModel


class UserInfo(BaseModel):
    name: str
    age: int

def build_function_request(base_model: BaseModel):
    schema = base_model.model_json_schema()
    openai_request = schema.copy()
    name = openai_request.pop("title")
    openai_request = {
        'type': 'function',
        'function': {
            'name': name,
            'parameters': {
                'type': 'object',
                'required': openai_request.pop('required'),
                'properties': openai_request.pop('properties'),
            },
            'description': f"Correctly extracted `{name}` with all the required parameters with correct types"
        }
    }
    return openai_request

build_function_request(UserInfo)