Client Error / API error fails to pickle in multi-processing setup

Question

Client Error / API error fails to pickle in multi-processing setup

farhanhubble opened this issue 7 months ago · comments

farhanhubble commented 7 months ago

Environment details

Programming language: Python
OS: Fedora Linux 41 (Workstation Edition) Linux 6.13.5-200.fc41.x86_64
Language runtime version: 3.12.7
Package version: 1.7.0

Steps to reproduce

Set up generate_content to be called from a function through multiprocessing.Pool().imap
Pass an attachment to generate_content that will trigger a google.genai.errors.ClientError: 400 INVALID_ARGUMENT. {'error': {'code': 400, 'message': 'The document has no pages.', 'status': 'INVALID_ARGUMENT'}}. For examples an IOBytes buffer that is seek()'d to end of file.
The client error fails to be pickled correctly and a stack trace like this is seen:

Exception in thread Thread-3 (_handle_results):
Traceback (most recent call last):
File "/home/farhanhubble/.pyenv/versions/3.12.7/lib/python3.12/threading.py", line 1075, in _bootstrap_inner
  self.run()
File "/home/farhanhubble/.pyenv/versions/3.12.7/lib/python3.12/threading.py", line 1012, in run
  self._target(*self._args, **self._kwargs)
File "/home/farhanhubble/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/pool.py", line 579, in _handle_results
  task = get()
         ^^^^^
File "/home/farhanhubble/.pyenv/versions/3.12.7/lib/python3.12/multiprocessing/connection.py", line 251, in recv
  return _ForkingPickler.loads(buf.getbuffer())
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: APIError.__init__() missing 1 required positional argument: 'response'

Reference: farhanhubble/jfk-tell#1

farhanhubble · Answer 1 · Wed Mar 26 2025 12:31:00 GMT+0800 (China Standard Time)

MRE:

import multiprocessing
from google import genai
from io import BytesIO
import os 

# Configure the API key
api_key = os.getenv("GEMINI_API_KEY")
if not api_key:
    raise ValueError("Please provide an API key.")


# Global model object (needs to be re-initialized inside subprocess)
client = None


def init_model():
    global client
    client = genai.Client(api_key=api_key)


def run_generation(prompt):
    global client
    try:
        mock_attachment = BytesIO(b"")
        mock_attachment.read()  # Seek to end of file
        client.files.upload(
            file=mock_attachment, config=dict(mime_type="application/pdf")
        )
        response = client.models.generate_content(
            model="gemini-2.0-flash", contents=[mock_attachment, prompt]
        )
        return response.text
    except ValueError as e:
        return f"Error: {e}"


if __name__ == "__main__":
    prompts = [
        "Explain the theory of relativity.",
        "What is the capital of France?",
        "Write a Python function to sort a list.",
    ]

    with multiprocessing.Pool(initializer=init_model) as pool:
        results = pool.map(run_generation, prompts)

    for r in results:
        print(r)

Yvonne Yu · Answer 2 · Sat Apr 19 2025 05:29:21 GMT+0800 (China Standard Time)

@farhanhubble

I have fixed the bug, if you update to 1.9.0, TypeError: APIError.__init__() missing 1 required positional argument: 'response' should not appear anymore.