No PDF support for files API.

Question

No PDF support for files API.

KevinRoller opened this issue 3 months ago · comments

Description of the bug:

I executed the following code with loaded GEMINI_API_KEY environment variable but recieved error "google.api_core.exceptions.InvalidArgument: 400 Unsupported MIME type: application/pdf"

Dependency:
google-generativeai == 0.6.0

Here is the code:

import os
import time

import google.generativeai as genai

genai.configure(api_key=os.environ["GEMINI_API_KEY"])

def upload_to_gemini(path, mime_type=None):

  file = genai.upload_file(path, mime_type=mime_type)
  print(f"Uploaded file '{file.display_name}' as: {file.uri}")
  return file

def wait_for_files_active(files):

  print("Waiting for file processing...")
  for name in (file.name for file in files):
    file = genai.get_file(name)
    while file.state.name == "PROCESSING":
      print(".", end="", flush=True)
      time.sleep(10)
      file = genai.get_file(name)
    if file.state.name != "ACTIVE":
      raise Exception(f"File {file.name} failed to process")
  print("...all files ready")
  print()

generation_config = {
  "temperature": 1,
  "top_p": 0.95,
  "top_k": 64,
  "max_output_tokens": 8192,
  "response_mime_type": "text/plain",
}

model = genai.GenerativeModel(
  model_name="gemini-1.5-pro",
  generation_config=generation_config,
  system_instruction="Just summarize the document",
)


files = [
  upload_to_gemini("./resources/few-shot/canon-filter/difm-adapter-ef-eosr-im-eng.pdf", mime_type="application/pdf"),
]

wait_for_files_active(files)

chat_session = model.start_chat(
  history=[
    {
      "role": "user",
      "parts": [
        "Summarize file",
        files[0],
      ],
    },
  ]
)

response = chat_session.send_message("INSERT_INPUT_HERE")

print(response.text)

The traceback of error:

Traceback (most recent call last):
  File "/init_project/app/services/taskAgent.py", line 223, in <module>
    response = chat_session.send_message("INSERT_INPUT_HERE")
  File "/python3.10/site-packages/google/generativeai/generative_models.py", line 504, in send_message
    response = self.model.generate_content(
  File "/python3.10/site-packages/google/generativeai/generative_models.py", line 258, in generate_content
    response = self._client.generate_content(
  File "/python3.10/site-packages/google/ai/generativelanguage_v1beta/services/generative_service/client.py", line 812, in generate_content
    response = rpc(
  File "/python3.10/site-packages/google/api_core/gapic_v1/method.py", line 131, in __call__
    return wrapped_func(*args, **kwargs)
  File "/python3.10/site-packages/google/api_core/retry/retry_unary.py", line 293, in retry_wrapped_func
    return retry_target(
  File "/python3.10/site-packages/google/api_core/retry/retry_unary.py", line 153, in retry_target
    _retry_error_helper(
  File "/python3.10/site-packages/google/api_core/retry/retry_base.py", line 212, in _retry_error_helper
    raise final_exc from source_exc
  File "/python3.10/site-packages/google/api_core/retry/retry_unary.py", line 144, in retry_target
    result = target()
  File "/python3.10/site-packages/google/api_core/timeout.py", line 120, in func_with_timeout
    return func(*args, **kwargs)
  File "/python3.10/site-packages/google/api_core/grpc_helpers.py", line 78, in error_remapped_callable
    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InvalidArgument: 400 Unsupported MIME type: application/pdf

Actual vs expected behavior:

The prompt should be executed without error because the code is generated by Google AI studio with a tested working prompt in UI.

Any other information you'd like to share?

I sent the request from Vietnam.

Niraj Singh · Answer 1 · Thu Jun 06 2024 13:34:22 GMT+0800 (China Standard Time)

@KevinRoller, Thank you reporting this issue. We are already aware of this issue and working on it. We will update this thread, once fixed. Thank you!

Anushka_Sonawane · Answer 2 · Tue Jun 11 2024 14:51:29 GMT+0800 (China Standard Time)

Hi @singhniraj08 @MarkDaoust is this issue resolved?

Mark Daoust · Answer 3 · Tue Jun 11 2024 23:58:52 GMT+0800 (China Standard Time)

Unsupported mime type

This is exactly what it says: The API doesn't directly handle PDFs yet. You can do something like this to upload the text and/or screenshots:

https://github.com/google-gemini/cookbook/blob/main/quickstarts/PDF_Files.ipynb

Anushka_Sonawane · Answer 4 · Wed Jun 12 2024 13:43:39 GMT+0800 (China Standard Time)

Thanks, @MarkDaoust.

Earlier, the application/pdf format was supported. Will it be available again soon?

As a workaround, I have a PDF file containing both text and screenshots. I converted the PDF to images and uploaded the images to the GenAI API, will this work to get accurate information from the image when it has text & screenshots?

Do we need to provide the text & screenshot separately? Also, does the API accurately extract information from the screenshots?

Dipanjan (DJ) Sarkar · Answer 5 · Thu Jun 20 2024 04:46:25 GMT+0800 (China Standard Time)

Any update on this? Since works fine on Vertex AI but with this API, it still doesn't work

sleepless-se · Answer 6 · Sun Jun 23 2024 20:34:21 GMT+0800 (China Standard Time)

I am encountering the same issue. When I try to upload a PDF file, I receive a 400 Unsupported MIME type: application/pdf error. I am looking forward to this problem being resolved. If there are any other suggestions or workarounds, please let me know.

Mohamed Negm · Answer 7 · Mon Jun 24 2024 22:21:28 GMT+0800 (China Standard Time)

Same here :(

Mark Daoust · Answer 8 · Fri Jul 26 2024 08:11:12 GMT+0800 (China Standard Time)

Fixed: https://github.com/google-gemini/cookbook/blob/main/quickstarts/PDF_Files.ipynb