browser.py call to summarizer exceeds token limits

Question

browser.py call to summarizer exceeds token limits

jtac opened this issue a year ago · comments

John Tackman commented a year ago

Please check that this issue hasn't been reported before.

I searched previous Bug Reports didn't find any similar reports.

Expected Behavior

even if a scraped page exceeds token limits, it should be properly chunked?

Current behaviour

NEXT_COMMAND: browser, Args: {'url': 'https://www.soundguys.com/sony-wf-1000xm4-review-31815/', 'question': 'Extract specs, prices, and reviews for Sony WF-1000XM4.'}

SYSTEM: Executing command: browser
Summarizing text...: 0%| | 0/2 [00:00<?, ?it/s]
Summarizing text...: 50%|█████ | 1/2 [00:00<00:00, 2.66it/s]
Summarizing text...: 50%|█████ | 1/2 [00:00<00:00, 1.45it/s]
SYSTEM: browser output: An error occurred while scraping the website: This model's maximum context length is 8192 tokens. However, your messages resulted in 15534 tokens. Please reduce the length of the messages.. Make sure the URL is valid.

Steps to reproduce

run the browser command with arguments to repeat

Possible solution

check chunking logic, maybe getting bypassed after soup handles the content <> links splitting

Which Operating Systems are you using?

Linux
macOS
Windows

Python Version

>= v3.11
v3.10
v3.9
<= v3.8

LoopGPT Version

feature/azure_openai 0.0.13

Acknowledgements

My issue title is concise, descriptive, and in title casing.
I have searched the existing issues to make sure this bug has not been reported yet.
I am using the latest version of LoopGPT.
I have provided enough information for the maintainers to reproduce and diagnose the issue.