browser.py call to summarizer exceeds token limits
jtac opened this issue · comments
Please check that this issue hasn't been reported before.
- I searched previous Bug Reports didn't find any similar reports.
Expected Behavior
even if a scraped page exceeds token limits, it should be properly chunked?
Current behaviour
NEXT_COMMAND: browser, Args: {'url': 'https://www.soundguys.com/sony-wf-1000xm4-review-31815/', 'question': 'Extract specs, prices, and reviews for Sony WF-1000XM4.'}
SYSTEM: Executing command: browser
Summarizing text...: 0%| | 0/2 [00:00<?, ?it/s]
Summarizing text...: 50%|█████ | 1/2 [00:00<00:00, 2.66it/s]
Summarizing text...: 50%|█████ | 1/2 [00:00<00:00, 1.45it/s]
SYSTEM: browser output: An error occurred while scraping the website: This model's maximum context length is 8192 tokens. However, your messages resulted in 15534 tokens. Please reduce the length of the messages.. Make sure the URL is valid.
Steps to reproduce
run the browser command with arguments to repeat
Possible solution
check chunking logic, maybe getting bypassed after soup handles the content <> links splitting
Which Operating Systems are you using?
- Linux
- macOS
- Windows
Python Version
- >= v3.11
- v3.10
- v3.9
- <= v3.8
LoopGPT Version
feature/azure_openai 0.0.13
Acknowledgements
- My issue title is concise, descriptive, and in title casing.
- I have searched the existing issues to make sure this bug has not been reported yet.
- I am using the latest version of LoopGPT.
- I have provided enough information for the maintainers to reproduce and diagnose the issue.