[Bug]: Rate limit hit when using Groq
ksylvan opened this issue · comments
Kayvan Sylvan commented
What happened?
with this alias in place:
groq_api=$(cat ~/Documents/groq-api.txt)
alias groq="env OPENAI_BASE_URL=https://api.groq.com/openai/v1 OPENAI_API_KEY=$groq_api"
The contents of groq-api.txt
is the key generated at https://console.groq.com/keys
Running this:
cat ~/Downloads/Leadership\ Committee\ Meeting_otter_ai.txt | groq fabric --pattern extract_wisdom -m Llama3-70b-8192
Error: Error code: 429 - {'error': {'message': 'Rate limit reached for model `llama3-70b-8192` in organization `org_01hrsvc1b8ey0anvbgha0xckf2` on tokens per minute (TPM): Limit 7000, Used 0, Requested ~12903. Please try again in 50.597142857s. Visit https://console.groq.com/docs/rate-limits for more information.', 'type': 'tokens', 'code': 'rate_limit_exceeded'}}
Error code: 429 - {'error': {'message': 'Rate limit reached for model `llama3-70b-8192` in organization `org_01hrsvc1b8ey0anvbgha0xckf2` on tokens per minute (TPM): Limit 7000, Used 0, Requested ~12903. Please try again in 50.597142857s. Visit https://console.groq.com/docs/rate-limits for more information.', 'type': 'tokens', 'code': 'rate_limit_exceeded'}}
kayvan@dharma fabric % cat ~/Downloads/Leadership\ Committee\ Meeting_otter_ai.txt | groq fabric --pattern extract_wisdom -m Llama3-70b-8192
Error: Error code: 429 - {'error': {'message': 'Rate limit reached for model `llama3-70b-8192` in organization `org_01hrsvc1b8ey0anvbgha0xckf2` on tokens per minute (TPM): Limit 7000, Used 0, Requested ~12903. Please try again in 50.597142857s. Visit https://console.groq.com/docs/rate-limits for more information.', 'type': 'tokens', 'code': 'rate_limit_exceeded'}}
Error code: 429 - {'error': {'message': 'Rate limit reached for model `llama3-70b-8192` in organization `org_01hrsvc1b8ey0anvbgha0xckf2` on tokens per minute (TPM): Limit 7000, Used 0, Requested ~12903. Please try again in 50.597142857s. Visit https://console.groq.com/docs/rate-limits for more information.', 'type': 'tokens', 'code': 'rate_limit_exceeded'}}
We should retry after n
seconds each time the query returns HTTP code 429 according to the specs on this page: https://console.groq.com/docs/rate-limits
Version check
- Yes I was.
Relevant log output
kayvan@dharma fabric % wc -l ~/Downloads/Leadership\ Committee\ Meeting_otter_ai.txt
456 /Users/kayvan/Downloads/Leadership Committee Meeting_otter_ai.txt
kayvan@dharma fabric % wc -c ~/Downloads/Leadership\ Committee\ Meeting_otter_ai.txt
48479 /Users/kayvan/Downloads/Leadership Committee Meeting_otter_ai.txt
Relevant screenshots (optional)
No response
Daniel Miessler commented
Yep that's just a Groq limitation right now. Should fix once they enable the paid keys soon.