[Bug]: Rate limit hit when using Groq

Question

[Bug]: Rate limit hit when using Groq

ksylvan opened this issue a month ago · comments

What happened?

with this alias in place:

groq_api=$(cat ~/Documents/groq-api.txt)
alias groq="env OPENAI_BASE_URL=https://api.groq.com/openai/v1 OPENAI_API_KEY=$groq_api"

The contents of groq-api.txt is the key generated at https://console.groq.com/keys

Running this:

cat ~/Downloads/Leadership\ Committee\ Meeting_otter_ai.txt | groq fabric --pattern extract_wisdom -m Llama3-70b-8192
Error: Error code: 429 - {'error': {'message': 'Rate limit reached for model `llama3-70b-8192` in organization `org_01hrsvc1b8ey0anvbgha0xckf2` on tokens per minute (TPM): Limit 7000, Used 0, Requested ~12903. Please try again in 50.597142857s. Visit https://console.groq.com/docs/rate-limits for more information.', 'type': 'tokens', 'code': 'rate_limit_exceeded'}}
Error code: 429 - {'error': {'message': 'Rate limit reached for model `llama3-70b-8192` in organization `org_01hrsvc1b8ey0anvbgha0xckf2` on tokens per minute (TPM): Limit 7000, Used 0, Requested ~12903. Please try again in 50.597142857s. Visit https://console.groq.com/docs/rate-limits for more information.', 'type': 'tokens', 'code': 'rate_limit_exceeded'}}
kayvan@dharma fabric % cat ~/Downloads/Leadership\ Committee\ Meeting_otter_ai.txt | groq fabric --pattern extract_wisdom -m Llama3-70b-8192
Error: Error code: 429 - {'error': {'message': 'Rate limit reached for model `llama3-70b-8192` in organization `org_01hrsvc1b8ey0anvbgha0xckf2` on tokens per minute (TPM): Limit 7000, Used 0, Requested ~12903. Please try again in 50.597142857s. Visit https://console.groq.com/docs/rate-limits for more information.', 'type': 'tokens', 'code': 'rate_limit_exceeded'}}
Error code: 429 - {'error': {'message': 'Rate limit reached for model `llama3-70b-8192` in organization `org_01hrsvc1b8ey0anvbgha0xckf2` on tokens per minute (TPM): Limit 7000, Used 0, Requested ~12903. Please try again in 50.597142857s. Visit https://console.groq.com/docs/rate-limits for more information.', 'type': 'tokens', 'code': 'rate_limit_exceeded'}}

We should retry after n seconds each time the query returns HTTP code 429 according to the specs on this page: https://console.groq.com/docs/rate-limits

Version check

Yes I was.

Relevant log output

kayvan@dharma fabric % wc -l ~/Downloads/Leadership\ Committee\ Meeting_otter_ai.txt
     456 /Users/kayvan/Downloads/Leadership Committee Meeting_otter_ai.txt
kayvan@dharma fabric % wc -c ~/Downloads/Leadership\ Committee\ Meeting_otter_ai.txt
   48479 /Users/kayvan/Downloads/Leadership Committee Meeting_otter_ai.txt

Relevant screenshots (optional)

No response

Daniel Miessler · Answer 1 · Sat May 11 2024 12:15:33 GMT+0800 (China Standard Time)

Yep that's just a Groq limitation right now. Should fix once they enable the paid keys soon.