deedy5 / duckduckgo_search

Search for words, documents, images, videos, news, maps and text translation using the DuckDuckGo.com search engine. Downloading files and images to a local hard drive.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What is the exact rate limit of DDG?

heesuju opened this issue · comments

Hello,

I'm using duckduckgo_search version 5.1.0.
In my code, I'm using "AsyncDDGS.text(keyword)" in a for loop.
I'm iterating through the loop with an interval of 10 seconds using "await asyncio.sleep(10)".

However, after 5-6 search requests, I'm getting a rate limit error.
Any subsequent requests trigger a rate limit error as well for about 10-15 minutes.
Until now I thought that the rate limit was 1-2 request per 10 seconds, but this doesn't seem to be the case.

Is there a total amount of requests that I'm allowed to send?
Any help would be appreciated.
Thank you in advance.

Hi, show me the code.

After upgrade to 5.1.0 it returns error

_aget_url() https://duckduckgo.com RequestsError: Impersonating BrowserType.chrome120 is not supported

@phamxtien
Some kind of problem with curl-cffi.
Try to reinstall duckduckgo_search:
pip install -I duckduckgo_search

@phamxtien Some kind of problem with curl-cffi. Try to reinstall duckduckgo_search: pip install -I duckduckgo_search

I follow your guide, it still returns errror
But use cli it runs smoothly
image

My code

from duckduckgo_search import DDGS
from bs4 import BeautifulSoup

def ddgSearch(keywords, region='vn-vi', count=5):
    documents = []
    urls = []
    ddgs = DDGS()
    i = 1
    for keyword in keywords:
        icount = 1
        try: 
            for r in ddgs.text(keyword, region=region, safesearch='off', timelimit='y', max_results=count):
                print(r)
                try:
                    response = requests.get(r['href'])
                    soup = BeautifulSoup(response.text, 'html.parser')
                    body = soup.find('body').text
                    body = ' '.join(body.split())
                    documents.append(body)
                    urls.append(r['href'])
                    i = i + 1
                    icount = icount + 1
                    if icount > count: break
                except Exception as e:
                    print(str(e))
                    continue
        except Exception as e:
            print(str(e))
            continue
        time.sleep(6)
    return {'urls': urls, 'documents': documents}

and get error _aget_url() https://duckduckgo.com/ RequestsError: Impersonating BrowserType.chrome120 is not supported

Environment

OS: Ubuntu 23.10
Python: 3.11

are you importing requests?

are you importing requests?

Yes, i import requests already
I miss it when create above comment
image

I don't see the above error when I run your code.
Reinstall duckduckgo_search in the virtual environment from which you are running the code.

I think this make it error
image
and I'm still stuck :(

Hi, show me the code.

Hello again, sorry for the late reply.
Here's my sample code.

I'm using the code from autogpt repository to get search results from duckduckgo_search
This code used to work fine until a week ago.
I think my IP might be blocked after making too many requests? (I used to use multi-threading to run like 20 requests at once)
Now I get a rate limit error every 5-6 times I make a request.

import asyncio
import json
from itertools import islice
from duckduckgo_search import AsyncDDGS

async def web_search(query: str, num_results: int = 8) -> list[dict]:
    search_results = []
    attempts = 0

    while attempts < 3:
        if not query:
            return json.dumps(search_results)

        async with AsyncDDGS() as ddgs:
            results = await ddgs.text(query, safesearch='on', max_results=num_results, backend="html")
            search_results = list(islice(results, num_results))

        if search_results:
            break

        await asyncio.sleep(1)
        attempts += 1

    return search_results

async def main(url:str):
	keywords = ["keyword1", "keyword2", "keyword3", "keyword4", "keyword5"]
	for i in range(len(keywords)):
		results = await search_keyword(keywords[i], 10)
		await asyncio.sleep(10)
  1. Try to use backend='api', it's less likely to block.
  2. I used to use multi-threading to run like 20 requests at once -> use a proxy.

Completely forgot to mention that I switched over to 'html' from 'api' after my ip started getting blocked.
I guess my only option is using proxies.
Thank you for the help!

commented

Hi - did using proxies resolve the issue? I'm having the same problem - used to work fine, now I keep getting the rate limit exception after 5-6 search runs. Have to wait a while before it can run properly again. Was trying to see if there's a way to actually pay for duckduckgo-search so that I can guarantee it'll work for what I need, but can't find that either.