amit1nayak / llm-agent-web-tools

A simple Google Search Engine Crawler.

Home Page:https://github.com/microsoft/ProphetNet/tree/master/CRITIC

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

LLM-Agent Web Tools

Repo for web search tools of the paper CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing [ICLR'24], main repo at CRITIC.

Supported web tools:

  • Google
  • Bing
  • Baidu
  • Goolge Scholar
  • DuckDuckGo
  • GitHub
  • StackOverflow
  • Baidu
  • YouTube
  • ...

Caching Mechanism for Replicability of CRITIC Paper

We build a caching system specifically designed for web searches. This system archives all API queries that are generated via greedy decoding for each model and evaluation sample, as well as their corresponding search outcomes. This approach ensures stability, fairness, and reproducibility in the results of CRITIC.

Usage

from src.tools.web_tools.core.engines.google import Search as GoogleSearch

# init a search engine
gsearch = GoogleSearch(proxy=None)

# will automatically parse Google and corresponding web pages
gresults = gsearch.search(query, cache=True, page_cache=True, topk=1, end_year=2024)

print(gresults)

References

About

A simple Google Search Engine Crawler.

https://github.com/microsoft/ProphetNet/tree/master/CRITIC

License:MIT License


Languages

Language:Python 100.0%