VinciGit00 / Scrapegraph-ai

Python scraper based on AI

Home Page:https://scrapegraphai.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ValueError: No HTML body content found, please try setting the 'headless' flag to False in the graph configuration. (Urgent help wanted)

MalakW opened this issue · comments

Initially, it worked and provided output, but it has stopped working. I have been trying to resolve this error for three days. Despite using a VPN and adding money to OpenAI, the error persists.

Screenshot 2024-05-27 122100

Hey @MalakW, the headless flag should not be inside the "browser" key in the graph configuration. Let me know

the headless flag should not be inside the "browser" key in the graph configuration.

like this you mean?

image

hello, any help regarding this issue?

look this ^_^ hope help for you

graph_config = {
    "llm": {
        "api_key": "<Your API KEY>",
        "model": "oneapi/qwen-turbo",
        "base_url": "http://127.0.0.1:13000/v1", 
    },
    "embeddings": {
        "model": "ollama/nomic-embed-text",
        "base_url": "http://127.0.0.1:11434",
    },
    "headless": False
}

hi @MalakW the reason is because you have not installed playwright, look this collar to see how Is implemented link

Hey @MalakW You can try this.

import asyncio
import sys
from playwright.async_api import async_playwright

graph_config = {
    "llm": {
        "model_instance": llm_model_instance
    },
    "embeddings": {
        "model_instance": embedder_model_instance
    },
    "browser": {
        "headless": False
    }
}
def scrape_website(prompt, source):
    print(prompt, source)
    # Ensure the event loop policy is set correctly for Windows
    if sys.platform == "win32":
        asyncio.set_event_loop_policy(asyncio.WindowsProactorEventLoopPolicy())

    # Create the SmartScraperGraph instance
    smart_scraper_graph = SmartScraperGraph(
        prompt=prompt,
        source=source,
        config=graph_config
    )

    result = smart_scraper_graph.run()