VinciGit00 / Scrapegraph-ai

Python scraper based on AI

Home Page:https://scrapegraphai.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

NotImplemented error while running in windows.

desainad opened this issue · comments

Describe the bug:

I am trying to develop a web scraper on windows using streamlit and scrapegraph ai. It gives an error:
2024-05-19 12:12:07.798 Uncaught app exception
Traceback (most recent call last):
File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 600, in _run_script
exec(code, module.dict)
File "D:\My Python Projects\SiteScrapers\ai_scraper.py", line 31, in
result=smart_scraper_graph.run()
File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\scrapegraphai\graphs\smart_scraper_graph.py", line 109, in run
self.final_state, self.execution_info = self.graph.execute(inputs)
File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\scrapegraphai\graphs\base_graph.py", line 107, in execute
result = current_node.execute(state)
File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\scrapegraphai\nodes\fetch_node.py", line 88, in execute
document = loader.load()
File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\langchain_core\document_loaders\base.py", line 29, in load
return list(self.lazy_load())
File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\langchain_community\document_loaders\chromium.py", line 76, in lazy_load
html_content = asyncio.run(self.ascrape_playwright(url))
File "C:\Python310\lib\asyncio\runners.py", line 44, in run
return loop.run_until_complete(main)
File "C:\Python310\lib\asyncio\base_events.py", line 646, in run_until_complete
return future.result()
File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\langchain_community\document_loaders\chromium.py", line 52, in ascrape_playwright async with async_playwright() as p:
File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\playwright\async_api_context_manager.py", line 46, in aenter
playwright = AsyncPlaywright(next(iter(done)).result())
File "C:\Users\XXXX\AppData\Roaming\Python\Python310\site-packages\playwright_impl_transport.py", line 120, in connect
self._proc = await asyncio.create_subprocess_exec(
File "C:\Python310\lib\asyncio\subprocess.py", line 218, in create_subprocess_exec
transport, protocol = await loop.subprocess_exec(
File "C:\Python310\lib\asyncio\base_events.py", line 1667, in subprocess_exec
transport = await self._make_subprocess_transport(
File "C:\Python310\lib\asyncio\base_events.py", line 498, in _make_subprocess_transport
raise NotImplementedError
NotImplementedError

To Reproduce
Steps to reproduce the behavior:

  1. command: streamlit run ai_scraper.py

Expected behavior
I wanted to give an input URL, which it accepts well, also give it information on what I want it to output, and proceed to provide the results.

Desktop (please complete the following information):

  • OS: Windows 10 Pro 22H2
  • Browser Chrome
  • Version 124.0.6367.203

Here is the simple code: file name - ai_scraper.py

import streamlit as st
from scrapegraphai.graphs import SmartScraperGraph

st.title('Web Scraping AI Assistant')
st.caption('This app allows you to scrape a website using openAI API')

Set up the configuration for the smartscrapegraph
graph_config = {
"llm" : {
"model": "ollama/llama3",
"temperature": 0,
"format":"json",
"base_url":"http://localhost:11434", #set ollama url
},
"embeddings": {
"model": "ollama/nomic-embed-text",
"base_url": "http://localhost:11434",
},
"verbose": True,
}
_ ------ Get the url of the website to scrape ------ _
url = st.text_input("Enter the url of the site you want to scrape")
_ -------get the user prompt -------- _
user_prompt = st.text_input("What you want the assistant to scra[e from the ui?")

_ ------ Create a sscmartscrapergraph object ----- _
smart_scraper_graph = SmartScraperGraph(prompt=user_prompt, source=url,config=graph_config)
if (st.button("scrape")):
result=smart_scraper_graph.run()
st.write(result)

Hey try SearchGraph and see if you still have the problem. If no, then it is a asyncio problem in SmartScraper and will fix

I have the same issue

I have the same issue

playwright = AsyncPlaywright(next(iter(done)).result())
raise self._exception.with_traceback(self._exception_tb)
self._proc = await asyncio.create_subprocess_exec(
transport, protocol = await loop.subprocess_exec(
transport = await self._make_subprocess_transport(
raise NotImplementedError
NotImplementedError

edit: I switched from fastapi to flask and It's fixed.

Hey try SearchGraph and see if you still have the problem. If no, then it is a asyncio problem in SmartScraper and will fix

Getting the same error with searchgraph as well. How can this be resolved?

@Shivansh-yadav13 Do you know what was the problem??

Hello @Ravel226 , I'm not sure what exactly the issue is, but whenever I'm using it with a framework I'm getting this error, I was recently trying it with reflex framework and I'm facing the same issue.

When I'm using it with a simple python program it works fine.

I think i will build my own streamlit app from scratch. I hope it will work

HI @ALL,

Has anybody solved this issue if yes please guide me on it - Thanks.

hi, the app where you can find this problem could be this one https://github.com/ScrapeGraphAI/Scrapegraph-LabLabAI-Hackathon

hi, the app where you can find this problem could be this one https://github.com/ScrapeGraphAI/Scrapegraph-LabLabAI-Hackathon

I have opened an issue there but you also closed it. I don't know where is the problem