Can ScrapeGraph-AI used in Kaggle Notebook?
Kingki19 opened this issue · comments
When i try to use scrapegraphai
in Kaggle Notebook, i got the error message. This is my code:
Let's assume i have installed scrapegraphai in notebook using
!pip install scrapegraphai
and have correct API_key. I use Gemini in this case.
import os
from dotenv import load_dotenv
load_dotenv()
from scrapegraphai.graphs import SmartScraperGraph
prompt ="""
list me all presidents in Indonesia, their presidency timespan, and their livespan.
"""
url = "https://en.wikipedia.org/wiki/List_of_presidents_of_Indonesia"
graph_config = {
"llm": {
"api_key": gemini_key,
"model": "gemini-pro",
},
}
smart_scraper_graph = SmartScraperGraph(
prompt=prompt,
# also accepts a string with the already downloaded HTML code
source=url,
config=graph_config
)
result = smart_scraper_graph.run()
print(result)
The message error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[17], line 15
6 url = "[https://en.wikipedia.org/wiki/List_of_presidents_of_Indonesia](https://en.wikipedia.org/wiki/List_of_presidents_of_Indonesia%3C/span%3E%3Cspan) style="color:rgb(175,0,0)">"
8 graph_config = {
9 "llm": {
10 "api_key": gemini_key,
11 "model": "gemini-pro",
12 },
13 }
---> 15 smart_scraper_graph = SmartScraperGraph(
16 prompt=prompt,
17 # also accepts a string with the already downloaded HTML code
18 source=url,
19 config=graph_config
20 )
22 result = smart_scraper_graph.run()
23 print(result)
File /opt/conda/lib/python3.10/site-packages/scrapegraphai/graphs/smart_scraper_graph.py:47, in SmartScraperGraph.__init__(self, prompt, source, config)
46 def __init__(self, prompt: str, source: str, config: dict):
---> 47 super().__init__(prompt, config, source)
49 self.input_key = "url" if source.startswith("http") else "local_dir"
File /opt/conda/lib/python3.10/site-packages/scrapegraphai/graphs/abstract_graph.py:48, in AbstractGraph.__init__(self, prompt, config, source)
46 self.source = source
47 self.config = config
---> 48 self.llm_model = self._create_llm(config["llm"], chat=True)
49 self.embedder_model = self._create_default_embedder(llm_config=config["llm"]
50 ) if "embeddings" not in config else self._create_embedder(
51 config["embeddings"])
53 # Create the graph
File /opt/conda/lib/python3.10/site-packages/scrapegraphai/graphs/abstract_graph.py:152, in AbstractGraph._create_llm(self, llm_config, chat)
150 except KeyError as exc:
151 raise KeyError("Model not supported") from exc
--> 152 return Gemini(llm_params)
153 elif llm_params["model"].startswith("claude"):
154 try:
File /opt/conda/lib/python3.10/site-packages/scrapegraphai/models/gemini.py:20, in Gemini.__init__(self, llm_config)
17 def __init__(self, llm_config: dict):
18 # replace "api_key" to "google_api_key"
19 llm_config["google_api_key"] = llm_config.pop("api_key", None)
---> 20 super().__init__(**llm_config)
File /opt/conda/lib/python3.10/site-packages/pydantic/v1/main.py:339, in BaseModel.__init__(__pydantic_self__, **data)
333 """
334 Create a new model by parsing and validating input data from keyword arguments.
335
336 Raises ValidationError if the input data cannot be parsed to form a valid model.
337 """
338 # Uses something other than `self` the first arg to allow "self" as a settable attribute
--> 339 values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
340 if validation_error:
341 raise validation_error
File /opt/conda/lib/python3.10/site-packages/pydantic/v1/main.py:1102, in validate_model(model, input_data, cls)
1100 continue
1101 try:
-> 1102 values = validator(cls_, values)
1103 except (ValueError, TypeError, AssertionError) as exc:
1104 errors.append(ErrorWrapper(exc, loc=ROOT_KEY))
File /opt/conda/lib/python3.10/site-packages/langchain_google_genai/chat_models.py:602, in ChatGoogleGenerativeAI.validate_environment(cls, values)
599 if isinstance(google_api_key, SecretStr):
600 google_api_key = google_api_key.get_secret_value()
--> 602 genai.configure(
603 api_key=google_api_key,
604 transport=values.get("transport"),
605 client_options=values.get("client_options"),
606 default_metadata=default_metadata,
607 )
608 if (
609 values.get("temperature") is not None
610 and not 0 <= values["temperature"] <= 1
611 ):
612 raise ValueError("temperature must be in the range [0.0, 1.0]")
File ~/.local/lib/python3.10/site-packages/sitecustomize.py:96, in post_import_logic.<locals>.new_configure(*args, **kwargs)
94 else:
95 default_metadata = []
---> 96 default_metadata.append(("x-kaggle-proxy-data", os.environ['KAGGLE_DATA_PROXY_TOKEN']))
97 user_secrets_token = os.environ['KAGGLE_USER_SECRETS_TOKEN']
98 default_metadata.append(('x-kaggle-authorization', f'Bearer {user_secrets_token}'))
AttributeError: 'tuple' object has no attribute 'append'
Summary:
This error occurs because there's a problem with this line:
default_metadata.append(("x-kaggle-proxy-data", os.environ['KAGGLE_DATA_PROXY_TOKEN']))
The error 'tuple' object has no attribute 'append' means you're trying to use the .append() operation on a tuple object, which should not be possible because tuples are immutable. As a solution, you need to ensure that default_metadata is a mutable object before attempting to add items to it.
I think this the reason why no one create a notebook in Kaggle about Scrapegraph-AI, i think they dealing same problem like this.
can you set the temperature to 0?
values.get("temperature") is not None
610 and not 0 <= values["temperature"] <= 1
can you set the temperature to 0? values.get("temperature") is not None 610 and not 0 <= values["temperature"] <= 1
Where i put that parameter in my code?
EDIT:
I set the temperature to 0 like this:
graph_config = {
"llm": {
"api_key": gemini_key,
"model": "gemini-pro",
'temperature': 0
},
}
but I still get the same error.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[3], line 16
6 url = "[https://en.wikipedia.org/wiki/List_of_presidents_of_Indonesia](https://en.wikipedia.org/wiki/List_of_presidents_of_Indonesia%3C/span%3E%3Cspan) style="color:rgb(175,0,0)">"
8 graph_config = {
9 "llm": {
10 "api_key": gemini_key,
(...)
13 },
14 }
---> 16 smart_scraper_graph = SmartScraperGraph(
17 prompt=prompt,
18 # also accepts a string with the already downloaded HTML code
19 source=url,
20 config=graph_config
21 )
23 result = smart_scraper_graph.run()
24 print(result)
File /opt/conda/lib/python3.10/site-packages/scrapegraphai/graphs/smart_scraper_graph.py:47, in SmartScraperGraph.__init__(self, prompt, source, config)
46 def __init__(self, prompt: str, source: str, config: dict):
---> 47 super().__init__(prompt, config, source)
49 self.input_key = "url" if source.startswith("http") else "local_dir"
File /opt/conda/lib/python3.10/site-packages/scrapegraphai/graphs/abstract_graph.py:48, in AbstractGraph.__init__(self, prompt, config, source)
46 self.source = source
47 self.config = config
---> 48 self.llm_model = self._create_llm(config["llm"], chat=True)
49 self.embedder_model = self._create_default_embedder(llm_config=config["llm"]
50 ) if "embeddings" not in config else self._create_embedder(
51 config["embeddings"])
53 # Create the graph
File /opt/conda/lib/python3.10/site-packages/scrapegraphai/graphs/abstract_graph.py:152, in AbstractGraph._create_llm(self, llm_config, chat)
150 except KeyError as exc:
151 raise KeyError("Model not supported") from exc
--> 152 return Gemini(llm_params)
153 elif llm_params["model"].startswith("claude"):
154 try:
File /opt/conda/lib/python3.10/site-packages/scrapegraphai/models/gemini.py:20, in Gemini.__init__(self, llm_config)
17 def __init__(self, llm_config: dict):
18 # replace "api_key" to "google_api_key"
19 llm_config["google_api_key"] = llm_config.pop("api_key", None)
---> 20 super().__init__(**llm_config)
File /opt/conda/lib/python3.10/site-packages/pydantic/v1/main.py:339, in BaseModel.__init__(__pydantic_self__, **data)
333 """
334 Create a new model by parsing and validating input data from keyword arguments.
335
336 Raises ValidationError if the input data cannot be parsed to form a valid model.
337 """
338 # Uses something other than `self` the first arg to allow "self" as a settable attribute
--> 339 values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
340 if validation_error:
341 raise validation_error
File /opt/conda/lib/python3.10/site-packages/pydantic/v1/main.py:1102, in validate_model(model, input_data, cls)
1100 continue
1101 try:
-> 1102 values = validator(cls_, values)
1103 except (ValueError, TypeError, AssertionError) as exc:
1104 errors.append(ErrorWrapper(exc, loc=ROOT_KEY))
File /opt/conda/lib/python3.10/site-packages/langchain_google_genai/chat_models.py:602, in ChatGoogleGenerativeAI.validate_environment(cls, values)
599 if isinstance(google_api_key, SecretStr):
600 google_api_key = google_api_key.get_secret_value()
--> 602 genai.configure(
603 api_key=google_api_key,
604 transport=values.get("transport"),
605 client_options=values.get("client_options"),
606 default_metadata=default_metadata,
607 )
608 if (
609 values.get("temperature") is not None
610 and not 0 <= values["temperature"] <= 1
611 ):
612 raise ValueError("temperature must be in the range [0.0, 1.0]")
File ~/.local/lib/python3.10/site-packages/sitecustomize.py:96, in post_import_logic.<locals>.new_configure(*args, **kwargs)
94 else:
95 default_metadata = []
---> 96 default_metadata.append(("x-kaggle-proxy-data", os.environ['KAGGLE_DATA_PROXY_TOKEN']))
97 user_secrets_token = os.environ['KAGGLE_USER_SECRETS_TOKEN']
98 default_metadata.append(('x-kaggle-authorization', f'Bearer {user_secrets_token}'))
AttributeError: 'tuple' object has no attribute 'append'
@Kingki19, are you able to use gemini with langchain alone in a kaggle notebook?
seems unrelated to scrapegraph from the error log
@Kingki19, are you able to use gemini with langchain alone in a kaggle notebook?
seems unrelated to scrapegraph from the error log
I am sorry, but i never use Langchain. But when i watch others notebook, they can run and use Langchain.
Hi please install all the libraries inside this notebook https://colab.research.google.com/drive/1sEZBonBMGP44CtO6GQTwAlL0BGJXjtfd?usp=sharing
@VinciGit00 Thanks.