peterw / Chat-with-Github-Repo

This repository contains two Python scripts that demonstrate how to create a chatbot using Streamlit, OpenAI GPT-3.5-turbo, and Activeloop's Deep Lake.

Home Page:https://explodinginsights.com/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Corrupted dataset error when running github.py

rnabirov opened this issue · comments

Traceback (most recent call last):
File "/Users/rnabirov/opt/anaconda3/lib/python3.8/site-packages/deeplake/core/dataset/dataset.py", line 240, in init
self._set_derived_attributes(address=address)
File "/Users/rnabirov/opt/anaconda3/lib/python3.8/site-packages/deeplake/core/dataset/dataset.py", line 2065, in _set_derived_attributes
self._set_read_only(
File "/Users/rnabirov/opt/anaconda3/lib/python3.8/site-packages/deeplake/core/dataset/dataset.py", line 1727, in _set_read_only
locked = self._lock(err=err)
File "/Users/rnabirov/opt/anaconda3/lib/python3.8/site-packages/deeplake/core/dataset/dataset.py", line 1296, in _lock
raise ReadOnlyModeError()
deeplake.util.exceptions.ReadOnlyModeError: Modification when in read-only mode is not supported!

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/Users/rnabirov/opt/anaconda3/lib/python3.8/site-packages/deeplake/api/dataset.py", line 569, in load
return dataset._load(dataset_kwargs, access_method)
File "/Users/rnabirov/opt/anaconda3/lib/python3.8/site-packages/deeplake/api/dataset.py", line 638, in _load
ret = dataset_factory(**dataset_kwargs)
File "/Users/rnabirov/opt/anaconda3/lib/python3.8/site-packages/deeplake/core/dataset/init.py", line 23, in dataset_factory
ds = clz(path=path, *args, **kwargs)
File "/Users/rnabirov/opt/anaconda3/lib/python3.8/site-packages/deeplake/core/dataset/dataset.py", line 246, in init
raise ReadOnlyModeError(
deeplake.util.exceptions.ReadOnlyModeError: This dataset cannot be open for writing as you don't have permissions. Try loading the dataset with `read_only=True.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "github.py", line 46, in
main(repo_url, root_dir, deeplake_repo_name, deeplake_username)
File "github.py", line 37, in main
db = DeepLake(dataset_path=f"hub://{username}/{repo_name}", embedding_function=embeddings)
File "/Users/rnabirov/opt/anaconda3/lib/python3.8/site-packages/langchain/vectorstores/deeplake.py", line 125, in init
self.ds = deeplake.load(
File "/Users/rnabirov/opt/anaconda3/lib/python3.8/site-packages/deeplake/util/spinner.py", line 139, in inner
return func(*args, **kwargs)
File "/Users/rnabirov/opt/anaconda3/lib/python3.8/site-packages/deeplake/api/dataset.py", line 581, in load
raise DatasetCorruptError(
deeplake.util.exceptions.DatasetCorruptError: Exception occured (see Traceback). The dataset maybe corrupted. Try using reset=True to reset HEAD changes and load the previous commit. This will delete all uncommitted changes on the branch you are trying to load.

@rnabirov hi! thanks for reporting this. will look into this shortly.

The issue got resolved. I was using the previous commit, where the activeloop token was mine and the script was trying to reach the dataset owned by the original author