kingoflolz / mesh-transformer-jax

Model parallel transformers in JAX and Haiku

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Fine-tuning

preste-naava opened this issue · comments

Hi! While finetuning gptj I always have the same google.api exception. I have uploaded the pretrained weights and dataset to a bucket, created Cloud TPU Service Account and gave read and write permission on the bucket. But it didn't help(( Any help will be appreciated.

saving a checkpoint for step 1
Traceback (most recent call last):
File "device_train.py", line 58, in save
with open(f"gs://{bucket}/{path}/meta.json", "r") as f:
File "/home/preste-naava/.local/lib/python3.8/site-packages/smart_open/smart_open_lib.py", line 235, in open
binary = _open_binary_stream(uri, binary_mode, transport_params)
File "/home/preste-naava/.local/lib/python3.8/site-packages/smart_open/smart_open_lib.py", line 398, in _open_binary_stream
fobj = submodule.open_uri(uri, mode, transport_params)
File "/home/preste-naava/.local/lib/python3.8/site-packages/smart_open/gcs.py", line 105, in open_uri
return open(parsed_uri['bucket_id'], parsed_uri['blob_id'], mode, **kwargs)
File "/home/preste-naava/.local/lib/python3.8/site-packages/smart_open/gcs.py", line 138, in open
fileobj = Reader(
File "/home/preste-naava/.local/lib/python3.8/site-packages/smart_open/gcs.py", line 224, in init
raise google.cloud.exceptions.NotFound('blob %s not found in %s' % (key, bucket))
google.api_core.exceptions.NotFound: 404 blob mesh_jax_gpt_6B_eliza_rotary/meta.json not found in gpt-j_bucket

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/preste-naava/.local/lib/python3.8/site-packages/google/cloud/storage/blob.py", line 2713, in create_resumable_upload_session
upload, _ = self._initiate_resumable_upload(
File "/home/preste-naava/.local/lib/python3.8/site-packages/google/cloud/storage/blob.py", line 1916, in _initiate_resumable_upload
upload.initiate(
File "/home/preste-naava/.local/lib/python3.8/site-packages/google/resumable_media/requests/upload.py", line 413, in initiate
self._process_initiate_response(response)
File "/home/preste-naava/.local/lib/python3.8/site-packages/google/resumable_media/_upload.py", line 502, in _process_initiate_response
_helpers.require_status_code(
File "/home/preste-naava/.local/lib/python3.8/site-packages/google/resumable_media/_helpers.py", line 99, in require_status_code
raise common.InvalidResponse(
google.resumable_media.common.InvalidResponse: ('Request failed with status code', 403, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.CREATED: 201>)

Hi, have you solved this error? I encountered the same one.