ultralytics / hub

Ultralytics HUB tutorials and support

Home Page:https://hub.ultralytics.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Accessing checkpoints once google collab got disconnected in between training

vipin-prabhakaran opened this issue Β· comments

Search before asking

Question

i was training for a dataset for around 13000 images ,but google collab got disconnected in the middle of the training , the bellow is the check point url https://hub.ultralytics.com/models/fjTOVdZKmQMsYBEqakVV . i was trying to access the best.pt in weights section as it got almost completed with 63 epocs .
as the training is not over i was not able to access the weights folder and best.pt ,can anyone help me in how to access the completed checkpoints to take out the best.pt

Additional

No response

πŸ‘‹ Hello @vipin-prabhakaran, thank you for raising an issue about Ultralytics HUB πŸš€! Please visit our HUB Docs to learn more:

  • Quickstart. Start training and deploying YOLO models with HUB in seconds.
  • Datasets: Preparing and Uploading. Learn how to prepare and upload your datasets to HUB in YOLO format.
  • Projects: Creating and Managing. Group your models into projects for improved organization.
  • Models: Training and Exporting. Train YOLOv5 and YOLOv8 models on your custom datasets and export them to various formats for deployment.
  • Integrations. Explore different integration options for your trained models, such as TensorFlow, ONNX, OpenVINO, CoreML, and PaddlePaddle.
  • Ultralytics HUB App. Learn about the Ultralytics App for iOS and Android, which allows you to run models directly on your mobile device.
    • iOS. Learn about YOLO CoreML models accelerated on Apple's Neural Engine on iPhones and iPads.
    • Android. Explore TFLite acceleration on mobile devices.
  • Inference API. Understand how to use the Inference API for running your trained models in the cloud to generate predictions.

If this is a πŸ› Bug Report, please provide screenshots and steps to reproduce your problem to help us get started working on a fix.

If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response.

We try to respond to all issues as promptly as possible. Thank you for your patience!

@vipin-prabhakaran If your Colab instance got disconnected and you do not have it connected with a permanent storage solution its probable that the environment got reset. I checked HUB and the model did not upload any checkpoints so you are not able to resume.

Your best hope is checking the /content directory in Colab and seeing if the weights are still there.

We also recently released Ultralytics Cloud, this would solve your connection problems. You can check it by upgrading to Pro and we currently have a promotion of the first month free with CLOUDPRO20.

Hi @kalenmike thank you so much for your response , If your Colab instance got disconnected and you do not have it connected with a permanent storage solution its probable that the environment got reset. I checked HUB and the model did not upload any checkpoints so you are not able to resume.as i am new to the system can u please throw some light on how to connect colab to a permenant storage ,thanks in advance

@vipin-prabhakaran This is not something that we handle directly and we suggest that you use our Pro plan with our Cloud Training feature.

If you prefer to continue with Colab you can find many guides online that show how to configure a connection to your storage. You will need to ensure that the path is correctly defined in yolo settings. You can read the docs to see how to configure your settings:

https://docs.ultralytics.com/quickstart/?h=settings#modifying-settings

Here are a few Colab posts I just sourced now:

https://towardsdatascience.com/different-ways-to-connect-google-drive-to-a-google-colab-notebook-pt-1-de03433d2f7a
https://stackoverflow.com/questions/64808087/how-do-i-save-files-from-google-colab-to-google-drive
https://www.youtube.com/watch?v=6UnCrulz-fE

hi @kalenmike thank you so much for your quick responses ,can you pls mention where i need to check in my account to make sure that my ultralytics hub training is properly connected with cloudpro url /or any kind of permanant storage

@vipin-prabhakaran You can follow this document for getting started with Cloud Training and upgrading to HUB Pro:

https://docs.ultralytics.com/hub/cloud-training/

@kev0051 @kalenmike I have completed my training using ultralytics HUB web portal with out using scripts and its showing 100% completion and still after training its saying it got disconnected ,and resume from epic 72 as the total number of epocs is 75 for 0.74$ per hour ,but its not clickable also ,can you pls help me on this as its very urgent for me and I am going clueless on what's happening ,after optimising weights ,its saying like disconnected and after that from today morning it got stuck at preparing your cloud instance ...... for hours

@vipin-prabhakaran Yes I can help, can you supply me with your model id? You can find it in the URL on the model page.

πŸ‘‹ Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO πŸš€ and Vision AI ⭐