Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClearML experiment tracking integration #8620

Conversation

thepycoder
Copy link
Contributor

@thepycoder thepycoder commented Jul 18, 2022

This PR adds integration with the open-source experiment tracker ClearML. Installing the package pip install clearml will enable the integration and allow users to track every training run in ClearML. This in turn allows users to keep track of different experiments, compare them to see the differences and even run the experiment remotely (using the instructions in the ClearML readme)

Features

Experiment Tracking and Comparison

After installing and initializing ClearML (with clearml-init CLI command), one can run train.py with any configuration desired and ClearML integration will automatically be enabled.

ClearML scalars dashboard

What will be captured:

  • Source code + uncommitted changes
  • Installed packages
  • (Hyper)parameters
  • Model files (use --save-period n to save a checkpoint every n epochs)
  • Console output
  • Scalars (mAP_0.5, mAP_0.5:0.95, precision, recall, losses, learning rates, ...)
  • General info such as machine details, runtime, creation date etc.
  • All produced plots such as label correlogram and confusion matrix
  • Images with bounding boxes per epoch
  • Mosaic per epoch
  • Validation images per epoch
  • ...

Each of these metrics can easily be compared between multiple experiments using the ClearML web console.

Versioned Dataset Support

Users will be able to provide a ClearML dataset version as part of the YOLOv5 command line interface, for training. This dataset will be downloaded or taken from cache and used to further train on.

ClearML Dataset Interface

Hyperparameter Optimization

A standalone script is provided, which will allow users to run HPO on YOLOv5 locally or in the cloud as well.
HPO

🛠️ PR Summary

Made with ❤️ by Ultralytics Actions

🌟 Summary

Enhanced YOLOv5 with ClearML integration for advanced ML experiment tracking and management.

📊 Key Changes

  • Added recommendation for using Weights & Biases Logging.
  • Integrated ClearML tool for automatic experiment tracking and dataset versioning.
  • Modified README to introduce ClearML integration with its features and links.
  • Updated requirements to suggest the installation of ClearML.
  • Tweaked train.py to support data logging for ClearML.
  • Enhanced Jupyter notebook tutorial to include ClearML setup and usage guide.
  • Updated general.py to recognize 'clearml:https://' dataset IDs for dataset versioning.
  • Introduced ClearML logger in utils/loggers/__init__.py.
  • Provided a detailed README for ClearML integration at utils/loggers/clearml/README.md.
  • Crafted a separate ClearML utility module at utils/loggers/clearml/clearml_utils.py.
  • Included a hypothetical script for hyperparameter optimization (hpo) using ClearML at utils/loggers/clearml/hpo.py.
  • Adjusted wandb_utils.py to harmonize with ClearML dataset structures.
  • Modified metrics.py and plots.py to add plot titles for clarity.

🎯 Purpose & Impact

  • 🚀 Purpose: Incorporate ClearML for better experiment tracking, dataset versioning, and model management, offering an alternative to Weights & Biases.
  • 🎛 Impact:
    • Users gain the ability to trace training runs comprehensively with live updates and detailed metrics.
    • Offers additional capabilities like uncommitted code tracking and reproducibility across machines.
    • Expands dataset management by allowing users to version their datasets using ClearML dataset IDs.
    • Enhances the existing benchmarking system with an exhaustive hyperparameter optimization script.
    • Facilitates easier overview and analysis of experiments through well-organized dashboards and logging systems within the ClearML environment.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 Hello @thepycoder, thank you for submitting a YOLOv5 🚀 PR! To allow your work to be integrated as seamlessly as possible, we advise you to:

  • ✅ Verify your PR is up-to-date with upstream/master. If your PR is behind upstream/master an automatic GitHub Actions merge may be attempted by writing /rebase in a new comment, or by running the following code, replacing 'feature' with the name of your local branch:
git remote add upstream https://github.com/ultralytics/yolov5.git
git fetch upstream
# git checkout feature  # <--- replace 'feature' with local branch name
git merge upstream/master
git push -u origin -f
  • ✅ Verify all Continuous Integration (CI) checks are passing.
  • ✅ Reduce changes to the absolute minimum required for your bug fix or feature addition. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." -Bruce Lee

@glenn-jocher
Copy link
Member

glenn-jocher commented Aug 5, 2022

@thepycoder I've reviewed and updated the PR as best I could. I'll merge now, but please review and verify I haven't broken any functionality, and please work on updates to streamline ClearML ops as mentioned over email so we can improve the user experience for ClearML logging.

FYI inserted links for improved analytics for us:
Screen Shot 2022-08-05 at 8 38 36 PM

@glenn-jocher
Copy link
Member

@thepycoder I've removed the plot titles from labels.png and labels_correlogram.png as they were applied on the last subplot only. All other plot titles are nice additions, thanks!

labels_correlogram
labels

@glenn-jocher glenn-jocher merged commit 378bde4 into ultralytics:master Aug 5, 2022
@glenn-jocher
Copy link
Member

@thepycoder PR is merged. Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐

@glenn-jocher glenn-jocher removed the TODO label Aug 5, 2022
@glenn-jocher
Copy link
Member

@thepycoder I noticed the light/dark mode graphics look good in GitHub (very clever), but unfortunately do not extend well to other platforms with automatic README views that we use like Docker Hub and Paperspace Gradient (see links below). Can you please provide a single graphic for use here? Maybe the light mode graphic with an opaque white inside the circle would work.

ctjanuhowski pushed a commit to ctjanuhowski/yolov5 that referenced this pull request Sep 8, 2022
* Add titles to matplotlib plots

* Add ClearML Experiment Tracking integration.

* Add ClearML Data Version Management automatic download when requested

* Add ClearML Hyperparameter Optimization

* ClearML save period integration

* Fix wandb breaking when used with ClearML dataset

* Fix wandb breaking when used with ClearML resume and dataset

* Add ClearML documentation

* fixed small bug in clearml integration that misreports epoch number

* Final ClearMl additions before refactor

* Add correct epoch reporting

* Add remote execution and autoscaling docs for ClearML integration

* Added images to clearml integration docs

* fixed logo alignment bug and added hpo screenshot clearml

* Fixed small epoch number bug in clearml integration

* Remove saved model flush clearml

* Cleanup clearml readme section

* Cleaned up clearml logger docstring

* Remove resume readme section clearml

* Clearml integration cleanup

* Updated ClearML documentation

* Added dark vs light icons ClearML Readme

* Clearml Readme styling

* Add better gifs

* Fixed gif file size

* Add better images in tutorial notebook

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Addressed comments in PR ultralytics#8620

* Fixed circular import

* Fixed circular import

* Update tutorial.ipynb

* Update tutorial.ipynb

* Inline comment

* Restructured tutorial notebook

* Add correct ClearML link to README

* Update tutorial.ipynb

* Update general.py

* Update __init__.py

* Update __init__.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update __init__.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Update __init__.py

* Update README.md

* Update __init__.py

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* spelling

* Update tutorial.ipynb

* notebook cutt.ly links

* Update README.md

* Update README.md

* cutt.ly links in tutorial

* Removed labels as they show up on last subplot only

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Glenn Jocher <[email protected]>
@Robotatron
Copy link

Robotatron commented Jan 8, 2023

Will ClearML be supported for segmentation as well? If so, any ETA on this? @thepycoder @glenn-jocher

@thepycoder
Copy link
Contributor Author

Hey @Robotatron! What features would you like to see specifically? I can def take a look :)

@Robotatron
Copy link

Robotatron commented Jan 11, 2023

Hey @Robotatron! What features would you like to see specifically? I can def take a look :)

Hey @thepycoder, thanks for showing interest. The features would be all the common stuff you can think of (probably the same stuff you log with object detection for ClearML already), in order of importance:

  1. Current YOLO5 segmentation Tensorboard scalar metrics that can be visualized in a graph with ClearML -> so losses and segmentation mAPs (AP50:95, AP50, etc). The ones YOLO5 logs into "results.csv" file
  2. opt.yaml, the config of the run
  3. If possible, per class mAP metrics (we get those printed in the console when running inference, but not during training sadly)
  4. Inference images on the test set (the one we get with YOLO automatically in the experiment folder)
  5. Train batch images (the one we get with YOLO automatically in the experiment folder)
  6. (this will be probably too much, but prediction masks on the test set in the COCO format)

@thepycoder
Copy link
Contributor Author

@Robotatron , please check out #10752 and let me know if it works for you! :)

@Robotatron
Copy link

@thepycoder Everything works great, thank you!
Would you know if I have to configure clearML or edit your logger to also log the best model under "artefacts"? I am new to ClearML so could be a stupid question :)

@thepycoder
Copy link
Contributor Author

@Robotatron The best model should always be logged under the artifacts tab. If not, there's a bug somewhere.
To get the latest model too you'll have to set a save inteval using the yolo arguments themselves :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants