-
-
Notifications
You must be signed in to change notification settings - Fork 15.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix list paths #721
Fix list paths #721
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @Hatovix, thank you for submitting a PR! To allow your work to be integrated as seamlessly as possible, we advise you to:
- Verify your PR is up-to-date with origin/master. If your PR is behind origin/master update by running the following, replacing 'feature' with the name of your local branch:
git remote add upstream https://github.com/ultralytics/yolov5.git
git fetch upstream
git checkout feature # <----- replace 'feature' with local branch name
git rebase upstream/master
git push -u origin -f
- Verify all Continuous Integration (CI) checks are passing.
- Reduce changes to the absolute minimum required for your bug fix or feature addition. "It is not daily increase but daily decrease, hack away the unessential. The closer to the source, the less wastage there is." -Bruce Lee
@Hatovix ah, thank you, good catch! I will look through this. |
@Hatovix maybe we should just flip the if statements so it only checks datasets with I assume only the official datasets would have download keys, and they are all single-source datasets (i.e. train and val strings). |
You are probably right about the download keys, provided by public datasets. Another think about this function (
A nonexistant path displays the associated warning, but the training only stop later, after loading previous existing paths from the list, and fails on the first problematic one, (just noticed that I think it is better to check that everything is ready to be loaded, since loading can be time consuming. |
Just realized if newcomers use the coco128.yaml template for their own dataset, they may leave the download key there with an empty string maybe Perhaps this function may need a bit of a broader rethink. |
At the end of the day the desired outcome is simply that the official datasets autodownload if not found locally, the details of the function aren't too important, but I wanted to create a structure that allows people to implement on their own custom datasets a s well, should they choose to do so. |
This should be more robust than before. I removed checking of the train paths, as some use cases involve running test.py on coco val without actually requiring you to download coco train. Also added an OS check, as I'm getting a certificate error on MacOS locally with torch download. |
CI passing, merging. Thank you for your contributionS @Hatovix. |
* Add list paths on check_dataset * missing raise statement * Update general.py Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
* Add list paths on check_dataset * missing raise statement * Update general.py Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
* Add list paths on check_dataset * missing raise statement * Update general.py Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
I noticed earlier today that
check_dataset
function was only expecting single string fromdict['train']
anddict['val']
, generating errors while providing a list of paths.π οΈ PR Summary
Made with β€οΈ by Ultralytics Actions
π Summary
Enhanced dataset auto-download and improved platform compatibility in general utilities.
π Key Changes
platform
module is imported and used.check_git_status
function to useplatform.system()
for better OS detection.check_file
function.check_dataset
function, adding support for MacOS curl and refining the auto-download script logic.π― Purpose & Impact