Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tools to label data #327

Closed
1 task done
hupo376787 opened this issue Jul 16, 2023 · 11 comments
Closed
1 task done

Tools to label data #327

hupo376787 opened this issue Jul 16, 2023 · 11 comments
Labels
question A HUB question that does not involve a bug

Comments

@hupo376787
Copy link

Search before asking

Question

Hi, I'm a newbie in AI. I found that this platform can train models using existing datasets, that's a great job.
But where does the dataset text come from? I have a lot of images, so what tool do I need to label?

Any advice is welcome.

image

Additional

No response

@hupo376787 hupo376787 added the question A HUB question that does not involve a bug label Jul 16, 2023
@github-actions
Copy link

👋 Hello @hupo376787, thank you for raising an issue about Ultralytics HUB 🚀! Please visit our HUB Docs to learn more:

  • Quickstart. Start training and deploying YOLO models with HUB in seconds.
  • Datasets: Preparing and Uploading. Learn how to prepare and upload your datasets to HUB in YOLO format.
  • Projects: Creating and Managing. Group your models into projects for improved organization.
  • Models: Training and Exporting. Train YOLOv5 and YOLOv8 models on your custom datasets and export them to various formats for deployment.
  • Integrations. Explore different integration options for your trained models, such as TensorFlow, ONNX, OpenVINO, CoreML, and PaddlePaddle.
  • Ultralytics HUB App. Learn about the Ultralytics App for iOS and Android, which allows you to run models directly on your mobile device.
    • iOS. Learn about YOLO CoreML models accelerated on Apple's Neural Engine on iPhones and iPads.
    • Android. Explore TFLite acceleration on mobile devices.
  • Inference API. Understand how to use the Inference API for running your trained models in the cloud to generate predictions.

If this is a 🐛 Bug Report, please provide screenshots and steps to reproduce your problem to help us get started working on a fix.

If this is a ❓ Question, please provide as much information as possible, including dataset, model, environment details etc. so that we might provide the most helpful response.

We try to respond to all issues as promptly as possible. Thank you for your patience!

@hupo376787
Copy link
Author

Found it.
YoloLabel

@glenn-jocher
Copy link
Member

@hupo376787 i'm glad you found a tool for your needs!

Your mentioned tool, YoloLabel, is indeed a widely used tool for data labeling in YOLO format. It allows users to draw bounding boxes around objects of interest in images and generates corresponding annotation files. This makes it a helpful utility when creating labeled datasets for training YOLO models.

Before you go ahead, I would like to suggest you check the quality of your labels regularly during the labeling process, especially if you are new to this. Accurate and consistent labels are very important for training a reliable model. Also, be sure to split your labeled images into training, validation, and testing datasets, so you can effectively train and evaluate your model.

Happy labeling and best of luck with your AI journey! If you have any more questions down the line, feel free to reach out.

@hupo376787
Copy link
Author

Hi, Glenn. Thanks for your advice. I used YoloLabel for a while and found that it can only draw bounding boxes. This makes it an efficient tool. And I also heard that using a brush to paint is more accurate to train models. But it takes a little more time for a label engineer.

So do you have any brush label tools recommended?

Thanks.

@glenn-jocher
Copy link
Member

@hupo376787 hello,

I'm glad to hear that you're exploring different means of labeling data and considering different tools. You're absolutely right: while bounding-box-based labels (like those created with YoloLabel) are indeed efficient, more precise labels - such as segmentation masks created by "painting" over the objects of interest with a brush tool - can potentially lead to better model performance, especially in applications which require a detailed understanding of object shapes.

A tool worth exploring is labelbox, which has the brush tool you're looking for. It lets you paint over your object, creating more sophisticated segmentation masks as opposed to just boxes. For an open-source solution, you might want to look into labelme, which also supports polygon annotations.

Keep in mind that this added complexity takes extra time and effort, during both the labelling process and the model training phase. It really depends on your specific use-case whether or not the improved performance from more advanced annotations is worth this additional investment.

I hope this assists you in making an informed decision. If you need further guidance, feel free to ask. Happy labeling!

@hupo376787
Copy link
Author

Hi, Glenn.

I found that different label tools generate different label data formats.
Like Yolo Label, its JSON data is

0 0.675585 0.690574 0.434783 0.520492
1 0.408584 0.262295 0.299889 0.418033

But for Labelme, the data is like

{
  "version": "4.5.9",
  "flags": {},
  "shapes": [
    {
      "label": "1",
      "points": [
        [
          244.2686567164179,
          232.25373134328356
        ],
        [
          579.3432835820895,
          471.8059701492537
        ]
      ],
      "group_id": null,
      "shape_type": "rectangle",
      "flags": {}
    },
    {
      "label": "2",
      "points": [
        [
          144.2686567164179,
          18.07462686567164
        ],
        [
          386.0597014925373,
          243.44776119402985
        ]
      ],
      "group_id": null,
      "shape_type": "rectangle",
      "flags": {}
    }
  ],
  "imagePath": "000000000009.jpg",
  "imageData": "",
  "imageHeight": 480,
  "imageWidth": 640
}

So If I use the Ultralytics hub to train data, it requires special data formats. So I can only use Yolo Label.
image

Am I correct?

@glenn-jocher
Copy link
Member

@hupo376787 hello!

That's right! The Ultralytics hub primarily supports the YOLO format for annotated training data. As a result, label tools such as YoloLabel that produce this desired output are suitable. However, this does not make other tools unusable.

Tools like Labelme generate output in their own distinct format, as you've pointed out. However, the important point is that the raw information (object class, object coordinates etc.) remains the same, it's just organised differently.

If you decide to use a tool like Labelme, you would have to convert the Labelme annotations to the YOLO format before you can feed it into the Ultralytics hub for training. This switch from one annotation format to another could be accomplished programmatically via a script.

I hope this answers your question. If you have more, feel free to reach out!

@hupo376787
Copy link
Author

Understood, thanks for your reply.

@glenn-jocher
Copy link
Member

@hupo376787, you're welcome! I'm glad to hear that you found the information helpful. If you have any other questions in the future related to data labeling, model training, or anything else about Ultralytics HUB, don't hesitate to ask. Good luck with your work!

@hupo376787
Copy link
Author

Understood, thanks @glenn-jocher

@glenn-jocher
Copy link
Member

@hupo376787, you're welcome! I'm glad I could help. If you have any other questions or concerns in the future, don't hesitate to reach out. Best of luck with your AI projects!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question A HUB question that does not involve a bug
Projects
None yet
Development

No branches or pull requests

2 participants