Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to export CoreML model with decode layer and NMS #9694

Closed
Qwin opened this issue Oct 4, 2022 · 20 comments
Closed

Fail to export CoreML model with decode layer and NMS #9694

Qwin opened this issue Oct 4, 2022 · 20 comments
Labels

Comments

@Qwin
Copy link

Qwin commented Oct 4, 2022

Hello everyone,

I been working on this for days now without avail, so was wondering if anyone here could help me out. I first trained up a model using the following dataset on kaggle : https://www.kaggle.com/datasets/taranmarley/sptire

I used the colab script on this github to train my model. It works flawlessly I can even see perfect detection of my tires with detect.

Now here is where the problem comes, when I try to export the model to coreML everything goes well however I get only 1 output out of the model (which my guess is one matrix with the results):

image

after it has been exported it runs a decode layer on the spec and adds NMS. Which both fail with the following error:
image

My guess is that the output is incorrect of the model or that it has changed and the script that I am using to export it is outdated and still expects 2 outputs while there is only 1. If anyone here could help me just getting my yolov5 model converted correctly to coreml I would be really grateful.

Here is the export script that I am running:
https://colab.research.google.com/drive/1uR738UTlzI7apqeN0qr6mQ5ke_a5SKa8?usp=sharing

Here is the blog that I followed to run this script:
https://rockyshikoku.medium.com/convert-yolov5-to-coreml-also-add-a-decode-layer-113408b7a848

P.S. Please let me know if additional info is needed.

@glenn-jocher
Copy link
Member

glenn-jocher commented Oct 5, 2022

@Qwin Ultralytics HUB exports YOLOv5 pipelined CoreML models with NMS etc. See https://hub.ultralytics.com/

@glenn-jocher
Copy link
Member

Screenshot 2022-10-05 at 21 27 59

@Qwin
Copy link
Author

Qwin commented Oct 6, 2022

@glenn-jocher thank you so much it worked!!! Is the export code for iOS opensource that the hub is using and might I take a look at it. I just want to see what I was doing wrong and what the difference is between the script I used from the blog and the hub used. I know I exported it using export.py (which my guess the ultralytics hub does the same) but then it uses a custom NMS code to get iOS values out of it, and I am curious as to what its doing to manipulate the model.

I have a feeling its because the script above was using 2 matrices while the script the hub is using as input is 1 matrix to calculate the missing NMS layer.

@junmcenroe
Copy link

junmcenroe commented Oct 6, 2022

Dear Qwin

I also had same trouble trained model with custom dataset, and reported on "Export.py need train mode export for cormel model to add NMS and decode layer #9667".
Still no response from author. But I modified the export.py as followings.


-model.eval()
+model.train() if train else model.eval() # training mode = no Detect() layer grid construction

def run(
:
+train=False, # model.train() mode
:
)

+parser.add_argument('--train', action='store_true', help='model.train() mode')


Then I can export three outputs array. I can add NMS and detect layer as we use as previous.

@Qwin
Copy link
Author

Qwin commented Oct 6, 2022

@junmcenroe here is the weird part I tried the exact same thing, thinking the training mode (considering the export.py was changed to remove --train) would be the one that was causing issues. However when I tried patching the python script and adding the training mode, I still got as result 1 output matrix though. I must have done something wrong with patching it! but thank you so much now I know at least WHY it was failing to add the NMS code and decode layer.

@junmcenroe
Copy link

junmcenroe commented Oct 7, 2022

@Qwin

following is my modified export.py. for your ref.

export.py.zip

@zhaoqier
Copy link

Dear Qwin

I also had same trouble trained model with custom dataset, and reported on "Export.py need train mode export for cormel model to add NMS and decode layer #9667". Still no response from author. But I modified the export.py as followings.

-model.eval() +model.train() if train else model.eval() # training mode = no Detect() layer grid construction

def run( : +train=False, # model.train() mode : )

+parser.add_argument('--train', action='store_true', help='model.train() mode')

Then I can export three outputs array. I can add NMS and detect layer as we use as previous.

@junmcenroe Hi, I have export my trained best.pt to best.mlmodel successful but without NMS, and there are four outputs right now, could you please help with how to add the NMS and detect layer as you said as previous.
Thank you very much.

@junmcenroe
Copy link

junmcenroe commented Oct 13, 2022

@zhaoqier
As far as I know, following methods can export .mlmodel with NMS and detect layer for custom trained model.
Option-1) As Qwin mentioned, use following blog scripts
https://rockyshikoku.medium.com/convert-yolov5-to-coreml-also-add-a-decode-layer-113408b7a848
you can run Google Colab scrips after replacing export.py from original.
This scripts have hard cording for class label. If you use this script for your custom model, need to change this class label data manually.
If your custom model was trained not with standard image input size(640x640), you need change the image size parameter which is also hard corded in this scripts.
**
featureMapDimensions = [640 // stride for stride in strides]

builder.add_scale(name=f"normalize_coordinates_{outputName}", input_name=f"{outputName}_raw_coordinates",
output_name=f"{outputName}_raw_normalized_coordinates", W=torch.tensor([1 / 640]).numpy(), b=0, has_bias=False)

builder.set_output(output_names=["raw_confidence", "raw_coordinates"], output_dims=[
(25200, numberOfClassLabels), (25200, 4)])

pipeline = ct.models.pipeline.Pipeline(input_features=[("image", ct.models.datatypes.Array(3, 460, 460)),
("iouThreshold", ct.models.datatypes.Double(
)),
("confidenceThreshold", ct.models.datatypes.Double())], output_features=["confidence", "coordinates"])

note:(3,460,460) should be (3,640,640) I think
**
Option-2) make your self converter.py to modify above scripts to make image size and class label into argument. It is very helpful not to modify manually
I tried to make this own conveter. see followings
https://github.com/junmcenroe/YOLOv5-CoreML-Converter.git

Option-3) Use https://github.com/mshamash/yolov5 repo which including NMS
mishmash issued PR to original, but still not merged. so he build blanch.
tucan9389/ObjectDetection-CoreML#6
**
git clone https://github.com/mshamash/yolov5
cd yolov5
git checkout fix/coreml_export_nms_layer
pip install -qr requirements.txt # install
python export.py --weights YourBest.pt --include coreml
**
This branch is 3 commits ahead, 462 commits behind master. So slightly old.
But for my custom model, no big issue happened.

@zhaoqier
Copy link

Hi @junmcenroe 👍
Thanks for your reply, finally I tried the option-3 and it is worked. And I make my iOS application runs as well!
Now I have a very strange problem, I don't know if you have encountered:
My model has a good effect on static images, but no result or inaccurate position is found in real-time detection. May I ask what is the possible reason?
The interested thing is, my model also performed well with video testing locally, so I was confused about what went wrong.

Anyway, thank you again sincerely!

@junmcenroe
Copy link

Dear @zhaoqier

per my previous experiment, no result or inaccurate position will be caused by following reason.

  1. XY coordinate of video preview layer which you saw and model input XY coordinate is different.
  2. while de-normalizing the position of model output, which is normalized data (0-1), calculate is something wrong.

Apple provided the sample code as follows.
https://developer.apple.com/documentation/vision/recognizing_objects_in_live_capture
Recognizing Objects in Live Capture, Apply Vision algorithms to identify objects in real-time video.

I started from this code, and now working well even if I confused the video output orientation at first.

I hope this info is useful for you.

@zhaoqier
Copy link

Dear @junmcenroe :

Thanks for your reply and your suggestion is absolutely useful for me.

At firstly, I thought it was the problem with my .mlmodel but finally I found it was caused by the coordinates transformation.

I have solved it and now the performance in Live Capture is great.

Thanks for your help again.

Best Regards,

Kier

@junmcenroe
Copy link

Dear @zhaoqier

Good news.

@philipperemy
Copy link

philipperemy commented Oct 26, 2022

I adapted the PR #7263. And I can make it work.

But it's strange. Anyone knows why the confidence shown in Xcode is always 100%?

I can tune the conf_thres to control the bounding boxes I see but any chosen BBox has a conf of 1.

I checked and my model does not output a conf of 1 for those images.

Any help would be greatly appreciated!

image

@junmcenroe
Copy link

junmcenroe commented Oct 26, 2022

Dear @philipperemy

If your result is by your own custom trained model , I think likely happen 100% conf result when the image for preview is the one of the images using train phase. If you try the quite different background image for preview, some different result might be happened.

Or after you will try the following repo, and if you get good result, your implementation might be something wrong.


git clone https://github.com/mshamash/yolov5
cd yolov5
git checkout fix/coreml_export_nms_layer
pip install -qr requirements.txt # install
python export.py --weights YourBest.pt --include coreml


@glenn-jocher
Copy link
Member

glenn-jocher commented Oct 26, 2022

Confidences are correct in Xcode for HUB models:
Screenshot 2022-10-26 at 13 30 46

@philipperemy
Copy link

philipperemy commented Oct 27, 2022

@glenn-jocher Thanks but I'd like to avoid using HUB. It should be possible have the same output with the export command of this repository. Or is there a quick way with HUB where I can upload this .pt checkpoint file and convert it to a .mlmodel model?

@philipperemy
Copy link

@junmcenroe Even on the test set, I always get 100% for each prediction. When I run the branch fix/coreml_export_nms_layer of mshamash/yolov5, I get this error:

AttributeError: Can't get attribute 'DetectionModel' on <module 'models.yolo' from '/private/tmp/yolov5/models/yolo.py'>.

The last commit was on Sat Apr 9 12:51:04. So I guess this fork is too old if we consider the latest commit of the main repo.

I might have to checkout an old commit on the main repo around April.

@junmcenroe
Copy link

Hi @philipperemy

I tried as following and get the same result of HUB
**
git clone https://github.com/mshamash/yolov5
cd yolov5
git checkout fix/coreml_export_nms_layer
pip install -qr requirements.txt # install
curl -OL https://github.com/ultralytics/yolov5/releases/download/v6.2/yolov5s.pt
python export.py --weights yolov5s.pt --include coreml
**
"git checkout fix/coreml_export_nms_layer" should NOT apply current master blanch, should apply mishmash's blanch.
スクリーンショット 2022-10-27 21 48 43

@ztyree42
Copy link

ztyree42 commented Nov 3, 2022

@philipperemy

In case you haven't sorted that error out there's not much to it. You trained a custom model recently and are trying to use an export.py that is ~400 commits behind. In order to make it work you need to go into models/yolo.py in that old branch and add the DetectionModel class from a models/yolo.py on master. (Maybe you can just checkout the file from the master branch but I'm not sure, haven't tried.)

The reason others are saying it works on yolov5s.pt is because that checkpoint is super old and doesn't rely on this class.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 4, 2022

👋 Hello, this issue has been automatically marked as stale because it has not had recent activity. Please note it will be closed if no further activity occurs.

Access additional YOLOv5 🚀 resources:

Access additional Ultralytics ⚡ resources:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLOv5 🚀 and Vision AI ⭐!

@github-actions github-actions bot added the Stale label Dec 4, 2022
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Dec 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants