Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems in replicating the performance #12

Open
vvigilante opened this issue Sep 11, 2018 · 1 comment
Open

Problems in replicating the performance #12

vvigilante opened this issue Sep 11, 2018 · 1 comment

Comments

@vvigilante
Copy link

Hello!
I'm trying to replicate your experiment: Raspberry Pi 3, Darknet19, NNPACK=1,ARM_NEON=1,NNPACK_FAST=1, all other switches are off (GPU, CUDNN, OPENCV, OPENMP, DEBUG, QPU_GEMM)

I should get: 1.3 (first frame), 0.66 (subsequent frames)
But I get: 6 seconds for the first frame and 3 for the subsequent ones, that is 4.5 times slower.

Do you think I'm missing some step in configuration? Did I set the wrong switches? Do I need to modify something in particular in NNPACK's init.c?

Thank you!

@shizukachan
Copy link
Owner

shizukachan commented Sep 12, 2018

I think I derped. On a Pi3 @ 900MHz:
From 2017-12-26 commit I got 6.44/3.33 on Darknet19 (224x224)
From 2018-05-31 commit I got 7.02/3.88 on Darknet19 (256x256)

I get 2.10/1.05 on Darknet (256x256)... oops, I mislabeled the benchmark results!
The update to the yolov3 tree also updated all of the cfg files, so the benchmark results (other than yolov3) are all likely out of date as it looks like they are all 256x256 (and I tested on 224x224).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants