-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding training and validation accuracy to the training process #264
Comments
There are 2 methods of doing that. You can split the data set in train and validate sets inside the code or just send 2 separate data sets, one for train and one for validate when you call the flow module. Anyway, in order to do that you should add some new parameters in default.py file, then modify the functions _batch, parse and shuffle from data.py (both yolo and yolov2 folders) and modify the method train() in flow.py file(here you only have to run another batch (every iteration or once a number of iterations) using the same tensorflow session, but without returning the train_op so you don't modify the weights). You can also add another tf.summary.FileWriter for validation so you can visualize your validation loss graph using tensorboard. I personally chose to send 2 different data sets . It was pretty straight forward. I hope I was clear enough. |
@Costyv95 Can you share your code with the added parameters and the changes that you have suggested ? |
Yes, no problem. I will upload the files here. If you have any question , just ask. |
@Costyv95 Does the validation set contribute to the gradient update in your implementation? |
I got it, validation samples does not contribute to gradient update. |
Yes, validation is only for a preview of the model results outside the training set. |
Hi,
Sorry. I didn't notice the last mail. Yes, validation is only for a preview of the model results outside the training set.
On Tuesday, July 4, 2017, 3:16:24 PM GMT+3, yfliu <notifications@github.com> wrote:
I got it, validation samples does not contribute to gradient update.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
@Costyv95 I just want konw how to run it after modify the original code? Thanks very much! |
@Costyv95 I run it like this ./flow --model cfg/yolo.cfg --train --dataset "/home/thinkjoy/lwl/modify-darkflow-master/data/VOCdevkit/VOC2007/JPEGImages" --annotation "/home/thinkjoy/lwl/modify-darkflow-master/data/VOCdevkit/VOC2007/Annotations" --gpu 1.0 |
This happens because the code I gave you has some modifications for adaptive learning rate and there is one more change you have to do . You find it here: 124d55d And you should add --val_dataset and val_annotation to arguments for having a validation loss. |
@Costyv95 Can we control that how much steps to validate one time, I just think one step one val is a little waste time for training? Thanks! |
@Costyv95 And have you achieved adding the accuracy when val? |
@dream-will For validation once in N steps, you can easily add an argument(val_steps) in defaults.py and in the train method in flow.py you just run the code that's after the "#validation time" inside an if statement like this: ` #validation time
In the defaults.py just add this line:
I don't quite get the second question about adding the accuracy. |
@Costyv95 Thanks for your answer,the second question just mean when validate ,we not only get the validation loss but also get the validation accuracy? |
@dream-will For that you have to implement yourself a custom accuracy method that compares the GT bboxes and the predicted bboxes (to get the predicted bboxes, see the code used in prediction) , but I don't see a reason for that because the loss is enough . Be aware that the validation you see is only on a random mini batch from the validation set, but this represent very well the testing loss on a big enough validation dataset. |
@Costyv95 ok,thanks |
Hi @Costyv95 . I'm having problem to output val loss values. I modified all files by following your instructions and codes. This is the following errors |
Can you print the value of file variable ? |
@Costyv95 no I can't. This is what I run: |
What I meant by the "file variabile" is the variabile used at line 36 in misc.py, because I cannot really understand what's wrong with your code. |
@Costyv95 Hi, I have copy and paste your files on diff.zip then i tried to train with command
But it still got error Parsing ./coke/yolo-coke-2c.cfg |
@khanh1412 P.S. It's a temporary solution. |
@Costyv95 Enter training ... |
…u#264) - the png -> jpg bug is solved - the wrong usage of glob is solved
hi @Costyv95 where should i add tf.summary.FileWriter for validation to visualize validation loss graph using tensorboard. thanks |
@Costyv95 I tried your zip file, diff.zip. But the terminal tells me that --val_dataset is an invalid argument. Do I need to change other files? |
You should replace all the files including yolo-data and yolov2-data ones. you should simply copy and paste respectively to the related folders by changing their names by just "data" to simply change the file in them. |
@KhanhHH |
Thank you so much!!!!!!!!
… On May 25, 2019, at 3:11 AM, Jack ***@***.***> wrote:
@KhanhHH <https://github.com/KhanhHH>
add a code “self.define('labels', 'labels.txt', 'path to labels file')” to the "def setDefaults(self):" in "darkflow\defaults.py", then you can use "--labels xxx.txt" as former.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub <#264?email_source=notifications&email_token=AEIP7DOZYIFBYZHV22RVIADPXD7ELA5CNFSM4DNH2ANKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWHLNQQ#issuecomment-495892162>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEIP7DPOGFQ2EYSSZFPZSL3PXD7ELANCNFSM4DNH2ANA>.
|
@Costyv95 |
thanks @Costyv95! |
Hi, @Costyv95 Yolo trains and outputs validation loss , but after 1000 steps it throws an error. File FileNotFoundError: [Errno 2] No such file or directory: 'gsutil': 'gsutil'. |
@akmeraki hello i met the same problem, did you find any solution about this error ? |
Hey guys. checkpoint I would appreciate the help guys. |
@Costyv95 |
It doesnt work for me. I get an error as below: |
During training:
step1 - loss 240.92623901367188 - moving ave loss 240.92623901367188
step 2 - loss 241.2866668701172 - moving ave loss 240.96228179931643
step 3 - loss 239.79562377929688 - moving ave loss 240.84561599731447
How do I add training accuracy and validation accuracy?
step1 - loss 240.92623901367188 - moving ave loss 240.92623901367188 - train 0.221
step 2 - loss 241.2866668701172 - moving ave loss 240.96228179931643 - train 0.222
step 3 - loss 239.79562377929688 - moving ave loss 240.84561599731447 - train 0.223
Finished 1 Epoch, validation 0.210
The text was updated successfully, but these errors were encountered: