Add CLI params support + MobileNetV2 model + small refactoring #9

rpoltorak · 2021-01-22T23:15:56Z

It allows to run experiments easily using CLI params like for example:

python mask_classifier.py --gpus 1 --image_size 256 --batch_size 32 --learning_rate 0.001

and many many more.

Moved basic cnn model and data set into separate directories to create space for other experiments in the future
Fixed accuracy calculation and modified some model layers (just for testing purposes - it's still dummy)
Added MobileNetV2 network with customized fully connected layer for testing purposes

GrivIN

Actually, couple of comments
btw. can we require type annotations?

mask_detector/datasets/masked_face_net.py

mask_detector/mask_classifier.py

mask_detector/datasets/masked_face_net.py

jedrzejpolaczek · 2021-02-01T23:56:27Z

mask_detector/datasets/masked_face_net.py

+        self.dataFrame = pd.read_csv(csv_file)
+
+        self.transform = Compose([
+            Resize(image_size),  # for MobileNetV2 - set image size to 256


If we are creating logic for creating data set for a specific solution (MobileNetV2) I would suggest to just create a separate solution to do this, rather than creating all for one class/function.

This is indeed a good one. Do you have any idea how we could name that? We need to think about naming that kind of combinations of dataset and model. I can't come up with a name for a combination of MobileNetV2 and MaskedFaceNet :) Naming is indeed one of the hardest part of making software :P

I think we shouldn't combine dataset with a particular model.

Rather than that we should keep a generic dataset prep here (connected only with the dataset we are using, parsing labels, etc).

Transforms specific for any model (or even generic ones for data augmentation) should be only applied here, not defined. It should be easy to do that if we pass all transforms as a function in arguments and apply it here.

Maybe even we should get a step further, merging both model-related transforms and data augmentation-related ones 🤔

Had not much free time last week, this one I'll try to come up with some nice file structure (after I'll finally decide on some collab tool).

If you want some inspiration you can check a template a friend of mine is doing (related to his thesis if I'm not wrong). Though I plan not to dive into it to be unbiased when preparing our template.

Also a quick observation - almost always when something is difficult to name it means we are doing something wrong 😅 If functions / modules are nicely separated then it shouldn't be as difficult (at least naming the modules / functions. Naming variables is and always be the major problem 😄 ).

jedrzejpolaczek · 2021-02-01T23:58:59Z

mask_detector/models/basic_cnn.py

@@ -1,4 +1,4 @@
-from torch.nn import Module, Conv2d, Linear, MaxPool2d, ReLU, Sequential, Softmax, Flatten
+from torch.nn import Dropout, Module, Conv2d, Flatten, Linear, MaxPool2d, ReLU, Sequential, Sigmoid, Softmax, Flatten


 class BasicCNN(Module):


I think it would be nice to add some results for a tested classifier. Maybe in README? Without that kind of data, it's hard to say anything about a solution.

I love that idea. Created separate README.md file in models directory. You could check this here: mask_detector/models/README.md

mask_detector/models/basic_cnn.py

mask_detector/models/mobile_net_v2.py

jedrzejpolaczek · 2021-02-02T00:01:03Z

mask_detector/datasets/masked_face_net.py

@@ -0,0 +1,33 @@
+""" Dataset module
+"""


I also do not get the name. Why masked face net?

Because it's the name of our current dataset. MaskedFaceNet :) Any suggestions are more than welcome

azjawinski

Looks great 💙

Added some picky comments, but probably some of them will be fixed when I finally propose some collab tool and the project structure 😞 Because of this I'm already giving approve ✅

azjawinski · 2021-02-08T11:54:49Z

mask_detector/datasets/masked_face_net.py

+        self.dataFrame = pd.read_csv(csv_file)
+
+        self.transform = Compose([
+            Resize(image_size),  # for MobileNetV2 - set image size to 256


I think we shouldn't combine dataset with a particular model.

Rather than that we should keep a generic dataset prep here (connected only with the dataset we are using, parsing labels, etc).

Transforms specific for any model (or even generic ones for data augmentation) should be only applied here, not defined. It should be easy to do that if we pass all transforms as a function in arguments and apply it here.

Maybe even we should get a step further, merging both model-related transforms and data augmentation-related ones 🤔

Had not much free time last week, this one I'll try to come up with some nice file structure (after I'll finally decide on some collab tool).

If you want some inspiration you can check a template a friend of mine is doing (related to his thesis if I'm not wrong). Though I plan not to dive into it to be unbiased when preparing our template.

Also a quick observation - almost always when something is difficult to name it means we are doing something wrong 😅 If functions / modules are nicely separated then it shouldn't be as difficult (at least naming the modules / functions. Naming variables is and always be the major problem 😄 ).

azjawinski · 2021-02-08T12:07:30Z

mask_detector/mask_classifier.py

 from utils import train_val_test_split
+from models.basic_cnn import BasicCNN
+from models.mobile_net_v2 import MobileNetV2
+from datasets.masked_face_net import MaskedFaceNetDataset


 class MaskClassifier(LightningModule):


Hmm, probably it would be convenient to prepare some basic, semi-generic class for our DataModule.
It should define all the common operations for the task we have (image-processing), as probably most of them will be the same between different models.

Then we should extend this base class and overwrite specific methods for each Model we have (choosing correct datasets, defining some transformations etc).

What do you think?

azjawinski · 2021-02-08T12:30:08Z

mask_detector/models/README.md

+
+| Model       | Params                                                | Loss   | Recall |
+| ----------- | ----------------------------------------------------- | ------ | ------ |
+| MobileNetV2 | max_epochs=10<br>batch_size=32<br>learning_rate=0.001 | 0.1449 | 0.9794 |


Having only loss and recall can be very misleading without having the precision as well.

Recall does not inform about any false positives. I suppose you have checked the dataset and if classes are properly balanced.

We have low loss value - though it does not mean anything. We do not know how to interpret this, or is loss 0.1 much better than 0.2. This is why usually additionally accuracy is presented.
On its own accuracy is very risky metric (without precision and recall data we could have a 99% accuracy when model is e.g. always outputting the same value, but classes were not balanced).

However, it describes what is the most important in evaluating the model outcome - how accurate the model is in the most intuitive way. If we see Recall / Precision / F1 score of e.g. 0.7 we need to think how to interpret it relating to the dataset, possible predictions etc. If we have an accuracy of 0.8 we already see that 80% of all predictions are correct. Then we can investigate other metrics to directly compare models, but that accuracy is the true goal anyways.

Also we miss precision. Here is a nice presentation what this could lead to (based on the situation from a shooting range):

I would opt for tracking all of these metrics (as they are not so slow to compute anyways). We never know when such information could become helpful.

azjawinski · 2021-02-08T12:39:00Z

mask_detector/models/basic_cnn.py

+        output = self.convLayers1(output)
+        output = self.convLayers2(output)
+        output = self.convLayers3(output)
+        output = self.linearLayers(output)


We could try to investigate if this generates some calculation efficiency cost. There could be the case that PyTorch is optimizing all the calculations in 1 Sequential layer, that we lose keeping separate Sequentials. Maybe it would be better to store them as arrays and then convert to 1 Sequential?

TBC (To Be Checked) though, it may be not a problem at all.

KonradNojman · 2021-03-01T17:09:00Z

@rpoltorak Will you be adding something more to this PR, or we can merge it and @azjawinski will work on this topic in the new PRs?

azjawinski · 2021-03-08T12:43:29Z

Closing this PR, I'm taking this over after Remek.
Comments will be implemented (and further reviewed) in a separate PR.

Remigiusz Poltorak added 3 commits January 22, 2021 23:54

Fix model params & add accuracy metric

20e84cb

Add CLI params support

9e97981

Move model & dataset to separate directories

ed52f60

rpoltorak changed the title ~~Add CLI params support~~ Add CLI params support + small refactoring Jan 23, 2021

Add MobileNetV2 model for testing purposes

0259253

rpoltorak requested review from azjawinski, GrivIN and jedrzejpolaczek January 24, 2021 16:20

rpoltorak changed the title ~~Add CLI params support + small refactoring~~ Add CLI params support + MobileNetV2 model + small refactoring Jan 24, 2021

Remigiusz Poltorak added 2 commits January 25, 2021 00:11

Change models & image transform pipeline

ed97c35

Add recall metric & refactor logging in train,validation and test steps

22bed22

GrivIN suggested changes Jan 31, 2021

View reviewed changes

jedrzejpolaczek suggested changes Feb 2, 2021

View reviewed changes

Remigiusz Poltorak added 4 commits February 6, 2021 15:39

Fixes related to code style and naming

29c5a99

Fix logged loss values

0754f06

Add performance measures for MobileNetV2 model

3c07f54

Change output variable to more meaningful name

645e594

rpoltorak requested review from GrivIN and jedrzejpolaczek February 6, 2021 15:12

azjawinski approved these changes Feb 8, 2021

View reviewed changes

azjawinski merged commit 458e8d8 into main Mar 8, 2021

azjawinski deleted the feature/cli-args branch March 8, 2021 12:44

azjawinski linked an issue Mar 8, 2021 that may be closed by this pull request

Implement basic model for a kickoff and testing purposes. #12

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CLI params support + MobileNetV2 model + small refactoring #9

Add CLI params support + MobileNetV2 model + small refactoring #9

rpoltorak commented Jan 22, 2021 •

edited

Loading

GrivIN left a comment

jedrzejpolaczek Feb 1, 2021

rpoltorak Feb 6, 2021 •

edited

Loading

azjawinski Feb 8, 2021

jedrzejpolaczek Feb 1, 2021

rpoltorak Feb 6, 2021

jedrzejpolaczek Feb 2, 2021

rpoltorak Feb 6, 2021 •

edited

Loading

azjawinski left a comment

azjawinski Feb 8, 2021

azjawinski Feb 8, 2021

azjawinski Feb 8, 2021

azjawinski Feb 8, 2021

KonradNojman commented Mar 1, 2021

azjawinski commented Mar 8, 2021

Add CLI params support + MobileNetV2 model + small refactoring #9

Add CLI params support + MobileNetV2 model + small refactoring #9

Conversation

rpoltorak commented Jan 22, 2021 • edited Loading

GrivIN left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rpoltorak Feb 6, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rpoltorak Feb 6, 2021 • edited Loading

Choose a reason for hiding this comment

azjawinski left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KonradNojman commented Mar 1, 2021

azjawinski commented Mar 8, 2021

rpoltorak commented Jan 22, 2021 •

edited

Loading

rpoltorak Feb 6, 2021 •

edited

Loading

rpoltorak Feb 6, 2021 •

edited

Loading