Skip to content
This repository has been archived by the owner on Oct 30, 2019. It is now read-only.

Using cifar-100 with 15 classes #203

Open
YotYot opened this issue May 5, 2018 · 7 comments
Open

Using cifar-100 with 15 classes #203

YotYot opened this issue May 5, 2018 · 7 comments

Comments

@YotYot
Copy link

YotYot commented May 5, 2018

Hi,

I'm trying to classify my images to 15 classes, and use cifar-100 for that.
I'm using the following command -

th main.lua -data -nClasses 15 -resetClassifier true -dataset cifar100 -depth 22

and I get the following error -

Assertiont >= 0 && t < n_classes failed.

I don't get this with any of the other datasets (cifar-10, imagenet)

Any clue, someone?

Thanks! Yotam

@onidzelskyi
Copy link

You've missed an argument after -data parameter. Your command should looks like

th main.lua -data /dev/null -nClasses 15 -resetClassifier true -dataset cifar100 -depth 22

@YotYot
Copy link
Author

YotYot commented May 23, 2018

Hi @onidzelskyi ,

Thanks for your answer, you're right of course but this isn't the problem - I just didn't specify the path to the data, but I was using it.

Running the same command with cifar10 works ok.

Thanks, Yotam

@onidzelskyi
Copy link

Yes, you right - I've the same issue when trying to train with small #classes (3 classes in my case) - it gives the same error you experienced with. Seems, for big networks (cifar100 for your case and resent-200 in my case) #classes should be equals or more than some threshold value.
To check it out try to increment #classes for cifar100 model and let me know if it make any positive effect.
Regards,
Oleksii

@aabobakr
Copy link
Contributor

The option -resetClassifier replaces the output layer of the original model with a new output layer with the -nClasses you provide. So, in your example it will create a new network with 15 output neurons, and you are training on cifar100 which has 100 classes. The assertion fails as the output and target should be the same size for the loss function to be evaluated.

@onidzelskyi
Copy link

onidzelskyi commented May 23, 2018

Fix me if I'm on wrong way.
To train on own dataset with custom #classes

  1. Overload datasets/imagenet.lua and datasets/imagenet-gen.lua for our dataset (e.g. )
  2. Train model with
    th main.lua -data <path to dataset directory> -resetClassifier true -nClasses <#classes> -dataset <custom dataset name>

But I get an error
unknown dataset: <custom dataset name>

I have no idea how to adopt it to my own dataset

@aabobakr
Copy link
Contributor

You don't need to set the -dataset argument and only the -data argument should contain the path to your dataset. Your dataset directory must be organised as follows:

/dataset
  |---> /train
           |--> /class#1
           |--> /class#2
  |---> /val
           |--> /class#1
           |--> /class#2

@onidzelskyi
Copy link

onidzelskyi commented May 23, 2018

th main.lua -data /home/alex/test_car_dataset/ -resetClassifier true -nClasses 3

gives an error

=> Creating model from file: models/resnet.lua
| ResNet-34 ImageNet
=> Replacing classifier with 3-way classifier
=> Generating list of images
| finding all validation images
| finding all training images
| saving list of images to /home/alex/fb.resnet.torch/gen/imagenet.t7
=> Training epoch # 1
/home/alex/torch/extra/cunn/lib/THCUNN/ClassNLLCriterion.cu:57: void > cunn_ClassNLLCriterion_updateOutput_kernel(Dtype *, Dtype *, Dtype *, long *, Dt$pe *, int, int, int, int, > long) [with Dtype = float, Acctype = float]: block: [0,0,0], thread: [0,0,0] Assertion t >= 0 && t < > n_classes failed.

...

THCudaCheck FAIL file=/home/alex/torch/extra/cutorch/lib/THC/generic/THCStorage.c line=32 > error=59 : device-side assert triggered
/home/alex/torch/install/bin/luajit: cuda runtime error (59) : device-side assert triggered at > /home/alex/torch/extra/cutorch/lib/THC/generic/THCStora
ge.c:32
stack traceback:
[C]: at 0x7ff5ed0f5210
[C]: in function '__index'
...lex/torch/install/share/lua/5.1/nn/ClassNLLCriterion.lua:50: in function 'updateOutput'
...torch/install/share/lua/5.1/nn/CrossEntropyCriterion.lua:20: in function 'forward'
./train.lua:58: in function 'train'
main.lua:52: in main chunk
[C]: in function 'dofile'
...alex/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

dataset directory structure

ls -lat /home/alex/test_car_dataset/
val
train

ls -lat /home/alex/test_car_dataset/train
1
3
2

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants