New dataset - label is coming from subfolder? csv no effect?


#1

Hi,

I have downloaded Studio desktop v 2.1.0.

Am i right that:

  1. Im only able to create a dataset for image classification?
    (And try out the built in examples.)
    (edited: I was able to create a dataset where I merged different columns into one and Studio was able to handle it as an array like in the text classification example.)

  2. The instruction to construct the image classification dataset is outdated?
    It looks like the uploader doesnt care any more about the csv. It is enough to have seperate folders for the images, and the system will use the subolder names for labels?

Please make it clear on the download page what are the restrictions for Deskop version.

Greetings,
Janos


#2

Hi Janos,

There are no restrictions in the Desktop version. Uploader still looks for train.csv and when it does not find it, it treats the dataset as image dataset and generates train.csv. This feature is not yet complete (does not work when uploading test dataset) so it is not advertised yet.

Regards
Rajendra


#3

Dear rajendra!

Yes you are right! It works!
I had created a csv for the images with a different name and there was no error / warning displayed, because this advanced feature (using the subfolders name as category) switched on. I used the same category names in the csv so I thought it was able to process it, and when I started to add more columns I didnt understand why the system doesnt recognize them. Now I know why.


#4

Hello all

Labels speciesd in the CSV file is not recognized. It shows it own… Please any one help me to fix this.


#5

Hi,
Can you provide us the snap shot of the csv file and the data section in DLS.

Regards
Rajat


#6

Thank for response. I identify the issue. Issue is while saving it saved as train.csv.csv. so this issue is happening.


#7

I am having the same issue as @horvathj (see my post here: Are TIF Image Files Supported?)

I have over 2K TIF images in ~30 subfolders with a labeled train CSV, which I have moved manually via the File Browser; and for some reason, the files are not being recognized.

I’ve received no responses on this and I’m slowly beginning to lose confidence in the platform and what it can do - it should not be this difficult to utilize custom datasets.

Feedback an support is greatly appreciated

Thanks,

AR


#8

here is my csv file for reference; I highlighted the corresponding folder structure.

image


#9

I figured it out - the AutoML feature is just limited to (1) type of categorical label for an output. It would really be a value add to extend multi-label support for auto-ml - other platforms like KNIME and Rapidminer have (or soon will) this capability.