script to sample/create image pairs from original data #8

suchanv · 2020-01-13T22:44:13Z

No description provided.

visionjo · 2020-01-13T22:52:03Z

Why is this needed?

We provide a list of pairs.

Do you mean script that starts from the list and builds data table?

visionjo · 2020-01-13T22:54:02Z

Let me rephrase that— as this would be good code to add, but in terms of priority / reproducibility. Pairs (lists) do not change. This is a part of the dataset (per benchmark of current version of DB)

suchanv · 2020-01-13T22:59:05Z

I think it will be good if you release the full dataset as opposed to the sampled dataset that corresponds to the sampled pairs. It will also be useful for us if we change the dataset. Just my opinion. If you don't have that script, the list of sampled pairs works too. I'm not sure if I'm going to need that part for other datasets, and if I do, it would be great if you share the script or the detailed steps.

suchanv · 2020-01-13T22:59:29Z

But feel free to close this issue or assign low priority.

visionjo · 2020-01-17T03:10:44Z

I do not understand-- I do have the scripts (actually in a different project/ code for a different dataset we built). Certainly will have to modify to work here, but minimal work. But wouldn't this be more of a utility tool for someone else to build a different dataset? (which would be cool to share, as I hugely appreciate when such things are provided)

I guess the misunderstanding comes in in regards to the samples and pair-lists not being the complete dataset? Sure we sample faces, but that is for the sake of having a controlled experiment with a reasonable number of faces per subject (i.e., 25 faces for all 100 subjects from each of the 8 subgroups, split evenly). We had enough faces to probably do as much as 150 faces per subject (whatever the min number of samples for all subjects), but that is a bit loaded (like so much data and possible pairs for 800 subjects... we already have a decent size set, and with potential to add more later provided extended work/ methods/ task protocol)

Thus, we provide a benchmark for others to run against (i.e., try to improve the performance of the same experiment). If everyone generates their own list then how would we be able to fairly compare and claim SOTA (it would be a frenzy, and certainly open to people subsetting easier test, amongst other issues). Unless a dataset consists of raw data samples, ground-truth labels (at least for the test, as unsupervised this holds too), and lists it is not a complete benchmark dataset. Make sense?

But perhaps there is something I am missing/ misunderstanding. Furthermore, you could have something that could make for a better resource (i.e., if you are thinking of something that you could use, then certainly others probably could too).

Thank you for sharing, and not closing the issue until I better understand. Or until you close via better understanding

suchanv assigned visionjo Jan 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

script to sample/create image pairs from original data #8

script to sample/create image pairs from original data #8

suchanv commented Jan 13, 2020

visionjo commented Jan 13, 2020

visionjo commented Jan 13, 2020

suchanv commented Jan 13, 2020

suchanv commented Jan 13, 2020

visionjo commented Jan 17, 2020 •

edited

Loading

script to sample/create image pairs from original data #8

script to sample/create image pairs from original data #8

Comments

suchanv commented Jan 13, 2020

visionjo commented Jan 13, 2020

visionjo commented Jan 13, 2020

suchanv commented Jan 13, 2020

suchanv commented Jan 13, 2020

visionjo commented Jan 17, 2020 • edited Loading

visionjo commented Jan 17, 2020 •

edited

Loading