Name		Name	Last commit message	Last commit date
parent directory ..
__pycache__		__pycache__
dataset_txt		dataset_txt
select_best		select_best
test_jsons		test_jsons
train_jsons		train_jsons
README.md		README.md
algorithm1.jpg		algorithm1.jpg
algorithm2.jpg		algorithm2.jpg
bayesian_blocks.py		bayesian_blocks.py
generate_bins.py		generate_bins.py
load_data.py		load_data.py
multinomial.py		multinomial.py
poisson.py		poisson.py
seeds10.txt		seeds10.txt
select_best.py		select_best.py
train.py		train.py

README.md

3.2 Revisiting Stage 2 (Creating Data splits) -- Sec 3.2

This folder contains the method in which the data-splits are created. This procedure partions the count range into balanced strata (bins) using Bayesian optimality criterion (Sec 3). The procedure can be summarised as shown below.

Optimal Bins for different datasets are:

Dataset name	Bins
NWPU	`[0.0, 0.5, 2.5, 5.5, 10.5, 18.5, 31.5, 53.5, 90.5, 152.5, 256.5, 429.5, 717.5, 1197.5, 1997.5, 3330.5, 5552.5, 9255.5, 15426.0]`
UCF	`[49.0, 49.5, 51.5, 55.5, 62.5, 73.5, 92.5, 123.5, 175.5, 261.5, 405.5, 644.5, 1043.5, 1708.5, 2816.5, 4662.5, 7738.5, 12865.0]`
STA	`[33.0, 33.5, 35.5, 38.5, 43.5, 51.5, 64.5, 85.5, 120.5, 178.5, 275.5, 436.5, 704.5, 1151.5, 1897.5, 3139.0]`
STB	`[13.0, 13.5, 15.5, 19.5, 25.5, 36.5, 54.5, 83.5, 132.5, 214.5, 351.5, 578.0]`

Algorithm Block 1	Algorithm block 2

Procedure for generating bins:

Step 1

First put your dataset txt in form of image_name,image_count in a .txt, for a demo you can look into the folder dataset_txt. As an example we have provided the txt files for NWPU,UCF-QNRF, STA and STB datasets.

Step 2

To start the training run

python train.py -d ./dataset_txt/Train_nwpu.txt -f multinomial

If you want to get bins on your dataset, change the -d argument to that path of your datasets information stored .txt format. The -f argument helps choose between the fitness functions (multinomial,poisson).

Step 3

After training (Step 2) there are best files generated in the select_best folder. To print out the top two best binning configurations, run the following code

python generate_bins.py  -d ./dataset_txt/Train_nwpu.txt -f multinomial

Here -d corresponds to dataset and -f corresponds to fitness function.

The folder structure of the codes in this folder is:

.
├── dataset_txt             # All files of dataset are in this folder.
│   ├── *.txt               # Dataset-wise txts 
│   └── ...
├── select_best             # After running the train.py file the best configs are stored in this (dataset-split-wise).
├── test_jsons              # While running the files generated are stored in here.
├── train_jsons             # While running the files generated are stored in here.
├── bayesian_blocks.py      # Modified version of bayesian blocks to fit our use case.
├── generate_bins.py        # This file is used to generate bins after training is over.
├── load_data.py            # For loading data
├── multinomial.py          # For multinomal distribution
├── poisson.py              # For poisson distribution
├── seeds_10.txt            # Fixing the seeds of the data splits.
├── select_best.py          # script that selects the best bin and is called in generate_bins.py
└── train.py                # For generating the bins split wise.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

binning

binning

README.md

3.2 Revisiting Stage 2 (Creating Data splits) -- Sec 3.2

Procedure for generating bins:

Step 1

Step 2

Step 3

Files

binning

Directory actions

More options

Directory actions

More options

Latest commit

History

binning

Folders and files

parent directory

README.md

3.2 Revisiting Stage 2 (Creating Data splits) -- Sec 3.2

Procedure for generating bins:

Step 1

Step 2

Step 3