active_adaptation

Rough Plan

Implement adverserial Lavg -> Lmax
Robust optimization experiment
Active learning experiment
ICML Feb 24!!!!
Figure out the theory part of domain adaptation stuff
Domain adaptation as a discriminator
Domain adaptation experiment
Active domain adaptation experiment
ICCV March 17
RL experiment with robots
Counter-Factual learning
NIPS May 19

Step Baseline 1

~~Cifar10 training and test (full data/nothing new using tf_base)~~
~~Set the seed training_set and save it num_images = 5000~~
~~Random active learning test 0.1/0.2/0.3/0.4/0.5/0.6/0.7/0.8/0.9/1.0~~
~~Plot the accuracy vs training data (baseline 1)~~ see acc_vs_size.png

Step Baseline 2

~~L_avg -> L_max experiment with the same setup~~
~~No random selection, just a sampling w/ replacement~~
~~Refactor the code~~
~~Re-run the baseline 1~~
~~Run the baseline 2~~
~~Plot the accuracy vs training data (baseline 2)~~
~~Consider unbiasing the gradients~~ samples are pretty uniform
~~Consider dropping one fc layer~~ the reason is about distribution difference

Step Active Learning

Consider a few tricks
- ~~Re-initialize everything~~ effective but not much
- Keep a validation and learn adverserial only on the validation (no contamination since actual network never sees it) effective but not much
Sample new data with the learned model
May be a diversity trick? ~~(still a valid thing)~~ does not seem necessary since tSNE is pretty diverse
- ~~Diversity is a submodular function if defined as sum of total probability covered around each ball~~
- ~~Theory suggests a covering ball so let's use that~~
~~Combinatorial Algorithm: start with greedy 2-OPT solution, then refine it using integer programming and binary search if feasible.~~ this is pretty feasible actually somehow Gurobi is more efficient than greedy one to improve the solution
~~To match theory and practice, put feature learning in both players~~
Include gradient reversal layer
- Seems like best option for now
- ~~Step 1: vanilla reversal~~ Note: ADAM is using second derivative which is crap in adverserial setting so use momentum
- ~~Step 1.5: Implement reversal with single output (so it can learn data distribution)~~
- ~~Step 2: vanilla(so) reversal+loss_rescale~~
- ~~Step 3: Reversal(so) domain estimate + sampling~~
- ~~Step 4: Reversal(so and not/so) + combinatorial sampling (this is desired simply because theory)~~
~~Try with oracle loss~~ still worse than random may be it is bringing sort of a bias
~~Look at the tSNE plot and see is it because of diversity~~ it is pretty diverse
~~Exploration works so test different degree of exploration~~ 0.2 seems like a nice one, may be 0.25
Consider normalizing stuff since they become crazy (may be remove batch norm)
Use BiGAN or ALI as semi-supervised algorithm

Baselines

k-k^\prime
maximum uncertanity
uncertanity based sampling
oracle uncertanity based sampling

Step Domain Adaptation

Modify the loss/adversery to be applicable to domain adaptation

Step Active Learning with DA

Combine everything

Device Assignment

109 0/5k/10k/15k
110 20k/25k/30k/35k
106 40k/45k

Results

50,55,62,68,70,73,76,78,79,81

Way to sample active datapoints

~~get the top 5000 expected loss make it 1 rest 0~~
~~combine with gamma = 0.01 or 0.02~~

Report

Descrive the active learning with pool problem, state it is a weakly supervised problem and need to be treated like one
Discuss p(x) p_n(x) p_\hat{n}(x) and give the basic idea behind loss re-scale and discuss how it can be discussed through alternating minimization (Adverserial Weak Supervision)
Discuss the theoretical aspect of active learning
- Review robustness and generalization
- Lemma (VGG is robust)
- Theorem: Any robust algorithm is robust with less samples if ()
Discuss the empirical setup and explain the two concepts (fixed budget, single step)
- Representations should be as close as possible so it is easier to cover the same space with less points
  - Gradient reversal layer
- The bound depends solely on (\gamma) hence solve combinatorial optimization to have minimum ball
  - Binary search over a submodular problem
Experiments
- MNIST
- Cifar 10 / Cifar 100 on VGG

Current TODO

~~Get the features and look at the tSNE~~
Sample far points and try this
Implement the combinatorial algorithm for N-D, try with 2D t-SNE points
Experiment the active learning

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
paper_iclr		paper_iclr
src		src
tf_base @ 1e656ff		tf_base @ 1e656ff
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
acc_vs_size.png		acc_vs_size.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

active_adaptation

Rough Plan

Step Baseline 1

Step Baseline 2

Step Active Learning

Baselines

Step Domain Adaptation

Step Active Learning with DA

Device Assignment

Results

Way to sample active datapoints

Report

Current TODO

About

Releases

Packages

Languages

License

ozansener/active_adaptation

Folders and files

Latest commit

History

Repository files navigation

active_adaptation

Rough Plan

Step Baseline 1

Step Baseline 2

Step Active Learning

Baselines

Step Domain Adaptation

Step Active Learning with DA

Device Assignment

Results

Way to sample active datapoints

Report

Current TODO

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages