Skip to content

Agents who seek information and reward in equal measure.

License

Notifications You must be signed in to change notification settings

CoAxLab/infomercial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Informercial

infomercial contains code for Curiosity eliminates the exploration-exploitation dilemma, bioArxiv 671362v8 (2020).

In this paper we present an alternative interpretation of the classic but intractible exploration-exploitation dilemma. We prove the key to finding a tractable solution is to do an unintuitive thing--to explore without considering reward value.

Research summary

The exploration-exploitation dilemma is summarized by a simple question: “Should I exploit an available reward, or explore to try out a new uncertain action?” Unfortunately, it’s been proven that this dilemma, when stated as a mathematical problem, is intractable and so can’t be solved directly. This fundamentally limits our ability to predict optimal naturalistic behavior during foraging and exploration, and to optimally drive learning in artificial agents.

To overcome this field-wide limitation, we took a fresh look at the dilemma. Our goal was simple: when one mathematical problem can’t be solved, it’s often good to find another related problem that can be and use that to make progress on both.

We show, for the first time, that nearly any dilemma problem can also be viewed as competition, between exploiting known rewards or exploring to learn for its own sake. We prove this competition can be perfectly solved using the famous win-stay, lose-switch strategy from game theory. To ensure this solution is as broad as possible we also derived a new universal theory of information value which complements--but is independent of--Shannon’s Information Theory.

Code dependencies

  • A standard anaconda install
  • pytorch (>= 4.1)

Install

pip install . -e following cloning of this repo.

Experiments

All experiments can be (re)run from the top-level Makefile, found in the infomercial repo.

For analysis see ./notebooks/

For paper figures see ./figures/

About

Agents who seek information and reward in equal measure.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages