Performs random search on a parameter space where the objective function is a program that outputs a number to stdout. The initial points are read from the input specified by --evaluated-points
(default is stdin). At each step the set of existing points is sorted to find the "best" point (see --optimization-type
). Exploratory points (see --num-proposals
) are then generated by picking an offset vector on a scaled N-dimensional sphere (see --dimensionality
and --radii
) and applying it to the best point. For each exploratory point the command is run to evaluate that point. Each newly evaluated point is added back to the set of existing points and the optimization repeats until convergence (see --stale-threshold
and --stale-count
). The value for a point is the first line in the command's stdout that can be converted to a float. This means the string inf
and -inf
are parsable and can be used to mark impossible parameter points.
Say you have a simulation and analysis test suite that runs a series of simulations for a given configuration and returns a single numeric score for that configuration:
$ ./evaluate_configuration_in_sim --alpha=0.3 --beta=1.0 -C 1000.0 342.4 10
1034.53
You want to find a good configuration but the configuration/parameter space is five dimensional (five numbers to tune). A grid search works nicely for getting a feel for the configuration space but can be expensive if your simulation and analysis suite is expensive to run. Random search is a crude but simple method for exploring a configuration/parameter space.
pyrandomsearch
basically is a "driver" that determines new configurations to try and then invokes your command with the configuration. It then reads the score/cost from the command and repeats until progress slows (gets "stale").
Here's an example invocation:
$ echo "1034.53 0.3 1.0 1000.0 342.4 10" \
| pyrandomsearch \
--radii=0.1,0.5,500,100,4 \
--optimization-type=max \
--num-proposals=10 \
--print-date-and-time \
'./evaluate_configuration_in_sim --alpha={} --beta={} -C {} {} {}'
A point is the score/cost/evaluation for that point followed by all of the point's components. So in the above example, 1034.53 is the score for point (0.3, 1.0, 1000.0, 342.4, 10). The program's output follows the same format.
If you want a quick example to just run we can use Python to evaluate an offset two dimensional quadratic surface with a global minimum at (4, 4): python3 -c "print(({}-4)**2+({}-4)**2)"
. You can also give it empty input in order to start the optimization at the origin.
$ echo "" \
| pyrandomsearch \
--rng-seed=0 \
--dimensionality=2 \
--radii=0.5 \
--optimization-type=min \
--num-proposals=4 \
--stale-count=2 \
'python3 -c "print(({}-4)**2+({}-4)**2)"'
## WARN: No existing points, seeding with origin
## Existing points:
## New points:
33.33016693325995 0.27953760569721353 -0.41455847235470794
33.84771087498297 -0.43901558517616096 0.23930172580328984
36.523094960394396 -0.4987459069840192 -0.03539096306528034
35.31706172458439 0.10538510265751438 -0.48876781823056353
...
0.11336010432191714 4.301968141413402 4.148913887509703
0.20925802238514438 4.117004666628483 3.5577693697036556
0.24940966182576765 3.8327014026304047 4.470553760099666
0.42210897999430264 3.3673455505322423 4.1478422387646745
## Best point: 0.028137782991007325 3.8348607359667337 3.9705584228418638
If you have existing evaluated points in a file you can specify that file for loading with --input
:
$ echo "2 3 3" > points.txt
$ pyrandomsearch \
--input=points.txt \
--rng-seed=0 \
--dimensionality=2 \
--radii=0.5 \
--optimization-type=min \
--num-proposals=4 \
--stale-count=2 \
'python3 -c "print(({}-4)**2+({}-4)**2)"'
## Existing points:
2.0 3.0 3.0
## New points:
2.5200417333149883 3.2795376056972136 2.5854415276452922
2.649427718745742 2.560984414823839 3.2393017258032897
3.3182737400985984 2.501254093015981 2.9646090369347196
3.016765431146098 3.1053851026575146 2.5112321817694365
...
1.8470073760058219 2.830020070854084 3.3085129492162997
1.6228828613128918 2.8645426891004027 3.4224019075191237
1.141081146510152 3.749537277804338 2.9615638819356818
1.873808109892183 2.827588030939391 3.2934178853837235
## Best point: 1.0653193465094637 3.3223805124466272 3.221442888031091
You can append newly evaluated points to the input file by using the --append
option:
$ echo "2 3 3" > points.txt
$ pyrandomsearch \
--input=points.txt \
--append \
--rng-seed=0 \
--dimensionality=2 \
--radii=0.5 \
--optimization-type=min \
--num-proposals=4 \
--stale-count=2 \
'python3 -c "print(({}-4)**2+({}-4)**2)"'
## Existing points:
2.0 3.0 3.0
## New points:
2.5200417333149883 3.2795376056972136 2.5854415276452922
2.649427718745742 2.560984414823839 3.2393017258032897
3.3182737400985984 2.501254093015981 2.9646090369347196
3.016765431146098 3.1053851026575146 2.5112321817694365
...
1.8470073760058219 2.830020070854084 3.3085129492162997
1.6228828613128918 2.8645426891004027 3.4224019075191237
1.141081146510152 3.749537277804338 2.9615638819356818
1.873808109892183 2.827588030939391 3.2934178853837235
## Best point: 1.0653193465094637 3.3223805124466272 3.221442888031091
$ tail points.txt
3.662986291560146 2.63202730265564 2.661479551564287
1.0653193465094637 3.3223805124466272 3.221442888031091
1.7864612304830279 3.085340872416278 3.0253924324063495
1.6503161905493098 2.716150981909868 3.9547463891129824
1.1367532497363355 3.3029967699834386 3.193191629268253
1.8470073760058219 2.830020070854084 3.3085129492162997
1.6228828613128918 2.8645426891004027 3.4224019075191237
1.141081146510152 3.749537277804338 2.9615638819356818
1.873808109892183 2.827588030939391 3.2934178853837235
## Best point: 1.0653193465094637 3.3223805124466272 3.221442888031091
This lets you use shell scripting to set up an "optimization rate" (i.e. radii) schedule. In the following example we take advantage of the fact that --radii
will evaluate its arguments as expressions allowing us to insert math expressions.
$ echo "" > points.txt
$ for X in 0 1 2 3 4; do \
pyrandomsearch \
--input=points.txt \
--append \
--rng-seed=0 \
--dimensionality=2 \
--radii="math.pow(10, -$X)" \
--optimization-type=min \
--num-proposals=4 \
--stale-count=2 \
'python3 -c "print(({}-4)**2+({}-4)**2)"'; \
done
## WARN: No existing points, seeding with origin
## Existing points:
## New points:
35.160333866519906 0.5590752113944271 -0.8291169447094159
36.19542174996593 -0.8780311703523219 0.4786034516065797
41.54618992078879 -0.9974918139680384 -0.07078192613056068
39.13412344916878 0.21077020531502877 -0.9775356364611271
...
7.799654190847027e-09 3.999971132019926 4.000083464327214
1.2275783593113444e-08 3.99999678064581 3.9998892506462695
3.6832827976079833e-09 4.000051480672121 4.000032140678207
1.8918456610956505e-08 3.99986410973208 3.999978732849378
## Best point: 1.5620963397323232e-09 3.9999639852946767 3.999983720032549
The easiest thing to do is to install it via pip
from the Python Package Index:
pip install pyrandomsearch
The program can also be installed by installing the Python module from source which will establish the pyrandomsearch
entry point:
python setup.py install
pyrandomsearch --help
Alternatively the pyrandomsearch.py
is stand alone and can be run directly.
./pyrandomsearch/pyrandomsearch.py --help