-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Common Optimization Methods #361
Conversation
…bursts and variants as well as tilted runs.
…ased number of Gingles' districts; add further documentation SingleMetricOptimizer class methods; delint optimization.py
I have no comments (for now) other than |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a great jumping-off point for us to bring short bursts into harmony with guided acceptance functions!
I like the flexibility of thresholding our searches — I'm going to keep saying "majority" (i.e. threshold = 0.50
) for simplicity but this should all work if the threshold is set to something else). But I think we should broaden the gingleator
initialization to be able to search for majority-minority or majority-party districts. This flexibility is sketched out here and here in my One-Click-Chains repo, though admittedly not in an "object-oriented" format. While broadening to partisan stuff makes the code a little less straightforward, I think we could gain that back by altogether dropping the minority_perc_col
and just expect to be passed either a Tally updater name for the demographic group (along with a Tally updater for the total population) or an ElectionResults updater name that we can use .percents(party)
to query for the party percents.
Conceptually, one thing all of the score functions in the gingleator
class have in common is that they look built to be used as simple comparators as you traverse the chain, i.e. accept the child if there is an improvement, otherwise accept with some fixed probability p. I think we should build out the functionality to make p dynamically change depending on how much worse the proposal is than its parent. In gingleator
terms, this means a helper function like my get_majdistricts_info()
function that returns the number of majority-{group} (demographic group or party) districts, and the percentages of a) the smallest district above the threshold and b) the largest district below the threshold. Then, in SingleMetricOptimizer
we can build acceptance functions that use those helpers to cleverly accept with a variable p. As I see it, this would be an extension/improvement to your tilted_short_bursts()
.
This is really exciting — I think that if we build this right we can be really flexible in how we search through the metagraph, and I would love to run experiments to see if layering all of these tricks together gives us more of a leg up (variable length short bursts with a custom acceptance function that rejects worse plans proportional how bad they are?? could be huge)...
Small bug, I think:
if minority_perc_col is None:
perc_up = {min_perc_column_name:
lambda part: {k: part[minority_pop_col][k] / part[total_pop_col][k]
for k in part.parts.keys()}}
initial_state.updaters.update(perc_up)
score = partial(score_function, minority_perc_col=minority_perc_col, threshold=threshold)
super().__init__(proposal, constraints, initial_state, score, minmax="max",
tracking_funct=tracking_funct)
"""
Score Functions
"""
@classmethod
def num_opportunity_dists(cls, part, minority_perc_col, threshold):
"""
Given a partition, returns the number of opportunity districts.
:param `part`: Partition to score.
:param `minority_perc_col`: Which updater is a mapping of district ids to the fraction of
minority popultion within that district.
:param `threshold`: Beyond which fraction to consider something a "Gingles"
(or opportunity) district.
:rtype int
"""
dist_percs = part[minority_perc_col].values()
return sum(list(map(lambda v: v >= threshold, dist_percs)))
...if minority_perc_col
is None
then the updater that maps district IDs to the fraction of minority population will be called minority_perc_column_name
. But the score functions all seem to call partition[minority_perc_col]
which seems like it would return an error in this case.
…partitions and to store rolling best partition/score as instance variables.
This looks really great! Just to check my understanding — if we called a simulated annealing run with a gingles.hot_cold_cycle_beta_function_factory(0,1000) (in other words only ever cold), would this be equivalent to a tilted run that always accepts better partitions, and accepts worse partitions with a dynamic probability I made some small changes in docstrings, mostly just fixing some typos. I also want to flag a couple spots I think the documentation is unclear — might just be me, so would love to get other folks' input as well... SingleMetricOptimizer hot_cold_cycle_beta_function_factory The |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #361 +/- ##
===========================================
- Coverage 91.91% 80.14% -11.77%
===========================================
Files 38 40 +2
Lines 1942 1894 -48
===========================================
- Hits 1785 1518 -267
- Misses 157 376 +219
... and 35 files with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
Thanks for the review! (as well as typo catching - spelling is not my strong suit)
Yes that would be equivalent to a titled run with a dynamic probability of excepting worst scoring plans. Although, it might be simpler to simply call a simulated annealing run with beta function: beta_function = lambda _: 1 which has slightly less computational overhead that overloading the
Yes I can add some more context/docs to the notebook! I'd like to expand on the pros/cons of the different optimization methods, although that might take way longer runs to show in a plot so I'm not sure if an example notebook is the place for that code. I also think it might be useful to show the usage of the |
Definitely agree it would be good to show |
…xpose `best_part`, `best_score`, and `score` as readonly properties. Add stubs for new cycling beta functions.
Looks good! I just updated some stuff in the |
This was fixed, but github will not show the comment in the code for me to resolve the conversation, so I have to do this the long way
This PR adds common optimization methods to the gerrychain codebase.
The
SingleMetricOptimizer
class represents the class of optimization problems over a single plan metric and currently implements short bursts, a few variants, and tilted runs, with more to come.The
Gingleator
class is a subclass ofSingleMetricOptimizer
and can be used to search for plans with increased numbers of Gingles' districts.