Duplicated Maps #137

zmarkovich · 2022-05-10T23:37:32Z

Hello,

While using redist_smc function, we've encountered a problem where the simulation seem to include a very larger number of identical maps. Indeed, the problem is so bad that in one case only 437 unique maps were generated out of 10,000 draws. This seems to occur regardless of the simulations parameters.

Is this intended behavior? Is there any parameters we should tweak to resolve the issue? I'd be happy to email you files/ code for a reproducible example, the data is just a bit large for github.

Thank you for your help with understanding this behavior.

Best,
Zach

CoryMcCartan · 2022-05-11T03:50:18Z

The SMC algorithm isn't designed to generate as many independent maps as possible; it tries to produce a representative sample so that when you take averages w.r.t. that sample, they are correct. That being said, 437 uniques out of 10,000 is on the low side. How many districts & precincts is your map, and are you using any constraints?

zmarkovich · 2022-05-15T14:57:11Z

Thanks for the follow up. We're drawing 63 districts out of about 15k precincts. In terms of constraints, we had limited county splits using the "counties argument (we specified 62 counties in the state). We also set seq_alpha=.25. The only other constraint is population tolerance (set to .05 for state legislative redistricting); and compactness was left at the default (1) along with all other defaults. One interesting thing is we didn't have nearly as much duplication in another set of simulations where we just left seq_alpha at its default; not sure if that's relevant info or not.

CoryMcCartan · 2022-05-16T23:49:19Z

OK, so 63 is a relatively large number districts. I think seq_alpha=0.6 or 0.7 is probably more appropriate -- the smaller values like 0.25 are going to not do aggressive enough pruning of bad samples & will let the range of the weights grow too extreme.

If you install the current dev version (soon to be 4.0) with remotes::install_github("alarm-redist/redist@dev"), you can call summary() on your plans object and see some useful diagnostic information that will help with this. In particular, there's a column which keeps track of the rough number of unique plans seen at each iteration. As you adjust seq_alpha you should see how these trend -- ideally you want the # at each iteration to be roughly the same, & certainly not a big drop at the end.

CoryMcCartan added the question label May 11, 2022

CoryMcCartan closed this as completed Jun 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Duplicated Maps #137

Duplicated Maps #137

zmarkovich commented May 10, 2022

CoryMcCartan commented May 11, 2022

zmarkovich commented May 15, 2022

CoryMcCartan commented May 16, 2022

Duplicated Maps #137

Duplicated Maps #137

Comments

zmarkovich commented May 10, 2022

CoryMcCartan commented May 11, 2022

zmarkovich commented May 15, 2022

CoryMcCartan commented May 16, 2022