Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graphs from integrated circuit design have higher than expected cuts #145

Open
profmadden opened this issue Mar 18, 2023 · 11 comments
Open
Assignees

Comments

@profmadden
Copy link

Hi -- I'm trying to figure out if I'm running things incorrectly, or if there's something odd with the graphs I'm feeding into KaHyPar.

My application is a recursive bisection based circuit placer -- a modern version of this:
https://dl.acm.org/doi/10.1145/981066.981084

The input graph is a net list for a circuit (with varying vertex weights -- small values for standard cells, large for macro blocks -- weight is effectively logic element physical area).

My old code uses hMetis 2.0 (on a Mac -- the 2.0 libraries were released, and are still running). I've modified my placer so that I can switch between hMetis and KaHyPar, and am finding that the results are consistently better with hMetis. My thinking is that either I'm using an incorrect "ini" file, or there's something structurally weird with my graphs that's causing trouble. The difference is about 10% higher cuts on average (and the objective function is cut).

In my application, I have a single fixed source and sink vertex (each with a weight of 1), and lots of other vertices (weights starting at 100 for small cells, and up). I'm using cut_kKaHyPar_sea20.ini, which I think would be the right fit. There are a lot of parameters, and I'm not sure which ones to try tweaking.

Any idea what's happening, or what I should do to improve performance? I can supply a variety of hgr files that illustrate what I'm seeing, if that will help.

@SebastianSchlag
Copy link
Member

SebastianSchlag commented Mar 19, 2023

Hi @profmadden,

thanks for reaching out.

My thinking is that either I'm using an incorrect "ini" file, or there's something structurally weird with my graphs that's causing trouble. The difference is about 10% higher cuts on average (and the objective function is cut).

In my application, I have a single fixed source and sink vertex (each with a weight of 1), and lots of other vertices (weights starting at 100 for small cells, and up). I'm using cut_kKaHyPar_sea20.ini, which I think would be the right fit. There are a lot of parameters, and I'm not sure which ones to try tweaking.

This is interesting. A few things spontaneously come to my mind:

  • Do the solutions computed by hMetis satisfy the balance constraint? In previous experiments, we noticed that especially the k-way partitioning algorithm of hMetis can produce imbalanced results quite often [1]. This seemed to be most pronounced for hypergraphs with vertex weights [2]. KaHyPar in a sense sacrifices solution quality if that is necessary to obtain a feasible solution, so this might be one possible explanation.
  • The cut_kKaHyPar_sea20.ini config is the right preset to use.
    • In our experimental evaluations, we noticed that sometimes (although rarely) the community detection algorithm can misguide the coarsening algorithm which then leads to poorly coarsened graphs with a very bad initial partition. To test whether this is the case for your instances, you can set p-detect-communities=false.
    • Although it depends on the net sizes of the hypergraph whether it is activated or not, it might also make sense to try to disable the pin sparsifier with p-use-sparsifier=false. I just double-checked, there is actually an old issue to extend the sparsifier to properly handle weighted hypergraphs. Unfortunately, this has not yet happened.
  • Since your hypergraphs have two fixed vertices, it could also be that the current fixed-vertex contraction policy messes things up. If this is the case, setting c-fixed-vertex-acceptance-criterion=fixed_vertex_allowed to free_vertex_only might help. The README has a bit more details about this.
  • Out of curiosity: What values of k and epsilon are you using in your application?

I can supply a variety of hgr files that illustrate what I'm seeing, if that will help.

This would actually be very much appreciated. I would be really interested to find out what is going on there. It could very well be that there's a bug in KaHyPar that is triggered when partitioning weighted hypergraphs with fixed vertices.

@N-Maas @kittobi1992 FYI. This is something that we should investigate.

[1] https://arxiv.org/pdf/2106.08696.pdf
[2] https://arxiv.org/pdf/2102.01378.pdf

@profmadden
Copy link
Author

Hi -- thanks for the help. I'm always doing bisection (k=2), and normally try for an imbalance of about 5%. I've put some of the graphs that fall out of my placement tool here:
Benchmarks at https://www.cs.binghamton.edu/~pmadden/benchmark.tgz

There's a "partbench" script that runs things.

#!/bin/tcsh

foreach B (000 001 002 003 004 005 006 007 009 009 010)
echo Partitioning Test -  fs-save$B.hgr fs-save$B.fix
~/bin/hmetis fs-save$B.hgr fs-save$B.fix 2 5 10 1 1 3 0 0 | grep "Hyperedge Cut"
~/bin/kahypar -h fs-save$B.hgr -k 2 -e 0.05 -o cut -m direct -p kahypar.ini -f fs-save$B.fix | grep "Hyperedge Cut"

end

The stand-alone partitioners for hmetis (1.5) and kahypar are run from the command line, but I'm seeing the same sort of behavior when I'm using the library interfaces.

Here's the results from my runs -- hMetis consistently a bit better (but there may be a difference in how the balance constraints are being dealt with).


[Rocinante:stdcell/ibm-v2/ibm01] pmadden% ./partbench
Partitioning Test - fs-save000.hgr fs-save000.fix
                Hyperedge Cut:       103		(minimize)
Hyperedge Cut  (minimize) = 117 
Partitioning Test - fs-save001.hgr fs-save001.fix
                Hyperedge Cut:       199		(minimize)
Hyperedge Cut  (minimize) = 201 
Partitioning Test - fs-save002.hgr fs-save002.fix
                Hyperedge Cut:        91		(minimize)
Hyperedge Cut  (minimize) = 94 
Partitioning Test - fs-save003.hgr fs-save003.fix
                Hyperedge Cut:       160		(minimize)
Hyperedge Cut  (minimize) = 165 
Partitioning Test - fs-save004.hgr fs-save004.fix
                Hyperedge Cut:        88		(minimize)
Hyperedge Cut  (minimize) = 93 
Partitioning Test - fs-save005.hgr fs-save005.fix
                Hyperedge Cut:       160		(minimize)
Hyperedge Cut  (minimize) = 164 
Partitioning Test - fs-save006.hgr fs-save006.fix
                Hyperedge Cut:       194		(minimize)
Hyperedge Cut  (minimize) = 216 
Partitioning Test - fs-save007.hgr fs-save007.fix
                Hyperedge Cut:       132		(minimize)
Hyperedge Cut  (minimize) = 138 
Partitioning Test - fs-save009.hgr fs-save009.fix
                Hyperedge Cut:       142		(minimize)
Hyperedge Cut  (minimize) = 161 
Partitioning Test - fs-save009.hgr fs-save009.fix
                Hyperedge Cut:       143		(minimize)
Hyperedge Cut  (minimize) = 161 
Partitioning Test - fs-save010.hgr fs-save010.fix
                Hyperedge Cut:        88		(minimize)
Hyperedge Cut  (minimize) = 92 

Thanks very much for any insight!

@SebastianSchlag
Copy link
Member

SebastianSchlag commented Mar 24, 2023

Thank you very much for these examples! I'll try to look a bit into this as soon as possible.

For k=2, the results might not be that surprising though. At least from what we've seen experimentally (e.g. p.194 in [1]) hMetis actually performs pretty well for bipartitioning, but the performance gets worse for k>2.

I was just looking at your hMetis parameter settings in the partbench script:

~/bin/hmetis fs-save$B.hgr fs-save$B.fix 2 5 10 1 1 3 0 0

If I understand these parameters correctly, you are using 10 runs and the 3 setting for V-cycles.

KaHyPar doesn't do any V-cycles by default but the number of V-cycles can be set using the --vcycles parameter. However, unlike hMetis, which, in mode 3, runs V-cycle refinement on each intermediate solution, KaHyPar currently only supports V-cycle refinement on the final solutions (which essentially corresponds to the 1 setting of hMetis' Vcycle parameter).
Having both nruns=10 and Vcycle=3 parameters therefore might give a single hMetis call more tries to come up with the best partition compared to a single KaHyPar call.

While KaHyPar doesn't have a parameter comparable to nruns, one thing we currently do support is time-limited partitioning. By setting --time-limited-repeated-partitioning=true and setting a time limit via --time-limit=X, KaHyPar will repeatedly partition the hypergraph until the time is up and use the best solution found during this time as final solution.

[1] https://publikationen.bibliothek.kit.edu/1000105953

@SebastianSchlag
Copy link
Member

@kittobi1992 Since MT-KaHyPar also runs in a traditional multi-level setting, a V-cycle approach like hMetis' 3 setting might actually be something that could further improve solution quality.

@profmadden
Copy link
Author

Thanks -- thats some things for me to try (hope to have time in the next week or two).

I'll turn on v-cycles, and I'll see if I can emulate multiple trials by calling KaHyPar multiple times (with different random seeds). During circuit placement, the partitioner will get called millions of times -- so it's hard to set a time limit that would make sense. I've used 10 runs on hMetis, because it seems to be a good "knee in the curve" for the graphs that go through the placement engine.

Is there an easy way to set the random seed of a context? For consistency across runs, I try to use the same random seed, so that I can replicate a result if needed.

@SebastianSchlag
Copy link
Member

Is there an easy way to set the random seed of a context? For consistency across runs, I try to use the same random seed, so that I can replicate a result if needed.

Yes, you can use the --seed parameter for that.

@profmadden
Copy link
Author

Ah -- to be clear -- I'd like to do this from the library interface. Given a context -- is there a khcontext_set_seed(context, n) sort of call?

Worst case, I can create a set of ini files with different seeds, then load a bunch of different contexts, but that seems like a bit of a kludge.

@SebastianSchlag
Copy link
Member

Oh, I see. Just checked the interface again. Seems like this is indeed currently not possible. Let me see if I can add support for this over the weekend.

@profmadden
Copy link
Author

Not a huge rush -- I'm buried with other stuff for a bit. Very much appreciate the help!

@SebastianSchlag
Copy link
Member

Done: #147

@SebastianSchlag
Copy link
Member

Hey @profmadden ,

just a quick heads-up: @kittobi1992 just recently found an issue with VLSI hypergraphs from the Titan23 benchmark set and we are currently preparing a fix for that: #172
Maybe this could help for your cases as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants