Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++: Optimize SubBasicBlocks library #1824

Merged
merged 3 commits into from
Aug 26, 2019
Merged

C++: Optimize SubBasicBlocks library #1824

merged 3 commits into from
Aug 26, 2019

Conversation

jbj
Copy link
Contributor

@jbj jbj commented Aug 26, 2019

Some predicates in this library had performance that was quadratic in the number of sub-basic-blocks per basic block. After field flow was added, that has become a problem on two projects: intel/mkl-dnn and laurentkneip/opengv. This PR should fix the join order and make these predicates faster on all projects.

@jbj jbj added C++ Priority PR that should be reviewed and merged as a matter of priority. labels Aug 26, 2019
@jbj jbj added this to the 1.22 milestone Aug 26, 2019
@jbj jbj requested a review from pavgust August 26, 2019 10:32
@jbj jbj requested a review from a team as a code owner August 26, 2019 10:32
Copy link
Contributor

@pavgust pavgust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@pavgust pavgust merged commit deacc23 into github:master Aug 26, 2019
pragma[nomagic]
private int outerPosToInnerPos(BasicBlock bb, int posInBB) {
posInBB = result + this.getIndexInBasicBlock(bb) and
result = [ 0 .. this.getNumberOfNodes() - 1 ]
Copy link
Contributor

@geoffw0 geoffw0 Aug 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How confident are we that this isn't quadratic? Naively, it seems that for any particular node bb, posInBB, every possible this in bb must be checked.

Other than this question, FWIW, these changes LGTM also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the DIL we generate for this predicate:

noinline nomagic
SubBasicBlocks::SubBasicBlock::outerPosToInnerPos#ffff(/* SubBasicBlocks::SubBasicBlock */ cached entity this,
                                                       /* BasicBlocks::BasicBlock */ cached entity bb,
                                                       int posInBB, int result) :-
  exists(cached int call_result |
    exists(int range#high |
      exists(int Sub#lhs |
        SubBasicBlocks::SubBasicBlock::getIndexInBasicBlock#fff(this, bb,
                                                                call_result),
        SubBasicBlocks::SubBasicBlock::getNumberOfNodes#ff(this, Sub#lhs),
        Minus(Sub#lhs, 1, range#high)
      ),
      range(0, range#high, result)
    ),
    PlusInt(result, call_result, posInBB)
  )
.

It first joins in the two getters (functional relations), so the tuple count will be the number of SubBasicBlock. It then joins with the range, which multiplies that tuple count by the average number of nodes per SubBasicBlock. All together, this predicate should have the same number of tuples as there are ControlFlowNodes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, that gives me a bit more confidence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C++ Priority PR that should be reviewed and merged as a matter of priority.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants