Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't emit atomics for thread-safe assignment patterns #401

Closed
wants to merge 1 commit into from

Conversation

Infinoid
Copy link
Contributor

This attempts to fix #316.

It prevents a #pragma omp atomic from being emitted for the assignment to y_vals[i] in this case:

$ bin/taco 'y(i) = A(i, j) * x(j)' -f=A:ds -s='parallelize(i, CPUThread, Atomics)'
// Generated by the Tensor Algebra Compiler (tensor-compiler.org)

int compute(taco_tensor_t *y, taco_tensor_t *A, taco_tensor_t *x) {
  int y1_dimension = (int)(y->dimensions[0]);
  double* restrict y_vals = (double*)(y->vals);
  int A1_dimension = (int)(A->dimensions[0]);
  int* restrict A2_pos = (int*)(A->indices[1][0]);
  int* restrict A2_crd = (int*)(A->indices[1][1]);
  double* restrict A_vals = (double*)(A->vals);
  int x1_dimension = (int)(x->dimensions[0]);
  double* restrict x_vals = (double*)(x->vals);

  #pragma omp parallel for schedule(runtime)
  for (int32_t i = 0; i < A1_dimension; i++) {
    double tjy_val = 0.0;
    for (int32_t jA = A2_pos[i]; jA < A2_pos[(i + 1)]; jA++) {
      int32_t j = A2_crd[jA];
      tjy_val += A_vals[jA] * x_vals[j];
    }
    y_vals[i] = tjy_val;
  }
  return 0;
}

Because each thread gets its own value of i, no other thread will assign to that location, and recent compilers treat an atomic pragma as invalid there. See #316 for details.

Are there any other cases this doesn't cover? All tests pass for me when configuring with -DOPENMP=ON -DPYTHON=ON.

@stephenchouca
Copy link
Contributor

This patch seems to generate incorrect code for expressions like 'y(j) = A(i,j,k) * x(k)' -s='parallelize(j, CPUThread, Atomics)', which still require atomics even after scalar promotion (the generate code omits the required atomic pragma).

More fundamentally though, I wonder if we should instead just fix the lowerer so that it won't emit unnecessary atomic pragmas for (non-reduction) assignments. That way, the bug would be fixed even if, for instance, a user decides to manually invoke the lowerer with concrete index notation that has an assignment inside a Forall statement synchronized with atomics. This should only require modifying a few lines in lowerAssignment; I can take a stab at it if you'd like.

@Infinoid
Copy link
Contributor Author

This should only require modifying a few lines in lowerAssignment; I can take a stab at it if you'd like.

Sure, please do. And I'll happily add your example as a test case.

@stephenchouca
Copy link
Contributor

Sure, please do. And I'll happily add your example as a test case.

That'd be great, thanks!

@Infinoid
Copy link
Contributor Author

Your patch works, I added the test case in #402. Thanks!

Canceling PR in favor of 0a48d17.

@Infinoid Infinoid closed this Feb 12, 2021
@Infinoid Infinoid deleted the fix-316 branch February 12, 2021 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

taco generates invalid "#pragma omp atomic" for Store ops
2 participants