Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature #5125

Closed
francisreyes-tfs opened this issue Mar 25, 2023 · 0 comments
Closed

Feature #5125

francisreyes-tfs opened this issue Mar 25, 2023 · 0 comments
Labels

Comments

@francisreyes-tfs
Copy link

If you have an active AWS support contract, please open a case with AWS Premium Support team using the below documentation to report the issue:
https://docs.aws.amazon.com/awssupport/latest/user/case-management.html

Before submitting a new issue, please search through open GitHub Issues and check out the troubleshooting documentation.

Please make sure to add the following data in order to facilitate the root cause detection.

Required Info:

  • AWS ParallelCluster version [e.g. 3.1.1]:
  • Full cluster configuration without any credentials or personal data.
  • Cluster name:
  • Output of pcluster describe-cluster command.
  • [Optional] Arn of the cluster CloudFormation main stack:

Bug description and how to reproduce:
A clear and concise description of what the bug is and the steps to reproduce the behavior.

If you are reporting issues about scaling or job failure:
We cannot work on issues without proper logs. We STRONGLY recommend following this guide and attach the complete cluster log archive with the ticket.

For issues with Slurm scheduler, please attach the following logs:

  • From Head node: /var/log/parallelcluster/clustermgtd, /var/log/parallelcluster/clusterstatusmgtd (if version >= 3.2.0), /var/log/parallelcluster/slurm_resume.log, /var/log/parallelcluster/slurm_suspend.log, /var/log/parallelcluster/slurm_fleet_status_manager.log (if version >= 3.2.0) and/var/log/slurmctld.log.
  • From Compute node: /var/log/parallelcluster/computemgtd.log and /var/log/slurmd.log.

If you are reporting issues about cluster creation failure or node failure:

If the cluster fails creation, please re-execute create-cluster action using --rollback-on-failure false option.

We cannot work on issues without proper logs. We STRONGLY recommend following this guide and attach the complete cluster log archive with the ticket.

Please be sure to attach the following logs:

  • From Head node: /var/log/cloud-init.log, /var/log/cfn-init.log and /var/log/chef-client.log
  • From Compute node: /var/log/cloud-init-output.log.

Additional context:
Any other context about the problem. E.g.:

  • CLI logs: ~/.parallelcluster/pcluster-cli.log
  • Custom bootstrap scripts, if any
  • Screenshots, if useful.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant