Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to create a cluster out of an HPC Image derived from a VHD - package epel-release is not installed epel-release-7-11.noarch #461

Open
souvik-de opened this issue Jan 27, 2021 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@souvik-de
Copy link

souvik-de commented Jan 27, 2021

Describe the bug
We have a pipeline that allows us to test a CentOS VHD. The pipeline downloads it into a storage account and then creates an image out of it. This image is now feed into the azhpc scripts to deploy a cluster and benchmarks are run. Before December 2020 we never had a issue doing it. But now the azhpc-build fails at the install_node_setup.sh step with the message "package epel-release is not installed epel-release-7-11.noarch".

To Reproduce
Steps to reproduce the behavior:

  1. Have a CentOS-HPC VHD at your disposal.
  2. Download it on to a storage account and create an image out of it.
  3. Utilize the azhpc scripts and the image to deploy a cluster.
  4. You should encounter the error here.

Expected behavior
As before Dec 2020, the azhpc-build should be able to deploy a cluster out of the image.

Screenshots
image

Configuration (please complete the following information):

  • OS and version: CentOS 7.6 HPC (Test VHD)
  • Context of execution : Ubuntu from WSL2
@souvik-de souvik-de added the bug Something isn't working label Jan 27, 2021
@xpillons
Copy link
Collaborator

@edwardsp can you please have a look ?

@edwardsp
Copy link
Collaborator

@souvik-de this is just failing as you are unable to ssh from the jumpbox to the compute instance. Have you tried to access the VMSS instance yourself (as you are able to connect to the jumpbox)? Also, does this happen consistently or just occasionally?

@souvik-de
Copy link
Author

I cannot ssh into the headnode even after resetting with password - "Permission denied (publickey,gssapi-keyex,gssapi-with-mic) | Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password)". Happens consistently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants