Create a ParallelCluster via AWS CLI
Create the cluster
pcluster create-cluster -c <pcluster-create.yml> -n <cluster-name> -r <region>
Checklist for pcluster-create.yml
Minimum required sections
Region,Image,HeadNode, andScheduling(withSlurmQueues).
SSH access
Head node security group allows inbound TCP 22 from your IP (or VPN / bastion).
HeadNode.Ssh.KeyNamematches an existing EC2 key pair in the region.
Networking
Head node subnet, compute subnets, and any FSx file system are in the same VPC.
Security groups referenced in the config belong to the same VPC.
Slurm queues
Each queue in
SlurmQueueshas a uniqueName.Each queue includes
CapacityType,Networking.SubnetIds, andComputeResources.Each compute resource specifies
InstanceTypeand integerMinCount/MaxCount.
IAM note (common failure point)
The AWS identity used to run
pcluster create-clustermust have permission to create and manage required resources (EC2, CloudFormation, IAM) and to pass IAM roles.You can verify the active identity with:
aws sts get-caller-identity
Example pcluster-create.yml
An example pcluster-create.yml:
Region: us-east-1
Image:
Os: ubuntu2404
CustomAmi: ami-08ea421ec9fad0a09
HeadNode:
InstanceType: c5.large
Ssh:
KeyName: dzhang # <-- change to your keypair name
Networking:
SubnetId: subnet-08895ae58a2f9167d
AdditionalSecurityGroups:
- sg-0f504cd36a5dc0b34
Scheduling:
Scheduler: slurm
SlurmSettings:
QueueUpdateStrategy: DRAIN
SlurmQueues:
- Name: ondemand
CapacityType: ONDEMAND
Networking:
SubnetIds:
- subnet-08895ae58a2f9167d
AdditionalSecurityGroups:
- sg-0f504cd36a5dc0b34
ComputeResources:
- Name: c8a12xl
InstanceType: c8a.12xlarge
MinCount: 0
MaxCount: 20
- Name: spot
CapacityType: SPOT
Networking:
SubnetIds:
- subnet-08895ae58a2f9167d
AdditionalSecurityGroups:
- sg-0f504cd36a5dc0b34
ComputeResources:
- Name: c8a12xl
InstanceType: c8a.12xlarge
MinCount: 0
MaxCount: 50
SharedStorage:
- Name: fsx_input
StorageType: FsxLustre
MountDir: /fsx_input
FsxLustreSettings:
FileSystemId: fs-009ecfd7e44882657
Monitor the creation process
pcluster describe-cluster -n <cluster-name> -r <region>
clusterStatus:
CREATE_IN_PROGRESS→ normalCREATE_COMPLETE→ successCREATE_FAILED→ stop and debug
Delete ParallelCluster
Delete by cluster name
pcluster delete-cluster -n <cluster-name> -r <region>
Find the cluster name (if forgotten)
If the head node still exists, you can infer the cluster name from the
head node IAM role in the EC2 console:
the string before -RoleHeadNode is typically the cluster name.
If the head node has already been deleted, list clusters in the region:
pcluster list-clusters -r <region>
Then inspect a specific cluster:
pcluster describe-cluster -n <cluster-name> -r <region>
Recover the configuration YAML used to create a cluster
From the output of pcluster describe-cluster, look for:
"clusterConfiguration": {
"url": "https://parallelcluster-.../clusters/<cluster-name>/configs/cluster-config.yaml?versionId=..."
}
Download the configuration file:
curl -o cluster-config.yaml "<that-URL>"