ParallelComputingService / Client / create_cluster
create_cluster#
- ParallelComputingService.Client.create_cluster(**kwargs)#
Creates a cluster in your account. Amazon Web Services PCS creates the cluster controller in a service-owned account. The cluster controller communicates with the cluster resources in your account. The subnets and security groups for the cluster must already exist before you use this API action.
Note
It takes time for Amazon Web Services PCS to create the cluster. The cluster is in a
Creating
state until it is ready to use. There can only be 1 cluster in aCreating
state per Amazon Web Services Region per Amazon Web Services account.CreateCluster
fails with aServiceQuotaExceededException
if there is already a cluster in aCreating
state.See also: AWS API Documentation
Request Syntax
response = client.create_cluster( clusterName='string', scheduler={ 'type': 'SLURM', 'version': 'string' }, size='SMALL'|'MEDIUM'|'LARGE', networking={ 'subnetIds': [ 'string', ], 'securityGroupIds': [ 'string', ] }, slurmConfiguration={ 'scaleDownIdleTimeInSeconds': 123, 'slurmCustomSettings': [ { 'parameterName': 'string', 'parameterValue': 'string' }, ] }, clientToken='string', tags={ 'string': 'string' } )
- Parameters:
clusterName (string) –
[REQUIRED]
A name to identify the cluster. Example:
MyCluster
scheduler (dict) –
[REQUIRED]
The cluster management and job scheduling software associated with the cluster.
type (string) – [REQUIRED]
The software Amazon Web Services PCS uses to manage cluster scaling and job scheduling.
version (string) – [REQUIRED]
The version of the specified scheduling software that Amazon Web Services PCS uses to manage cluster scaling and job scheduling.
size (string) –
[REQUIRED]
A value that determines the maximum number of compute nodes in the cluster and the maximum number of jobs (active and queued).
SMALL
: 32 compute nodes and 256 jobsMEDIUM
: 512 compute nodes and 8192 jobsLARGE
: 2048 compute nodes and 16,384 jobs
networking (dict) –
[REQUIRED]
The networking configuration used to set up the cluster’s control plane.
subnetIds (list) –
The list of subnet IDs where Amazon Web Services PCS creates an Elastic Network Interface (ENI) to enable communication between managed controllers and Amazon Web Services PCS resources. Subnet IDs have the form
subnet-0123456789abcdef0
.Subnets can’t be in Outposts, Wavelength or an Amazon Web Services Local Zone.
Note
Amazon Web Services PCS currently supports only 1 subnet in this list.
(string) –
securityGroupIds (list) –
A list of security group IDs associated with the Elastic Network Interface (ENI) created in subnets.
(string) –
slurmConfiguration (dict) –
Additional options related to the Slurm scheduler.
scaleDownIdleTimeInSeconds (integer) –
The time (in seconds) before an idle node is scaled down.
Default:
600
slurmCustomSettings (list) –
Additional Slurm-specific configuration that directly maps to Slurm settings.
(dict) –
Additional settings that directly map to Slurm settings.
parameterName (string) – [REQUIRED]
Amazon Web Services PCS supports configuration of the following Slurm parameters:
For clusters
For compute node groups
parameterValue (string) – [REQUIRED]
The values for the configured Slurm settings.
clientToken (string) –
A unique, case-sensitive identifier that you provide to ensure the idempotency of the request. Idempotency ensures that an API request completes only once. With an idempotent request, if the original request completes successfully, the subsequent retries with the same client token return the result from the original successful request and they have no additional effect. If you don’t specify a client token, the CLI and SDK automatically generate 1 for you.
This field is autopopulated if not provided.
tags (dict) –
1 or more tags added to the resource. Each tag consists of a tag key and tag value. The tag value is optional and can be an empty string.
(string) –
(string) –
- Return type:
dict
- Returns:
Response Syntax
{ 'cluster': { 'name': 'string', 'id': 'string', 'arn': 'string', 'status': 'CREATING'|'ACTIVE'|'UPDATING'|'DELETING'|'CREATE_FAILED'|'DELETE_FAILED'|'UPDATE_FAILED', 'createdAt': datetime(2015, 1, 1), 'modifiedAt': datetime(2015, 1, 1), 'scheduler': { 'type': 'SLURM', 'version': 'string' }, 'size': 'SMALL'|'MEDIUM'|'LARGE', 'slurmConfiguration': { 'scaleDownIdleTimeInSeconds': 123, 'slurmCustomSettings': [ { 'parameterName': 'string', 'parameterValue': 'string' }, ], 'authKey': { 'secretArn': 'string', 'secretVersion': 'string' } }, 'networking': { 'subnetIds': [ 'string', ], 'securityGroupIds': [ 'string', ] }, 'endpoints': [ { 'type': 'SLURMCTLD'|'SLURMDBD', 'privateIpAddress': 'string', 'publicIpAddress': 'string', 'port': 'string' }, ], 'errorInfo': [ { 'code': 'string', 'message': 'string' }, ] } }
Response Structure
(dict) –
cluster (dict) –
The cluster resource.
name (string) –
The name that identifies the cluster.
id (string) –
The generated unique ID of the cluster.
arn (string) –
The unique Amazon Resource Name (ARN) of the cluster.
status (string) –
The provisioning status of the cluster.
Note
The provisioning status doesn’t indicate the overall health of the cluster.
createdAt (datetime) –
The date and time the resource was created.
modifiedAt (datetime) –
The date and time the resource was modified.
scheduler (dict) –
The cluster management and job scheduling software associated with the cluster.
type (string) –
The software Amazon Web Services PCS uses to manage cluster scaling and job scheduling.
version (string) –
The version of the specified scheduling software that Amazon Web Services PCS uses to manage cluster scaling and job scheduling.
size (string) –
The size of the cluster.
SMALL
: 32 compute nodes and 256 jobsMEDIUM
: 512 compute nodes and 8192 jobsLARGE
: 2048 compute nodes and 16,384 jobs
slurmConfiguration (dict) –
Additional options related to the Slurm scheduler.
scaleDownIdleTimeInSeconds (integer) –
The time (in seconds) before an idle node is scaled down.
Default:
600
slurmCustomSettings (list) –
Additional Slurm-specific configuration that directly maps to Slurm settings.
(dict) –
Additional settings that directly map to Slurm settings.
parameterName (string) –
Amazon Web Services PCS supports configuration of the following Slurm parameters:
For clusters
For compute node groups
parameterValue (string) –
The values for the configured Slurm settings.
authKey (dict) –
The shared Slurm key for authentication, also known as the cluster secret.
secretArn (string) –
The Amazon Resource Name (ARN) of the the shared Slurm key.
secretVersion (string) –
The version of the shared Slurm key.
networking (dict) –
The networking configuration for the cluster’s control plane.
subnetIds (list) –
The ID of the subnet where Amazon Web Services PCS creates an Elastic Network Interface (ENI) to enable communication between managed controllers and Amazon Web Services PCS resources. The subnet must have an available IP address, cannot reside in AWS Outposts, AWS Wavelength, or an AWS Local Zone.
Example:
subnet-abcd1234
(string) –
securityGroupIds (list) –
The list of security group IDs associated with the Elastic Network Interface (ENI) created in subnets.
The following rules are required:
Inbound rule 1
Protocol: All
Ports: All
Source: Self
Outbound rule 1
Protocol: All
Ports: All
Destination: 0.0.0.0/0 (IPv4)
Outbound rule 2
Protocol: All
Ports: All
Destination: Self
(string) –
endpoints (list) –
The list of endpoints available for interaction with the scheduler.
(dict) –
An endpoint available for interaction with the scheduler.
type (string) –
Indicates the type of endpoint running at the specific IP address.
privateIpAddress (string) –
The endpoint’s private IP address.
Example:
2.2.2.2
publicIpAddress (string) –
The endpoint’s public IP address.
Example:
1.1.1.1
port (string) –
The endpoint’s connection port number.
Example:
1234
errorInfo (list) –
The list of errors that occurred during cluster provisioning.
(dict) –
An error that occurred during resource creation.
code (string) –
The short-form error code.
message (string) –
The detailed error information.
Exceptions
ParallelComputingService.Client.exceptions.ServiceQuotaExceededException
ParallelComputingService.Client.exceptions.ThrottlingException
ParallelComputingService.Client.exceptions.ValidationException
ParallelComputingService.Client.exceptions.ConflictException
ParallelComputingService.Client.exceptions.InternalServerException
ParallelComputingService.Client.exceptions.AccessDeniedException