EMRServerless

Table of Contents

Client

class EMRServerless.Client

A low-level client representing EMR Serverless

Amazon EMR Serverless is a new deployment option for Amazon EMR. EMR Serverless provides a serverless runtime environment that simplifies running analytics applications using the latest open source frameworks such as Apache Spark and Apache Hive. With EMR Serverless, you don’t have to configure, optimize, secure, or operate clusters to run applications with these frameworks.

The API reference to Amazon EMR Serverless is emr-serverless . The emr-serverless prefix is used in the following scenarios:

  • It is the prefix in the CLI commands for Amazon EMR Serverless. For example, aws emr-serverless start-job-run .
  • It is the prefix before IAM policy actions for Amazon EMR Serverless. For example, "Action": ["emr-serverless:StartJobRun"] . For more information, see Policy actions for Amazon EMR Serverless .
  • It is the prefix used in Amazon EMR Serverless service endpoints. For example, emr-serverless.us-east-2.amazonaws.com .
import boto3

client = boto3.client('emr-serverless')

These are the available methods:

can_paginate(operation_name)

Check if an operation can be paginated.

Parameters
operation_name (string) -- The operation name. This is the same name as the method name on the client. For example, if the method name is create_foo, and you'd normally invoke the operation as client.create_foo(**kwargs), if the create_foo operation can be paginated, you can use the call client.get_paginator("create_foo").
Returns
True if the operation can be paginated, False otherwise.
cancel_job_run(**kwargs)

Cancels a job run.

See also: AWS API Documentation

Request Syntax

response = client.cancel_job_run(
    applicationId='string',
    jobRunId='string'
)
Parameters
  • applicationId (string) --

    [REQUIRED]

    The ID of the application on which the job run will be canceled.

  • jobRunId (string) --

    [REQUIRED]

    The ID of the job run to cancel.

Return type

dict

Returns

Response Syntax

{
    'applicationId': 'string',
    'jobRunId': 'string'
}

Response Structure

  • (dict) --

    • applicationId (string) --

      The output contains the application ID on which the job run is cancelled.

    • jobRunId (string) --

      The output contains the ID of the cancelled job run.

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.ResourceNotFoundException
  • EMRServerless.Client.exceptions.InternalServerException
close()

Closes underlying endpoint connections.

create_application(**kwargs)

Creates an application.

See also: AWS API Documentation

Request Syntax

response = client.create_application(
    name='string',
    releaseLabel='string',
    type='string',
    clientToken='string',
    initialCapacity={
        'string': {
            'workerCount': 123,
            'workerConfiguration': {
                'cpu': 'string',
                'memory': 'string',
                'disk': 'string'
            }
        }
    },
    maximumCapacity={
        'cpu': 'string',
        'memory': 'string',
        'disk': 'string'
    },
    tags={
        'string': 'string'
    },
    autoStartConfiguration={
        'enabled': True|False
    },
    autoStopConfiguration={
        'enabled': True|False,
        'idleTimeoutMinutes': 123
    },
    networkConfiguration={
        'subnetIds': [
            'string',
        ],
        'securityGroupIds': [
            'string',
        ]
    }
)
Parameters
  • name (string) -- The name of the application.
  • releaseLabel (string) --

    [REQUIRED]

    The EMR release version associated with the application.

  • type (string) --

    [REQUIRED]

    The type of application you want to start, such as Spark or Hive.

  • clientToken (string) --

    [REQUIRED]

    The client idempotency token of the application to create. Its value must be unique for each request.

    This field is autopopulated if not provided.

  • initialCapacity (dict) --

    The capacity to initialize when the application is created.

    • (string) --
      • (dict) --

        The initial capacity configuration per worker.

        • workerCount (integer) -- [REQUIRED]

          The number of workers in the initial capacity configuration.

        • workerConfiguration (dict) --

          The resource configuration of the initial capacity configuration.

          • cpu (string) -- [REQUIRED]

            The CPU requirements for every worker instance of the worker type.

          • memory (string) -- [REQUIRED]

            The memory requirements for every worker instance of the worker type.

          • disk (string) --

            The disk requirements for every worker instance of the worker type.

  • maximumCapacity (dict) --

    The maximum capacity to allocate when the application is created. This is cumulative across all workers at any given point in time, not just when an application is created. No new resources will be created once any one of the defined limits is hit.

    • cpu (string) -- [REQUIRED]

      The maximum allowed CPU for an application.

    • memory (string) -- [REQUIRED]

      The maximum allowed resources for an application.

    • disk (string) --

      The maximum allowed disk for an application.

  • tags (dict) --

    The tags assigned to the application.

    • (string) --
      • (string) --
  • autoStartConfiguration (dict) --

    The configuration for an application to automatically start on job submission.

    • enabled (boolean) --

      Enables the application to automatically start on job submission. Defaults to true.

  • autoStopConfiguration (dict) --

    The configuration for an application to automatically stop after a certain amount of time being idle.

    • enabled (boolean) --

      Enables the application to automatically stop after a certain amount of time being idle. Defaults to true.

    • idleTimeoutMinutes (integer) --

      The amount of idle time in minutes after which your application will automatically stop. Defaults to 15 minutes.

  • networkConfiguration (dict) --

    The network configuration for customer VPC connectivity.

    • subnetIds (list) --

      The array of subnet Ids for customer VPC connectivity.

      • (string) --
    • securityGroupIds (list) --

      The array of security group Ids for customer VPC connectivity.

      • (string) --
Return type

dict

Returns

Response Syntax

{
    'applicationId': 'string',
    'name': 'string',
    'arn': 'string'
}

Response Structure

  • (dict) --

    • applicationId (string) --

      The output contains the application ID.

    • name (string) --

      The output contains the name of the application.

    • arn (string) --

      The output contains the ARN of the application.

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.InternalServerException
  • EMRServerless.Client.exceptions.ConflictException
delete_application(**kwargs)

Deletes an application. An application has to be in a stopped or created state in order to be deleted.

See also: AWS API Documentation

Request Syntax

response = client.delete_application(
    applicationId='string'
)
Parameters
applicationId (string) --

[REQUIRED]

The ID of the application that will be deleted.

Return type
dict
Returns
Response Syntax
{}

Response Structure

  • (dict) --

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.ResourceNotFoundException
  • EMRServerless.Client.exceptions.InternalServerException
get_application(**kwargs)

Displays detailed information about a specified application.

See also: AWS API Documentation

Request Syntax

response = client.get_application(
    applicationId='string'
)
Parameters
applicationId (string) --

[REQUIRED]

The ID of the application that will be described.

Return type
dict
Returns
Response Syntax
{
    'application': {
        'applicationId': 'string',
        'name': 'string',
        'arn': 'string',
        'releaseLabel': 'string',
        'type': 'string',
        'state': 'CREATING'|'CREATED'|'STARTING'|'STARTED'|'STOPPING'|'STOPPED'|'TERMINATED',
        'stateDetails': 'string',
        'initialCapacity': {
            'string': {
                'workerCount': 123,
                'workerConfiguration': {
                    'cpu': 'string',
                    'memory': 'string',
                    'disk': 'string'
                }
            }
        },
        'maximumCapacity': {
            'cpu': 'string',
            'memory': 'string',
            'disk': 'string'
        },
        'createdAt': datetime(2015, 1, 1),
        'updatedAt': datetime(2015, 1, 1),
        'tags': {
            'string': 'string'
        },
        'autoStartConfiguration': {
            'enabled': True|False
        },
        'autoStopConfiguration': {
            'enabled': True|False,
            'idleTimeoutMinutes': 123
        },
        'networkConfiguration': {
            'subnetIds': [
                'string',
            ],
            'securityGroupIds': [
                'string',
            ]
        }
    }
}

Response Structure

  • (dict) --
    • application (dict) --

      The output displays information about the specified application.

      • applicationId (string) --

        The ID of the application.

      • name (string) --

        The name of the application.

      • arn (string) --

        The ARN of the application.

      • releaseLabel (string) --

        The EMR release version associated with the application.

      • type (string) --

        The type of application, such as Spark or Hive.

      • state (string) --

        The state of the application.

      • stateDetails (string) --

        The state details of the application.

      • initialCapacity (dict) --

        The initial capacity of the application.

        • (string) --
          • (dict) --

            The initial capacity configuration per worker.

            • workerCount (integer) --

              The number of workers in the initial capacity configuration.

            • workerConfiguration (dict) --

              The resource configuration of the initial capacity configuration.

              • cpu (string) --

                The CPU requirements for every worker instance of the worker type.

              • memory (string) --

                The memory requirements for every worker instance of the worker type.

              • disk (string) --

                The disk requirements for every worker instance of the worker type.

      • maximumCapacity (dict) --

        The maximum capacity of the application. This is cumulative across all workers at any given point in time during the lifespan of the application is created. No new resources will be created once any one of the defined limits is hit.

        • cpu (string) --

          The maximum allowed CPU for an application.

        • memory (string) --

          The maximum allowed resources for an application.

        • disk (string) --

          The maximum allowed disk for an application.

      • createdAt (datetime) --

        The date and time when the application run was created.

      • updatedAt (datetime) --

        The date and time when the application run was last updated.

      • tags (dict) --

        The tags assigned to the application.

        • (string) --
          • (string) --
      • autoStartConfiguration (dict) --

        The configuration for an application to automatically start on job submission.

        • enabled (boolean) --

          Enables the application to automatically start on job submission. Defaults to true.

      • autoStopConfiguration (dict) --

        The configuration for an application to automatically stop after a certain amount of time being idle.

        • enabled (boolean) --

          Enables the application to automatically stop after a certain amount of time being idle. Defaults to true.

        • idleTimeoutMinutes (integer) --

          The amount of idle time in minutes after which your application will automatically stop. Defaults to 15 minutes.

      • networkConfiguration (dict) --

        The network configuration for customer VPC connectivity for the application.

        • subnetIds (list) --

          The array of subnet Ids for customer VPC connectivity.

          • (string) --
        • securityGroupIds (list) --

          The array of security group Ids for customer VPC connectivity.

          • (string) --

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.ResourceNotFoundException
  • EMRServerless.Client.exceptions.InternalServerException
get_job_run(**kwargs)

Displays detailed information about a job run.

See also: AWS API Documentation

Request Syntax

response = client.get_job_run(
    applicationId='string',
    jobRunId='string'
)
Parameters
  • applicationId (string) --

    [REQUIRED]

    The ID of the application on which the job run is submitted.

  • jobRunId (string) --

    [REQUIRED]

    The ID of the job run.

Return type

dict

Returns

Response Syntax

{
    'jobRun': {
        'applicationId': 'string',
        'jobRunId': 'string',
        'name': 'string',
        'arn': 'string',
        'createdBy': 'string',
        'createdAt': datetime(2015, 1, 1),
        'updatedAt': datetime(2015, 1, 1),
        'executionRole': 'string',
        'state': 'SUBMITTED'|'PENDING'|'SCHEDULED'|'RUNNING'|'SUCCESS'|'FAILED'|'CANCELLING'|'CANCELLED',
        'stateDetails': 'string',
        'releaseLabel': 'string',
        'configurationOverrides': {
            'applicationConfiguration': [
                {
                    'classification': 'string',
                    'properties': {
                        'string': 'string'
                    },
                    'configurations': {'... recursive ...'}
                },
            ],
            'monitoringConfiguration': {
                's3MonitoringConfiguration': {
                    'logUri': 'string',
                    'encryptionKeyArn': 'string'
                },
                'managedPersistenceMonitoringConfiguration': {
                    'enabled': True|False,
                    'encryptionKeyArn': 'string'
                }
            }
        },
        'jobDriver': {
            'sparkSubmit': {
                'entryPoint': 'string',
                'entryPointArguments': [
                    'string',
                ],
                'sparkSubmitParameters': 'string'
            },
            'hive': {
                'query': 'string',
                'initQueryFile': 'string',
                'parameters': 'string'
            }
        },
        'tags': {
            'string': 'string'
        },
        'totalResourceUtilization': {
            'vCPUHour': 123.0,
            'memoryGBHour': 123.0,
            'storageGBHour': 123.0
        },
        'networkConfiguration': {
            'subnetIds': [
                'string',
            ],
            'securityGroupIds': [
                'string',
            ]
        },
        'totalExecutionDurationSeconds': 123
    }
}

Response Structure

  • (dict) --

    • jobRun (dict) --

      The output displays information about the job run.

      • applicationId (string) --

        The ID of the application the job is running on.

      • jobRunId (string) --

        The ID of the job run.

      • name (string) --

        The optional job run name. This doesn't have to be unique.

      • arn (string) --

        The execution role ARN of the job run.

      • createdBy (string) --

        The user who created the job run.

      • createdAt (datetime) --

        The date and time when the job run was created.

      • updatedAt (datetime) --

        The date and time when the job run was updated.

      • executionRole (string) --

        The execution role ARN of the job run.

      • state (string) --

        The state of the job run.

      • stateDetails (string) --

        The state details of the job run.

      • releaseLabel (string) --

        The EMR release version associated with the application your job is running on.

      • configurationOverrides (dict) --

        The configuration settings that are used to override default configuration.

        • applicationConfiguration (list) --

          The override configurations for the application.

          • (dict) --

            A configuration specification to be used when provisioning an application. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

            • classification (string) --

              The classification within a configuration.

            • properties (dict) --

              A set of properties specified within a configuration classification.

              • (string) --
                • (string) --
            • configurations (list) --

              A list of additional configurations to apply within a configuration object.

        • monitoringConfiguration (dict) --

          The override configurations for monitoring.

          • s3MonitoringConfiguration (dict) --

            The Amazon S3 configuration for monitoring log publishing.

            • logUri (string) --

              The Amazon S3 destination URI for log publishing.

            • encryptionKeyArn (string) --

              The KMS key ARN to encrypt the logs published to the given Amazon S3 destination.

          • managedPersistenceMonitoringConfiguration (dict) --

            The managed log persistence configuration for a job run.

            • enabled (boolean) --

              Enables managed logging and defaults to true. If set to false, managed logging will be turned off.

            • encryptionKeyArn (string) --

              The KMS key ARN to encrypt the logs stored in managed log persistence.

      • jobDriver (dict) --

        The job driver for the job run.

        Note

        This is a Tagged Union structure. Only one of the following top level keys will be set: sparkSubmit, hive. If a client receives an unknown member it will set SDK_UNKNOWN_MEMBER as the top level key, which maps to the name or tag of the unknown member. The structure of SDK_UNKNOWN_MEMBER is as follows:

        'SDK_UNKNOWN_MEMBER': {'name': 'UnknownMemberName'}
        
        • sparkSubmit (dict) --

          The job driver parameters specified for Spark.

          • entryPoint (string) --

            The entry point for the Spark submit job run.

          • entryPointArguments (list) --

            The arguments for the Spark submit job run.

            • (string) --
          • sparkSubmitParameters (string) --

            The parameters for the Spark submit job run.

        • hive (dict) --

          The job driver parameters specified for Hive.

          • query (string) --

            The query for the Hive job run.

          • initQueryFile (string) --

            The query file for the Hive job run.

          • parameters (string) --

            The parameters for the Hive job run.

      • tags (dict) --

        The tags assigned to the job run.

        • (string) --
          • (string) --
      • totalResourceUtilization (dict) --

        The aggregate vCPU, memory, and storage resources used from the time job start executing till the time job is terminated, rounded up to the nearest second.

        • vCPUHour (float) --

          The aggregated vCPU used per hour from the time job start executing till the time job is terminated.

        • memoryGBHour (float) --

          The aggregated memory used per hour from the time job start executing till the time job is terminated.

        • storageGBHour (float) --

          The aggregated storage used per hour from the time job start executing till the time job is terminated.

      • networkConfiguration (dict) --

        The network configuration for customer VPC connectivity.

        • subnetIds (list) --

          The array of subnet Ids for customer VPC connectivity.

          • (string) --
        • securityGroupIds (list) --

          The array of security group Ids for customer VPC connectivity.

          • (string) --
      • totalExecutionDurationSeconds (integer) --

        The job run total execution duration in seconds. This field is only available for job runs in a COMPLETED , FAILED , or CANCELLED state.

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.ResourceNotFoundException
  • EMRServerless.Client.exceptions.InternalServerException
get_paginator(operation_name)

Create a paginator for an operation.

Parameters
operation_name (string) -- The operation name. This is the same name as the method name on the client. For example, if the method name is create_foo, and you'd normally invoke the operation as client.create_foo(**kwargs), if the create_foo operation can be paginated, you can use the call client.get_paginator("create_foo").
Raises OperationNotPageableError
Raised if the operation is not pageable. You can use the client.can_paginate method to check if an operation is pageable.
Return type
L{botocore.paginate.Paginator}
Returns
A paginator object.
get_waiter(waiter_name)

Returns an object that can wait for some condition.

Parameters
waiter_name (str) -- The name of the waiter to get. See the waiters section of the service docs for a list of available waiters.
Returns
The specified waiter object.
Return type
botocore.waiter.Waiter
list_applications(**kwargs)

Lists applications based on a set of parameters.

See also: AWS API Documentation

Request Syntax

response = client.list_applications(
    nextToken='string',
    maxResults=123,
    states=[
        'CREATING'|'CREATED'|'STARTING'|'STARTED'|'STOPPING'|'STOPPED'|'TERMINATED',
    ]
)
Parameters
  • nextToken (string) -- The token for the next set of application results.
  • maxResults (integer) -- The maximum number of applications that can be listed.
  • states (list) --

    An optional filter for application states. Note that if this filter contains multiple states, the resulting list will be grouped by the state.

    • (string) --
Return type

dict

Returns

Response Syntax

{
    'applications': [
        {
            'id': 'string',
            'name': 'string',
            'arn': 'string',
            'releaseLabel': 'string',
            'type': 'string',
            'state': 'CREATING'|'CREATED'|'STARTING'|'STARTED'|'STOPPING'|'STOPPED'|'TERMINATED',
            'stateDetails': 'string',
            'createdAt': datetime(2015, 1, 1),
            'updatedAt': datetime(2015, 1, 1)
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • applications (list) --

      The output lists the specified applications.

      • (dict) --

        The summary of attributes associated with an application.

        • id (string) --

          The ID of the application.

        • name (string) --

          The name of the application.

        • arn (string) --

          The ARN of the application.

        • releaseLabel (string) --

          The EMR release version associated with the application.

        • type (string) --

          The type of application, such as Spark or Hive.

        • state (string) --

          The state of the application.

        • stateDetails (string) --

          The state details of the application.

        • createdAt (datetime) --

          The date and time when the application was created.

        • updatedAt (datetime) --

          The date and time when the application was last updated.

    • nextToken (string) --

      The output displays the token for the next set of application results. This is required for pagination and is available as a response of the previous request.

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.InternalServerException
list_job_runs(**kwargs)

Lists job runs based on a set of parameters.

See also: AWS API Documentation

Request Syntax

response = client.list_job_runs(
    applicationId='string',
    nextToken='string',
    maxResults=123,
    createdAtAfter=datetime(2015, 1, 1),
    createdAtBefore=datetime(2015, 1, 1),
    states=[
        'SUBMITTED'|'PENDING'|'SCHEDULED'|'RUNNING'|'SUCCESS'|'FAILED'|'CANCELLING'|'CANCELLED',
    ]
)
Parameters
  • applicationId (string) --

    [REQUIRED]

    The ID of the application for which to list the job run.

  • nextToken (string) -- The token for the next set of job run results.
  • maxResults (integer) -- The maximum number of job runs that can be listed.
  • createdAtAfter (datetime) -- The lower bound of the option to filter by creation date and time.
  • createdAtBefore (datetime) -- The upper bound of the option to filter by creation date and time.
  • states (list) --

    An optional filter for job run states. Note that if this filter contains multiple states, the resulting list will be grouped by the state.

    • (string) --
Return type

dict

Returns

Response Syntax

{
    'jobRuns': [
        {
            'applicationId': 'string',
            'id': 'string',
            'name': 'string',
            'arn': 'string',
            'createdBy': 'string',
            'createdAt': datetime(2015, 1, 1),
            'updatedAt': datetime(2015, 1, 1),
            'executionRole': 'string',
            'state': 'SUBMITTED'|'PENDING'|'SCHEDULED'|'RUNNING'|'SUCCESS'|'FAILED'|'CANCELLING'|'CANCELLED',
            'stateDetails': 'string',
            'releaseLabel': 'string',
            'type': 'string'
        },
    ],
    'nextToken': 'string'
}

Response Structure

  • (dict) --

    • jobRuns (list) --

      The output lists information about the specified job runs.

      • (dict) --

        The summary of attributes associated with a job run.

        • applicationId (string) --

          The ID of the application the job is running on.

        • id (string) --

          The ID of the job run.

        • name (string) --

          The optional job run name. This doesn't have to be unique.

        • arn (string) --

          The ARN of the job run.

        • createdBy (string) --

          The user who created the job run.

        • createdAt (datetime) --

          The date and time when the job run was created.

        • updatedAt (datetime) --

          The date and time when the job run was last updated.

        • executionRole (string) --

          The execution role ARN of the job run.

        • state (string) --

          The state of the job run.

        • stateDetails (string) --

          The state details of the job run.

        • releaseLabel (string) --

          The EMR release version associated with the application your job is running on.

        • type (string) --

          The type of job run, such as Spark or Hive.

    • nextToken (string) --

      The output displays the token for the next set of job run results. This is required for pagination and is available as a response of the previous request.

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.InternalServerException
list_tags_for_resource(**kwargs)

Lists the tags assigned to the resources.

See also: AWS API Documentation

Request Syntax

response = client.list_tags_for_resource(
    resourceArn='string'
)
Parameters
resourceArn (string) --

[REQUIRED]

The Amazon Resource Name (ARN) that identifies the resource to list the tags for. Currently, the supported resources are Amazon EMR Serverless applications and job runs.

Return type
dict
Returns
Response Syntax
{
    'tags': {
        'string': 'string'
    }
}

Response Structure

  • (dict) --
    • tags (dict) --

      The tags for the resource.

      • (string) --
        • (string) --

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.ResourceNotFoundException
  • EMRServerless.Client.exceptions.InternalServerException
start_application(**kwargs)

Starts a specified application and initializes initial capacity if configured.

See also: AWS API Documentation

Request Syntax

response = client.start_application(
    applicationId='string'
)
Parameters
applicationId (string) --

[REQUIRED]

The ID of the application to start.

Return type
dict
Returns
Response Syntax
{}

Response Structure

  • (dict) --

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.ResourceNotFoundException
  • EMRServerless.Client.exceptions.InternalServerException
  • EMRServerless.Client.exceptions.ServiceQuotaExceededException
start_job_run(**kwargs)

Starts a job run.

See also: AWS API Documentation

Request Syntax

response = client.start_job_run(
    applicationId='string',
    clientToken='string',
    executionRoleArn='string',
    jobDriver={
        'sparkSubmit': {
            'entryPoint': 'string',
            'entryPointArguments': [
                'string',
            ],
            'sparkSubmitParameters': 'string'
        },
        'hive': {
            'query': 'string',
            'initQueryFile': 'string',
            'parameters': 'string'
        }
    },
    configurationOverrides={
        'applicationConfiguration': [
            {
                'classification': 'string',
                'properties': {
                    'string': 'string'
                },
                'configurations': {'... recursive ...'}
            },
        ],
        'monitoringConfiguration': {
            's3MonitoringConfiguration': {
                'logUri': 'string',
                'encryptionKeyArn': 'string'
            },
            'managedPersistenceMonitoringConfiguration': {
                'enabled': True|False,
                'encryptionKeyArn': 'string'
            }
        }
    },
    tags={
        'string': 'string'
    },
    executionTimeoutMinutes=123,
    name='string'
)
Parameters
  • applicationId (string) --

    [REQUIRED]

    The ID of the application on which to run the job.

  • clientToken (string) --

    [REQUIRED]

    The client idempotency token of the job run to start. Its value must be unique for each request.

    This field is autopopulated if not provided.

  • executionRoleArn (string) --

    [REQUIRED]

    The execution role ARN for the job run.

  • jobDriver (dict) --

    The job driver for the job run.

    Note

    This is a Tagged Union structure. Only one of the following top level keys can be set: sparkSubmit, hive.

    • sparkSubmit (dict) --

      The job driver parameters specified for Spark.

      • entryPoint (string) -- [REQUIRED]

        The entry point for the Spark submit job run.

      • entryPointArguments (list) --

        The arguments for the Spark submit job run.

        • (string) --
      • sparkSubmitParameters (string) --

        The parameters for the Spark submit job run.

    • hive (dict) --

      The job driver parameters specified for Hive.

      • query (string) -- [REQUIRED]

        The query for the Hive job run.

      • initQueryFile (string) --

        The query file for the Hive job run.

      • parameters (string) --

        The parameters for the Hive job run.

  • configurationOverrides (dict) --

    The configuration overrides for the job run.

    • applicationConfiguration (list) --

      The override configurations for the application.

      • (dict) --

        A configuration specification to be used when provisioning an application. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

        • classification (string) -- [REQUIRED]

          The classification within a configuration.

        • properties (dict) --

          A set of properties specified within a configuration classification.

          • (string) --
            • (string) --
        • configurations (list) --

          A list of additional configurations to apply within a configuration object.

    • monitoringConfiguration (dict) --

      The override configurations for monitoring.

      • s3MonitoringConfiguration (dict) --

        The Amazon S3 configuration for monitoring log publishing.

        • logUri (string) --

          The Amazon S3 destination URI for log publishing.

        • encryptionKeyArn (string) --

          The KMS key ARN to encrypt the logs published to the given Amazon S3 destination.

      • managedPersistenceMonitoringConfiguration (dict) --

        The managed log persistence configuration for a job run.

        • enabled (boolean) --

          Enables managed logging and defaults to true. If set to false, managed logging will be turned off.

        • encryptionKeyArn (string) --

          The KMS key ARN to encrypt the logs stored in managed log persistence.

  • tags (dict) --

    The tags assigned to the job run.

    • (string) --
      • (string) --
  • executionTimeoutMinutes (integer) -- The maximum duration for the job run to run. If the job run runs beyond this duration, it will be automatically cancelled.
  • name (string) -- The optional job run name. This doesn't have to be unique.
Return type

dict

Returns

Response Syntax

{
    'applicationId': 'string',
    'jobRunId': 'string',
    'arn': 'string'
}

Response Structure

  • (dict) --

    • applicationId (string) --

      This output displays the application ID on which the job run was submitted.

    • jobRunId (string) --

      The output contains the ID of the started job run.

    • arn (string) --

      The output lists the execution role ARN of the job run.

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.ResourceNotFoundException
  • EMRServerless.Client.exceptions.InternalServerException
  • EMRServerless.Client.exceptions.ConflictException
stop_application(**kwargs)

Stops a specified application and releases initial capacity if configured. All scheduled and running jobs must be completed or cancelled before stopping an application.

See also: AWS API Documentation

Request Syntax

response = client.stop_application(
    applicationId='string'
)
Parameters
applicationId (string) --

[REQUIRED]

The ID of the application to stop.

Return type
dict
Returns
Response Syntax
{}

Response Structure

  • (dict) --

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.ResourceNotFoundException
  • EMRServerless.Client.exceptions.InternalServerException
tag_resource(**kwargs)

Assigns tags to resources. A tag is a label that you assign to an AWS resource. Each tag consists of a key and an optional value, both of which you define. Tags enable you to categorize your AWS resources by attributes such as purpose, owner, or environment. When you have many resources of the same type, you can quickly identify a specific resource based on the tags you've assigned to it.

See also: AWS API Documentation

Request Syntax

response = client.tag_resource(
    resourceArn='string',
    tags={
        'string': 'string'
    }
)
Parameters
  • resourceArn (string) --

    [REQUIRED]

    The Amazon Resource Name (ARN) that identifies the resource to list the tags for. Currently, the supported resources are Amazon EMR Serverless applications and job runs.

  • tags (dict) --

    [REQUIRED]

    The tags to add to the resource. A tag is an array of key-value pairs.

    • (string) --
      • (string) --
Return type

dict

Returns

Response Syntax

{}

Response Structure

  • (dict) --

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.ResourceNotFoundException
  • EMRServerless.Client.exceptions.InternalServerException
untag_resource(**kwargs)

Removes tags from resources.

See also: AWS API Documentation

Request Syntax

response = client.untag_resource(
    resourceArn='string',
    tagKeys=[
        'string',
    ]
)
Parameters
  • resourceArn (string) --

    [REQUIRED]

    The Amazon Resource Name (ARN) that identifies the resource to list the tags for. Currently, the supported resources are Amazon EMR Serverless applications and job runs.

  • tagKeys (list) --

    [REQUIRED]

    The keys of the tags to be removed.

    • (string) --
Return type

dict

Returns

Response Syntax

{}

Response Structure

  • (dict) --

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.ResourceNotFoundException
  • EMRServerless.Client.exceptions.InternalServerException
update_application(**kwargs)

Updates a specified application. An application has to be in a stopped or created state in order to be updated.

See also: AWS API Documentation

Request Syntax

response = client.update_application(
    applicationId='string',
    clientToken='string',
    initialCapacity={
        'string': {
            'workerCount': 123,
            'workerConfiguration': {
                'cpu': 'string',
                'memory': 'string',
                'disk': 'string'
            }
        }
    },
    maximumCapacity={
        'cpu': 'string',
        'memory': 'string',
        'disk': 'string'
    },
    autoStartConfiguration={
        'enabled': True|False
    },
    autoStopConfiguration={
        'enabled': True|False,
        'idleTimeoutMinutes': 123
    },
    networkConfiguration={
        'subnetIds': [
            'string',
        ],
        'securityGroupIds': [
            'string',
        ]
    }
)
Parameters
  • applicationId (string) --

    [REQUIRED]

    The ID of the application to update.

  • clientToken (string) --

    [REQUIRED]

    The client idempotency token of the application to update. Its value must be unique for each request.

    This field is autopopulated if not provided.

  • initialCapacity (dict) --

    The capacity to initialize when the application is updated.

    • (string) --
      • (dict) --

        The initial capacity configuration per worker.

        • workerCount (integer) -- [REQUIRED]

          The number of workers in the initial capacity configuration.

        • workerConfiguration (dict) --

          The resource configuration of the initial capacity configuration.

          • cpu (string) -- [REQUIRED]

            The CPU requirements for every worker instance of the worker type.

          • memory (string) -- [REQUIRED]

            The memory requirements for every worker instance of the worker type.

          • disk (string) --

            The disk requirements for every worker instance of the worker type.

  • maximumCapacity (dict) --

    The maximum capacity to allocate when the application is updated. This is cumulative across all workers at any given point in time during the lifespan of the application. No new resources will be created once any one of the defined limits is hit.

    • cpu (string) -- [REQUIRED]

      The maximum allowed CPU for an application.

    • memory (string) -- [REQUIRED]

      The maximum allowed resources for an application.

    • disk (string) --

      The maximum allowed disk for an application.

  • autoStartConfiguration (dict) --

    The configuration for an application to automatically start on job submission.

    • enabled (boolean) --

      Enables the application to automatically start on job submission. Defaults to true.

  • autoStopConfiguration (dict) --

    The configuration for an application to automatically stop after a certain amount of time being idle.

    • enabled (boolean) --

      Enables the application to automatically stop after a certain amount of time being idle. Defaults to true.

    • idleTimeoutMinutes (integer) --

      The amount of idle time in minutes after which your application will automatically stop. Defaults to 15 minutes.

  • networkConfiguration (dict) --

    The network configuration for customer VPC connectivity.

    • subnetIds (list) --

      The array of subnet Ids for customer VPC connectivity.

      • (string) --
    • securityGroupIds (list) --

      The array of security group Ids for customer VPC connectivity.

      • (string) --
Return type

dict

Returns

Response Syntax

{
    'application': {
        'applicationId': 'string',
        'name': 'string',
        'arn': 'string',
        'releaseLabel': 'string',
        'type': 'string',
        'state': 'CREATING'|'CREATED'|'STARTING'|'STARTED'|'STOPPING'|'STOPPED'|'TERMINATED',
        'stateDetails': 'string',
        'initialCapacity': {
            'string': {
                'workerCount': 123,
                'workerConfiguration': {
                    'cpu': 'string',
                    'memory': 'string',
                    'disk': 'string'
                }
            }
        },
        'maximumCapacity': {
            'cpu': 'string',
            'memory': 'string',
            'disk': 'string'
        },
        'createdAt': datetime(2015, 1, 1),
        'updatedAt': datetime(2015, 1, 1),
        'tags': {
            'string': 'string'
        },
        'autoStartConfiguration': {
            'enabled': True|False
        },
        'autoStopConfiguration': {
            'enabled': True|False,
            'idleTimeoutMinutes': 123
        },
        'networkConfiguration': {
            'subnetIds': [
                'string',
            ],
            'securityGroupIds': [
                'string',
            ]
        }
    }
}

Response Structure

  • (dict) --

    • application (dict) --

      Information about the updated application.

      • applicationId (string) --

        The ID of the application.

      • name (string) --

        The name of the application.

      • arn (string) --

        The ARN of the application.

      • releaseLabel (string) --

        The EMR release version associated with the application.

      • type (string) --

        The type of application, such as Spark or Hive.

      • state (string) --

        The state of the application.

      • stateDetails (string) --

        The state details of the application.

      • initialCapacity (dict) --

        The initial capacity of the application.

        • (string) --

          • (dict) --

            The initial capacity configuration per worker.

            • workerCount (integer) --

              The number of workers in the initial capacity configuration.

            • workerConfiguration (dict) --

              The resource configuration of the initial capacity configuration.

              • cpu (string) --

                The CPU requirements for every worker instance of the worker type.

              • memory (string) --

                The memory requirements for every worker instance of the worker type.

              • disk (string) --

                The disk requirements for every worker instance of the worker type.

      • maximumCapacity (dict) --

        The maximum capacity of the application. This is cumulative across all workers at any given point in time during the lifespan of the application is created. No new resources will be created once any one of the defined limits is hit.

        • cpu (string) --

          The maximum allowed CPU for an application.

        • memory (string) --

          The maximum allowed resources for an application.

        • disk (string) --

          The maximum allowed disk for an application.

      • createdAt (datetime) --

        The date and time when the application run was created.

      • updatedAt (datetime) --

        The date and time when the application run was last updated.

      • tags (dict) --

        The tags assigned to the application.

        • (string) --
          • (string) --
      • autoStartConfiguration (dict) --

        The configuration for an application to automatically start on job submission.

        • enabled (boolean) --

          Enables the application to automatically start on job submission. Defaults to true.

      • autoStopConfiguration (dict) --

        The configuration for an application to automatically stop after a certain amount of time being idle.

        • enabled (boolean) --

          Enables the application to automatically stop after a certain amount of time being idle. Defaults to true.

        • idleTimeoutMinutes (integer) --

          The amount of idle time in minutes after which your application will automatically stop. Defaults to 15 minutes.

      • networkConfiguration (dict) --

        The network configuration for customer VPC connectivity for the application.

        • subnetIds (list) --

          The array of subnet Ids for customer VPC connectivity.

          • (string) --
        • securityGroupIds (list) --

          The array of security group Ids for customer VPC connectivity.

          • (string) --

Exceptions

  • EMRServerless.Client.exceptions.ValidationException
  • EMRServerless.Client.exceptions.ResourceNotFoundException
  • EMRServerless.Client.exceptions.InternalServerException

Paginators

The available paginators are:

class EMRServerless.Paginator.ListApplications
paginator = client.get_paginator('list_applications')
paginate(**kwargs)

Creates an iterator that will paginate through responses from EMRServerless.Client.list_applications().

See also: AWS API Documentation

Request Syntax

response_iterator = paginator.paginate(
    states=[
        'CREATING'|'CREATED'|'STARTING'|'STARTED'|'STOPPING'|'STOPPED'|'TERMINATED',
    ],
    PaginationConfig={
        'MaxItems': 123,
        'PageSize': 123,
        'StartingToken': 'string'
    }
)
Parameters
  • states (list) --

    An optional filter for application states. Note that if this filter contains multiple states, the resulting list will be grouped by the state.

    • (string) --
  • PaginationConfig (dict) --

    A dictionary that provides parameters to control pagination.

    • MaxItems (integer) --

      The total number of items to return. If the total number of items available is more than the value specified in max-items then a NextToken will be provided in the output that you can use to resume pagination.

    • PageSize (integer) --

      The size of each page.

    • StartingToken (string) --

      A token to specify where to start paginating. This is the NextToken from a previous response.

Return type

dict

Returns

Response Syntax

{
    'applications': [
        {
            'id': 'string',
            'name': 'string',
            'arn': 'string',
            'releaseLabel': 'string',
            'type': 'string',
            'state': 'CREATING'|'CREATED'|'STARTING'|'STARTED'|'STOPPING'|'STOPPED'|'TERMINATED',
            'stateDetails': 'string',
            'createdAt': datetime(2015, 1, 1),
            'updatedAt': datetime(2015, 1, 1)
        },
    ],
    'NextToken': 'string'
}

Response Structure

  • (dict) --

    • applications (list) --

      The output lists the specified applications.

      • (dict) --

        The summary of attributes associated with an application.

        • id (string) --

          The ID of the application.

        • name (string) --

          The name of the application.

        • arn (string) --

          The ARN of the application.

        • releaseLabel (string) --

          The EMR release version associated with the application.

        • type (string) --

          The type of application, such as Spark or Hive.

        • state (string) --

          The state of the application.

        • stateDetails (string) --

          The state details of the application.

        • createdAt (datetime) --

          The date and time when the application was created.

        • updatedAt (datetime) --

          The date and time when the application was last updated.

    • NextToken (string) --

      A token to resume pagination.

class EMRServerless.Paginator.ListJobRuns
paginator = client.get_paginator('list_job_runs')
paginate(**kwargs)

Creates an iterator that will paginate through responses from EMRServerless.Client.list_job_runs().

See also: AWS API Documentation

Request Syntax

response_iterator = paginator.paginate(
    applicationId='string',
    createdAtAfter=datetime(2015, 1, 1),
    createdAtBefore=datetime(2015, 1, 1),
    states=[
        'SUBMITTED'|'PENDING'|'SCHEDULED'|'RUNNING'|'SUCCESS'|'FAILED'|'CANCELLING'|'CANCELLED',
    ],
    PaginationConfig={
        'MaxItems': 123,
        'PageSize': 123,
        'StartingToken': 'string'
    }
)
Parameters
  • applicationId (string) --

    [REQUIRED]

    The ID of the application for which to list the job run.

  • createdAtAfter (datetime) -- The lower bound of the option to filter by creation date and time.
  • createdAtBefore (datetime) -- The upper bound of the option to filter by creation date and time.
  • states (list) --

    An optional filter for job run states. Note that if this filter contains multiple states, the resulting list will be grouped by the state.

    • (string) --
  • PaginationConfig (dict) --

    A dictionary that provides parameters to control pagination.

    • MaxItems (integer) --

      The total number of items to return. If the total number of items available is more than the value specified in max-items then a NextToken will be provided in the output that you can use to resume pagination.

    • PageSize (integer) --

      The size of each page.

    • StartingToken (string) --

      A token to specify where to start paginating. This is the NextToken from a previous response.

Return type

dict

Returns

Response Syntax

{
    'jobRuns': [
        {
            'applicationId': 'string',
            'id': 'string',
            'name': 'string',
            'arn': 'string',
            'createdBy': 'string',
            'createdAt': datetime(2015, 1, 1),
            'updatedAt': datetime(2015, 1, 1),
            'executionRole': 'string',
            'state': 'SUBMITTED'|'PENDING'|'SCHEDULED'|'RUNNING'|'SUCCESS'|'FAILED'|'CANCELLING'|'CANCELLED',
            'stateDetails': 'string',
            'releaseLabel': 'string',
            'type': 'string'
        },
    ],
    'NextToken': 'string'
}

Response Structure

  • (dict) --

    • jobRuns (list) --

      The output lists information about the specified job runs.

      • (dict) --

        The summary of attributes associated with a job run.

        • applicationId (string) --

          The ID of the application the job is running on.

        • id (string) --

          The ID of the job run.

        • name (string) --

          The optional job run name. This doesn't have to be unique.

        • arn (string) --

          The ARN of the job run.

        • createdBy (string) --

          The user who created the job run.

        • createdAt (datetime) --

          The date and time when the job run was created.

        • updatedAt (datetime) --

          The date and time when the job run was last updated.

        • executionRole (string) --

          The execution role ARN of the job run.

        • state (string) --

          The state of the job run.

        • stateDetails (string) --

          The state details of the job run.

        • releaseLabel (string) --

          The EMR release version associated with the application your job is running on.

        • type (string) --

          The type of job run, such as Spark or Hive.

    • NextToken (string) --

      A token to resume pagination.