start_job_run

EMRContainers.Client.start_job_run(**kwargs)

Starts a job run. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS.

See also: AWS API Documentation

Request Syntax

response = client.start_job_run(
    name='string',
    virtualClusterId='string',
    clientToken='string',
    executionRoleArn='string',
    releaseLabel='string',
    jobDriver={
        'sparkSubmitJobDriver': {
            'entryPoint': 'string',
            'entryPointArguments': [
                'string',
            ],
            'sparkSubmitParameters': 'string'
        },
        'sparkSqlJobDriver': {
            'entryPoint': 'string',
            'sparkSqlParameters': 'string'
        }
    },
    configurationOverrides={
        'applicationConfiguration': [
            {
                'classification': 'string',
                'properties': {
                    'string': 'string'
                },
                'configurations': {'... recursive ...'}
            },
        ],
        'monitoringConfiguration': {
            'persistentAppUI': 'ENABLED'|'DISABLED',
            'cloudWatchMonitoringConfiguration': {
                'logGroupName': 'string',
                'logStreamNamePrefix': 'string'
            },
            's3MonitoringConfiguration': {
                'logUri': 'string'
            }
        }
    },
    tags={
        'string': 'string'
    },
    jobTemplateId='string',
    jobTemplateParameters={
        'string': 'string'
    },
    retryPolicyConfiguration={
        'maxAttempts': 123
    }
)
Parameters
  • name (string) -- The name of the job run.
  • virtualClusterId (string) --

    [REQUIRED]

    The virtual cluster ID for which the job run request is submitted.

  • clientToken (string) --

    [REQUIRED]

    The client idempotency token of the job run request.

    This field is autopopulated if not provided.

  • executionRoleArn (string) -- The execution role ARN for the job run.
  • releaseLabel (string) -- The Amazon EMR release version to use for the job run.
  • jobDriver (dict) --

    The job driver for the job run.

    • sparkSubmitJobDriver (dict) --

      The job driver parameters specified for spark submit.

      • entryPoint (string) -- [REQUIRED]

        The entry point of job application.

      • entryPointArguments (list) --

        The arguments for job application.

        • (string) --
      • sparkSubmitParameters (string) --

        The Spark submit parameters that are used for job runs.

    • sparkSqlJobDriver (dict) --

      The job driver for job type.

      • entryPoint (string) --

        The SQL file to be executed.

      • sparkSqlParameters (string) --

        The Spark parameters to be included in the Spark SQL command.

  • configurationOverrides (dict) --

    The configuration overrides for the job run.

    • applicationConfiguration (list) --

      The configurations for the application running by the job run.

      • (dict) --

        A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.

        • classification (string) -- [REQUIRED]

          The classification within a configuration.

        • properties (dict) --

          A set of properties specified within a configuration classification.

          • (string) --
            • (string) --
        • configurations (list) --

          A list of additional configurations to apply within a configuration object.

    • monitoringConfiguration (dict) --

      The configurations for monitoring.

      • persistentAppUI (string) --

        Monitoring configurations for the persistent application UI.

      • cloudWatchMonitoringConfiguration (dict) --

        Monitoring configurations for CloudWatch.

        • logGroupName (string) -- [REQUIRED]

          The name of the log group for log publishing.

        • logStreamNamePrefix (string) --

          The specified name prefix for log streams.

      • s3MonitoringConfiguration (dict) --

        Amazon S3 configuration for monitoring log publishing.

        • logUri (string) -- [REQUIRED]

          Amazon S3 destination URI for log publishing.

  • tags (dict) --

    The tags assigned to job runs.

    • (string) --
      • (string) --
  • jobTemplateId (string) -- The job template ID to be used to start the job run.
  • jobTemplateParameters (dict) --

    The values of job template parameters to start a job run.

    • (string) --
      • (string) --
  • retryPolicyConfiguration (dict) --

    The retry policy configuration for the job run.

    • maxAttempts (integer) -- [REQUIRED]

      The maximum number of attempts on the job's driver.

Return type

dict

Returns

Response Syntax

{
    'id': 'string',
    'name': 'string',
    'arn': 'string',
    'virtualClusterId': 'string'
}

Response Structure

  • (dict) --

    • id (string) --

      This output displays the started job run ID.

    • name (string) --

      This output displays the name of the started job run.

    • arn (string) --

      This output lists the ARN of job run.

    • virtualClusterId (string) --

      This output displays the virtual cluster ID for which the job run was submitted.

Exceptions

  • EMRContainers.Client.exceptions.ValidationException
  • EMRContainers.Client.exceptions.ResourceNotFoundException
  • EMRContainers.Client.exceptions.InternalServerException