EMRContainers / Client / start_job_run

start_job_run¶

EMRContainers.Client.start_job_run(**kwargs)¶

Starts a job run. A job run is a unit of work, such as a Spark jar, PySpark script, or SparkSQL query, that you submit to Amazon EMR on EKS.

Request Syntax

response = client.start_job_run(
    name='string',
    virtualClusterId='string',
    clientToken='string',
    executionRoleArn='string',
    releaseLabel='string',
    jobDriver={
        'sparkSubmitJobDriver': {
            'entryPoint': 'string',
            'entryPointArguments': [
                'string',
            ],
            'sparkSubmitParameters': 'string'
        },
        'sparkSqlJobDriver': {
            'entryPoint': 'string',
            'sparkSqlParameters': 'string'
        }
    },
    configurationOverrides={
        'applicationConfiguration': [
            {
                'classification': 'string',
                'properties': {
                    'string': 'string'
                },
                'configurations': {'... recursive ...'}
            },
        ],
        'monitoringConfiguration': {
            'managedLogs': {
                'allowAWSToRetainLogs': 'ENABLED'|'DISABLED',
                'encryptionKeyArn': 'string'
            },
            'persistentAppUI': 'ENABLED'|'DISABLED',
            'cloudWatchMonitoringConfiguration': {
                'logGroupName': 'string',
                'logStreamNamePrefix': 'string'
            },
            's3MonitoringConfiguration': {
                'logUri': 'string'
            },
            'containerLogRotationConfiguration': {
                'rotationSize': 'string',
                'maxFilesToKeep': 123
            }
        }
    },
    tags={
        'string': 'string'
    },
    jobTemplateId='string',
    jobTemplateParameters={
        'string': 'string'
    },
    retryPolicyConfiguration={
        'maxAttempts': 123
    }
)

Parameters:

name (string) – The name of the job run.
virtualClusterId (string) –
[REQUIRED]

The virtual cluster ID for which the job run request is submitted.
clientToken (string) –
[REQUIRED]

The client idempotency token of the job run request.

This field is autopopulated if not provided.
executionRoleArn (string) – The execution role ARN for the job run.
releaseLabel (string) – The Amazon EMR release version to use for the job run.
jobDriver (dict) –
The job driver for the job run.
- sparkSubmitJobDriver (dict) –
  
  The job driver parameters specified for spark submit.
  - entryPoint (string) – [REQUIRED]
    
    The entry point of job application.
  - entryPointArguments (list) –
    
    The arguments for job application.
    - (string) –
  - sparkSubmitParameters (string) –
    
    The Spark submit parameters that are used for job runs.
- sparkSqlJobDriver (dict) –
  
  The job driver for job type.
  - entryPoint (string) –
    
    The SQL file to be executed.
  - sparkSqlParameters (string) –
    
    The Spark parameters to be included in the Spark SQL command.
configurationOverrides (dict) –
The configuration overrides for the job run.
- applicationConfiguration (list) –
  
  The configurations for the application running by the job run.
  - (dict) –
    
    A configuration specification to be used when provisioning virtual clusters, which can include configurations for applications and software bundled with Amazon EMR on EKS. A configuration consists of a classification, properties, and optional nested configurations. A classification refers to an application-specific configuration file. Properties are the settings you want to change in that file.
    - classification (string) – [REQUIRED]
      
      The classification within a configuration.
    - properties (dict) –
      
      A set of properties specified within a configuration classification.
      - (string) –
        
        (string) –
    - configurations (list) –
      
      A list of additional configurations to apply within a configuration object.
- monitoringConfiguration (dict) –
  
  The configurations for monitoring.
  - managedLogs (dict) –
    
    The entity that controls configuration for managed logs.
    - allowAWSToRetainLogs (string) –
      
      Determines whether Amazon Web Services can retain logs.
    - encryptionKeyArn (string) –
      
      The Amazon resource name (ARN) of the encryption key for logs.
  - persistentAppUI (string) –
    
    Monitoring configurations for the persistent application UI.
  - cloudWatchMonitoringConfiguration (dict) –
    
    Monitoring configurations for CloudWatch.
    - logGroupName (string) – [REQUIRED]
      
      The name of the log group for log publishing.
    - logStreamNamePrefix (string) –
      
      The specified name prefix for log streams.
  - s3MonitoringConfiguration (dict) –
    
    Amazon S3 configuration for monitoring log publishing.
    - logUri (string) – [REQUIRED]
      
      Amazon S3 destination URI for log publishing.
  - containerLogRotationConfiguration (dict) –
    
    Enable or disable container log rotation.
    - rotationSize (string) – [REQUIRED]
      
      The file size at which to rotate logs. Minimum of 2KB, Maximum of 2GB.
    - maxFilesToKeep (integer) – [REQUIRED]
      
      The number of files to keep in container after rotation.
tags (dict) –
The tags assigned to job runs.
- (string) –
  - (string) –
jobTemplateId (string) – The job template ID to be used to start the job run.
jobTemplateParameters (dict) –
The values of job template parameters to start a job run.
- (string) –
  - (string) –
retryPolicyConfiguration (dict) –
The retry policy configuration for the job run.
- maxAttempts (integer) – [REQUIRED]
  
  The maximum number of attempts on the job’s driver.

Return type:

dict

Returns:

Response Syntax

{
    'id': 'string',
    'name': 'string',
    'arn': 'string',
    'virtualClusterId': 'string'
}

Response Structure

(dict) –
- id (string) –
  
  This output displays the started job run ID.
- name (string) –
  
  This output displays the name of the started job run.
- arn (string) –
  
  This output lists the ARN of job run.
- virtualClusterId (string) –
  
  This output displays the virtual cluster ID for which the job run was submitted.

Exceptions

EMRContainers.Client.exceptions.ValidationException
EMRContainers.Client.exceptions.ResourceNotFoundException
EMRContainers.Client.exceptions.InternalServerException