SageMaker / Client / update_training_job
update_training_job#
- SageMaker.Client.update_training_job(**kwargs)#
Update a model training job to request a new Debugger profiling configuration or to change warm pool retention length.
See also: AWS API Documentation
Request Syntax
response = client.update_training_job( TrainingJobName='string', ProfilerConfig={ 'S3OutputPath': 'string', 'ProfilingIntervalInMilliseconds': 123, 'ProfilingParameters': { 'string': 'string' }, 'DisableProfiler': True|False }, ProfilerRuleConfigurations=[ { 'RuleConfigurationName': 'string', 'LocalPath': 'string', 'S3OutputPath': 'string', 'RuleEvaluatorImage': 'string', 'InstanceType': 'ml.t3.medium'|'ml.t3.large'|'ml.t3.xlarge'|'ml.t3.2xlarge'|'ml.m4.xlarge'|'ml.m4.2xlarge'|'ml.m4.4xlarge'|'ml.m4.10xlarge'|'ml.m4.16xlarge'|'ml.c4.xlarge'|'ml.c4.2xlarge'|'ml.c4.4xlarge'|'ml.c4.8xlarge'|'ml.p2.xlarge'|'ml.p2.8xlarge'|'ml.p2.16xlarge'|'ml.p3.2xlarge'|'ml.p3.8xlarge'|'ml.p3.16xlarge'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.18xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.12xlarge'|'ml.m5.24xlarge'|'ml.r5.large'|'ml.r5.xlarge'|'ml.r5.2xlarge'|'ml.r5.4xlarge'|'ml.r5.8xlarge'|'ml.r5.12xlarge'|'ml.r5.16xlarge'|'ml.r5.24xlarge'|'ml.g4dn.xlarge'|'ml.g4dn.2xlarge'|'ml.g4dn.4xlarge'|'ml.g4dn.8xlarge'|'ml.g4dn.12xlarge'|'ml.g4dn.16xlarge', 'VolumeSizeInGB': 123, 'RuleParameters': { 'string': 'string' } }, ], ResourceConfig={ 'KeepAlivePeriodInSeconds': 123 }, RemoteDebugConfig={ 'EnableRemoteDebug': True|False } )
- Parameters:
TrainingJobName (string) –
[REQUIRED]
The name of a training job to update the Debugger profiling configuration.
ProfilerConfig (dict) –
Configuration information for Amazon SageMaker Debugger system monitoring, framework profiling, and storage paths.
S3OutputPath (string) –
Path to Amazon S3 storage location for system and framework metrics.
ProfilingIntervalInMilliseconds (integer) –
A time interval for capturing system metrics in milliseconds. Available values are 100, 200, 500, 1000 (1 second), 5000 (5 seconds), and 60000 (1 minute) milliseconds. The default value is 500 milliseconds.
ProfilingParameters (dict) –
Configuration information for capturing framework metrics. Available key strings for different profiling options are
DetailedProfilingConfig
,PythonProfilingConfig
, andDataLoaderProfilingConfig
. The following codes are configuration structures for theProfilingParameters
parameter. To learn more about how to configure theProfilingParameters
parameter, see Use the SageMaker and Debugger Configuration API Operations to Create, Update, and Debug Your Training Job.(string) –
(string) –
DisableProfiler (boolean) –
To turn off Amazon SageMaker Debugger monitoring and profiling while a training job is in progress, set to
True
.
ProfilerRuleConfigurations (list) –
Configuration information for Amazon SageMaker Debugger rules for profiling system and framework metrics.
(dict) –
Configuration information for profiling rules.
RuleConfigurationName (string) – [REQUIRED]
The name of the rule configuration. It must be unique relative to other rule configuration names.
LocalPath (string) –
Path to local storage location for output of rules. Defaults to
/opt/ml/processing/output/rule/
.S3OutputPath (string) –
Path to Amazon S3 storage location for rules.
RuleEvaluatorImage (string) – [REQUIRED]
The Amazon Elastic Container Registry Image for the managed rule evaluation.
InstanceType (string) –
The instance type to deploy a custom rule for profiling a training job.
VolumeSizeInGB (integer) –
The size, in GB, of the ML storage volume attached to the processing instance.
RuleParameters (dict) –
Runtime configuration for rule container.
(string) –
(string) –
ResourceConfig (dict) –
The training job
ResourceConfig
to update warm pool retention length.KeepAlivePeriodInSeconds (integer) – [REQUIRED]
The
KeepAlivePeriodInSeconds
value specified in theResourceConfig
to update.
RemoteDebugConfig (dict) –
Configuration for remote debugging while the training job is running. You can update the remote debugging configuration when the
SecondaryStatus
of the job isDownloading
orTraining
.To learn more about the remote debugging functionality of SageMaker, see Access a training container through Amazon Web Services Systems Manager (SSM) for remote debugging.EnableRemoteDebug (boolean) –
If set to True, enables remote debugging.
- Return type:
dict
- Returns:
Response Syntax
{ 'TrainingJobArn': 'string' }
Response Structure
(dict) –
TrainingJobArn (string) –
The Amazon Resource Name (ARN) of the training job.
Exceptions
SageMaker.Client.exceptions.ResourceNotFound
SageMaker.Client.exceptions.ResourceLimitExceeded