SageMaker / Client / list_inference_recommendations_job_steps
list_inference_recommendations_job_steps#
- SageMaker.Client.list_inference_recommendations_job_steps(**kwargs)#
Returns a list of the subtasks for an Inference Recommender job.
The supported subtasks are benchmarks, which evaluate the performance of your model on different instance types.
See also: AWS API Documentation
Request Syntax
response = client.list_inference_recommendations_job_steps( JobName='string', Status='PENDING'|'IN_PROGRESS'|'COMPLETED'|'FAILED'|'STOPPING'|'STOPPED'|'DELETING'|'DELETED', StepType='BENCHMARK', MaxResults=123, NextToken='string' )
- Parameters:
JobName (string) –
[REQUIRED]
The name for the Inference Recommender job.
Status (string) – A filter to return benchmarks of a specified status. If this field is left empty, then all benchmarks are returned.
StepType (string) –
A filter to return details about the specified type of subtask.
BENCHMARK
: Evaluate the performance of your model on different instance types.MaxResults (integer) – The maximum number of results to return.
NextToken (string) – A token that you can specify to return more results from the list. Specify this field if you have a token that was returned from a previous request.
- Return type:
dict
- Returns:
Response Syntax
{ 'Steps': [ { 'StepType': 'BENCHMARK', 'JobName': 'string', 'Status': 'PENDING'|'IN_PROGRESS'|'COMPLETED'|'FAILED'|'STOPPING'|'STOPPED'|'DELETING'|'DELETED', 'InferenceBenchmark': { 'Metrics': { 'CostPerHour': ..., 'CostPerInference': ..., 'MaxInvocations': 123, 'ModelLatency': 123, 'CpuUtilization': ..., 'MemoryUtilization': ..., 'ModelSetupTime': 123 }, 'EndpointMetrics': { 'MaxInvocations': 123, 'ModelLatency': 123 }, 'EndpointConfiguration': { 'EndpointName': 'string', 'VariantName': 'string', 'InstanceType': 'ml.t2.medium'|'ml.t2.large'|'ml.t2.xlarge'|'ml.t2.2xlarge'|'ml.m4.xlarge'|'ml.m4.2xlarge'|'ml.m4.4xlarge'|'ml.m4.10xlarge'|'ml.m4.16xlarge'|'ml.m5.large'|'ml.m5.xlarge'|'ml.m5.2xlarge'|'ml.m5.4xlarge'|'ml.m5.12xlarge'|'ml.m5.24xlarge'|'ml.m5d.large'|'ml.m5d.xlarge'|'ml.m5d.2xlarge'|'ml.m5d.4xlarge'|'ml.m5d.12xlarge'|'ml.m5d.24xlarge'|'ml.c4.large'|'ml.c4.xlarge'|'ml.c4.2xlarge'|'ml.c4.4xlarge'|'ml.c4.8xlarge'|'ml.p2.xlarge'|'ml.p2.8xlarge'|'ml.p2.16xlarge'|'ml.p3.2xlarge'|'ml.p3.8xlarge'|'ml.p3.16xlarge'|'ml.c5.large'|'ml.c5.xlarge'|'ml.c5.2xlarge'|'ml.c5.4xlarge'|'ml.c5.9xlarge'|'ml.c5.18xlarge'|'ml.c5d.large'|'ml.c5d.xlarge'|'ml.c5d.2xlarge'|'ml.c5d.4xlarge'|'ml.c5d.9xlarge'|'ml.c5d.18xlarge'|'ml.g4dn.xlarge'|'ml.g4dn.2xlarge'|'ml.g4dn.4xlarge'|'ml.g4dn.8xlarge'|'ml.g4dn.12xlarge'|'ml.g4dn.16xlarge'|'ml.r5.large'|'ml.r5.xlarge'|'ml.r5.2xlarge'|'ml.r5.4xlarge'|'ml.r5.12xlarge'|'ml.r5.24xlarge'|'ml.r5d.large'|'ml.r5d.xlarge'|'ml.r5d.2xlarge'|'ml.r5d.4xlarge'|'ml.r5d.12xlarge'|'ml.r5d.24xlarge'|'ml.inf1.xlarge'|'ml.inf1.2xlarge'|'ml.inf1.6xlarge'|'ml.inf1.24xlarge'|'ml.dl1.24xlarge'|'ml.c6i.large'|'ml.c6i.xlarge'|'ml.c6i.2xlarge'|'ml.c6i.4xlarge'|'ml.c6i.8xlarge'|'ml.c6i.12xlarge'|'ml.c6i.16xlarge'|'ml.c6i.24xlarge'|'ml.c6i.32xlarge'|'ml.g5.xlarge'|'ml.g5.2xlarge'|'ml.g5.4xlarge'|'ml.g5.8xlarge'|'ml.g5.12xlarge'|'ml.g5.16xlarge'|'ml.g5.24xlarge'|'ml.g5.48xlarge'|'ml.g6.xlarge'|'ml.g6.2xlarge'|'ml.g6.4xlarge'|'ml.g6.8xlarge'|'ml.g6.12xlarge'|'ml.g6.16xlarge'|'ml.g6.24xlarge'|'ml.g6.48xlarge'|'ml.p4d.24xlarge'|'ml.c7g.large'|'ml.c7g.xlarge'|'ml.c7g.2xlarge'|'ml.c7g.4xlarge'|'ml.c7g.8xlarge'|'ml.c7g.12xlarge'|'ml.c7g.16xlarge'|'ml.m6g.large'|'ml.m6g.xlarge'|'ml.m6g.2xlarge'|'ml.m6g.4xlarge'|'ml.m6g.8xlarge'|'ml.m6g.12xlarge'|'ml.m6g.16xlarge'|'ml.m6gd.large'|'ml.m6gd.xlarge'|'ml.m6gd.2xlarge'|'ml.m6gd.4xlarge'|'ml.m6gd.8xlarge'|'ml.m6gd.12xlarge'|'ml.m6gd.16xlarge'|'ml.c6g.large'|'ml.c6g.xlarge'|'ml.c6g.2xlarge'|'ml.c6g.4xlarge'|'ml.c6g.8xlarge'|'ml.c6g.12xlarge'|'ml.c6g.16xlarge'|'ml.c6gd.large'|'ml.c6gd.xlarge'|'ml.c6gd.2xlarge'|'ml.c6gd.4xlarge'|'ml.c6gd.8xlarge'|'ml.c6gd.12xlarge'|'ml.c6gd.16xlarge'|'ml.c6gn.large'|'ml.c6gn.xlarge'|'ml.c6gn.2xlarge'|'ml.c6gn.4xlarge'|'ml.c6gn.8xlarge'|'ml.c6gn.12xlarge'|'ml.c6gn.16xlarge'|'ml.r6g.large'|'ml.r6g.xlarge'|'ml.r6g.2xlarge'|'ml.r6g.4xlarge'|'ml.r6g.8xlarge'|'ml.r6g.12xlarge'|'ml.r6g.16xlarge'|'ml.r6gd.large'|'ml.r6gd.xlarge'|'ml.r6gd.2xlarge'|'ml.r6gd.4xlarge'|'ml.r6gd.8xlarge'|'ml.r6gd.12xlarge'|'ml.r6gd.16xlarge'|'ml.p4de.24xlarge'|'ml.trn1.2xlarge'|'ml.trn1.32xlarge'|'ml.trn1n.32xlarge'|'ml.inf2.xlarge'|'ml.inf2.8xlarge'|'ml.inf2.24xlarge'|'ml.inf2.48xlarge'|'ml.p5.48xlarge'|'ml.m7i.large'|'ml.m7i.xlarge'|'ml.m7i.2xlarge'|'ml.m7i.4xlarge'|'ml.m7i.8xlarge'|'ml.m7i.12xlarge'|'ml.m7i.16xlarge'|'ml.m7i.24xlarge'|'ml.m7i.48xlarge'|'ml.c7i.large'|'ml.c7i.xlarge'|'ml.c7i.2xlarge'|'ml.c7i.4xlarge'|'ml.c7i.8xlarge'|'ml.c7i.12xlarge'|'ml.c7i.16xlarge'|'ml.c7i.24xlarge'|'ml.c7i.48xlarge'|'ml.r7i.large'|'ml.r7i.xlarge'|'ml.r7i.2xlarge'|'ml.r7i.4xlarge'|'ml.r7i.8xlarge'|'ml.r7i.12xlarge'|'ml.r7i.16xlarge'|'ml.r7i.24xlarge'|'ml.r7i.48xlarge', 'InitialInstanceCount': 123, 'ServerlessConfig': { 'MemorySizeInMB': 123, 'MaxConcurrency': 123, 'ProvisionedConcurrency': 123 } }, 'ModelConfiguration': { 'InferenceSpecificationName': 'string', 'EnvironmentParameters': [ { 'Key': 'string', 'ValueType': 'string', 'Value': 'string' }, ], 'CompilationJobName': 'string' }, 'FailureReason': 'string', 'InvocationEndTime': datetime(2015, 1, 1), 'InvocationStartTime': datetime(2015, 1, 1) } }, ], 'NextToken': 'string' }
Response Structure
(dict) –
Steps (list) –
A list of all subtask details in Inference Recommender.
(dict) –
A returned array object for the
Steps
response field in the ListInferenceRecommendationsJobSteps API command.StepType (string) –
The type of the subtask.
BENCHMARK
: Evaluate the performance of your model on different instance types.JobName (string) –
The name of the Inference Recommender job.
Status (string) –
The current status of the benchmark.
InferenceBenchmark (dict) –
The details for a specific benchmark.
Metrics (dict) –
The metrics of recommendations.
CostPerHour (float) –
Defines the cost per hour for the instance.
CostPerInference (float) –
Defines the cost per inference for the instance .
MaxInvocations (integer) –
The expected maximum number of requests per minute for the instance.
ModelLatency (integer) –
The expected model latency at maximum invocation per minute for the instance.
CpuUtilization (float) –
The expected CPU utilization at maximum invocations per minute for the instance.
NaN
indicates that the value is not available.MemoryUtilization (float) –
The expected memory utilization at maximum invocations per minute for the instance.
NaN
indicates that the value is not available.ModelSetupTime (integer) –
The time it takes to launch new compute resources for a serverless endpoint. The time can vary depending on the model size, how long it takes to download the model, and the start-up time of the container.
NaN
indicates that the value is not available.
EndpointMetrics (dict) –
The metrics for an existing endpoint compared in an Inference Recommender job.
MaxInvocations (integer) –
The expected maximum number of requests per minute for the instance.
ModelLatency (integer) –
The expected model latency at maximum invocations per minute for the instance.
EndpointConfiguration (dict) –
The endpoint configuration made by Inference Recommender during a recommendation job.
EndpointName (string) –
The name of the endpoint made during a recommendation job.
VariantName (string) –
The name of the production variant (deployed model) made during a recommendation job.
InstanceType (string) –
The instance type recommended by Amazon SageMaker Inference Recommender.
InitialInstanceCount (integer) –
The number of instances recommended to launch initially.
ServerlessConfig (dict) –
Specifies the serverless configuration for an endpoint variant.
MemorySizeInMB (integer) –
The memory size of your serverless endpoint. Valid values are in 1 GB increments: 1024 MB, 2048 MB, 3072 MB, 4096 MB, 5120 MB, or 6144 MB.
MaxConcurrency (integer) –
The maximum number of concurrent invocations your serverless endpoint can process.
ProvisionedConcurrency (integer) –
The amount of provisioned concurrency to allocate for the serverless endpoint. Should be less than or equal to
MaxConcurrency
.Note
This field is not supported for serverless endpoint recommendations for Inference Recommender jobs. For more information about creating an Inference Recommender job, see CreateInferenceRecommendationsJobs.
ModelConfiguration (dict) –
Defines the model configuration. Includes the specification name and environment parameters.
InferenceSpecificationName (string) –
The inference specification name in the model package version.
EnvironmentParameters (list) –
Defines the environment parameters that includes key, value types, and values.
(dict) –
A list of environment parameters suggested by the Amazon SageMaker Inference Recommender.
Key (string) –
The environment key suggested by the Amazon SageMaker Inference Recommender.
ValueType (string) –
The value type suggested by the Amazon SageMaker Inference Recommender.
Value (string) –
The value suggested by the Amazon SageMaker Inference Recommender.
CompilationJobName (string) –
The name of the compilation job used to create the recommended model artifacts.
FailureReason (string) –
The reason why a benchmark failed.
InvocationEndTime (datetime) –
A timestamp that shows when the benchmark completed.
InvocationStartTime (datetime) –
A timestamp that shows when the benchmark started.
NextToken (string) –
A token that you can specify in your next request to return more results from the list.
Exceptions
SageMaker.Client.exceptions.ResourceNotFound